> Those systems are predicated on denotational semantics, which is a formalizati...

pron · on June 15, 2016

Of course. But I was talking about FP there, and, I think, operational semantics must be embedded in FP. In the state machine view, operational semantics are simply a refinement SM of the denotational semantic (which could be also specified as a nondeterministic SM)

tome · on June 15, 2016

> I was talking about FP there, and, I think, operational semantics must be embedded in FP.

I'm not sure what you mean. Are you claiming that operational semantics is somehow a poor fit for FP?

pron · on June 15, 2016

I was being vague and imprecise. Obviously, any semantics is semantics of a language, and every language has its own perfectly fine operational semantics. I think that FP is a non-optimal first-choice for operational reasoning; let's call it that.

But while I have your attention, let me try to put the finger on two concrete problem areas in the typed-PFP reasoning. Now, I'm not saying that this isn't useful for a programming language; I am saying that it is a poor choice for reasoning about programs. I think that FP has to discrete "jumps" between modes, that are simply unnecessary for reasoning and do nothing but complicate matters (again, they may be useful for other reasons):

The first is the jump between "plain" functional code and monadic code. Consider an algorithm that sorts list of numbers using mergesort. The program could be written to use simple recursion, or, it could be written using a monad, with the monadic function encoding just a single step. Those two programs are very different but they encode the very same algorithm! The first may use a more denotational reasoning, and the second a more operational one. In TLA, there is no such jarring break. You can specify an algorithm anywhere you want on a refinement spectrum. Each step may be just moving a single number in memory, splitting the array, all the way to sorting in one step. The relationship between all these different refinement levels is a simple implication:

  R1 => R2 => R3

where R3 may be completely denotational and not even algorithmic as in:

   done => result = Sorted(input)

(assuming some operator Sorted).

The second discrete jump is between the program itself and the types. I was just talking to an Agda-using type theorist today, and we noted how a type is really a nondeterministic program specifying all possible deterministic programs that can yield it. This is a refinement relation. Yet, in FP, types are distinct from programs (even in languages where they use the same syntax). In TLA the relationship between a "type", i.e. a proposition about a program and the program is, you guessed it, a simple refinement, i.e. simple logical implication (he figures that intermediate refinement steps are analogous to a "program with holes" in Agda). So, the following is a program that returns the maximal element in a list:

A2 ≜ done = FALSE ∧ max = 0 ∧ [](done' = TRUE ∧ max' = Max(input))

but it is also the type (assuming dependent types) of all programs that find the maximum element in a list, say (details ommitted):

A1 ≜ done = FALSE ∧ max = {} ∧ i = 0 ∧ [](IF i = Len(input) THEN done' = TRUE ELSE (input[i] > max => max' = input[i]) ∧ i' = i + 1)

Then, A1 => A2, because A1 is a refinement of A2.

So two very central concepts in typed-PFP, namely monads and types are artificial constructs that essentially just mean refinements. Not only is refinement a single concepts, it is a far simpler concept to learn than either monads or types. In fact, once you learn the single idea of an "action" in TLA, which is how state machines are encoded as logic (it is not trivial for beginners, but relatively easy to pick up), refinement is plain old familiar logical implication.

So I've just removed two complicated concepts and replaced it with a simple one that requires little more than highschool math, all without losing an iota of expressivity or proving power.

tome · on June 15, 2016

So how do you write the type "forall a. a -> a -> a" in TLA?

pron · on June 15, 2016

That depends what it is that you want to specify: a function or a computation (i.e., a machine). In TLA+ as in "reality" computations are not functions, and functions are not computations.

We'll start with a function. It's basically `[a -> [a -> a]]` or `[a × a -> a]`. More completely (to show that `a` is abstract):

    CONSTANT a, f
    ASSUME f ∈ [a -> [a -> a]]

`CONSTANT` means that a can be anything (formally, any set)

If you want to specify a computation, that, when terminates[1] returns a result in `a`, the "type" would be `[](done => result ∈ a)` (where [] stands for the box operator, signifying "always"), or more fully:

    CONSTANT a, Input
    ASSUME Input ∈ a × a
    VARIABLES done, result
    f ≜ ... \* the specification of the computation
    THEOREM f => [](done => result ∈ a)

So the first type is purely static. A set membership. The second has a modality, which says that whenever (i.e., at any step) `done` is true, then `result` must be in `a`.

------

[1]: Let's say that we define termination as setting the variable `done` to TRUE. BTW, in TLA as in other similar formalisms, a terminating computation is a special case of a non-terminating one, one that at some finite time stutters forever, i.e., doesn't change state, or, formally <>[](x' = x), or the sugared <>[](UNCHANGED x) namely eventually x is always the same (<> stands for diamond and [] for box; for some reason HN doesn't display those characters)

tome · on June 15, 2016

Actually, I've thought of something that will be an even better example. You mentioned "reduce" here

https://news.ycombinator.com/item?id=11910858

Could you link to its implementation?

pron · on June 15, 2016

I'll quote my own:

    Reduce(Op(_, _), x, seq) ==
        LET RECURSIVE Helper(_, _)
        Helper(i, y) ==
            IF i > Len(seq) THEN y
                            ELSE Helper(i + 1, Op(y, seq[i]))
        IN Helper(1, x)

A couple of things. First, the need for the helper is just a limitation of the language that doesn't allow recursive operators to take operators as arguments. Second, I used indexing, but you can use the Head and Tail operators. It's just that I use this often, and I care about the model-checking performance. Finally, Reduce is an operator not a function (it is not a function in the theory). Hence, its domain (for all arguments that aren't operators) is "all sets" something that would be illegal as a "proper" function (Russel's paradox). Operators, like functions, can be higher order, but they're not "first-class", i.e., they can't be considered "data". Functions require a proper domain (a set).

tome · on June 15, 2016

> I'll quote my own: ...

Thanks

> Operators, like functions, can be higher order, but they're not "first-class", i.e., they can't be considered "data".

Do you mean you can't pass an operator to a function (or make it the argument of a computation)?

EDIT: Or more generally, what are the consequences of it not being "first-class", i.e., not being considered "data"?

pron · on June 15, 2016

> Do you mean you can't pass an operator to a function

Right. Functions are objects in the theory, and must have a domain which is a set. Operators are outside the theory, and aren't in any sets. You also can't return an operator as a result of an operator.

A function can, of course, return a function, but functions can't be polymorphic within a single specification. So they must have a particular (though perhaps unspecified) set as their domain. Why? Because there is no such thing as polymorphic functions in math. Polymorphism is a feature of a certain language or "calculus", and TLA+ is about algorithms, not about a specific linguistic representation of them. "Polymorphism" of the kind I've shown does make sense, because you determine the level of the specification, and you can say that you want to reason about the use of a function (or a computation) without assuming anything more specific other than your explicit assumptions about a certain set. But that is not to say that you can’t have an operator that “creates” functions generically. E.g.:

    Foo(a) ≜ [x ∈ a × a ↦ x[1]]   /or/  Foo(a) ≜ LET f[x ∈ a × a] ≜ x[1] IN f

Which you can then use like so: Foo(Nat)[1, 2]. Foo must be an operator because its argument `a` ranges over all sets, so it's not a proper domain (Russel, etc.)

> or make it the argument of a computation

Ah, that actually you can do, and it's quite common as it's very useful. A constant can be an operator (a variable can't because a variable is state, and state must be an object in the theory). For example, it's useful for taking a relation as input (although a relation can also be defined as a set of pairs, so it’s an object in the theory). If in the polymorphic example you want to say that the set `a` is partially ordered (because you're specifying a sorting algorithm), you can write[1]:

    CONSTANTS a, _ ⊑ _
    ASSUME ∀x, y, z ∈ a :
              ∧ x ⊑ x
              ∧ x ⊑ y ∧ y ⊑ x => x = y
              ∧ x ⊑ y ∧ y ⊑ z => x ⊑ z

All of this, BTW, is actually the "+" in TLA+ (which is a bit technical, and trips me up occasionally), and not where the core concepts lie, which is the TLA part. Lamport would say you’re focusing on the boring algebraic stuff, rather than the interesting algorithmic stuff… I guess you could come up with another TLA language, with a different "+", but the choice of this particular "+" (i.e, set theory) was made after experimentation and work with engineers back in the '90s (Lamport says that early versions were typed). There is actually a type inferencer for TLA+, which is used when proofs are passed to the Isabelle backend (or intended to be used, I don't know; the proof system is under constant development at INRIA). So you get rich mathematical reasoning using simple math — no monads or dependent types (nor any types) necessary. Those things can be useful (for automatic code extraction, maybe, or for deep mathematical theories of languages — not algorithms), but for mathematically reasoning about programs, this is really all you need. And it works well, in the industry, for programs as small as a sorting routine or as large as a complex distributed database.

[1]: The aligned conjunctions are a syntactic convenience to avoid parentheses.

tome · on June 15, 2016

Are those actually polymorphic? It seems like you've assumed a specific, concrete, a.

EDIT: On a second look, "CONSTANT" seems more like declaring a variable than requiring a concrete type.

EDIT 2: Hmm, still not completely sure about this ...

pron · on June 15, 2016

Constants are external inputs to a specification. All you know about them is what you assert in the assumptions. When you use the proof system, the assumptions are your axioms and if you don't have any, then it's really polymorphic. If you use the model checker, you'll need to supply the checker with a concrete -- and finite[1] -- set for all constants.

[1]: The one supplied with TLA+, called TLC, is powerful enough to model-check an expressive language like TLA+, but it's algorithm is rather primitive; it's an "old-school" explicit state model-checker. A more modern, state-of-the-art model checker is under research: http://forsyte.at/research/apalache/. BTW, modern model checkers are amazing. Some can even check infinite models.