Joachim Breitner

Less parentheses

Published 2017-09-10 in sections English, Haskell.

Yesterday, at the Haskell Implementers Workshop 2017 in Oxford, I gave a lightning talk titled ”syntactic musings”, where I presented three possibly useful syntactic features that one might want to add to a language like Haskell.

The talked caused quite some heated discussions, and since the Internet likes heated discussion, I will happily share these ideas with you

Context aka. Sections

This is probably the most relevant of the three proposals. Consider a bunch of related functions, say analyseExpr and analyseAlt, like these:

analyseExpr :: Expr -> Expr
analyseExpr (Var v) = change v
analyseExpr (App e1 e2) =
  App (analyseExpr e1) (analyseExpr e2)
analyseExpr (Lam v e) = Lam v (analyseExpr flag e)
analyseExpr (Case scrut alts) =
  Case (analyseExpr scrut) (analyseAlt <$> alts)

analyseAlt :: Alt -> Alt
analyseAlt (dc, pats, e) = (dc, pats, analyseExpr e)

You have written them, but now you notice that you need to make them configurable, e.g. to do different things in the Var case. You thus add a parameter to all these functions, and hence an argument to every call:

type Flag = Bool

analyseExpr :: Flag -> Expr -> Expr
analyseExpr flag (Var v) = if flag then change1 v else change2 v
analyseExpr flag (App e1 e2) =
  App (analyseExpr flag e1) (analyseExpr flag e2)
analyseExpr flag (Lam v e) = Lam v (analyseExpr (not flag) e)
analyseExpr flag (Case scrut alts) =
  Case (analyseExpr flag scrut) (analyseAlt flag <$> alts)

analyseAlt :: Flag -> Alt -> Alt
analyseAlt flag (dc, pats, e) = (dc, pats, analyseExpr flag e)

I find this code problematic. The intention was: “flag is a parameter that an external caller can use to change the behaviour of this code, but when reading and reasoning about this code, flag should be considered constant.”

But this intention is neither easily visible nor enforced. And in fact, in the above code, flag does “change”, as analyseExpr passes something else in the Lam case. The idiom is indistinguishable from the environment idiom, where a locally changing environment (such as “variables in scope”) is passed around.

So we are facing exactly the same problem as when reasoning about a loop in an imperative program with mutable variables. And we (pure functional programmers) should know better: We cherish immutability! We want to bind our variables once and have them scope over everything we need to scope over!

The solution I’d like to see in Haskell is common in other languages (Gallina, Idris, Agda, Isar), and this is what it would look like here:

type Flag = Bool
section (flag :: Flag) where
  analyseExpr :: Expr -> Expr
  analyseExpr (Var v) = if flag then change1 v else change2v 
  analyseExpr (App e1 e2) =
    App (analyseExpr e1) (analyseExpr e2)
  analyseExpr (Lam v e) = Lam v (analyseExpr e)
  analyseExpr (Case scrut alts) =
    Case (analyseExpr scrut) (analyseAlt <$> alts)

  analyseAlt :: Alt -> Alt
  analyseAlt (dc, pats, e) = (dc, pats, analyseExpr e)

Now the intention is clear: Within a clearly marked block, flag is fixed and when reasoning about this code I do not have to worry that it might change. Either all variables will be passed to change1, or all to change2. An important distinction!

Therefore, inside the section, the type of analyseExpr does not mention Flag, whereas outside its type is Flag -> Expr -> Expr. This is a bit unusual, but not completely: You see precisely the same effect in a class declaration, where the type signature of the methods do not mention the class constraint, but outside the declaration they do.

Note that idioms like implicit parameters or the Reader monad do not give the guarantee that the parameter is (locally) constant.

More details can be found in the GHC proposal that I prepared, and I invite you to raise concern or voice support there.

Curiously, this problem must have bothered me for longer than I remember: I discovered that seven years ago, I wrote a Template Haskell based implementation of this idea in the seal-module package!

Less parentheses 1: Bulleted argument lists

The next two proposals are all about removing parentheses. I believe that Haskell’s tendency to express complex code with no or few parentheses is one of its big strengths, as it makes it easier to visualy parse programs. A common idiom is to use the $ operator to separate a function from a complex argument without parentheses, but it does not help when there are multiple complex arguments.

For that case I propose to steal an idea from the surprisingly successful markup language markdown, and use bulleted lists to indicate multiple arguments:

foo :: Baz
foo = bracket
        • some complicated code
          that is evaluated first
        • other complicated code for later
        • even more complicated code

I find this very easy to visually parse and navigate.

It is actually possible to do this now, if one defines (•) = id with infixl 0 •. A dedicated syntax extension (-XArgumentBullets) is preferable:

  • It only really adds readability if the bullets are nicely vertically aligned, which the compiler should enforce.
  • I would like to use $ inside these complex arguments, and multiple operators of precedence 0 do not mix. (infixl -1 • would help).
  • It should be possible to nest these, and distinguish different nesting levers based on their indentation.

Less parentheses 1: Whitespace precedence

The final proposal is the most daring. I am convinced that it improves readability and should be considered when creating a new language. As for Haskell, I am at the moment not proposing this as a language extension (but could be convinced to do so if there is enough positive feedback).

Consider this definition of append:

(++) :: [a] -> [a] -> [a]
[]     ++ ys = ys
(x:xs) ++ ys = x : (xs++ys)

Imagine you were explaining the last line to someone orally. How would you speak it? One common way to do so is to not read the parentheses out aloud, but rather speak parenthesised expression more quickly and add pauses otherwise.

We can do the same in syntax!

(++) :: [a] -> [a] -> [a]
[]   ++ ys = ys
x:xs ++ ys = x : xs++ys

The rule is simple: A sequence of tokens without any space is implicitly parenthesised.

The reaction I got in Oxford was horror and disgust. And that is understandable – we are very used to ignore spacing when parsing expressions (unless it is indentation, of course. Then we are no longer horrified, as our non-Haskell colleagues are when they see our code).

But I am convinced that once you let the rule sink in, you will have no problem parsing such code with ease, and soon even with greater ease than the parenthesised version. It is a very natural thing to look at the general structure, identify “compact chunks of characters”, mentally group them, and then go and separately parse the internals of the chunks and how the chunks relate to each other. More natural than first scanning everything for ( and ), matching them up, building a mental tree, and then digging deeper.

Incidentally, there was a non-programmer present during my presentation, and while she did not openly contradict the dismissive groan of the audience, I later learned that she found this variant quite obvious to understand and more easily to read than the parenthesised code.

Some FAQs about this:

  • What about an operator with space on one side but not on the other? I’d simply forbid that, and hence enforce readable code.
  • Do operator sections still require parenthesis? Yes, I’d say so.
  • Does this overrule operator precedence? Yes! a * b+c == a * (b+c).
  • What is a token? Good question, and I am not not decided. In particular: Is a parenthesised expression a single token? If so, then (Succ a)+b * c parses as ((Succ a)+b) * c, otherwise it should probably simply be illegal.
  • Can we extend this so that one space binds tighter than two spaces, and so on? Yes we can, but really, we should not.
  • This is incompatible with Agda’s syntax! Indeed it is, and I really like Agda’s mixfix syntax. Can’t have everything.
  • Has this been done before? I have not seen it in any language, but Lewis Wall has blogged this idea before.

Well, let me know what you think!

Comments

It is not the first time you talk about sections and I find them a reasonable and useful addition. I wouldn't use the word section though, or we'll get in a mess when discussing expressions like (1+)!

I don't particularly feel the need for "bulleted arguments" (maybe because we already put the "very complicated code" in a where clause? I don't know). The "enforce alignment" doesn't sound a feature to me when tweaking stuff of debugging.

The "whitespace precedence" looks so easy to get and visually appealing! I am so amazed at its simplicity and wrote some Haskell in ghci off the top of my head to see if it "holds" (as in, if it doesn't bite back in more complex examples). To my surprise it worked flawlessly and I really hope to see this implemented one day: I am far for proficient in Haskell, but if you need a helping hand, drop me a line!

#1 Francesco Ariis am 2017-09-10

I wrote section in the blog post just to vary the concrete (because it matters less than the concept) and to get Coq users on board. The ghc proposal uses the Isar-inspired syntax context fixes … where ….

Alignment enforcement is already ubiquituous in Haskell (top level definitions, do blocks, pattern matches) etc. I think it would go well.

Glad to hear that the whitespace thing stuff works so well :)

#2 Joachim Breitner am 2017-09-10

The language "Fortress" distinguishes between "tight" and "loose" operands when discussing mathematical operators. Fortress is a (now defunct) language that tried to revive the Fortran heritage for modern software design and high-performance computing, and as such, had a number of very interesting features that are now slowly being rediscovered.

Apart from distinguishing between tight and loose operands, it also introduced a latex-like alternative syntax for unicode that can make code both concise on a graphical terminal as well as readable in plain ASCII. And its module system (modules are called "fortresses") offers features that sit somewhere in between Haskell Stack and Haskell Backpack.

See section 16.2 (page 101) on the Fortress language standard for details.

Regarding the original discussion -- a compiler warning for (potentially) misleading use or lack of white space would be much appreciated.

#3 Erik Schnetter am 2017-09-11

I used the last idea in a Lisp dialect of mine. (search for 'infix'). Ross Angle has pointed out to me that an earlier language called Merd has used it as well.

I think it's a neat idea, worth having in a language designer's toolbox. But I wouldn't go past distinguishing between operators with spaces and operators without. Operators with 2 spaces vs 1 (or 7 vs 6) would be the devil.

#4 Kartik Agaram am 2017-09-12

I support your idea for using absent internal whitespace as implicit bounding parentheses

It is a notation I have been using for a couple years in my personal mindmapping.

#5 Jeffrey Brown am 2017-09-14

Have something to say? You can post a comment by sending an e-Mail to me at <mail@joachim-breitner.de>, and I will include it here.