Joachim Breitner

A different Maybe maybe

Published 2007-03-07 in sections English, Haskell.

This is a post about the implementation of data types in the programming language Haskell. If you haven’t heard about that, you can proabably safely skip it.

For a while I have been thinking: Isn’t there a way to get rid of the intermediate Maybe construct in a common expression like fromMaybe default . lookup. It seems that a way to do that would be to pass more information to the Maybe-generating function: What to do with a Just-Value, and what to return in case of Nothing. This leads to a new definion of the Maybe data type as a function. Later I discovered that this seems to work for any algebraic data type.

This is probably nothing new, but I was offline at the time of writing, so I didn’t check. This means that this might also be total rubbish. Enjoy.

Update: Several responses in the “haskell-cafe”-list told me that I just re-invented Curch-Encoding. Nice :-)

Comments

This is, in fact, the standard way of encoding datatypes in the lambda calculus (which has only functions). It's quite flexible, as you'll see if you start playing around.
#1 Dylan Thurston am 2007-03-08
Another interesting thing about this is that if you think about it a bit, the standard treatment of algebraic datatypes in Haskell/ML is a pretty thin layer on this. Typically, to use values from an algebraic datatype, you have to unpack the data using some sort of a casing construct (either a function call or a "case" statement). In a lambda-calculus reduction, this is just syntactic sugar for building a new pile of closures and passing them to the encoding function of the structure.





Probably a more fruitful way to optimize Maybe (and one that I would hope the compiler already does!) is to use pointer tagging-type tricks. You might be familiar with this already -- essentially, you use the fact that all pointers on a 32-bit architecture have the last two bits set to zero. So, if you set (say) the low bit of a possibly-pointer value to 1, that means that it's something other than a pointer. You can use this to give common constants like [] and Nothing a unique encoding.
#2 Daniel Burrows am 2007-03-10

Have something to say? You can post a comment by sending an e-Mail to me at <mail@joachim-breitner.de>, and I will include it here.