\documentclass[a4paper,DIV11]{scrartcl} \usepackage[utf8]{inputenc} \usepackage[english]{babel} \usepackage{hyperref} \hypersetup{ pdftitle=Algebraic Data Type Done Differently, pdfauthor=Joachim Breitner, pdfpagelayout=Onecolumn, } \usepackage{listings} \lstnewenvironment{code}{}{} \lstset{ basicstyle=\ttfamily, keywordstyle=\normalfont\bfseries, identifierstyle=\bfseries, commentstyle=\rmfamily, texcl=true, language=Haskell, flexiblecolumns=false, basewidth={0.5em,0.45em}, extendedchars=true, % frame=leftline, % numbers=left, } \renewcommand{\l}{\lstinline} \title{Algebraic Data Types Done Differently} \author{Joachim Breitner} \date{3.6.2006} \begin{document} \maketitle \begin{abstract} For a while I have been thinking: Isn’t there a way to get rid of the intermediate \l-Maybe- construct in a commen expression like \l-fromMaybe default . lookup-. It seems that a way to do that would be to pass more information to the \l-Maybe--generating function: What to do with a \l-Just--Value, and what to return in case of \l-Nothing-. This leads to a new definion of the \l-Maybe- data type as a function. Later I discovered that this seems to work for any algebraic data type. This is probably nothing new, but I am offline at the moment, so I can’t check. This means that this might also be total rubbish. Enjoy. \end{abstract} \tableofcontents \section{Preparation} Assuming we have a function \l-f :: Maybe a- that returns a \l-Maybe- containing a variable of type \l-a-. Usually, we do not really care about this encapsulation, but we rather care to handle two cases – one with data and one without – separately. If we want to get rid of the \l-Maybe-, we obviously need to pass a default return value to the function. Therefore, we have \l_f :: a -> a_. But that is not sufficient: We still want to be able to distinguish between \l-Nothing- and \l-Just default-. So we move the processing of the \l-Just- value into the function. Because the result type of that processing does not matter to \l-f-, the type becomes \l_f :: forall b. (a -> b) -> b -> b_. Is this equivalent to \-Prelude.Maybe-? We can check by trying to implement the various functions. \section{Implementation of our \l-Maybe-} We first hide some functions from \l-Prelude- that we are going to redefine, and import some standard modules. The \l-forall- construct is not standard Haskell98, so we tell ghc to not be so picky. \begin{code} {-# OPTIONS_GHC -fglasgow-exts #-} import Prelude hiding (Maybe, Bool, maybe, not) import Control.Monad (MonadPlus, mzero, mplus) \end{code} Now let’s define the type. We could have used \l-type- instead, and omitted the \l-M- constructor, but that would prevent us from declaring type class instansnces for our \l-Maybe-, so we have to take the long road with \l-newtype- and a constructor. \begin{code} newtype Maybe a = M (forall b. (a -> b) -> b -> b) \end{code} We also need replacements for the standard constructors \l_Just_ and \l_Maybe_. These are now mere functions, so we have to write them in lower case. \l_just_ just applies the given function to the stored value, ignoring the default. \l_nothing_ just returns the default value, ignoring the function. \begin{code} just :: a -> Maybe a just x = M (\f d -> f x) nothing :: Maybe a nothing = M (\f d -> d) \end{code} We also want to be able to retrieve a value from \l_Maybe_, so we implement the \l_maybe_ function as seen in the \l_Prelude_. \begin{code} maybe d f (M a) = a f d \end{code} Because of the similarity between \l_maybe_ and our definition of \l_Maybe_, some of the following code could be written using this \l_maybe_ function. I chose not to, to keep the structure visible. On the other hand this means that any implementation of \l_Maybe_ just needs \l_just_, \l_nothing_ and \l_maybe_, and all the other functions can be done based on that, independently of the implementation. Currently, our \l_Maybe_s are just uncomprehensible functions. We want to make them visible, so we define an instance for \l_Show_. \begin{code} instance Show a => Show (Maybe a) where show (M a) = a (("Just "++) . show) "Nothing" \end{code} Monads are alwasy interesting, so we implement that instance as well. I use the \l_maybe_ function once out of convenience, otherwise this might be an ugly \l_let_ expression. \begin{code} instance Monad Maybe where return = just fail _ = nothing (M a) >>= b = M (\f d -> a (maybe d f . b) d) \end{code} Is this really a monad? We can find out by proving the monad laws, but the proofs are ugly. See the end of this document if you want to know. Just for fun we can define other class instances. \begin{code} instance MonadPlus Maybe where mzero = nothing mplus (M a) (M b) = M (\f d -> a f (b f d)) instance Functor Maybe where fmap f' (M a) = M (\f d -> a (f.f') d) \end{code} A demonstration of how to implement a “regular” \l_Maybe_ function here, see \l_maybeToList_: \begin{code} maybeToList (M a) = a (:[]) [] \end{code} \section{Same with \l_Bool_} After having done this, I wonder: Can we also implement other data types this way? And it seems we can. Let’s try \l_Bool_. We usually work with boolean values to later on decide which of two possible values we want. So why not pass these values directly to our \l_Bool_ and skip the middle man? So we get: \begin{code} newtype Bool = B (forall a. a -> a -> a) \end{code} Again, we implement the constructors as functions. Just for fun I’ll use point-free style here, instead of lambda-notation. A true value will return the first argument, a false value the second. \begin{code} true :: Bool true = B const false :: Bool false = B (flip const) \end{code} Some logical operations: \begin{code} not (B a) = B (flip a) (B a) && (B b) = B (\t f -> a (b t f) f) (B a) || (B b) = B (\t f -> a t (b t f)) \end{code} Usually a \l_Bool_ ends up in an \l_if_ expression. But the standard \l_if_ requires, hardcodedly, a \l_Bool_. If the \l_if_ construct were a regular function, we could re-define that. But it is not, so we create our own \l_if'_. \begin{code} if' (B a) t f = a t f \end{code} It seems that \l_if'_ is to our \l_Bool_ what \l_maybe_ is to our \l_maybe_. Also, all implementations of an boolean value would need to only provide \l_true_, \l_false_ and some kind of \l_if_. The instance to show these values is straight forward \begin{code} instance Show Bool where show (B a) = a "True" "False" \end{code} To make our \l_Bool_s compareable, we need to implement \l_Eq_. Unfortnately, this hard-codes the result type \l_Prelude.Bool_, so we are leaving our world here. We could define our own \l_Eq'_ with \l_(==)_ returning our \l_Bool_. \begin{code} instance Eq Bool where (B a) == (B b) = a (b True False) (b False True) \end{code} To combine our two types, we implement \l_isJust_ and \l_isNothing_. \begin{code} isJust (M a) = B (\t f -> a (const t) f) isNothing = not.isJust \end{code} \section{Other data types} It seems that all algebraic data types can be represented this way. There will be one parameter per constructor which itself takes as many parameters as the constructor takes arguments. So examples might be: \begin{code} newtype Either a b = E (forall c. (a -> c) -> (b -> c) -> c) newtype (,) a b = E (forall c. (a -> b -> c) -> c) \end{code} \section{And what is it good for?} No idea. I was hoping for higher efficiency because there is no need to pack the data in a \l_Maybe_ or similar, but intuitively this implementation will be much more expensive because of all the functions to pass around. I also expect functions whose result is needed more than once to be run more often than necesary, although the compiler could optimize that away. \l_Maybe_. \appendix \section{Proofs of the Monad law} Sorry for being lazy and not commenting every step. And yes, they are ugly \begin{lstlisting} unM (M a) = a -- shortcut unM . M == \x -> unM (M x) == \x -> x == id -- properties M . unM == id :: Maybe a -> Maybe a return a >>= k == just a >>= k -- def. return == M (\f d -> f a) >>= k -- def. just == M (\f d -> (\f' d' -> f a) (maybe d f . b) d) -- def. bind == M (\f d -> (maybe d f . b) a) -- apply lambda == M (\f d -> maybe d f (b a)) -- def. (.) == M (\f d -> maybe d f (M (unM (b a)))) -- shortcut == M (\f d -> unM (b a) f d) -- def. maybe == M (unM (b a)) -- eta reduction (?) == b a -- shortcut (M m) >>= return == (M m) >>= just == (M m) >>= (\a -> just a)) == (M m) >>= (\a -> M (\f d -> f a)) == M (\f d -> m (maybe d f . (\a -> M (\f' d' -> f' a)) d)) == M (\f d -> m (\a -> maybe b f ( M (\f' d' -> f' a)) d)) == M (\f d -> m (\a -> (\f' d' -> f' a) f d) d)) == M (\f d -> m (\a -> f a) d) == M (\f d -> m f d) == M m (M m) >>= (\x -> k x >>= h) == M (\f d -> m (maybe d f . (\x -> k x >>= h)) d) == M (\f d -> m (\a -> (maybe d f . (\x -> k x >>= h)) a) d) == M (\f d -> m (\a -> (maybe d f ((\x -> k x >>= h) a))) d) == M (\f d -> m (\a -> (maybe d f (k a >>= h))) d) == M (\f d -> m (\a -> (maybe d f (M (unM (k a >>= h))))) d) == M (\f d -> m (\a -> (unM (k a >>= h)) f d) d) == M (\f d -> m (\a -> (let M k' = (k a >>= h) in unM (M k')) f d) d) == M (\f d -> m (\a -> (let M k' = (k a >>= h) in k') f d) d) == M (\f d -> m (\a -> (let M k' = M (\f' d' -> unM (k a) (maybe d' f' . h)) in k') f d) d) == M (\f d -> m (\a -> (\f' d' -> unM (k a) (maybe d' f' . h) d') f d) d) == M (\f d -> m (\a -> unM (k a) (maybe d f . h) d) d) == M (\f d -> m (\a -> unM (k a) (maybe d f . h) d) d) == M (\f d -> m (\a -> maybe d (maybe d f . h) (M (unM (k a)))) d) == M (\f d -> m (\a -> maybe d (maybe d f . h) (k a)) d) == M (\f d -> m (\a -> (maybe d (maybe d f . h) . k) a) d) == M (\f d -> m (maybe d (maybe d f . h) . k) d) == M (\f d -> (\f' d' -> m (maybe d' f' . k) d') (maybe d f . h) d) == M (\f d -> let M mk' = M (\f' d' -> m (maybe d' f' . k) d') in mk' (maybe d f . h) d) == M (\f d -> let M mk' = (M m >>= k) in mk' (maybe d f . h) d) == M (\f d -> let M mk' = (M m >>= k) in unM (M mk') (maybe d f . h) d) == M (\f d -> unM ((M m) >>= k) (maybe d f . h) d) == ((M m) >>= k) >>= (M h) \end{lstlisting} \end{document}