\lettrine[nindent=0ex]F{ormal} semantics are the basic building block of all rigorous programming language research. Not only do they force us to think our work through in all details -- without a precise definition of the meaning of programs, we cannot conduct any proofs. Therefore, as I do want to be able to prove theorems about my work, I need a suitable semantics, and also implement it in Isabelle. Furthermore, semantics provide a common ground for the research community: If the same semantics are used, then results can easily be compared and combined. Therefore, I should not just define a semantics that happens to suit me, but preferably choose an existing, well-established semantics to build on. One such semantics is John Launchbury’s “Natural Semantics for Lazy Evaluation” \cite{launchbury}, which has several important traits: It is simple, as it has only four rules. It is detailed enough to model lazy evaluation. It is abstract enough to not model unnecessary details. And it is widely accepted as a standard semantics. Using a standard denotational semantics, Launchbury underpins his natural semantics by claiming correctness (evaluation in the natural semantics preserves denotation) and adequacy (all programs with a denotation have a derivation in the natural semantics). While he proves correctness in sufficient detail, he only outlines the adequacy proof -- an omission that resisted fixing, despite the popularity of the semantics, and despite serious attempts to follow his proof sketch (e.g.\@ \cite{sanchezlaunchbury}). In this chapter, I reproduce Launchbury's semantics, including subsequent improvements by Sestoft \cite{sestoft} and modernisations to how names binding is handled. This yields a definition that is suitable for formalisation in Isabelle. The original correctness proof was almost directly usable in the mechanisation and required only minor adjustments, which I discuss. I then provide a full adequacy proof, where I do not follow Launchbury's outline directly, but find a more elegant and direct proof. Parts of this chapter, in particular the adequacy proof, has been submitted to the Journal of Functional Programming \cite{launchbury-adequacy-preprint}. Dedicated sections explicate the differences to Launchbury's work, serving two purposes: The reasons for deviation can be educational to someone attempting a similar formalisation. Furthermore they are checklists when combining this work with other Launchbury-based developments. Finally, in preparation of \cref{ch:provingcallarity}, I extend the semantics and the proofs with a simple base type, and introduce a corresponding small-step semantics. \section{Launchbury's semantics} \label{sec:syntax} Launchbury defines a semantics for the simple untyped lambda calculus given in \cref{fig:launchbury_syntax}, consisting of variables, lambda abstractions, applications and mutually recursive bindings. \begin{figure} \abovedisplayskip=0pt \belowdisplayskip=0pt \begin{alignat*}{2} x,y,z,w &\in \sVar \\ e &\in \sExp &{} \Coloneqq {}& \sLam x e \\ & &{} \mid {}& \sApp e x \\ & &{} \mid {}& x \\ & &{} \mid {}& \sLet {\xeng} e \end{alignat*} \caption{Launchbury’s core lambda calculus} \index{Var@\sVar\idxexpl{type of variables}} \index{Exp@\sExp\idxexpl{type of lambda expressions}} \label{fig:launchbury_syntax} \end{figure} The set of free variables of an expression $e$ is denoted by $\fv e$; I overload this notation and use $\text{fv}$ with arguments of other types that may contain variable names. For example for tuples (or, equivalently, multiple arguments), we have $\fv{\Gamma, e} = \fv \Gamma \cup \fv e$. A variable $x$ is \emph{fresh} with regard to an expression $e$ (or a similar object) if $x\notin\fv e$. The expression $e$ with every free occurrence of $x$ replaced by $y$ is written as $\subst e x y$. \index{fv@$\fv{\idxdummy}$\idxexpl{free variables}} \index{fresh} I equate alpha-equivalent lambda abstractions ($\sLam x x = \sLam y y$) and the bound variable is not part of the set of free variables ($\fv{\sLam x {\sApp y x}} = \{y\}$). \keyword{let} bindings are handled likewise. The theoretical foundation used is nominal logic (see \cref{sec:nominal}). This does impose a few well-formedness side conditions, such as equivariance of definitions over expressions. I skip them in this presentation, and do so with good conscience, as they have been covered in the machine-checked proofs. Note that the term on the right hand side of an application has to be a variable. A general lambda term of the form $\sApp{e_1}{e_2}$ would have to be pre-processed to $\sLet{x = e_2}{\sApp{e_1}x}$ before it can be handled by my semantics. This restriction simplifies the semantics, as all bindings on the heap are created by a \keyword{let} expression and we do not have to ensure separately that the evaluation of a function’s argument is shared. This is a standard trick applied by Launchbury \cite{launchbury} and others \cite{sestoft,sands,wwmif}. In some of the less formal parts of this thesis, e.g.\ in examples, I occasionally use expressions as arguments in the interest of readability. This should be understood as a shorthand for the proper, let-bound form. \subsection{Natural semantics} \begin{figure} \abovedisplayskip=0pt \belowdisplayskip=0pt \begin{mathpar} \inferrule { } {\sred{\Gamma}{\sLam xe}L{\Gamma}{\sLam xe}} \sRule{Lam} \and \inferrule {\sred{\Gamma}e L{\Delta}{\sLam y {e'}}\\ \sred{\Delta}{\subst{e'}yx}L{\Theta}{v}} {\sred\Gamma{\sApp e x}L\Theta v} \sRule{App} \and \inferrule {\sred\Gamma e {L\cup\{x\}}\Delta v} {\sred{x \mapsto e,\Gamma} x L {x \mapsto v,\Delta}{v}} \sRule{Var} \and \inferrule {\dom\Delta \cap \fv{\Gamma,L} = \{\}\\ \sred{\Delta,\Gamma} e L \Theta v} {\sred{\Gamma}{\sLet{\Delta}e} L \Theta v} \sRule{Let} \end{mathpar} \index{Lam@\sRule{Lam}\idxexpl{natural semantics rule for lambda abstraction}} \index{App@\sRule{App}\idxexpl{natural semantics rule for application}} \index{Var@\sRule{Var}\idxexpl{natural semantics rule for variables}} \index{Let@\sRule{Let}\idxexpl{natural semantics rule for \keyword{let} bindings}} \index{.R5red@$\sred{\idxdummy}{\idxdummy}{\idxdummy}{\idxdummy}{\idxdummy}$\idxexpl{Launchbury’s natural semantics}} \caption{Launchbury natural semantics, as revised by Sestoft} \label{fig:natsem} \end{figure} Launchbury gives meaning to this language by way of a natural semantics. I present his semantics with minor adjustments due to Sestoft and myself, and explain these differences in \cref{sec:modifications1}. The semantics is given by a relation \[\sred \Gamma e L \Delta v\] with the intuition that the expression $e$ within the heap $\Gamma$ reduces to the value $v$, while modifying the heap to $\Delta$, while avoiding the names in the set $L$. The relation is defined inductively by the rules in \cref{fig:natsem}, which obey the following naming conventions: \index{Heap@\sHeap\idxexpl{type of heaps, bindings}} \index{Val@\sVal\idxexpl{type of value expressions, subset of \sExp}} \index{.R2rightharpoon@$\idxdummy\pfun\idxdummy$\idxexpl{type of partial functions}} % \begin{alignat*}{2} \Gamma, \Delta, \Theta &\in \sHeap &&= \sVar \pfun \sExp \displaybreak[1]\\ v &\in \sVal &&\Coloneqq \sLam x e \end{alignat*} A heap is a partial function from variables to expressions ($\sVar \pfun \sExp$), and usually represented by $\Gamma, \Delta$ or $\Theta$. The same type is used for the list of bindings in a \keyword{let}. The domain of a heap $\Gamma$, written $\dom\Gamma$, is the set of variables bound by the heap. \index{dom@$\dom{\idxdummy}$\idxexpl{domain}!of a heap} In contrast to expressions, \emph{heaps} are not alpha-equated, so we have $\dom\Gamma \subseteq \fv \Gamma$. I write $x \mapsto e$ for the singleton heap and use commas to combine heaps with distinct domains. A $v$ represents a \emph{value}, i.e.\ an expression in weak head normal form. So far, the only values are lambda abstractions; this will change when I add Booleans in \cref{sec:addingbooleans}. I use the predicate $\isVal e$ to denote that the expression $e$ is a value. \index{isVal@$\isVal{\idxdummy}$} \medskip The first rule, \sRule{Lam}, does not actually “do” anything: Expression and heap on the left and on the right are the same. This rule thus states that to evaluate an expression that is already a value, nothing has to be done. The second rule, \sRule{App}, handles evaluation of an application. As we want to model lazy evaluation, first the called expression $e$ is evaluated. The argument $x$, which by our syntactic restriction is just a variable, is then substituted into the the resulting lambda abstracted expression, and evaluation continues with that. Observe that the argument itself is not necessarily evaluated. Rule \sRule{Var} takes care of evaluating a variable $x$. This is only possible if it is mentioned in the heap. During the evaluation of the expression $e$, the binding $x\mapsto e$ is removed from the heap: This way, if the evaluation of $e$ would itself require the evaluation of $x$, the \sRule{Var} rule does not apply over and over again, but rather the inference is stuck. An inference algorithm derived from these rules would exhibit the same behaviour as a runtime for lazy functional programs that sports \emph{blackholing}\index{blackholing}, where a thunk under evaluation is replaced by a so-called blackhole which, if evaluated, aborts the program \cite{stg}. This rule also implements \emph{sharing}: After having evaluated $e$ to a value $v$, this is not only returned as the result of the computation, but also added to the resulting heap as the new binding for $x$. This \emph{updating} of $x$ ensures that any further evaluation of $x$ will immediately return with its once evaluated value. The final rule, \sRule{Let}, implements let-bindings, which may be mutually recursive, simply by moving them to the heap. The let-expression itself represents an alpha-equivalency class and hence does not have names for the bound values, so it is the application of this rule that actually determines $\dom\Delta$, and the first assumption of the rule ensures that these variables do not clash with existing ones. The set $L$ was not present in Launchbury's rules. It was added by Sestoft \cite{sestoft} to keep track of variables that must be avoided when choosing new names in the \sRule{Let} rule, but would otherwise not be present in the judgement any more, because they were blackholed by \sRule{Var}. I explain this modification in greater detail in \cref{sec:naming}). \medskip The semantics has a few noteworthy properties, which I describe in the following lemmas. Evaluation does not forget bindings:\nopagebreak \begin{lemma} If $\sred \Gamma e L \Delta v$ then $\dom \Gamma \subseteq \dom \Delta$. \label{lem:doesnotforget} \end{lemma} \begin{proof} by induction on the derivation of $\sred \Gamma e L \Delta v$. \end{proof} Furthermore, names that appear as new bindings on the heap do not clash with any names in the set $L$: \begin{lemma} If $\sred \Gamma e L \Delta v$ then $(\dom \Delta \setminus \dom\Gamma) \cap L = \{\}$. \label{lem:avoidsL} \end{lemma} \begin{proof} by induction on the derivation of $\sred \Gamma e L \Delta v$. In the case for let expressions, we use that the names chosen for the bound variables are fresh with regard to $L$, as $\dom\Delta \cap \fv{\Gamma,e,L} = \{\}$. \end{proof} I consider a judgement $\sred \Gamma e L \Delta v$ to be \emph{closed} if $\fv{\Gamma, e} \subseteq \dom \Gamma \cup L$, i.e.\@ all occurring names are either bound in the heap, or explicitly listed in the set $L$ of names to avoid. \index{closed judgement} Note that this property is preserved by the semantics in the following sense: \begin{lemma} If $\sred \Gamma e L \Delta v$ holds and $\fv{\Gamma, e} \subseteq \dom \Gamma \cup L$, then $\fv{\Delta, v} \subseteq \dom \Delta \cup L$. \label{lem:closedpreserved} \end{lemma} \begin{proof} In light of \cref{lem:doesnotforget}, this follows from: If $\sred \Gamma e L \Delta v$ holds, and $x'$ is fresh with regard to $\Gamma$ and $e$, then $x'$ is either also fresh with regard to $\Delta$ and $v$, or $x'$ appears in $\dom \Delta$, in which case $x'\notin L$ must hold. I prove this by induction on the derivation. \case{\sRule{Lam}} This case is trivial. \case{\sRule{App}} By the first induction hypothesis, there are two subcases to consider: \begin{itemize} \item The variable $x'$ is still fresh with regard to $\Delta$ and $\sLam y {e'}$. In order to invoke the second induction hypothesis, we need to show that $x'$ is fresh with regard to $\subst{e'}y{x}$. This is the case, as $x'\ne x$, by the assumption, and either $x'=y$, or $x$ is fresh with regard to $e'$. \item The variable $x'$ appears in $\dom\Delta$ and is not in $L$. By \cref{lem:doesnotforget}, it then also appears in $\dom\Theta$. \end{itemize} \case{\sRule{Var}} As $x'$ is fresh with regard to $(x\mapsto e, \Gamma)$ and $x$, we have $x'\ne x$. Furthermore, $x'$ is also fresh with regard to $\Gamma$ and $e$, so we can invoke the induction hypothesis. \begin{itemize} \item If $x'$ is still fresh with regard to $\Delta$ and $v$, then it is also fresh with regard to $(x\mapsto v, \Delta)$, as $x' \ne x$. \item If $x'\in \dom\Delta$ and $x'\notin L\cup\{x\}$, we obviously also have that \mbox{$x'\in\dom(x\mapsto v, \Delta)$} and $x \notin L$. \end{itemize} \case{\sRule{Let}} As this case introduces new names on the heap, this decides what side of the disjunction in the proposition $x'$ ends up in. \begin{itemize} \item If $x'\in\dom \Delta$, then $x'\in\dom(\Delta,\Gamma)$, and by \cref{lem:doesnotforget}, $x' \in \dom\Theta$. Also, by the freshness condition on \sRule{Let}, $x'\notin L$. \item Otherwise, $x$ is fresh with regard to $(\Delta,\Gamma)$ and $e$ by the assumption, so we can invoke the induction hypothesis. \end{itemize} \end{proof} \subsection{Denotational semantics} \label{sec:denotational_semantics} \index{Value@\sValue\idxexpl{semantic domain}} In order to show that the natural semantics behaves as expected, Launchbury defines a standard denotational semantics for expressions and heaps, following Abramsky \cite{abramsky}. The semantic domain $\sValue$ is the initial solution to the domain equation \[ \sValue = (\sValue \to \sValue)_\bot, \] in the category of pointed chain-complete partial orders with continuous functions. In this domain, we can distinguish $\bot$ from $\lambda x. \bot$. \index{Fn@$\sFn{\idxdummy}$} The injection \[ \sFn \_ \colon (\sValue \to \sValue) \to \sValue \] turns values of type $\sValue \to \sValue$ into non-bottom values of $\sValue$, while the projection \index{FnProj@$\sFnProj{\idxdummy}{\idxdummy}$} \[ \sFnProj{\_}{\_} \colon \sValue \to \sValue \to \sValue \] does the converse, defined as \[ \sFnProj{v_1}{v_2} = \begin{cases} f\,v_2 &\text{if } v_1 = \sFn f \\ \bot &\text{otherwise.} \end{cases} \] \index{.R2sqsubset@$\idxdummy\sqsubseteq\idxdummy$!on $\sValue$} The partial order $\sqsubseteq$ on $\sValue$ is the usual, with $\sFn f \sqsubseteq \sFn g$ if $\forall x.\,f\,x \sqsubseteq g \, x$ for $f,g \in \sValue \to \sValue$, and $\bot$ below everything. \index{Env@\sEnv\idxexpl{semantic environment}} A semantic environment maps variables to values, \begin{alignat*}{2} \rho \in \sEnv &= \sVar \to \sValue, \end{alignat*} and the initial environment $\bot$ maps all variables to $\bot$. Environments are ordered by lifting the order on $\sValue$ pointwise. \index{.R2sqsubset@$\idxdummy\sqsubseteq\idxdummy$!on environments} \index{dom@$\dom{\idxdummy}$\idxexpl{domain}!of an environment} With the \emph{domain of an environment $\rho$}, written $\dom \rho$, I denote the set of variables that are not mapped to $\bot$. \index{.O2restr@$\idxdummy"|_\idxdummy$!on environments} The environment $\rho|_S$, where $S$ is a set of variables, is the restriction of $\rho$ to $S$: \[ (\rho |_S)\, x = \begin{cases} \rho\, x,& \text{if } x \in S\\ \bot& \text{if } x \not\in S. \end{cases} \] \index{.O2setminus@$\idxdummy\setminus\idxdummy$!on environments} The environment $\rho\setminus S$ is defined as the restriction of $\rho$ to the complement of $S$, i.e.\ \mbox{$\rho\setminus S \coloneqq \rho|_{\sVar \setminus S}$}. \begin{figure} \abovedisplayskip=0pt \belowdisplayskip=0pt \begin{align*} \dsem{\sLam x e}\rho &\coloneqq \sFn{\lambda v. \dsem e {\rho \sqcup [x \mapsto v]}}\\ \dsem{\sApp e x}\rho &\coloneqq \sFnProj {\dsem e \rho}{\rho\,x}\\ \dsem{x}\rho &\coloneqq \rho\,x\\ \dsem{\sLet{\Delta}e}\rho &\coloneqq \dsem e {\esem{\Delta}\rho.} \end{align*} \index{.Sdsem@$\dsem\idxdummy\idxdummy$\idxexpl{denotational semantics of expressions}} \caption{The standard denotational semantics} \label{fig:dendef} \end{figure} The semantics of expressions and heaps are mutually recursive. The meaning of an expression $e \in \sExp$ in an environment $\rho \in \sEnv$ is written as $\dsem e \rho \in \sValue$ and is defined by the equations in \cref{fig:dendef}. We can map this function over a heap to obtain an environment: \begin{align*} %\dsem{\xen}\rho \coloneqq [x_1 \mapsto \dsem{e_1}{\rho}, \ldots, x_n \mapsto \dsem{e_n}{\rho}] \dsem{\Gamma}\rho ~ x \coloneqq \begin{cases} \dsem{e}{\rho}, & \text{if } (x\mapsto e)\in \Gamma \\ \bot & \text{if } x \notin \dom\Gamma. \end{cases} \end{align*} The semantics of a heap $\Gamma \in \sHeap$ in an environment $\rho$, written $\esem \Gamma \rho \in \sEnv$, is then obtained as a least fixed point: \index{.Sesem@$\esem\idxdummy\idxdummy$\idxexpl{denotational semantics of heaps}} \begin{align*} \esem{\Gamma}\rho &= (\mu \rho'.\, \rho \addon{\dom \Gamma} \dsem{\Gamma}{\rho'}) \end{align*} where \[ (\rho \addon{S} \rho')~x \coloneqq \begin{cases} \rho~x, &\text{if } x \notin S \\ \rho'~x, &\text{if } x \in S \end{cases} \] is a restricted update operator. \index{.O3addon@$\idxdummy \addon{\idxdummy}\idxdummy$\idxexpl{restricted update operator on environments}} The least fixed point exists, as all involved operations are monotone and continuous, and by unrolling the fixed point once, we can see \begin{lemma}[Application of the heap semantics] \lemdisplayunskip \[ \label{lem:esem_this} \label{lem:esem_other} (\esem{\Gamma}\rho)\,x = \begin{cases} \dsem{e}{\esem{\Gamma}\rho}, & \text{if } (x\mapsto e)\in \Gamma \\ \rho\, x, & \text{if } x \notin \dom\Gamma. \end{cases} \] \end{lemma} The following substitution lemma plays an important role in finding a more direct proof of adequacy, but is also required for the correctness proof, as performed by Launchbury: \begin{lemma}[Semantics of substitution] \lemdisplayunskip \[ \label{lem:subst} \dsem{e}{\rho (y \mapsto \rho~x)} = \dsem{ \subst{e}{y}{x}}{\rho}. \] \end{lemma} \begin{proof} We first show $\forall\rho.\, \rho\,x = \rho\,y \implies \esem{e}\rho = \esem{\subst{e}{y}{x}}\rho$ by induction on $e$, using parallel fixed-point induction in the case for \keyword{let}. This allows us to calculate \begin{conteq} \dsem{e}{\rho (y \mapsto \rho\,x)} \\ = \dsem{\subst{e}{y}{x}}{\rho (y \mapsto \rho\,x)} & as $\rho (y \mapsto \rho\,x)\,x=\rho (y \mapsto \rho\,x)\,y$ \\ = \dsem{\subst{e}{y}{x}}{\rho} & as $y\notin \fv{\subst{e}{y}{x}}$. \end{conteq} \end{proof} I sometimes write $\esem{\Gamma}$ instead of $\esem{\Gamma}{\bot}$. In an expression $\esem{\Gamma}(\esem{\Delta}\rho)$ I omit the parentheses and write $\esem{\Gamma}\esem{\Delta}\rho$. \subsection{Discussions of modifications} \label{sec:modifications1} It is rare that a formal system developed with pen and on paper can be formalised to the letter, partly because of vagueness (what, exactly, is a ``completely'' fresh variable?), partly because of formalisation convenience, and partly because the stated facts -- even if morally correct -- are wrong when read scrupulously. Launchbury’s work is no exception. This section discusses the required divergence from Launchbury's work. \subsubsection{Naming} \label{sec:naming} Getting the naming issues right is one of the major issues when formalising anything involving bound variables. In Launchbury's work, the names are manifestly part of the syntax, i.e.\@ $\sLam x x \ne \sLam y y$, and his rules involve explicit renaming of bound variables to fresh ones in the rule \sRule{Var}. His definition of freshness is a global one, so the validity of a derivation using \sRule{Var} depends on everything around it. This is morally what we want, but very impractical. Sestoft \cite{sestoft} noticed this problem and fixed it by adding a set $L$ of variables to the judgement, so that every variable to be avoided occurs somewhere in $\Gamma$, $e$, or $L$. Instead of renaming all bound variables in the rule \sRule{Var}, he chooses fresh names for the new heap bindings in the rule \sRule{Let}. I build on that, but go one step further and completely avoid bound names in the expressions, i.e.\ $\sLam x x = \sLam y y$. I still have them in the syntax, of course, but these are just representatives of an $\alpha$-equivalency class. Nominal logic (cf.\@ \cref{sec:nominal}) forms the formal foundation for this. So in my rule \sRule{Let} I do not have to rename the variables, but simply may assume that the variables used in the representation of the \keyword{let}-expression are sufficiently fresh. The names of bindings on the heap are not abstracted away in that manner; this follows \cite{nameless}. \subsubsection{Closed judgements} Launchbury deliberately allows non-closed configurations in his derivations, i.e.\@ configurations with free variables in the terms that have no corresponding binding on the heap. This is a necessity, as rule \sRule{Var} models blackholing by removing a binding from the heap during its evaluation. With the addition of the set of variables to avoid, which will always contain such variables, the question of whether non-closed configurations should be allowed can be revisited. And indeed, Sestoft defines the notion of \emph{$L$-good configurations}, where all free variables are either bound on the heap, or contained in $L$. He shows that this property is preserved by the operational semantics and subsequently considers only $L$-good configurations. I follow this example with my definition of \emph{closed} judgements. Threading the closedness requirement through a proof by rule induction is a typical chore contributing to the overhead of a machine-checked formalisation. \subsubsection{Join vs.\@ update} \index{.O2sqcup@$\idxdummy\sqcup\idxdummy$!on environments} Launchbury specifies his denotational semantics using a binary operation $\sqcup$ on environments: \begin{align*} \esem{\Gamma}\rho &= (\mu \rho'.\, \rho \sqcup \dsem{\Gamma}{\rho'}) \end{align*} He does not define it explicitly, but the statements in his Section 5.2.1 leave no doubt that he indeed intended this operation to denote the least upper bound of its arguments, as one would expect. Unfortunately, with this definition, his Theorem 2 is false. The proposition of the theorem (which corresponds to \cref{thm:thm2} in this document) is \[ \sred \Gamma e L \Delta v \implies \forall \rho \in \sEnv.\,\dsem{e}{\esem{\Gamma}{\rho}} = \dsem{v}{\esem{\Delta}{\rho}} \] and a counter example is given by \begin{align*} e &= x,\\ v &= (\sLam{a}{\sLet{b = b}b}),\\ \Gamma = \Delta &= (x \mapsto v),\text{ and }\\ \rho &= (x \mapsto \sFn{\lambda \_. \sFn{\lambda x.x}}). \end{align*} Note that the denotation of $v$ is $\sFn{\lambda \_. \bot}$ in every environment. We have $\sred\Gamma e{\{\}}\Delta v$, so according to the theorem, $\dsem{e}{\esem{\Gamma}{\rho}} = \dsem{v}{\esem{\Delta}{\rho}}$ should hold, but the following calculation show that it does not: \begin{conteq} \dsem{e}{\esem\Gamma\rho} \\ = \big(\esem \Gamma\rho\big)\,x \\ = \rho\,x \sqcup \dsem{v}{\esem{\Gamma}\rho} \\ = \sFn{\lambda \_. \sFn{\lambda x.x}} \sqcup \sFn{\lambda \_. \bot} \\ = \sFn{\lambda \_. \sFn{\lambda x.x} \sqcup \bot} \\ = \sFn{\lambda \_. \sFn{\lambda x.x}} \\ \ne \sFn{\lambda \_. \bot} \\ = \dsem{v}{\esem{\Delta}{\rho}}. \end{conteq} The crucial property of the counter-example is that $\rho$ contains compatible, but better information for a variable also bound in $\Gamma$. % The mistake in his correctness proof is in the step $(\esem{x \mapsto v, \Delta}\rho)\,x = \dsem{v}{\esem{x \mapsto v, \Delta}\rho}$ in the case for \sRule{Var}, which should be $(\esem{x \mapsto v, \Delta}\rho)\,x = \dsem{v}{\esem{x \mapsto v, \Delta}\rho} \sqcup \rho\,x$. \medskip Intuitively, such rogue $\rho$ are not relevant for a proof of the main \cref{thm:main}. Nevertheless, this issue needs to be fixed before attempting a formal proof. One possible fix is to replace $\sqcup$ by a right-sided update operation that just throws away information from the left argument for those variables bound on the right. The syntax $\rho \addon S \rho'$ denotes this operation. If that is used instead of the least upper bound, then the proof goes through in full rigour. It is slightly annoying having to specify the set $S$ in this operation explicitly, as it is usually clear ``from the context'': Morally, it is the set of variables that the object on the right talks about. But as environments, i.e.\ total functions from $\sVar \to \sValue$, do not distinguish between variables not mentioned at all and variables mentioned, but bound to $\bot$, this information is not easily exploitable in a formal setting. For the same reason \cref{thm:thm2} uses the more explicit equality between restricted environments instead of Launchbury’s ordering $\le$ on environments. I elaborate on this in \cref{sec:modifications2}. \pagebreak \section{Correctness} \label{sec:correctness} The main correctness theorem for the natural semantics is \begin{theorem}[Correctness] If $\sred \Gamma e L \Delta v$ holds and is closed, then \label{thm:main} \[ \dsem{e}{\esem{\Gamma}} = \dsem{v}{\esem{\Delta}}. \] \end{theorem} A proof by rule induction requires the following generalisation: \begin{theorem}[Correctness, generalized] If $\sred \Gamma e L \Delta v$ holds and is closed, then for all environments $\rho \in \sEnv$, we have \label{thm:thm2} \[ \dsem{e}{\esem{\Gamma}{\rho}} = \dsem{v}{\esem{\Delta}{\rho}} \quad \text{and} \quad (\esem\Gamma\rho)|_{\dom\Gamma} = (\esem\Delta\rho)|_{\dom\Gamma}. \] \end{theorem} The proof follows Launchbury's steps, but differs in some details. In the interest of a self-contained presentation, I give the full proof here. Two technical lemmas used in the proof are stated and proved subsequently. For clarity, $\rho \eqon{S} \rho'$ abbreviates $\rho|_S = \rho'|_S$. \index{.R2=on@$\idxdummy\eqon\idxdummy\idxdummy$\idxexpl{restricted equality on semantic environments}} \begin{proof} by induction on the derivation of $\sred \Gamma e L \Delta v$. Note that in such a derivation, all occurring judgements are closed. \case{\sRule{Lam}} This case is trivial. \case{\sRule{App}} The induction hypotheses are $\dsem{e}{\esem{\Gamma}\rho} = \dsem{\sLam y {e'}}{\esem{\Delta}\rho}$ and $\esem{\Gamma}\rho \eqon{\dom\Gamma} \esem{\Delta}\rho$ as well as $\dsem{\subst{e'}{y}{x}}{\esem{\Delta}\rho} = \dsem{v}{\esem{\Theta}\rho}$ and $\esem{\Delta}\rho \eqon{\dom\Delta} \esem{\Theta}\rho$. We have $\esem{\Gamma}{\rho}~x = \esem{\Delta}{\rho}~x$: If $x\in \dom \Gamma$, this follows from the induction hypothesis. Otherwise, we know $x\in L$, as the judgement is closed, and the new names bound in $\Delta$ avoid $L$, so we have $\rho\, x$ on both sides. While the second part follows from the corresponding induction hypotheses and $\dom\Gamma \subseteq \dom\Delta$ (\cref{lem:doesnotforget}), the first part is a simple calculation: \begin{conteq}[explline] \dsem{\sApp{e}{x}}{\esem{\Gamma}\rho} \\ = \sFnProj{\dsem{e}{\esem{\Gamma}\rho}}{{\esem{\Gamma}{\rho}}\, x} & by the denotation of application \\ = \sFnProj{\dsem{\sLam y {e'}}{\esem{\Delta}\rho}}{\esem{\Gamma}{\rho}~x} & by the induction hypothesis \\ = \sFnProj{\dsem{\sLam y {e'}}{\esem{\Delta}\rho}}{{\esem{\Delta}{\rho}}~x} & see above \\ = \dsem{e'}{(\esem{\Delta}\rho)(y \mapsto {\esem{\Delta}{\rho}}~x)} & by the denotation of lambda abstraction \\ = \dsem{\subst{e'}{y}{x}}{\esem{\Delta}\rho} & by \cref{lem:subst} \\ = \dsem{v}{\esem{\Theta}\rho} & by the induction hypothesis \end{conteq} \case{\sRule{Var}} We know that $\dsem{e}{\esem{\Gamma}\rho'}=\dsem{v}{\esem{\Delta}\rho'}$ and $\esem{\Gamma}{\rho'} \eqon{\dom\Gamma} \esem{\Delta}{\rho'}$ for all environments $\rho'$. We begin with the second part: \begin{conteq}[explline] \esem{x \mapsto e,\Gamma}{\rho} \\ = \mu \rho'.\,(\rho \addon{\dom\Gamma} \esem{\Gamma}{\rho'})[x \mapsto \dsem{e}{\esem{\Gamma}\rho'}] & by the following \cref{lem:iter} \\ = \mu \rho'.\,(\rho \addon{\dom\Gamma} \esem{\Gamma}{\rho'})[x \mapsto \dsem{v}{\esem{\Delta}\rho'}] & %\aexpl{by the induction hypothesis, invoked for $\rho'$!} \\ \parbox{\widthof{by the induction hypothesis. Note that}}{\raggedright by the induction hypothesis. Note that we invoke it for $\rho'$ with $\rho' \ne \rho$!} \\ \eqon{\dom{(x \mapsto e,\Gamma})} \mu \rho'.\,(\rho \addon{\dom\Delta} \esem{\Delta}{\rho'})[x \mapsto \dsem{v}{\esem{\Delta}\rho'}] & by the induction hypothesis; see below \\ = \esem{x \mapsto v, \Delta}{\rho} & again by \cref{lem:iter} \end{conteq} The second but last step is quite technical, as the $|_{\dom{(x \mapsto e,\Gamma})}$ operator needs to commute with the fixed-point operator. This goes through by parallel fixed-point induction if we first generalise it to $|_{\sVar \setminus \dom\Delta \mathop\cup \dom{(x \mapsto e,\Gamma)}}$, the restriction to the complement of the new variables added to the heap during evaluation of $x$. The first part now follows from the second part: \begin{conteq}[explline] \dsem{x}{\esem{x \mapsto e, \Gamma}\rho} \\ = (\esem{x \mapsto e, \Gamma}\rho)\,x %\aexpl{by the denotation of variables} \\ = (\esem{x \mapsto v, \Delta}\rho)\,x & by the second part and $x\in\dom(x \mapsto e,\Gamma)$ \\ = \dsem{v}{\esem{x \mapsto v, \Delta}\rho} & by \cref{lem:esem_this}. \end{conteq} \case{\sRule{Let}} We know that $\dsem{e}{\esem{\Delta,\Gamma}\rho} = \dsem{v}{\esem{\Theta}{\rho}}$ and $\esem{\Delta,\Gamma}\rho \eqon{\dom{(\Delta,\Gamma)}} \esem{\Theta}{\rho}$. For the first part we have \begin{conteq} \dsem{\sLet{\Delta}e}{\esem{\Gamma}\rho} \\ = \dsem{e}{\esem{\Delta}{\esem{\Gamma}\rho}} & by the denotation of let-expressions \\ = \dsem{e}{\esem{\Delta,\Gamma}{\rho}} & by the following \cref{lem:esem-merge} \\ = \dsem{v}{\esem{\Theta}{\rho}} & by the induction hypothesis \end{conteq} and for the second part we have \begin{conteq} \esem{\Gamma}{\rho} \\ \eqon{\dom\Gamma} \esem{\Delta}{\esem{\Gamma}\rho} & because $\dom\Delta$ are fresh \\ = \esem{\Delta,\Gamma}\rho & again by \cref{lem:esem-merge} \\ \eqon{\dom{(\Delta,\Gamma)}} \esem{\Theta}\rho. & by the induction hypothesis. \end{conteq} \end{proof} In the case for \sRule{Var}, I switched from the usual, simultaneous definition of the heap semantics to an iterative one, in order to be able to make use of the induction hypothesis: \nopagebreak \begin{lemma}[Iterative definition of the heap semantics] \lemdisplayunskip \[ \esem{x \mapsto e, \Gamma}\rho = \mu \rho'.\, \big((\rho \addon{\dom\Gamma} {\esem{\Gamma}{\rho'}})[x \mapsto \dsem{e}{\esem{\Gamma}\rho'}]\big). \label{lem:iter} \] \end{lemma} A corresponding lemma can be found in Launchbury \cite{launchbury}, but without proof. As the proof involves some delicate juggling of fixed points, I include it here in detail: \newcommand{\domxG}{\dom{(x\mapsto e,\Gamma)}} \begin{proof} Let \begin{align*} L &= \lambda \rho'.\, \big(\rho \addon{\domxG} \dsem{x\mapsto e, \Gamma}{\rho'}\big) \intertext{be the functorial of the fixed point on the left hand side, and} R &= \lambda \rho'.\, \big((\rho \addon{\dom\Gamma} {\esem{\Gamma}{\rho'}})[x \mapsto \dsem{e}{\esem{\Gamma}\rho'}]\big). \end{align*} the functorial of the fixed point on the right hand side. By \cref{lem:esem_this}, we have \begin{align*} (\mu L)\, y &= \dsem{e'}{\mu L} &&\text{for }y\mapsto e'\in\dom\Gamma, \tag{1}\\ (\mu L)\, x &= \dsem{e}{\mu L}, \tag{2}\\ (\mu L)\, y &= \rho\, y &&\text{for }y \notin \domxG. \tag{3} \intertext{% Similarly, by unrolling the fixed points, we have } (\mu R)\, y &= \dsem{e'}{\esem{\Gamma}{(\mu R)}} &&\text{for } y \mapsto e'\in\dom\Gamma,\tag4\\ (\mu R)\, x &= \dsem{e}{\esem{\Gamma}{(\mu R)}}, \tag5\\ (\mu R)\, y &= \rho\, y && \text{for }y \notin \domxG, \tag6 \intertext{% and also for $\rho' \in \sEnv$ (in particular for $\rho' = (\mu L)$ and $\rho' = (\mu R)$), again using \cref{lem:esem_this}, } (\esem{\Gamma}{\rho'})\,y &= \dsem{e'}{\esem{\Gamma}{\rho'}} && \text{for } y \mapsto e' \in \dom\Gamma, \tag 7 \\ (\esem{\Gamma}{\rho'})\,y &= \rho'\, y && \text{for }y \notin \dom\Gamma. \tag 8 \end{align*} \medskip \noindent We obtain \begin{align*} \esem{\Gamma}{(\mu R)} &= (\mu R) \tag 9 \intertext{% from comparing (4)--(6) with (7) and (8). We can also show } \esem{\Gamma}{(\mu L)} &= (\mu L), \tag{10} \end{align*} by antisymmetry of $\sqsubseteq$ and using that least fixed points are least pre-fixed points: \begin{itemize} \item[$\sqsubseteq$:] We need to show that $(\mu L) \addon{\dom\Gamma} \dsem{\Gamma}{(\mu L)} \sqsubseteq (\mu L)$, which follows from (1). \item[$\sqsupseteq$:] We need to show that \[ \esem{\Gamma}{(\mu L)} \addon{\domxG} \dsem{x\mapsto e, \Gamma}{\esem{\Gamma}{(\mu L)}} \sqsubseteq \esem{\Gamma}{(\mu L)}. \] For $\dom\Gamma$, this follows from (7), so we show $\dsem{e}{\esem{\Gamma}{(\mu L)}} \sqsubseteq (\mu L)\, x = \dsem{e}{(\mu L)}$, which follows from the monotonicity of $\dsem{e}{\_}$ and case $\sqsubseteq$. \end{itemize} To show the conclusion of the lemma, i.e.\ $(\mu L) = (\mu R)$, we again use antisymmetry and the leastness of least fixed points: \begin{itemize} \item[$\sqsubseteq$:] We need to show that $L\, (\mu R) = \mu R$, i.e. \begin{compactitem} \item $\rho\,y = (\mu R)\, y$ for $y \notin \domxG$, which follows from (6), \item $\dsem{e'}{\mu R} = (\mu R)\, y$ for $y \mapsto e' \in \Gamma$, which follows from (4) and (9) and \item $\dsem{e}{\mu R} = (\mu R)\, x$, which follows from (5) and (9). \end{compactitem} \item[$\sqsupseteq$:] Now we have to show that $R\ (\mu L) = (\mu L)$, i.e. \begin{compactitem} \item $\rho\,y = (\mu L)\, y$ for $y \notin \domxG$, which follows from (3), \item $\dsem{e'}{\esem{\Gamma}{(\mu L)}} = (\mu L)\, y$ for $y \mapsto e' \in \Gamma$, which follows from (1) and (10), and \item $\dsem{e}{\esem{\Gamma}{(\mu L)}} = (\mu L)\, x$, which follows from (2) and (10). \end{compactitem} \end{itemize} \end{proof} The final lemma required for the correctness proof shows that the denotation of a set of bindings with only fresh variables can be merged with the heap it was defined over: \begin{lemma}[Merging the heap semantics] \label{lem:esem-merge} If $\dom \Delta$ is fresh with regard to $\Gamma$ and $\rho$, then \[ \esem{\Delta}{\esem{\Gamma}\rho} = \esem{\Delta, \Gamma}\rho. \] \end{lemma} \begin{proof} %First note that %\begin{align*} %\esem{\Gamma}\rho \preceq \esem{\Delta}{\esem{\Gamma}\rho}, \tag{$\ast$} %\end{align*} %as the variables bound in $\Delta$ are fresh and existing bindings in $\esem{\Gamma}\rho$ keep their semantics. We use the antisymmetry of $\sqsubseteq$, and the leastness of least fixed points. \begin{itemize} \item[$\sqsubseteq$:] We need to show that $\esem{\Gamma}\rho \addon{\dom\Delta} \dsem\Delta{\esem{\Delta,\Gamma}\rho} = \esem{\Delta,\Gamma}\rho$, which we verify pointwise. \begin{compactitem} \item For $x \in \dom\Delta$, this follows directly from \cref{lem:esem_this}. \item For $x\notin \dom\Delta$, this holds as the variables bound in $\Delta$ are fresh, so the bindings in $\esem{\Gamma}\rho$ keep their semantics. \end{compactitem} \item[$\sqsupseteq$:] We need to show that $\rho \addon{\dom{(\Delta,\Gamma)}} \dsem{\Delta, \Gamma}{\esem{\Delta}{\esem{\Gamma}\rho}} = \esem{\Delta}{\esem{\Gamma}\rho}$. \begin{compactitem} \item For $x\in \dom\Delta$, this follows from unrolling the fixed point on the right hand side once. \item For $x\mapsto e \in \dom\Gamma$ (and hence $x\notin \dom \Delta$), we have \begin{conteq}[onecolumn] (\rho \addon{\dom{(\Delta,\Gamma)}} \dsem{\Delta, \Gamma}{\esem{\Delta}{\esem{\Gamma}\rho}})\, x \\ = \dsem{e}{\esem{\Delta}{\esem{\Gamma}\rho}} & by \cref{lem:esem_this} \\ = \dsem{e}{\esem{\Gamma}\rho} & because $\dom\Delta$ is fresh with regard to $e$ \\ = (\esem\Gamma\rho)\, x & by unrolling the fixed point \\ = (\dsem{\Delta}{\esem\Gamma\rho})\, x & because $x\notin \dom\Delta$ and \cref{lem:esem_this}. \end{conteq} \item For $x\notin \dom (\Delta,\Gamma)$, we have $\rho\, x$ on both sides. \end{compactitem} \end{itemize} \end{proof} \subsection{Discussions of modifications} \label{sec:modifications2} My main \cref{thm:main} and the generalisation in \cref{thm:thm2} differ from Launchbury's corresponding Theorem 2. The additional requirement that the judgements are closed has already been discussed in \cref{sec:modifications1}. Furthermore, the second part of \cref{thm:thm2} is phrased differently. Launchbury states $\esem\Gamma\rho \le \esem\Delta\rho$ where $\rho \le \rho'$ is defined as \[ \forall x.\, \rho\,x \ne \bot \implies \rho\,x = \rho'\, x, \] i.e.\ $\rho'$ agrees with $\rho$ on all variables that have a meaning in $\rho$. % The issue with this definition is that there are two reasons why $\esem\Gamma\rho\,x=\bot$ can hold: Either $x \notin \dom\Gamma$, or $x \in \dom\Gamma$, but bound to a diverging value. Only the first case is intended here, and actually $\le$ is used as if only that case can happen, e.g.\ in the treatment of \sRule{Var} in the correctness proof. I therefore avoid the problematic $\le$ relation and explicitly show $\esem\Gamma\rho \eqon{\dom\Gamma} \esem\Delta\rho$. \section{Adequacy} A correctness theorem for a natural semantics is not worth much on its own. Imagine a mistake in side condition of the \sRule{Let} rule that accidentally prevents any judgement to be derived for programs with a \keyword{let} -- the correctness theorem would still hold. It is therefore desirable to prove that all programs that have a meaning, in this case according to the denotational semantics, indeed have a derivation: \begin{theorem}[Adequacy] For all expressions $e$, heap $\Gamma$ and set of variables $L$, if $\dsem{e}{\esem{\Gamma}} \ne \bot$, then there exists a heap $\Delta$ and a value $v$ so that $\sred \Gamma e L \Delta v$. \label{thm:adequacy} \end{theorem} The proof uses a modified denotational semantics that keeps track of the number of steps required to determine the non-bottomness of $e$, which I introduce in the next subsection. I will then show that the natural semantics is adequate with regard to the modified denotational semantics, and make the connection by showing how the two denotational semantics relate. \subsection{The resourced denotational semantics} \index{.R2sqsubset@$\idxdummy\sqsubseteq\idxdummy$!on $\sC$} \index{C@$\sC$\idxexpl{domain of resources}} \index{C@$C$\idxexpl{injection function of domain $\sC$}} The domain used to count the resources is a solution to the equation $\sC = \sC_\bot$. The lifting is done by the injection function $C \colon \sC \to \sC$, so the elements are \[ \bot \sqsubset C~\bot \sqsubset C~(C~\bot) \sqsubset \cdots \sqsubset C^n \sqsubset \cdots \sqsubset C^\infty \] This is isomorphic to the extended naturals. I use $r$ for variables ranging over $\sC$. The notation $\Crestr{f}{r}$ restricts a function $f$ with domain $C$ to take at most $r$ resources: $(\Crestr f r)~r'\coloneqq f~(r \sqcap r')$. \index{.O2restr@$\idxdummy"|_\idxdummy$!on functions taking a $\sC$} The resourced semantics $\dsemr{e}{\sigma}$ of an expression $e$ in environment $\sigma$ is now a function which takes an additional argument $r$ of type $\sC$ to indicate the number of steps the semantics is still allowed to perform: Every recursive call in the definition of $\dsemr{e}{\sigma}~r$ peels one application of $C$ off $r$ until none are left. The type of the environment changes as well: It is now $\sVar \to (\sC \to \sCValue)$. I use $\sigma$ for variables ranging over such \emph{resourced environments}. The intuition is that if we pass in an infinite number of resources, the two semantics coincide: \[ \forall x.\, \rho~x = \sigma~x~C^\infty \implies \dsem{e}{\rho} = \dsemr{e}{\sigma}~C^\infty, \] as Launchbury puts it. While this intuition is intuitively true, it cannot be stated that naively: Because the semantics of an expression is now a function taking a $\sC$, this needs to be reflected in the domain equation, which therefore constructs a different domain, as observed by {\lydiaetal} \index{CValue@\sCValue\idxexpl{resourced semantic domain}} \[ \sCValue = ((\sC \to \sCValue) \to (\sC \to \sCValue))_\bot \] The lifting and the projection functions are hence \index{CFn@$\sCFn{\idxdummy}$} \index{CFnProj@$\sCFnProj{\idxdummy}{\idxdummy}$} \begin{align*} \sCFn \_ &\colon (\sC \to \sCValue) \to (\sC \to \sCValue) \to \sCValue \\ \sCFnProj{\_}{\_} &\colon \sCValue \to (\sC\to\sCValue) \to (\sC \to \sCValue). \end{align*} \begin{figure} \abovedisplayskip=0pt \belowdisplayskip=0pt \begin{align*} \dsemr{e}\sigma~ \bot &\coloneqq \bot\\ \dsemr{\sLam x e}\sigma ~ (C~r) &\coloneqq \sCFn{\lambda v. \Crestr{\dsemr e {\sigma \sqcup [x \mapsto v]}} r}\\ \dsemr{\sApp e x}\sigma ~ (C~r) &\coloneqq (\sCFnProj {(\dsemr e \sigma~r)}{\Crestr{(\sigma~x)} r})~r\\ \dsemr{x}\sigma ~ (C~r) &\coloneqq \sigma~x~r\\ \dsemr{\sLet{\Delta}e}\sigma ~ (C~r) &\coloneqq \dsemr e {\esem{\Delta}\sigma}~r \end{align*} \index{.Sdsemr@$\dsemr\idxdummy\idxdummy$\idxexpl{resourced denotational semantics of expressions}} \index{Ndsemr@$\dsemr\idxdummy\idxdummy$\idxexpl{resourced denotational semantics of expressions}} \caption{The resourced denotational semantics} \label{fig:resdendef} \end{figure} \index{.Sesemr@$\esemr\idxdummy\idxdummy$\idxexpl{resourced denotational semantics of heaps}} \index{Nesemr@$\esemr\idxdummy\idxdummy$\idxexpl{resourced denotational semantics of heaps}} The definition of the resourced semantics, given in \cref{fig:resdendef}, resembles the definition of the standard semantics with some additional resource bookkeeping. The semantics of the heap is defined as before: \begin{align*} \esemr{\Gamma}\sigma &\coloneqq (\mu \sigma'.\, \sigma \addon{\dom \Gamma} \dsemr{\Gamma}{\sigma'}). \end{align*} Given the similarity between this semantics and the standard semantics, it is not surprising that \cref{lem:esem_this,lem:subst,lem:iter,lem:esem-merge} hold as well. In fact, in the formal development, they are stated and proved abstractly, using \emph{locales} \cite{locales} as a modularisation tool, and then simply instantiated for both variants of the semantics. I describe this approach in \cref{sec:abstracting-over-semantics}. The correctness lemma needs some adjustments, as a more evaluated expression requires fewer resources. It therefore only provides an inequality: \begin{lemma}[Correctness, resourced] If $\sred \Gamma e L \Delta v$ holds and is closed, then for all resourced environments $\sigma$ we have $\dsemr{e}{\esem{\Gamma}{\sigma}} \sqsubseteq \dsemr{v}{\esem{\Delta}{\sigma}}$ and $(\esemr\Gamma\sigma)|_{\dom\Gamma} \sqsubseteq (\esemr\Delta\sigma)|_{\dom\Gamma}$.% \label{lem:resourced_correctness} \end{lemma} \begin{proof} Analogously to the proof of \cref{thm:thm2}. \end{proof} \subsection{Denotational black holes} \label{sec:denblackhole} The major difficulty in proving computational adequacy is the blackholing behaviour of the operational semantics: During the evaluation of a variable $x$ the corresponding binding is removed from the heap. Operationally, this is desirable: If the variable is called again during its own evaluation, we would have an infinite loop anyways. But obviously, the variable is still mentioned in the current configuration, and simply removing the binding will change the denotation of the configuration in unwanted ways: There is no hope of proving $\dsemr{e}{\esemr{x\mapsto e, \Gamma}} = \dsemr{e}{\esemr{\Gamma}}$. But a weaker statement holds, which reflects the idea of ``not using $x$ during its own evaluation'' more closely: \begin{lemma}[Denotational blackholing] \lemdisplayunskip \[ \label{lem:denblackhole} \dsemr{e}{\esemr{x\mapsto e, \Gamma}} r \ne \bot \implies \dsemr{e}{\esemr{\Gamma}} r \ne \bot \] \end{lemma} This is a consequence of the following lemma, which states that during the evaluation of an expression using finite resources, only fewer resources will be passed to the members of the environment (which are of type $\sC \to \sCValue$): \begin{lemma} \lemdisplayunskip \[ \label{lem:denrestr} \Crestr{\dsemr{e}\sigma}{C~r} = \Crestr{\dsemr{e}{(\Crestr{\sigma}{r})}}{C~r} \] where $\Crestr{\sigma}{r}$ is an abbreviation for $\lambda x. \Crestr{(\sigma~x)}{r}$. \end{lemma} \begin{proof} by induction on the expression $e$. In order to show $\Crestr{\dsemr{e}\sigma}{C~r} = \Crestr{\dsemr{e}{(\Crestr{\sigma}{r})}}{C~r}$ it suffices to show that $\dsemr{e}\sigma~(C~r') = \dsemr{e}{(\Crestr{\sigma}{r})}~(C~r')$ for any $r'\sqsubseteq r$. The critical case is the one for variables, where $e = x$. We have \begin{conteq}[oneline] \dsemr{x}\sigma~(C~r') \\ = \sigma~x~r' \\ = (\Crestr{\sigma~x}{r})~r' & as $r'\sqsubseteq r$ \\ = \dsemr{x}{(\Crestr{\sigma}{r})}~(C~r') \end{conteq} as $r'\sqsubseteq r$. In the other cases, the result follows from the fact that nested expressions are evaluated with $r'$ resources or, in the case of lambda abstraction, wrapped inside a $\Crestr \strut {r'}$ restriction operator. For the case of \keyword{let}, a related lemma for heaps needs to be proven by parallel fixed-point induction, namely $\forall r.\, \Crestr{(\esemr{\Gamma}\sigma)}r = \Crestr{(\esemr{\Gamma}{(\Crestr\sigma r)})}r$. \end{proof} Equipped with this lemma, we can begin the \begin{proof}[of \cref{lem:denblackhole}] Let $r'$ be the least resource such that $\dsemr{e}{\esemr{x\mapsto e, \Gamma}} (C~r') \ne \bot $. Such an $r'$ exists by the assumption, and $C~r' \sqsubseteq r$, and by the continuity of the semantics $r' \ne C^\infty$. In particular, $\dsemr{e}{\esemr{x\mapsto e, \Gamma}} r' = \bot$. We first show \begin{equation} \Crestr{\esemr{x\mapsto e, \Gamma}}{r'} \sqsubseteq \esemr{\Gamma} \tag{$\ast$} \end{equation} by bounded fixed-point induction. So given an arbitrary environment $\sigma \sqsubseteq \esemr{x\mapsto e, \Gamma}$, we may assume $\Crestr{\sigma}{r'} \sqsubseteq \esemr{\Gamma}$ and have to prove $\Crestr{\dsemr{x\mapsto e, \Gamma}{\sigma}}{r'} \sqsubseteq \esemr{\Gamma}$, which we do point-wise: For $y\mapsto e'\in\Gamma$, this follows from \begin{conteq} \Crestr{\dsemr{x\mapsto e, \Gamma}{\sigma}}{r'}~y \\ = \Crestr{\dsemr{e'}{\sigma}}{r'} \\ = \Crestr{\dsemr{e'}{\Crestr{\sigma}{r'}}}{r'} & by \cref{lem:denrestr} \\ \sqsubseteq \dsemr{e'}{\Crestr{\sigma}{r'}} \\ \sqsubseteq \dsemr{e'}{\esemr{\Gamma}} & by the induction hypothesis \\ = \esemr{\Gamma}~y & by \cref{lem:esem_this} \end{conteq} while for $x$, this follows from \begin{conteq} \Crestr{\dsemr{x\mapsto e, \Gamma}{\sigma}}{r'}~x \\ = \Crestr{\dsemr{e}{\sigma}}{r'} \\ \sqsubseteq \Crestr{\dsemr{e}{\esemr{x\mapsto e, \Gamma}}}{r'} & using $\sigma \sqsubseteq \esemr{x\mapsto e, \Gamma}$ \\ = \bot & by the choice of $r'$ \\ = \esemr{\Gamma}~x & as $x \notin \dom\Gamma$. \end{conteq} \noindent So we can conclude the proof with \begin{conteq} \bot \\ \sqsubset \dsemr{e}{\esemr{x\mapsto e, \Gamma}} (C~r') & by the choice of $r'$ \\ = \dsemr{e}{\Crestr{\esemr{x\mapsto e, \Gamma}}{r'}} (C~r') & by \cref{lem:denrestr} \\ \sqsubseteq \dsemr{e}{\esemr{\Gamma}} (C~r') & by $(\ast)$ \\ \sqsubseteq \dsemr{e}{\esemr{\Gamma}} r & as $C~r'\sqsubseteq r$ \end{conteq} \end{proof} \subsection{Resourced adequacy} \label{sec:resourced_adequacy} Now the necessary tools to handle blackholing are in place for the adequacy proof with regard to the resourced semantics. \begin{lemma}[Resourced semantics adequacy] \label{lem:resad} For all $e$, $\Gamma$ and $L$, if $\dsemr{e}{\esemr{\Gamma}}~r \ne \bot$, then there exists $\Delta$ and $v$ so that $\sred \Gamma e L \Delta v$. \end{lemma} \begin{proof} Because the semantics is continuous, it suffices to show this for $r = C^n~\bot$, and perform induction on this $n$, with arbitrary $e$, $\Gamma$ and $L$. The case $r=C^0~\bot = \bot$ is vacuously true, as $\dsemr{e}{\esemr{\Gamma}}~\bot = \bot$. For the inductive case assume that the lemma holds for $r$, and that $\dsemr{e}{\esemr{\Gamma}}~(C~r) \ne \bot$. We proceed by case analysis on the expression $e$. \case{$e = x$.} From the assumption we know that $\Gamma = x\mapsto e',\Gamma'$ for some $e'$ and $\Gamma'$, as otherwise the denotation would be bottom, and furthermore that $\dsemr{e'}{\esemr{x\mapsto e',\Gamma'}}~r \ne\bot$ With \cref{lem:denblackhole} this implies $\dsemr{e'}{\esemr{\Gamma'}}~r \ne\bot$, so the induction hypothesis applies and provides $\Delta$ and $v$ with $\sred {\Gamma'} {e'} {L\cup\{x\}} \Delta v$. This implies $\sred {x\mapsto e',\Gamma'} {x} {L} \Delta v$ by rule \sRule{Var}, as desired. \case{$e = e'~x$.} Assume that $\fv{\Gamma, e'} \subseteq L$. No generality is lost here: If a derivation in the natural semantics with a larger set of variables to avoid than required holds, then the same derivation is also valid with the required set $L$. From the assumption we know $\sCFnProj {(\dsemr {e'} {\esemr{\Gamma}}~r}{\Crestr{({\esemr{\Gamma}}~x)}{r}})~r \ne \bot$. In particular $(\dsemr {e'} {\esemr{\Gamma}}~r)\ne \bot$, so by the induction hypothesis we have $\Delta$, $y$ and $e''$ with $\sred{\Gamma}{e'}L\Delta {\sLam y {e''}}$, the first hypothesis of \sRule{App}. This judgement is closed by the extra assumption, so \cref{lem:resourced_correctness} ensures that $\dsemr{e'}{\esemr{\Gamma}} \sqsubseteq \dsemr{\sLam y {e''}}{\esemr{\Delta}}$ and $\esemr{\Gamma} \sqsubseteq \esemr{\Delta}$. We can insert that into the inequality above to calculate \begin{conteq} \bot \\ \sqsubset \big(\sCFnProj {\dsemr {e'} {\esemr{\Gamma}}~r}{\Crestr{({\esemr{\Gamma}}~x)}{r}}\big)~r \\ \sqsubseteq \big(\sCFnProj {\dsemr {\sLam y {e''}} {\esemr{\Delta}}~r }{\Crestr{({\esemr{\Delta}}~x)}{r}}\big)~r \\ \sqsubseteq \big(\sCFnProj {\dsemr {\sLam y {e''}} {\esemr{\Delta}}~r }{\esemr{\Delta}~x}\big)~r \\ = \big(\sCFnProj {\sCFn{\lambda v.\, \Crestr{\dsemr {e''} {\esemr{\Delta} \sqcup [y \mapsto v]}}{r}}}{\esemr{\Delta}~x}\big)~r \\ \sqsubseteq \big(\sCFnProj {\sCFn{\lambda v.\, \dsemr {e''} {\esemr{\Delta} \sqcup [y \mapsto v]}}}{\esemr{\Delta}~x}\big)~r \\ = \dsemr {e''} {\esemr{\Delta} \sqcup [y \mapsto (\esemr{\Delta}~x)]}~r \\ = \dsemr {\subst{e''}{y}{x}} {\esemr{\Delta} }~r & by \cref{lem:subst} \end{conteq} which, using the induction hypothesis again, provides us with $\Theta$ and $v$ so that the second hypothesis of \sRule{App}, $\sred{\Delta}{\subst{e''}{y}{x}}L\Theta v$, holds, concluding this case. \case{$e=\sLam y {e'}$} This case follows immediately from rule \sRule{Lam} with $\Delta=\Gamma$ and $v = \sLam y{e'}$. \case{$e=\sLet \Delta {e'}$} We have \begin{conteq} \bot \\ \sqsubset \dsemr{\sLet \Delta {e'}}{\esemr{\Gamma}}~r \\ \sqsubseteq \dsemr{e'}{\esemr{\Delta}{\esemr{\Gamma}}} \\ = \dsemr{e'}{\esemr{\Delta,\Gamma}} & by \cref{lem:esem-merge} \end{conteq} so we have $\Theta$ and $v$ with $\sred{\Delta,\Gamma}{e'}L\Theta v$ and hence $\sred{\Gamma}{\sLet \Delta {e'}}L\Theta v$ by rule \sRule{Let}, as desired. \end{proof} \subsection{Relating the denotational semantics} \label{sec:relating_the_semantics} \cref{lem:resad} is almost what we want, but it talks about the resourced denotational semantics. In order to obtain that result for the standard denotational semantics, we need to relate these two semantics. We cannot simply equate them, as they have different denotational domains $\sValue$ and $\sC\to\sCValue$. So we are looking for a relation $\dsim$ between $\sValue$ and $\sCValue$ that expresses the intuition that they behave the same, if the latter is given infinite resources. In particular, it is specified by the two equations \index{.R2sim@$\idxdummy\dsim\idxdummy$\idxexpl{relating $\sValue$ and $\sC \to \sCValue$}} \[ \bot \dsim \bot \] and \[ (\forall x\,y.\, x \dsim y~C^\infty \implies f~ x \dsim g~y~C^\infty) \iff \sFn f \dsim \sCFn g. \] Unfortunately, this is not admissible as an inductive definition, as it is self-referential in a non-monotone way, so the construction of this relation is non-trivial. This was observed and performed by {\lydiaetal} \cite{functionspaces}, and I have subsequently implemented this construction in Isabelle. \index{.R2simheap@$\idxdummy\dsimheap\idxdummy$\idxexpl{relating $\sHeap$ and $\sVar \to (\sC \to \sCValue)$}} I lift this relation to environments $\rho \in \sEnv$ and resourced environments $\sigma \in \sVar \to (\sC \to \sValue)$ by \[ \rho \dsimheap \sigma \iff \forall x.\, \rho~x \dsim \sigma~x~C^\infty. \] \goodbreak This allows us to state precisely how the two denotational semantics are related: \begin{lemma}[The denotational semantics are related] \label{lem:denrel} For all environments $\rho \in \sEnv$ and $\sigma \in \sVar \to (\sC \to \sValue)$ with $\rho \dsimheap \sigma$, we have \[ \dsem e \rho \dsim \dsemr e \sigma~{C^\infty}. \] \end{lemma} \begin{proof} Intuitively, the proof is obvious: As we are only concerned with infinite resources, all the resource counting added to the denotational semantics becomes moot and the semantics are obviously related. A more rigorous proof can be found in \cite{functionspaces} and in my formal verification. \end{proof} \begin{corollary} For all heaps $\Gamma$, we have $\esem{\Gamma}\dsimheap \esemr{\Gamma}$. \label{lem:denrelheap} \end{corollary} \begin{proof} by parallel fixed-point induction and \cref{lem:denrel}. \end{proof} I describe my Isabelle formalisation of \cite{functionspaces} in \cref{sec:relatingdomains}, including the mistakes in the original work that I found and fixed. \subsection{Concluding the adequacy} With this in place, I can give the \begin{proof}[of \cref{thm:adequacy}] By \cref{lem:denrelheap} we have $\esem{\Gamma}\dsimheap \esemr{\Gamma}$, and with \cref{lem:denrel} this implies $\dsem{e}{\esem{\Gamma}} \dsim \dsemr{e}{\esemr{\Gamma}}~C^\infty$. With the assumption $\dsem{e}{\esem{\Gamma}} \ne \bot$ and the definition of $\dsim$ this ensures that $\dsemr{e}{\esem{\Gamma}}~C^\infty \ne \bot$, and we can apply \cref{lem:resad}, as desired. \end{proof} \subsection{Discussions of modifications} \label{sec:modifications3} My adequacy proof diverges quite a bit from Launchbury's. As it is the first rigorous proof, I discuss the differences in greater detail. \begin{figure} \abovedisplayskip=0pt \belowdisplayskip=0pt \begin{mathpar} \inferrule {\sred{\Gamma}e L{\Delta}{\sLam y e'}\\ \sred{y\mapsto x, \Delta}{e'}L{\Theta}{v}} {\sred\Gamma{\sApp e x}L\Theta v} \sRule{App'} \and \inferrule {\sred{x\mapsto e, \Gamma} e L \Delta v} {\sred{x\mapsto e, \Gamma} x L \Delta v} \sRule{Var'} \end{mathpar} \index{App'@\sRule{App'}\idxexpl{alternative natural semantics rule for application}} \index{Var'@\sRule{Var'}\idxexpl{alternative natural semantics rule for variables}} \caption{Launchbury’s alternative natural semantics} \label{fig:ans} \end{figure} Launchbury performs the adequacy proof by introducing an alternative natural semantics (ANS) that is closer to the denotational semantics than the original natural semantics (NS). He replaces the rules \sRule{App} and \sRule{Var} with the two rules given in \cref{fig:ans}. There are three differences to be spotted: \begin{enumerate} \item In the rule for applications, instead of substituting the argument $x$ for the parameter $y$, the variable $y$ is added to the heap, bound to $x$, adding an \emph{indirection}. \item In the rule for variables, no update is performed: Even after $x$ has been evaluated to the value $v$, the binding $x$ on the heap is not modified at all. \item Also in the rule for variables, no blackholing is performed: The binding for $x$ stays on the heap during its evaluation. \end{enumerate} Without much ado, Launchbury states that the original natural semantics and the alternative natural semantics are equivalent, which is intuitively convincing. Unfortunately, it turned out that a rigorous proof of this fact is highly non-trivial, as the actual structure of the heaps during evaluation differs a lot: The modification to the application rule causes many indirections, which need to be taken care of. Furthermore, the lack of updates in the variable rules causes possibly complex, allocating expressions to be evaluated many times, each time adding further copies of already existing expressions to the heap. On the other side, the updates in the original semantics further obscure the relationship between the heaps in the original and the alternative semantics. On top of all that add the technical difficulty that is due to naming issues: Variables that are fresh in one derivation might not be fresh in the other, and explicit renamings need to be carried along. {\lydiaetal} have attempted to perform this proof. They broke it down into two smaller steps, going from the original semantics to one with only the variable rule changes (called No-update natural semantics, NNS), and from there to the ANS. So far, they have performed the second step, the equivalence between NNS and ANS, in a pen-and-paper proof \cite{indirections}, while relation between NS and NNS has yet resisted a proper proof \cite{sanchezlaunchbury}. Considering these difficulties, I went a different path, and bridged the differences not on the side of the natural semantics, but on the denotational side, which turned out to work well: \begin{enumerate} \item The denotational semantics for lambda expressions changes the environment ($\dsem{\sLam x e}\rho \coloneqq \sFn{\lambda v. \dsem e {\rho \sqcup [x \mapsto v]}}$), while the natural semantics uses substitution into the expression: $\subst{e}{y}{x}$. This difference is easily bridged on the denotational side by the substitution \cref{lem:subst}, which we need anyways for the correctness proof. See the last line of the application case in the proof of \cref{lem:resad} for this step. \item The removal of updates had surprisingly no effect on the adequacy proof: The main chore of the adequacy proof is to produce evidence for the \emph{assumptions} of the corresponding natural semantics inference rule, which is then, in the last step, applied to produce the desired judgement. The removal of updates only changes the \emph{conclusion} of the rule, so the adequacy proof is unchanged. Of course updates are not completely irrelevant, and they do affect the adequacy proof indirectly. The adequacy proof uses the correctness theorem for the resourced natural semantics (\cref{lem:resourced_correctness}), and there the removal of updates from the semantics would make a noticeable difference. \item Finally, and most trickily, there is the issue of blackholing. I explain my solution in \cref{sec:denblackhole}, which works due to a small modification to the resourced denotational semantics.\goodbreak My proof relies on the property that when we calculate the semantics of $\dsemr{e}\sigma\,r$, we never pass more than $r$ resources to the values bound in $\sigma$ (\cref{lem:denrestr}). This concurs with the intuition about resources. In Launchbury’s original definition of the resourced semantics, this lemma does not hold: The equation for lambda expression ignores the resources passed to it and returns a function involving the semantics of the body: \[ \dsemr{\sLam x e}\sigma ~ (C~r) \coloneqq \sCFn{\lambda v. \dsemr e {\sigma \sqcup [x \mapsto v]}} \] With that definition, $\dsemr{\sLam x y}\sigma~(C~\bot) = \sCFn{\lambda\_.\,\sigma~y}$, which depends on $\sigma~y~r$ for all $r$, contradicting \cref{lem:denrestr}. Therefore, I restrict the argument of $\sCFn \_$ to cap any resources passed to it at $r$. Analogously I adjust the equation for applications to cap any resources passed to the value of the argument in the environment, $\sigma~x$. These modifications do not affect the proof relating the two denotational semantics (\cref{lem:denrel}), as there we always pass infinite resources, and $|_{C^\infty}$ is the identity function. \end{enumerate} \section{Data type encodings and base values} Launchbury's semantics is a typical core calculus used for research: Minimalistic as far as possible. Lambda abstraction, application and variables are enough to have a full functional programming language, and \keyword{let} expressions are added to talk explicitly about sharing and recursion. \subsection{Data types via Church encoding} Many other features of a typical programming language are omitted, and that is fine, because they can often be modelled with these primitive building blocks. For example data constructors and case analysis are expressible using a suitable encoding, such as the Church encoding. Consider tuples:\goodbreak The constructor of a product type can be implemented as \begin{align*} \vname{Pair} &= \sLam{x\,y\, z}\sApp{\sApp z x} y \intertext{ with the projection functions } \vname{fst} &= \sLam{p}{\sApp p {(\sLam{x\,y}x)}}\\ \vname{snd} &= \sLam{p}{\sApp p {(\sLam{x\,y}y)}}. \end{align*} For the purposes of analysing lazy evaluation, these encodings sufficiently capture the behaviour of constructors. In particular, the semantics is set up so that the arguments of such a constructor are evaluated at most once, matching the expected behaviour of “real” constructors in a lazy language. \begin{example} Consider the following code, which stores an unevaluated expression $\sApp fx$ in the $\vname{Pair}$ constructor, extracts it later and evaluates it twice. \[ \begin{pmboxed} \> \keyword{let} \>~f \> = \sLam{x\,y}y\\ \> \>~p \> = \sApps{\vname{Pair} {(\sApp fx)} (\sApp fy)} \\ \> \keyword{in}~\sApps{(\sApp{\vname{fst}}p) (\sApp{\vname{fst}}p) x} \end{pmboxed} \] We expect the redex $(\sApp fx)$ to be evaluated only once, and indeed, this is the case. But note that the example code does not actually follow my syntax, because we have non-trivial expressions as arguments. By the mentioned preprocessing, the code should actually be \[ \begin{pmboxed} \>\keyword{let}\>~f \> = \sLam{x\,y}y\\ \> \>~ p \> = (\sLet{y_2 = \sApp fy}{\sLet{y_1 = \sApp fx}{{\sApps{\vname{Pair} {y_1} {y_2}}}}}) \\ \>\keyword{in}~\sLet{y_3 = \sApp{\vname{fst}}p}{\sApps{\vname{fst} p {y_3} x}} \end{pmboxed} \] If this expression is called $e$, then we have $\sred {[]} e {\{\}} \Delta v$ where \begin{align*} \Delta = f &\mapsto \sLam{x\,y}y, \\ p &\mapsto \sLam{z}{\sApps{z {y_1} {y_2}}}, \\ y_3 &\mapsto \sLam yy, \\ y_2 &\mapsto \sApp fy, \\ y_1 &\mapsto \sLam yy, \end{align*} and $v = \sApps{x}$, and the pair now stores its first argument in its evaluated form, while the second argument is still unevaluated. Tracing the complete derivation of this judgement (which is a too large to be reproduced here) we see that $(\sApp f x)$ is indeed evaluated only once. \end{example} There is, however, a trait of “real” constructors and case analysis that is not easily modelled by the Church encoding: With the Haskell code \begin{hcode} \> !case b !of \> True \> → do this \\ \> \> False \> → do that \end{hcode} it is obvious that either \hi{do this} or \hi{do that} is executed, but not both. The same cannot be said for the corresponding code in a Church encoding of Booleans: \begin{align*} \vname{True} &= \sLam{x\,y}{x}\\ \vname{False} &= \sLam{x\,y}{y}\\ \vname{ifThenElse} &= \sLam{p\,x\,y}{\sApps{p x y}} \end{align*} where the code $\sApps{\vname{ifThenElse} b {(\sApp{\vname{do}}{\vname{this}})} {(\sApp{\vname{do}}{\vname{that}}})}$ may well evaluate both branches -- there is no guarantee that $b$ is one of the well-behaving expressions $\vname{True}$ or $\vname{False}$. In the later chapters, I am modelling an analysis that makes use of exactly that: A case analysis evaluates at most one of its branches (and exactly one, unless the scrutinee diverges). To prove that analysis to be correct, I need the language at hand to include a built-in case analysis operator that exhibits that behaviour -- the above Church encoding is not enough. But as just that feature is needed, I add just what is required to model it, and not more: Two constructors without arguments, and an \hi{!if-!then-!else}-construct. I deliberately do not add more complex data types that can carry parameters: As just explained, that would not add anything of value to the semantics. \subsection{Adding Booleans} \label{sec:addingbooleans} \index{C@$\sCon \idxdummy$\idxexpl{Boolean constructors}} \index{.C3ifthenelse@$\sITE\idxdummy\idxdummy\idxdummy$\idxexpl{\hi{!if-!then-!else} construct}} The extended language sports three additional syntactical constructs, two of which are also values: \begin{align*} e \in \sExp &\Coloneqq \ldots \mid \sCon{\bT} \mid \sCon{\bF} \mid \sITE{e}{e_\bT}{e_\bF}\\ v \in \sVal &\Coloneqq \ldots \mid \sCon{\bT} \mid \sCon{\bF} \end{align*} The notation $\sITE{e}{e_\bT}{e_\bF}$, taken from the ternary operator in C-like languages, is a succinct way to write an \hi{!if-!then-!else} construct, and the use of $\bT$ and $\bF$ in variable indices avoids repeating rules for the two cases, if the meta variable $b$ is used to represent either of these. This can be seen in the two additional rules for the natural semantics: \index{Con@\sRule{Con}\idxexpl{natural semantics rule for constructors}} \index{IfThenElse@\sRule{IfThenElse}\idxexpl{natural semantics rule for \hi{!if-!then-!else} constructs}} \begin{mathpar} \inferrule { } {\sred{\Gamma}{\sCon b}L{\Gamma}{\sCon b}} \sRule{Con} \and \inferrule {\sred{\Gamma}e L{\Delta}{\sCon b}\\ \sred{\Delta}{e_b}L{\Theta}{v}} {\sred\Gamma{\sITE e {e_\bT}{e_\bF}}L\Theta v} \sRule{IfThenElse} \end{mathpar} As the other rules of the semantics are unchanged, the proofs by induction performed in this chapter only need to be extended by the two additional cases. The required changes to the denotational semantics are a bit more involved, as the semantic domain changes: Besides functions, the semantics can also return Booleans, and the equation becomes \[ \sValue = ((\sValue \to \sValue) + 2)_\bot, \] where $+$ is the disjoint sum and $2$ the discrete two-element domain with the (conveniently named) elements $\{\bT, \bF\}$. In addition to the existing injection and projection functions \begin{align*} \sFn \_ &\colon (\sValue \to \sValue) \to \sValue \\ \sFnProj{\_}{\_} &\colon \sValue \to \sValue \to \sValue \end{align*} there is the additional injection function \index{B@$\sB{\idxdummy}$} \begin{align*} \sB \_ &\colon 2 \to \sValue \end{align*} and the deconstruction function \index{B@$\sBProj\idxdummy\idxdummy\idxdummy$} \begin{align*} \sBProj{\_}{\_}{\_} &\colon \sValue \to \sValue \to \sValue \to \sValue\\ \intertext{where} \sBProj{v}{v_1}{v_2} &= \begin{cases} v_1 &\text{if } v = \sB\bT\\ v_2 &\text{if } v = \sB\bF\\ \bot &\text{otherwise}. \end{cases} \end{align*} The partial order $\sqsubseteq$ on $\sValue$ relates neither values of the form $\sFn f$ with $\sB b$ nor $\sB \bT$ with $\sB\bF$. The denotation of the new syntactic constructs is given by \begin{align*} \dsem{\sCon{b}}\rho &\coloneqq \sB b\\ \dsem{\sITE e {e_\bT}{e_\bF}}\rho &\coloneqq \sBProj{\dsem{e}\rho}{\dsem{e_\bT}\rho}{\dsem{e_\bF}\rho} \end{align*} The thus extended natural semantics is still correct with regard to the denotational semantics: \nopagebreak \begin{proof}[of \cref{thm:thm2}] Two additional cases need to be handled. \case{\sRule{Con}} This case is trivial, just like case \sRule{Lam}. \case{\sRule{IfThenElse}} The induction hypotheses are $\dsem{e}{\esem{\Gamma}\rho} = \dsem{\sCon b}{\esem{\Delta}\rho}$ and $\esem{\Gamma}\rho \eqon{\dom\Gamma} \esem{\Delta}\rho$ as well as $\dsem{e_b}{\esem{\Delta}\rho} = \dsem{v}{\esem{\Theta}\rho}$ and $\esem{\Delta}\rho \eqon{\dom\Delta} \esem{\Theta}\rho$. We have $\dsem{e_b}{\esem{\Gamma}\rho} = \dsem{e_b}{\esem{\Delta}\rho}$: Because the judgement is closed, i.e.\ $\fv{e_b} \subseteq \dom\Gamma \cup L$, it suffices to show $\esem\Gamma\rho \eqon{\dom\Gamma \cup L} \esem\Delta\rho$. The induction hypothesis provides the equality on $\dom\Gamma$, and for $x\in L\setminus \dom\Gamma$ we also have $x \notin \dom\Delta$ by \cref{lem:avoidsL}, so we have $\rho\,x$ on both sides. Like in case \sRule{App}, the second part follows from the corresponding induction hypotheses and $\dom \Gamma \subseteq \dom \Delta$. The first part can be calculated: \begin{conteq}[explline] \dsem{\sITE{e}{e_\bT}{e_\bF}}{\esem{\Gamma}\rho} \\ = \sBProj{\dsem{e}{\esem{\Gamma}\rho}}{\dsem{e_\bT}{\esem{\Gamma}\rho}}{\dsem{e_\bF}{\esem{\Gamma}\rho}} & by the denotation of the \hi{!if-!then-!else} construct \displaybreak \\ = \sBProj{\dsem{\sCon b}{\esem{\Delta}\rho}}{\dsem{e_\bT}{\esem{\Gamma}\rho}}{\dsem{e_\bF}{\esem{\Gamma}\rho}} & by the induction hypothesis\\ = \sBProj{\sB b}{\dsem{e_\bT}{\esem{\Gamma}\rho}}{\dsem{e_\bF}{\esem{\Gamma}\rho}} & by the denotation of the constructor \\ = \dsem{e_b}{\esem{\Gamma}\rho} \\ = \dsem{e_b}{\esem{\Delta}\rho} & see above \\ = \dsem{v}{\esem{\Theta}\rho} & by the induction hypothesis \end{conteq} \end{proof} The adequacy proof can also be recovered, after extending the resourced domain $\sCValue$ to contain $\{\bT, \bF\}$, i.e. \[ \sCValue = \big(((\sC \to \sCValue) \to (\sC \to \sCValue)) + 2\big)_\bot \] with the analogous injection function and deconstructor \begin{align*} \sCB \_ &\colon 2 \to \sCValue\\ \sCBProj{\_}{\_}{\_} &\colon \sCValue \to \sCValue \to \sCValue \to \sCValue. \end{align*} which allow the definition of the resourced denotational semantics to be extended by \begin{align*} \dsemr{\sCon{b}}\rho &\coloneqq \sCB b\\ \dsemr{\sITE e {e_\bT}{e_\bF}}\rho &\coloneqq \sCBProj{\dsemr{e}\rho}{\dsemr{e_\bT}\rho}{\dsemr{e_\bF}\rho}. \end{align*} \begin{proof}[of \cref{lem:resad}] We need to extend the case analysis on $e$: \case{$e= \sCon b$} follows immediately from the rule \sRule{Con}. \case{$e= \sITE{e'}{e_\bT}{e_\bF}$} %As in the case for application, assume $\fv{\Gamma,e}\subseteq L$. The assumption $\dsemr{\sITE{e'}{e_\bT}{e_\bF}}{\esemr\Gamma} \ne\bot$ resolves to \[ \sCBProj{\dsemr{e'}{\esemr\Gamma}}{\dsemr{e_\bT}{\esemr\Gamma}}{\dsemr{e_\bF}{\esemr\Gamma}}\ne\bot. \] From this, we can conclude that $\dsemr{e'}{\esemr\Gamma} = \sCB b$ for a $b\in\{\bT,\bF\}$, and $\dsemr{e_b}{\esemr\Gamma} \ne \bot$. We can therefore apply the first induction hypothesis and obtain $\Delta$ and $v$ so that $\sred\Gamma{e'}{L'}\Delta v$, where $L' = L \cup \fv{\Gamma,e'}$ -- the extended set of variables ensures that the judgement is closed. By the correctness of the resourced denotational semantics (\cref{lem:resourced_correctness}, the proof of which can be extended analogously to \cref{thm:thm2}), we have $\dsemr{v}{\esemr\Delta}\sqsubseteq\dsemr{e'}{\esemr\Gamma} = \sCB b$, so the value $v$ necessarily is $v = \sCon b$. The correctness lemma also states $\esemr\Gamma \sqsubseteq \esemr\Delta$, so $\dsemr{e_b}{\esemr\Delta} \ne \bot$ and the induction hypothesis provides $\Theta$ and $v'$ with $\sred\Delta{e_b}{L'}\Theta {v'}$. By rule \sRule{IfThenElse}, this shows $\sred\Gamma{\sITE{e'}{e_\bT}{e_\bF}}{L'}\Theta {v'}$ and hence, as $L\subseteq L'$, we have $\sred\Gamma{\sITE{e'}{e_\bT}{e_\bF}}{L}\Theta {v'}$ as desired. \end{proof} The three semantics and the corresponding proofs thus allowed for a modular extension by Booleans: I just added new cases to the syntax, the natural rules, the denotational domains and the various functions, but left the overall structure of the proofs and the other cases as they were. I had to be careful not to make use of the lemma “If $\sred\Gamma e L \Delta v$, then $v$ is a lambda abstraction,” which no longer holds in the extended version; this comes up in the proof of \cref{lem:resad}. \section{A small-step semantics} % \label{sec:sestoftssemantics}% One important feature of Launchbury's natural big-step-semantic is that the stack is implicit: During the evaluation of an application, for example, the argument is not stored anywhere in the configuration, but lives only in the rules. This is elegant and convenient if during a proof the stack does not need to be taken into account, e.g.\@ in the adequacy proof, but causes headaches when the stack \emph{is} relevant to the discussion at hand, which will be the case in \cref{ch:provingcallarity}. For such feats, a semantics with an explicit stack in the configuration is better suited. As explained in the beginning of the chapter, I need to refrain from just building my own semantics that happens to suit me\footnote{I tried that, and it did not go well.} but rather build on existing, well-received definitions. Sestoft has derived a small-step semantics with an explicit stack from Launchbury's semantics, called the \emph{mark-1 abstract machine}, and proved it to be equivalent to Launchbury's semantics. I follow that path and pave it by formalising it in Isabelle. I include my addition of Booleans (\cref{sec:addingbooleans}) in the treatment, but it is a modular extension: Simply ignore the cases related to Booleans and you obtain the plain semantics. \subsection{Sestoft's mark-1 abstract machine} \begin{figure}% \abovedisplayskip=0pt \belowdisplayskip=0pt \begin{flalign*} \noprecondition (\Gamma, \sApp e x, S) &\sest (\Gamma, e, \cons{\arg x}S) \tag*{\sRule{app$_1$}} \\ \noprecondition (\Gamma, \sLam y e, \cons{\arg x}S) &\sest (\Gamma, \subst e y x, S) \tag*{\sRule{app$_2$}} \\ \shortprecondition{(x \mapsto e) \in \Gamma} (\Gamma, x, S) &\sest (\Gamma \setminus x, e, \cons{\upd x}S) \tag*{\sRule{var$_1$}} \\ \shortprecondition{\isVal e} (\Gamma, e, \cons{\upd x}S) &\sest (\Gamma[x \mapsto e], e, S) \tag*{\sRule{var$_2$}} \\ \noprecondition (\Gamma, (\sITE e {e_\bT} {e_\bF}), S) &\sest (\Gamma, e, \cons{\alts{e_\bT}{e_\bF}}S) \tag*{\sRule{if$_1$}} \\ \shortprecondition{b \in \{\bT, \bF\}} %\hspace{2em} (\Gamma, \sCon{b}, \cons{\alts{e_\bT}{e_\bF}}S) &\sest (\Gamma, e_b, S) \tag*{\sRule{if$_2$}} \\ % x \fresh (\Gamma, e_1, S) \implies \tag*{ (\Gamma, \nonreclet x {e_1}{e_2}, S) &\sest (\Gamma[x \mapsto e_1], e_2, S) &&\sRule{let}} \\ \precondition{\dom \Delta \cap \fv{\Gamma, S} = \{\}} (\Gamma, \sLet{\Delta}e, S) &\sest (({\Delta},{\Gamma}), e, S) \tag*{\sRule{let$_1$}} \end{flalign*}% \index{.R2sest@$\idxdummy\sest\idxdummy$\idxexpl{small step reduction (one step)}} \index{App1@\sRule{App$_1$}, \sRule{App$_2$}\idxexpl{small step rules for application}} \index{Var1@\sRule{Var$_1$, \sRule{Var$_1$}}\idxexpl{small step rules for variable}} \index{If1@\sRule{If$_1$, \sRule{If$_1$}}\idxexpl{small step rules for case analysis}} \index{Let1@\sRule{Let$_1$}\idxexpl{small step rule for \keyword{let} bindings}} \caption{The small-step semantics, due to Sestoft \cite{sestoft}} \label{fig:sestoft} \end{figure} Sestoft's semantics operates on \emph{configurations} $(\Gamma, e, S)$ that consist of the heap $\Gamma$, the control $e$ (i.e.\@ the expression currently under evaluation) and the stack $S$. The stack is constructed from \begin{compactitem} \item the empty stack, $[]$, \index{.C0emptystack@$[]$\idxexpl{empty stack}} \index{configuration\idxexpl{heap, expression and stack (small-step semantics)}} \item arguments, written $\cons{\arg{x}}S$ and put on the stack during the evaluation of an application, \index{.C1arg@$\arg\idxdummy$\idxexpl{argument (on the stack)}} \item update markers, written $\cons{\upd{x}}S$ and put on the stack during the evaluation of a variable's right-hand-side, and \index{.C1upd@$\upd\idxdummy$\idxexpl{update marker (on the stack)}} \item alternatives, written \mbox{$\cons{\alts{e_\bT}{e_\bF}}S$} and put on the stack during the evaluation of the scrutinee of an \hi{!if-!then-!else}-construct. \index{.C1alts@$\alts\idxdummy\idxdummy$\idxexpl{alternatives (on the stack)}} \end{compactitem} Throughout this work we assume all configurations to be \emph{good}, i.e. $\dom \Gamma$ and $\upd S \coloneqq \{ x \mid \upd x \in S\}$ are disjoint and the update markers on the stack are distinct. The relation $\sest$, given in \cref{fig:sestoft}, defines the semantics of a one-step-reduction. As usual, $\sests$ denotes the reflexive transitive closure of this relation. %By abuse of notation, $c_1 \sests c_n$ will sometimes also refer to the list of configurations that \index{.R2sests@$\steps\idxdummy\idxdummy$\idxexpl{small step reduction (many steps)}} % % In the interest of naming hygiene, the names for the new bindings in the \sRule{let$_1$} rule have to be fresh with regard to what is already present on the heap and stack, as ensured by the side-condition. % % An interesting side-effect is that this rule, and hence the whole semantics, is not deterministic, as there is an infinite number of valid names that can be used when putting the bindings onto the heap. % % % %Both % %\[ % %\step{([], \letrec {[x\mapsto e]} x, \emptystack)}{([x \mapsto e], x,\emptystack)} % %\] % %and % %\[ % %\step{(\emptystack, \letrec {[x\mapsto e]} x, \emptystack)}{([y \mapsto e], y, \emptystack)} % %\] are valid judgements, and for $x\ne y$ the resulting configurations are different. % A consequence of this is that my proofs cannot take short-cuts using determinism, which would be a problem if “real” nondeterminism were added to the formalism. Note that the semantics takes good configurations to good configurations. \subsection{Relating Sestoft's and Launchbury's semantics} Sestoft's small-step and Launchbury's big-step semantics are closely related, this section explicates this relationship. I follow \cite{sestoft} here, making minor adjustments to ease the implementation in Isabelle. \begin{lemma}[Small-step simulates big-step] If $\sred\Gamma e L \Delta v$ and $\fv{\Gamma, e, S} \subseteq \dom \Gamma \cup L$, then $(\Gamma, e, S) \sests (\Delta, v, S)$. \label{lem:sestoft_simulates_launchbury} \end{lemma} \begin{proof} By induction on the derivation of $\sred\Gamma e L \Delta v$, with $S$ arbitrary. \case{\sRule{Lam} and \sRule{Con}} are trivial, as $\sests$ is reflexive. \case{\sRule{App}} The side condition of the first induction hypothesis follows from the assumption by $\fv{\Gamma, \sApp e x, S} = \fv{\Gamma, e, \cons{\arg x}{S}}$. From that follows the side condition of the second induction hypothesis, i.e.\ $\fv{\Delta, \subst{e'}yx, S}\subseteq \dom \Delta \cup L$, using $\dom\Gamma\subseteq\dom\Delta$ (\cref{lem:doesnotforget}) and using that the natural semantics preserves closedness. It remains to do a simple calculation: \begin{conteq} (\Gamma, \sApp e x,S) \\ \sest (\Gamma, x,\cons{\arg x}{S}) & by \sRule{app$_1$} \\ \sests (\Delta, \sLam y {e'}, \cons{\arg x}{S}) & first induction hypothesis \\ \sest (\Delta, \subst{e'}{y}{x}, S) & by \sRule{app$_2$} \\ \sests (\Theta, v, S) & second induction hypothesis \end{conteq} \case{\sRule{Var}} follows from this calculation: \begin{conteq} ((x\mapsto e,\Gamma), x, S) \\ \sest (\Gamma, e, \cons{\upd x}S) & by \sRule{var$_1$} \\ \sests (\Delta, v, \cons{\upd x}S) & induction hypothesis \\ \sest ((x\mapsto e,\Delta), e, S) & by \sRule{var$_2$} \end{conteq} The side condition of the induction hypothesis, i.e.\@ the inequality $\fv{(\Gamma, e, \cons{\upd x}S)} \subseteq \dom\Gamma \cup (L\cup\{x\})$, follows directly from the given assumption $\fv{(x\mapsto e,\Gamma), x, S} \subseteq \dom(x\mapsto e,\Gamma)\cup L$. \case{\sRule{Let}} The calculation is short: \begin{conteq} (\Gamma, \sLet\Delta e, S) \\ \sest ((\Delta,\Gamma), e, S) & by \sRule{let$_1$} \\ \sests (\Theta,v,S) & by induction hypothesis \end{conteq} where the side-condition of \sRule{let$_1$} follows from the side-condition of \sRule{Let}. \case{\sRule{IfThenElse}} resembles the case for \sRule{App}, with a similar proof for the side conditions, and the calculation \begin{conteq} (\Gamma, \sITE{e}{e_\bT}{e_\bF}, S) \\ \sest (\Gamma, e, \cons{\alts{e_\bT}{e_\bF}}S) & by \sRule{if$_1$} \\ \sests (\Delta, \sCon b, \cons{\alts{e_\bT}{e_\bF}}S) & induction hypothesis \\ \sest (\Delta, e_b, S) & by \sRule{if$_2$} \\ \sest (\Theta, v, S) & induction hypothesis \end{conteq} \end{proof} The proof of the other direction, i.e.\ that an evaluation in the small-step semantics has a corresponding derivation in the big-step-semantics, is a bit more involved, as we need to recover the tree-structure of the big-step-semantics from the flat sequence of configurations in the small step semantics. \index{balanced execution} To that end, we use the notion of a \emph{balanced execution}: An execution $c_1 \sest \cdots \sest c_n$, $n\ge 1$, is balanced if the stack of each intermediate configuration $c_i,\, i\in\{1,\ldots,n-1\}$ is an extension of the stack of $c_1$, and the stack of $c_n$ equals the stack of $c_1$. We write $c_1 \sestsb c_2$ for such a balanced execution. As every rule of the semantics only pushes or pops at most one element off the stack, balanced executions can be broken into smaller parts, which are still balanced, as shown by the following “intermediate value theorem”: \begin{lemma} \label{lem:bal_consE} Given a balanced execution $c_1 \sest c_2 \sest \cdots \sest c_5$ where the stack of $c_2$ is the stack of $c_1$ with one element pushed, then there are intermediate states $c_3$ and $c_4$ so that \[ c_1 \sest c_2 \sestsb c_3 \sest c_4 \sestsb c_5. \] \end{lemma} \begin{proof} Because the execution is balanced, $c_5$ and $c_1$ have the same stack. In particular, the stack of $c_5$ does \emph{not} extend the stack of $c_2$. Let $c_4$ be the first configuration in that sequence whose stack does not extend the stack of $c_2$, and $c_3$ be the configuration preceding $c_4$. I claim that $c_2 \sests c_3$ and $c_4 \sests c_5$ are indeed balanced. Every stack in $c_2 \sests c_3$ extends the stack of $c_2$ by construction. Furthermore, the stack of $c_3$ is equal to the stack of $c_2$: If it was not, then the stack of the configuration following $c_3$, namely $c_4$, would still be an extension of $c_2$'s stack, contradicting the choice of $c_4$. The stack of $c_4$ is equal to the stack of $c_5$: As it follows an extension of $c_2$'s stack, but itself is not an extension of that, it must be $c_2$ with the top element popped. By assumption, that is $c_1$'s stack. So $c_4$ and $c_5$ have the same stack, and all intermediate states have a stack that is an extension of that. \end{proof} \begin{example} This execution is balanced and fulfils the assumptions of \cref{lem:bal_consE}, as the second stack equals the first with one element pushed: \begin{conteq} (\Gamma, \sApp x y, S) \\ \sest (\Gamma, x,\cons{\arg x}S) \\ \sest ([],(\sLam y{(\sLam z z)}), \cons{\upd x}{\cons{\arg x}S}) \\ \sest (\Gamma, (\sLam y{(\sLam z z)}),\cons{\arg x}S) \\ \sest (\Gamma, (\sLam z z),S) \end{conteq} where $\Gamma = x\mapsto (\sLam y{(\sLam z z)})$. \Cref{lem:bal_consE} decomposes this sequences into the two balanced executions \[ (\Gamma, x,\cons{\arg x}S) \sestsb (\Gamma, (\sLam y{(\sLam z z)}),\cons{\arg x}S) \] and \[ (\Gamma, (\sLam z z),S) \sestsb (\Gamma, (\sLam z z),S). \] where the second balanced execution does not actually do any steps. \end{example} \begin{lemma}[Big-step simulates small-step] Let $(\Gamma, e, S) \sestsb (\Delta, v, S)$ with $\isVal v$. Then $\sred\Gamma e {\upd S} \Delta v$. \label{lem:launchbury_simulates_sestoft} \end{lemma} \begin{proof} by complete induction on the number of steps in $(\Gamma, e, S) \sestsb (\Delta, v, S)$. If there are no intermediate steps, then $\Gamma = \Delta$, $e = v$ and we have $\sred\Gamma v {\upd S} \Gamma v$ either by \sRule{Lam} or \sRule{Con}. Otherwise, we proceed by case analysis on the first rule applied in the execution. This rule cannot be \sRule{app$_2$}, \sRule{var$_2$} or \sRule{if$_2$}, as these pop an element off the stack, in contradiction to the execution being balanced. \case{\sRule{app$_1$}} We have $e = \sApp{e'}x$ and using \cref{lem:bal_consE}, we can decompose the execution as follows: \[ (\Gamma, e' x, S) \sest (\Gamma, e', \cons{\arg x}S) \sestsb (\Delta', e_3, \cons{\arg x}S) \sest (\Delta'', e_4, S) \sestsb (\Delta, v, S). \] As only rule \sRule{app$_2$} pops argument marker $\arg x$ off the stack, we obtain \begin{multline*} (\Gamma, e' x, S) \sest (\Gamma, e', \cons{\arg x}S) \sestsb (\Delta', \sLam y {e''}, \cons{\arg x}S)\\ \sest (\Delta', \subst{e''}yx, S) \sestsb (\Delta, v, S). \end{multline*} By induction, the first balanced execution yields $\sred\Gamma {e'} {\upd S} {\Delta'} {\sLam y {e''}}$, while the second yields $\sred{\Delta'}{\subst{e''}yx}{\upd S}{\Delta}v$, which, by \sRule{App}, concludes this case. \case{\sRule{var$_1$}} By an analogous decomposition using \cref{lem:bal_consE} we find $e = x$ and \begin{multline*} (\Gamma, x, S) \sest (\Gamma\setminus\{x\}, e', \cons{\upd x}S) \sestsb (\Delta', z, \cons{\upd x}S) \\ \sest ((x\mapsto z,\Delta'), z, S) \sestsb (\Delta, v, S) \end{multline*} with $(x\mapsto e')\in \Gamma$ and $\isVal z$. \goodbreak The balanced execution $(x\mapsto z,\Delta', z, S) \sestsb (\Delta, v, S)$ is actually empty: With a value as the current execution, only rules \sRule{app$_2$}, \sRule{var$_2$} and \sRule{if$_2$} can apply. But these pop an element off the stack, so they cannot begin a balanced execution. Therefore, $\Delta = x\mapsto z,\Delta'$ and $z = v$. Using the induction hypothesis on the other balanced sub-execution, we obtain $\sred{\Gamma\setminus\{x\}}{e'}{\upd S \cup \{x\}}{\Delta'}v$ which, by \sRule{Var}, concludes this case. \case{\sRule{if$_1$}} Starting as before, we find $ e = \sITE{e'}{e_\bT}{e_\bF}$ and \begin{multline*} (\Gamma, \sITE{e'}{e_\bT}{e_\bF}, S) \sest (\Gamma, e', \cons{\alts{e_\bT}{e_\bF}}S) \sestsb (\Delta', \sCon{b}, \cons{\alts{e_\bT}{e_\bF}}S) \\ \sest (\Delta', e_b, S) \sestsb (\Delta, v, S). \end{multline*} Using the induction hypothesis on the two balanced sub-execution, we obtain $\sred{\Gamma}{e'}{\upd S}{\Delta'}{\sCon{b}}$ and $\sred{\Delta'}{e_b}S\Delta{v}$ which, by \sRule{IfThenElse}, conclude this case. \case{\sRule{let$_1$}} As this rule does not modify the stack, we have \[ (\Gamma, \sLet{\Delta'}{e'}, S) \sest ((\Delta',\Gamma), e', S) \sestsb (\Delta, v, S). \] Using the induction hypothesis on the balanced sub-execution, we obtain $\sred{(\Delta',\Gamma)}{e'}{\upd S}{\Delta}{v}$ which, by \sRule{Let}, concludes this case. \end{proof} \subsection{Discussions of modifications} Sestoft's paper \cite{sestoft} is already on the rigorous side and quite suitable to be brought into a machine-checkable form. One difference is, of course, due to my choice of nominal logic to implement name binding: While Sestoft's rule for \keyword{let} expressions renames the let-bound variables to fresh ones, as they enter the the heap, my rule \sRule{let$_1$} simply assumes them to already be fresh. Intuitively, this is equivalent, but the practical benefit of not having to push the renaming into the expressions by substitution is great. \subsubsection{Constructors} This section already includes the addition of Booleans and the \hi{!if-!then-!else}-construct to the language. Sestoft, following Launchbury, initially only has variables, application, lambda abstraction and mutually recursive \keyword{let}-bindings. As before, my addition is modular: One can simply ignore the extra case and obtain a formalisation of Sestoft's machine. He introduces constructors and \keyword{case} expressions in a separate chapter of his paper. His constructors support parameters, but the design is equivalent to mine. \subsubsection{Fusing a proof} The definition of a balanced execution is the same as Sestoft's, and the proof of \cref{lem:launchbury_simulates_sestoft} follows his idea, but is structured differently. Sestoft first notices, by use of \cref{lem:bal_consE} and rule inversion on the small step rules, that every balanced execution is of one of these forms: \begin{compactitem} \item It is empty. \item It is a sequence of rule \sRule{app$_1$} followed by a balanced execution, followed by \sRule{app$_2$}, followed by another balanced execution. \item It is a sequence of rule \sRule{var$_1$} followed by a balanced execution, followed by \sRule{var$_2$}. \item It is a sequence of rule \sRule{if$_1$} followed by a balanced execution, followed by \sRule{if$_2$}, followed by another balanced execution. \item It is a sequence of rule \sRule{let$_1$} followed by a balanced execution. \end{compactitem} This describes a (context-free) grammar of balanced executions, and his proof of \cref{lem:launchbury_simulates_sestoft} proceeds by induction on the productions of that grammar. I could have followed this path by defining another predicate for balanced executions, as an inductively defined predicate following these rules, then proving that all balanced executions are contained in that grammar and finally performing the proof of \cref{lem:launchbury_simulates_sestoft} by induction on that predicate. But that would be considerably more work for little gain, as long as this grammar of balanced executions is not used again, so I chose to fuse\footnote{It is interesting to see that the ideas behind list fusion, i.e.\@ fusing generators and consumers of inductive data types, carry over to transforming proofs so well. The Curry-Howard correspondence at its best.} these two steps of the proof into one, by using the complete induction on the length of the execution and recovering the tree structure “on the fly” using \cref{lem:bal_consE}. \section{The Isabelle formalisation} \label{sec:launchbury-formalisation} A distinguishing feature of this dissertation’s treatment of Launchbury’s and Sestoft’s semantics is that I have implemented the definitions, theorems and proofs in the interactive theorem prover Isabelle \cite{isabelle}. Nevertheless, I chose to write most of this thesis mostly in the classical style of hand-written mathematics, addressing the reader who is interested in my constructions, results and proofs and who, although happy to know that everything is machine-checked, is not interested in the Isabelle formalisation itself. In contrast, the following section addresses the reader who also wonders how I implemented this in Isabelle, what techniques I used, and why. The section also serves as a map to find your way around the Isabelle theories, and draws the connection between the artefacts in the thesis and the Isabelle development. \subsection{Employing nominal logic} \label{sec:using-nominal} In \cref{sec:syntax}, I introduce the syntax of the lambda calculus and state that I consider these to be equal up to alpha-conversion. In the Isabelle formalisation, I use the nominal package (cf.\@ \cref{sec:nominal}) to create a data type for expressions in my syntax: \isainput{exp-definition} \pagebreak[3] The annotation \isa{\isacommand{binds}} indicates where in the syntax tree binders are, and what their scope is. The command \isa{\isacommand{nominal{\isacharunderscore}datatype}} then takes care of constructing the data type with the desired equalities. The command does not support nested recursion, so it is not possible to simply write \begin{isablock}% {\isacharbar}\ Let\ {\isasymGamma}{\isacharcolon}{\isacharcolon}({\isacharparenleft}var\ {\isasymtimes}\ exp{\isacharparenright}\ list)\ body{\isacharcolon}{\isacharcolon}exp\ \isakeyword{binds}\ {\isachardoublequoteopen}domA\ {\isasymGamma}{\isachardoublequoteclose}\ \isakeyword{in}\ body\ {\isasymGamma} \end{isablock}% Instead, I have to effectively re-define the list type along with the expression type and simultaneously define the function that collects all the binders. Luckily, the resulting type \isa{assn} is indeed isomorphic to \isa{{\isacharparenleft}var\ {\isasymtimes}\ exp{\isacharparenright}\ list}, so subsequently I define conversion functions between these two types and define the function \isa{Let} with the desired type. A definitory command such as \isa{\isacommand{nominal{\isacharunderscore}datatype}} produces a number of definitions and lemmas, such as distinctness of constructors, size lemmas and induction rules. I re-state all of these in terms of \isa{Let} instead of \isa{LetA}, which is slightly tedious, but from then on I can use \isa{Let} exclusively, including in function definitions and inductive proofs, just as if \isa{\isacommand{nominal{\isacharunderscore}datatype}} supported nested recursion directly. \subsection{The type of environments} \label{sec:type-of-envs} The type for environments used here is \isa{var {\isasymRightarrow} Value}, as one would expect. But it was non-trivial to actually implement it this way, and an earlier version went a different route that, although eventually abandoned, is worth describing. The defining equation for the semantics of lambda abstractions is \[ \isa{{\isasymlbrakk}\ {\isasymlambda}x{\isachardot}\ e\ {\isasymrbrakk}\isactrlbsub {\isasymrho}\isactrlesub \ {\isacharequal}\ Fn{\isasymcdot}{\isacharparenleft}{\isasymLambda}\ v{\isachardot}\ {\isasymlbrakk}\ e\ {\isasymrbrakk}\isactrlbsub {\isasymrho}{\isacharparenleft}x\ {\isacharcolon}{\isacharequal}\ v{\isacharparenright}\isactrlesub {\isacharparenright}}. \] Note that the argument on the left hand side is the representative of an equivalence class (defined using the Nominal package), so this definition is only allowed if the right hand side is indeed independent of the actual choice of \isa{x}. The \isa{\isacommand{nominal{\isacharunderscore}function}} command requires the user to discharge that proof obligation before the function is actually defined. This is shown most commonly and easily if \isa{x} is fresh in all the other arguments (\isa{x\ {\isasymnotin}\ fv\ {\isasymrho}}), and indeed the Nominal package allows me to specify this as a side condition to the defining equation, which is what I did in the first version of \cite{launchbury-isa}. But this convenience comes as a price: Such side-conditions are only allowed if the argument has finite support (otherwise there might be no variable fulfilling \isa{x\ {\isasymnotin}\ fv\ {\isasymrho}}). More precisely: The type of the argument must be a member of the \isa{fs} typeclass provided by the Nominal package (cf.\@ \cref{sec:nominal-isa}). The type \isa{var {\isasymRightarrow} Value} cannot be made a member of this class, as there obviously are elements that have infinite support. \index{fmap@\isa{fmap}\idxexpl{type of partial functions with finite domain}} My fix -- inspired by HOLCF’s handling of continuity using a dedicated type -- was to introduce a new type constructor, \isa{fmap}, for partial functions with \emph{finite} domain. This is fine: Only functions with finite domain matter in my formalisation. The introduction of \isa{fmap} had further consequences. The main type class of the HOLCF package, which we use to define domains and continuous functions on them, is the class \isa{cpo} of chain-complete partial orders. With the usual ordering on partial functions, \isa{(var, Value) fmap} cannot be a member of this class: As there is an infinite supply of variables, there exists a chain of partial functions of ever increasing domain, and the limit of that chain would necessarily have an infinite domain, and hence no longer is in \isa{fmap}. The fix here is to use a different ordering on \isa{fmap} and only let elements be comparable that have the same domain. In my formalisation, the domain is always known (e.g.\ all variables bound on some heap), so this seemed to work out. But not without causing yet another issue: With this ordering, \isa{(var, Value) fmap} is a \isa{cpo}, but lacks a bottom element, i.e.\ it is no longer an \isa{pcpo}, and HOLCF's built-in operator \isa{{\isasymmu}\ x{\isachardot}\ f\ x} for expressing least fixed points, as they occur in the semantics of heaps, is not available. Furthermore, \isa{\isasymsqunion} is not a total function, as it is only defined if the arguments have the same domain. In the end, I had to define a rather convoluted set of theories that formalise functions that are continuous on a specific set, fixed points on such sets etc. Eventually, I finished all proofs using that approach, but it amounted to an unreasonable amount of extra work and awkward proofs infested with statements about the domains of environments. \medskip In a later refinement, I found a way to solve this problem much more elegantly. Using a small trick I defined the semantics functions so that \[ \isa{{\isasymlbrakk}\ {\isasymlambda}x{\isachardot}\ e\ {\isasymrbrakk}\isactrlbsub {\isasymrho}\isactrlesub \ {\isacharequal}\ Fn{\isasymcdot}{\isacharparenleft}{\isasymLambda}\ v{\isachardot}\ {\isasymlbrakk}\ e\ {\isasymrbrakk}\isactrlbsub {\isasymrho}{\isacharparenleft}x\ {\isacharcolon}{\isacharequal}\ v{\isacharparenright}\isactrlesub {\isacharparenright}} \] holds unconditionally. Technically, the definition is \[ \isa{{\isasymlbrakk}\ {\isasymlambda}x{\isachardot}\ e\ {\isasymrbrakk}\isactrlbsub {\isasymrho}\isactrlesub \ {\isacharequal}\ Fn{\isasymcdot}{\isacharparenleft}{\isasymLambda}\ v{\isachardot}\ {\isasymlbrakk}\ e\ {\isasymrbrakk}\isactrlbsub {\isasymrho}{\isacharbar}\isactrlbsub fv\ {\isacharparenleft}{\isasymlambda}x{\isachardot}\ e{\isacharparenright}\isactrlesub {\isacharparenleft}x\ {\isacharcolon}{\isacharequal}\ v{\isacharparenright}\isactrlesub {\isacharparenright}} \] where the right-hand-side can be shown to be independent of the choice of \isa{x}, as \isa{x\ {\isasymnotin}\ fv\ {\isacharparenleft}{\isasymlambda}x{\isachardot}\ e{\isacharparenright}}. This definition can more easily be shown to be well-formed, and once the function is defined, \isa{{\isasymlbrakk}\ e\ {\isasymrbrakk}\isactrlbsub {\isasymrho}\isactrlesub \ {\isacharequal}\ {\isasymlbrakk}\ e\ {\isasymrbrakk}\isactrlbsub {\isasymrho}{\isacharbar}\isactrlbsub fv\ e\isactrlesub \isactrlesub } can be proved by induction. By using that lemma, I can prove the desired equation for \mbox{\isa{{\isasymlbrakk}\ {\isasymlambda}x{\isachardot}\ e\ {\isasymrbrakk}\isactrlbsub{\isasymrho}\isactrlesub}} as a lemma. The same trick is applied to the equation for let bindings. This allows me to use the type \isa{var {\isasymRightarrow} Value} for the semantic environments and considerably simplifies the formalisation compared to the initial version of \cite{launchbury-isa}. \subsection{Abstracting over the denotational semantics} \label{sec:abstracting-over-semantics} I have defined two denotational semantics in this chapter: The standard semantics ($\dsem{e}\rho$, see \cref{fig:dendef}) and the resourced denotational semantics ($\dsemr{e}\sigma$, see \cref{fig:resdendef}). The definitions are quite similar, and a number of lemmas hold for both of them. Moreover, both definitions are mutually recursive with the definition of the respective heap semantics ($\esem{\Gamma}\rho$ resp.\@ $\esemr{\Gamma}\sigma$), which is defined identically in both cases. In the Isabelle theories, I therefore abstracted over the differences in order to define a generic semantics function once and instantiate it twice. Given the rather large and annoying proofs required for a function definition over nominal terms, this pays off. I define the heap semantics within a locale that abstractly assumes the presence of some denotation function for some type of expressions: \isainput{has-ESem-locale} At this point, the concrete type of expressions, variables and semantics values is left open (the initial apostrophe in \isa{'value} denotes a type variable). No further assumptions about \isa{ESem} are required to define the heap semantics, besides those encoded in the type of \isa{ESem}: \begin{itemize} \item The expressions contain variables (\isa{pt}, provided by the Nominal package, as explained in \cref{sec:nominal-isa}). \index{pt@\isa{pt}\idxexpl{class of values with names (Nominal2)}} \item The type of variables is a base value in terms of the Nominal package (\isa{at{\isacharunderscore}base}). In our setting, $\isa{var}$ is the only such type. \index{at-base@\isa{at{\isacharunderscore}base}\idxexpl{class of types representing names (Nominal2)}} \item The semantics is continuous in the environment (use of \isa{\isasymrightarrow} instead of \isa{\isasymRightarrow}, provided by the HOLCF package). \item The type of the semantic values is oblivious to names (type class \isa{pure}, provided by the Nominal package) and it forms a pointed chain-complete partial order (type class \isa{pcpo}, provided by the HOLCF package). \index{pure@\isa{pure}\idxexpl{class of types with no names (Nominal2)}} \end{itemize} The former restriction on \isa{'value} makes it easier to prove the functions to be equivariant, while the latter is a natural requirement for the fixed point based definition for the heap semantics, which is \isainput{HSem-definition} The Isabelle command \isa{\isacommand{definition}} allows to define regular functions (type constructor \isa{\isasymRightarrow}) by giving the parameters on the left, but it does not know anything about HOLCF’s type of continuous functions (type constructor \isa{\isasymrightarrow}). Therefore, the second argument is consumed by a lambda-abstraction using HOLCF’s continuous lambda operator \isa{\isasymLambda}. \index{Labscont@$\isa{\isasymLambda\idxdummy. \idxdummy}$\idxexpl{HOLCF’s continuous lambda operator}} The two denotational semantics differ in the concrete domain (\sValue{} vs.\@ $(\sC \to \sCValue)$), and therefore the injection and projection functions are different. Furthermore the resourced denotational semantics needs to keep track of the consumed resources. In order to abstractly define the semantics of expressions, I define a locale that provides these components: \isainput{abstract-denotational-locale} The locale parameter \isa{tick} is used to count the resources as they are consumed. The type class \isa{pcpo{\isacharunderscore}pt} combines the classes \isa{pcpo} (for pointed chain-complete partial orders) with \isa{pt} (for types that may contain names), additionally ensuring that the permutation of names is continuous. Within this locale, I define the abstract denotational semantics: \isainput{abstract-denotational-definition} Note that this definition has a non-trivial recursion pattern: It uses nested recursion via the heap semantics defined in the \isa{has{\isacharunderscore}ESem} locale, to which I therefore have to pass the expression semantics \isa{ESem} -- the very thing that I am defining here -- as an argument. From the abstract denotational semantics I can produce the concrete ones by \emph{interpretation}, where the parameters of the locale are specified. For the standard denotational semantics, no resource accounting takes place, so the last parameter is the (continuous) identity: \isainput{denotational-interpretation} The arguments to pass for the resourced denotational semantics are not simply the injection and projection function themselves, as I instantiate the locale’s type parameter \isa{'value} with \isa{C {\isasymrightarrow} CValue} and the resource argument needs to be passed along: \isainput{resourced-denotational-interpretation} The case analysis function on \isa{C}, which was produced by the HOLCF package when I defined the domain \isa{C}, happens to have the right type \isa{(C {\isasymrightarrow} 'a) {\isasymrightarrow} C {\isasymrightarrow} 'a} to serve as the \isa{tick} argument to the locale. In order to convince myself that despite all this abstraction and definitional detours, I have defined the semantics that I claim I have defined, I stated the equations as a lemma and proved them. For the standard denotational semantics, this reads: \isainput{denotational-equations} \subsection{Relating the domains {\sValue} and {\sCValue}} \label{sec:relatingdomains} In order to relate the two denotational semantics, I defined the relation $\dsim$ in \cref{sec:relating_the_semantics}, closely following the work of {\lydiaetal} in \cite{functionspaces}. Their domain $D$ corresponds to my \sValue, their domain $E$ is my type $\sC \to \sCValue$ and $A$ is $\sCValue$. While {\lydiaetal} construct their domain “by hand”, by a series of domain approximations $D_n$ resp. $E_n$, I can use Isabelle's HOLCF package to construct the domain directly from its domain equation (which already includes Booleans; as mentioned in \cref{sec:addingbooleans} this addition is modular and can be ignored to obtain a formalisation closer to the work of {\lydiaetal}) \isainput{value-domain} \isainput{cvalue-domain} In my formalisation, the approximations are just subsets of the full domain, and the \emph{n-injection} $\phi_n^E \colon E_n \to E$ is the identity here. The projections in \cite{functionspaces} correspond to the \emph{take-functions} generated by the HOLCF package, which produce finite approximations of their arguments. For example, $\psi_n^D \colon E \to E_n$ becomes \isa{Value{\isacharunderscore}take} with type \isa{nat\ {\isasymRightarrow}\ Value\ {\isasymrightarrow}\ Value}. The Isabelle theories introduce the former notation as abbreviations for the latter to better match the presentation in \cite{functionspaces}. Section 2.3 of \cite{functionspaces} contains the following two equations without proof: \begin{align*} \psi^E_n((\sCFnProj{e}{a})\,c) &= (\sCFnProj{\psi^E_{n+1}(e)}{\psi^A_n(a)})\,c \tag{2}\\ \psi^D_n(\sFnProj{d}{d'}) &= \sFnProj{\psi^D_{n+1}(d)}{\psi^D_n(d')} \tag{3} \end{align*} Unfortunately, these equations do not hold in general. A counter-example to (3) can be given by \begin{align*} d &= \sFn{\lambda e. (\sFnProj e \bot)}, \\ d' &= \sFn{\lambda \_. \sFn{\lambda \_ .\bot}}\text { and } \\ n &= 1. \end{align*} In this case, the left-hand-side of the equation simplifies to $\sFn{\lambda \_. \bot}$, while the right-hand-side is simply $\bot$. A counter-example to (2) can be constructed analogously. The critical property of $d'$ is that it is ``two levels deep''. On the left hand side, $\sFnProj{d}{d'}$ passes one argument to $d'$ and hence returns a result that is one level deep, which goes through $\psi^D_1$ unaltered, while on the right hand side, $\psi^D_1(d')$ cuts off the structure of $d'$ after one level and returns $\sFn{\lambda \_ .\bot}$. Therefore, in order for the equation to hold, the argument to $d$ on the left-hand needs to be at most one level deep. An extra invocation of $\psi^D_n$ on the left hand side can ensure this: \[ \psi^D_n(\sFnProj{d}{\psi^D_n(d')}) = \sFnProj{\psi^D_{n+1}(d)}{\psi^D_n(d')} \] This lemma can already be found in \cite{fullabstraction}, equation 4.3.5~(1). The problematic equations are used in the proof of the only-if direction of Proposition 9 in \cite{functionspaces}. I fixed this by applying take-induction, which inserts the extra call to $\psi^D_n$ in the right spot and allows me to proceed using the fixed lemma. \section{Related work} A large number of developments on formal semantics of functional programming languages in the last two decades build on Launchbury’s work; here is a short selection: Van Eekelen \& de~Mol \cite{mixed} add strictness annotations to the syntax and semantics of Launchbury’s work. %They state the correctness as in `Theorem' \ref{thm:false}, without spotting the issue. Nakata \& Hasegawa \cite{nakata} define a small-step semantics for call-by-need and relate it to a Launchbury-derived big-step semantics. Nakata \cite{nakata_blackhole} modifies the denotational semantics to distinguish direct cycles from looping recursion. %They state the correctness with respect to the denotational semantics and reproduce the `Theorem' \ref{thm:false} in the flawed form, in their extended version. S{\'a}nchez-Gil {\em et~al.} \cite{distributed} extend Launchbury's semantics with distributed evaluation. Baker-Finch {\em et~al.} \cite{parallel} create a semantics for parallel call-by-need based on Launchbury’s. While many of them implicitly or explicitly rely on the correctness and adequacy proof as spelled out by Launchbury, some stick with the original definition of the heap semantics using $\sqcup$, for which the proofs do not got through \cite{mixed,nakata, distributed, parallel-tr}, while others use right-sided updates, without further explanation \cite{nakata_blackhole, parallel}. The work by Baker-Finch {\em et~al.}\ is particularly interesting, as they switched from the original to the fixed definition between the earlier tech report and the later ICFP publication, unfortunately without motivating that change. %Nakata also changes this detail between the two cited works. Such disagreement about the precise definition of the semantics is annoying, as it creates avoidable incompatibilities between these publications. I hope that my fully rigorous treatment will resolve this confusion and allows future work to standardise on the ``right'' definition. Furthermore, none of these works discuss the holes in Launchbury’s adequacy proof, even those that explicitly state the adequacy of their extended semantics. My adequacy proof is better suited for such extensions, as it is rigorous and furthermore avoids the intermediate natural semantics. \medskip This list is just a small collection of many more Launchbury-like semantics. Often the relation to a denotational semantics is not stated, but nevertheless they are standing on the foundations laid by Launchbury. Therefore, it is not surprising that others have worked on formally fortifying these foundations as well: In particular {\lydiaetal} worked towards rigorously proving Launchbury's semantics correct and adequate. They noted that the relation between the standard and the resourced denotational semantics is not as trivial as it seemed at first, and worked out a detailed pen-and-paper proof \cite{functionspaces}. I have formalised this, fixing mistakes in the proof, and build on their result here (\cref{lem:denrel}). They also bridged half the gap between Launchbury's natural and alternative natural semantics \cite{indirections}, and plan to bridge the other half. I avoided these very tedious proofs by bridging the difference on the denotational side (\cref{sec:modifications3}). As a step towards a mechanisation of their work in Coq, they address the naming issues and suggest a mixed representation, using de Bruijn indices for locally bound variables and names for free variables \cite{nameless}. This approach corresponds to my treatment of names in the formal development, using the Nominal logic machinery \cite{nominal2} locally but not for names bound in heaps, and can be found in other formalisation works as well \cite{biernacki}. %\medskip % %Having an implementation of the present work in Isabelle does not only give us great assurance about the correctness of my work, but additional is a tool to formalize further work. We have used these semantics to prove that the compiler analysis and transformation ``Call Arity'' \cite{tfp}, which is implemented in the Haskell compiler GHC, is semantics-preserving and does not degrade performance \cite{arity-afp}. We measure performance on a sufficiently abstract level by counting allocations on the heap, for which Launchbury’s semantics provides just the right level of detail. \medskip The aim of this development is to be able to formally prove properties of the language or the compiler, but not so much to prove individual functional programs to be correct; there are better ways to do that. In the context of using Isabelle to prove properties of Haskell programs, noteworthy approaches include Haskabelle \cite{haskabelle}, which transforms Haskell code into Isabelle code, but punts on issues of laziness; Isabelle’s code generation facilities \cite{codegen}, which go the other way; and HOLCF-Prelude \cite{hart}, which models Haskell’s lazy semantics, albeit without observable sharing, using HOLCF function definitions in Isabelle.