\lettrine[nindent=0ex]I{t} is a pleasure to create programs in functional programming languages, as they allow for a very high-level style of programming, using abstraction and composition. It suits the human brain that is trying to solve a problem, instead of accommodating the machine that has to implement the instructions. But such an abstraction comes at a cost: Generally, programs written in such high-level style perform worse than manually tweaked low-level code. Therefore, we reach out to optimising compilers, with the hope that they can amend this overhead, at least to some extent. An optimising compiler necessarily needs to be conservative in how it changes the programs: We would not be happy if the program becomes faster, but suddenly computes wrong results. Naturally, this limits the compiler’s latitude in applying fancy and far-reaching transformations. Conversely, the more declarative the language is -- i.e.\@ the less low-level details it specifies -- and the fewer side-effects can occur, the more possibilities for optimising transformations arise. This explains why compilers of pure, lazy functional programming languages, such as Haskell, can pull quite astonishing tricks on the code. A prime example for such a far-reaching transformation is \emph{list fusion}: This technique transforms a program built from smaller components, each producing and/or consuming a list of values, into one combined loop. This not only avoids having to allocate, traverse and deallocate the list structure of each intermediate value, it also puts the actual processing codes next to each other, allowing for local optimisations to work on code that was originally far apart. Unfortunately, there are many instances of code where the sufficiently smart compiler could do something clever -- and often the uninitiated user actually expects that clever thing to happen -- but the real compilers out there just do not do it yet. This thesis is about one such instance: A large number of list processing functions, including common combinators such as \hi{sum} and \hi{length}, were not set up to take part in list fusion. This is not because including them in the technique is difficult to do, but because the code that results from such a transformation would perform very badly. I found a new program analysis, called \emph{Call Arity}, which gives the compiler enough information to further transform that problematic code into nice, straightforward and efficient code -- code that is roughly what a programmer would write manually, if he chose to program such low-level code. I motivate and describe the analysis and its impact on program performance, determined empirically. The same language treats -- purity and laziness -- which make it easier for the compiler to transform programs also ease a rigorous, formal discussion of the artefacts at hand. I therefore evaluate Call Arity not only empirically, but also prove that it is correct (i.e.\@ does not change the meaning of the program) and safe (i.e.\@ does not make the program's performance worse). While the former is common practice in this field of research, the latter is rarely done with such rigour. What makes me so confident that my proof deserves to be called rigorous? If I just did a pen-and-paper proof, I would not trust it to that extent. For that reason, I implemented the syntax, the semantics and the compiler transformations in the theorem prover Isabelle and performed all proofs therein. Occasionally, this required derivations from the pen-and-paper presentation given in this thesis. I discuss these differences, and other noteworthy facts about the formalisation, in dedicated sections in the following chapters. Such a formal proof requires a formal semantics for the programming language at hand, and in order to make statements about an operational property, the semantics has to be sufficiently detailed. Launchbury’s natural semantics for lazy evaluation \cite{launchbury} is such a semantics, and can be considered a standard semantics in lazy functional programming language research. I implemented this semantics in Isabelle, including proofs of the two fundamental properties: correctness and adequacy (with regard to a standard denotational semantics). As no rigorous proof of adequacy existed before, I present my proof in detail. \medskip I have structured this thesis as follows: This chapter contains a brief introduction to the Haskell compiler GHC, in particular its intermediate language \emph{Core}, its evaluation strategy and list fusion, introduces the central notion of \emph{arity}, describes nominal logic and the interactive theorem prover Isabelle. \Cref{ch:launchbury} lays the foundation by formally introducing the syntax and the various semantics, and contains rigorous correctness and adequacy proofs for Launchbury’s semantics. \Cref{ch:call-arity} motivates, describes and empirically evaluates Call Arity. \Cref{ch:provingcallarity} builds on the previous two chapters and contains the formal proof that Call Arity is safe. \Cref{ch:formal-stuff} contains the Isabelle formulation of the main results and relevant definitions. \Cref{ch:listings} lists the Haskell implementation of Call Arity. The bibliography and an index of used symbols and terms, including short explanations, follow. \Cref{fig:overview} contains a map to the main artefacts and how they relate to each other. In the interest of readability I omit elaborate definitions and descriptions in the figure; if necessary, consult the index. \begin{figure} \centering \newcommand{\figref}[1]{{\footnotesize (\cref{#1})}} { \hypersetup{hidelinks} \begin{tikzpicture} \node(launchbury) at (0,0) {$\sred {\Gamma}{e}{L}{\Delta}{v}$}; \node(den) at (6,1.5) {$\dsem{e}\rho$}; \node(resden) at (6,-1.5) {$\dsemr{e}\rho$}; \draw[->, bend left=20] (launchbury) to node[midway,above,sloped] {correct} node[midway,below,sloped] {\figref{thm:main}} (den); \draw[<->] (den) to node[midway,above,sloped] {equivalent} node[midway,below,sloped] {\figref{lem:denrel}} (resden); \draw[->, bend left=20] (launchbury) to node[midway,above,sloped] {correct} node[midway,below,sloped] {\figref{lem:resourced_correctness}} (resden); \draw[->, bend left=20] (resden) to node[midway,above,sloped] {adequate} node[midway,below,sloped] {\figref{lem:resad}} (launchbury); \node (sestoft) at (0,3.5) {$\steps{(\Gamma,e,S)}{(\Delta, v, S')}$}; \draw[->, bend right=25] (sestoft) to node[midway,above,sloped] {simulates} node[midway,below,sloped] {\figref{lem:sestoft_simulates_launchbury}} (launchbury); \draw[->, bend right=25] (launchbury) to node[midway,above,sloped] {simulates} node[midway,below,sloped] {\figref{lem:launchbury_simulates_sestoft}} (sestoft); \node (G1) at (1,11) {$\ccexp e \alpha$}; \node [base right=0pt of G1, inner sep=0pt] (p2) {$+$}; \node [base right=0pt of p2] (A1) {$\aexp e \alpha$}; \node [base right=0pt of A1, inner sep=0pt] (p1) {$+$}; \node [base right=0pt of p1] (T1) {$\trans e \alpha$}; \draw [decorate,decoration={brace,amplitude=5pt}] (G1.north west) to node [midway,above] (ca) {Call Arity} (T1.north east); \node (tracetree) at (0,8.5) {$\fexp e \alpha$}; \draw[->] (G1) to node[midway,above,sloped] {refines} node[midway,below,sloped] {\figref{lem:cocall-safety}} (tracetree); \node (card) at (-1,6) {$\cheap \Gamma e \alpha$}; \draw[->] (tracetree) to node[midway,above,sloped] {refines} node[midway,below,sloped] {\figref{lem:tree-safety}} (card); \node [base right=0pt of card, inner sep=0pt] (p3) {$+$}; \node [base right=0pt of p3] (A2) {$\aexp e \alpha$}; \node [base right=0pt of A2, inner sep=0pt] (p4) {$+$}; \node [base right=0pt of p4] (T2) {$\trans e \alpha$}; \draw [decorate,decoration={brace,amplitude=5pt,mirror}] (card.south west) to node [midway,below] (x) {} (T2.south east); \draw[->] (sestoft) to node[midway,above,sloped] {safe} node[midway,below,sloped] {\figref{lem:card-arity-safety}} (x); \draw [decorate,decoration={brace,amplitude=5pt,mirror}] (A1.south west) to node [midway,below] (xx) {} (T1.south east); \draw[->,bend right=10] (den) to node[midway,above,sloped] {semantics preserving} node[midway,below,sloped] {\figref{thm:call-arity-correct}} (xx); \draw[dotted] ([yshift=-2pt]sestoft.north -| current bounding box.west) -- ([yshift=-2pt]sestoft.north -| current bounding box.east) node[near end, right, above] {\figref{ch:provingcallarity}} node[near end, right, below] {\figref{ch:launchbury}}; \draw[dotted] ([yshift=-2pt]ca.south -| current bounding box.west) -- ([yshift=-2pt]ca.south -| current bounding box.east) node[very near end, right, above] {\figref{ch:call-arity}} node[very near end, right, below] {\figref{ch:provingcallarity}}; \end{tikzpicture} } \caption{Main artefacts of this thesis and their relationship} \label{fig:overview} \end{figure} \section{Notation and conventions} \label{sec:notation} I use mostly standard mathematical notation in this text, and any custom notation is introduced upon its first use. The index at the end of the thesis also includes symbols and notations, together with a short description of each entry. Proofs are concluded by a black square ($_\blacksquare$), definitions and examples span to the next diamond ($\diamond$). When a function argument is just a single symbol, possibly with sub- or superscripts, I usually omit the parentheses for better readability: $\fv{{e_1}}$ instead of $\fv {e_1}$. Variables printed with a dot (e.g.\@ $\dot\alpha$) refer to lists whose elements are usually referred to by the plain variable (e.g.\ $\alpha$). Variables printed with a bar (e.g.\ $\bar \alpha$) refer to objects that are partial or total maps from variable names to whatever the plain variable usually stands for. The same notation is used to distinguish related functions: If $\mathcal T$ is a plain function with one argument of a certain type, then $\dot{\mathcal T}$ is a function that expects a list of elements of that type and $\overline{\mathcal T}$ expects a map from variables names to values of that type. Source code listings and code fragments within the text are typeset using a proportional sans-serif font, with language keywords highlighted by heavy type: \hi{!if p a !then f 1 !else f 2}. When writing Haskell code, I use some Unicode syntax instead of the more common ASCII representation, in particular a lambda instead of a backslash for lambda abstractions, and a proper arrow instead of \hi{->}, e.g.\@ \mbox{\hi{(λx → x) :: a → a}}. The Isabelle code snippets are produced by Isabelle from the sources, and printed in the usual {\LaTeX} style of Isabelle’s document generation facilities. The name of the Isabelle file containing the snipped is given in the top-right corner, unless it is the same as for the preceding snippet. \medskip There are various schools of writing concerning the use of “we”, “I” and “the author”. Since a dissertation thesis is necessarily more tied to the person than a paper, even if it was a single author paper, I decided to use the first person singular whenever I describe what I have done or not done, and why I have done so. Nevertheless, large parts of the text, especially the proofs, are an invitation to you, the reader, to follow my train of thoughts. Optimistically assuming that you follow this invitation, I will commonly use “we” in these parts, referring to you and me, just as if we were standing in front of a blackboard where I walk you through my proof. I do not avoid the passive voice as fundamentally as other authors would: It is used whenever I believe readability is best served this way. \pagebreak[3] \section{Reproducibility and artefacts} This thesis describes a few artefacts that cannot be included in their entirety in the document, or that will evolve further in the future and thus diverge from what is discussed here. This includes the Call Arity implementation, which is part of the GHC source tree, and the Isabelle formalisations of Launchbury’s semantics \cite{launchbury-isa} and of the safety of Call Arity \cite{arity-afp}. Furthermore, I have conducted performance measurements of which only a summary is included in this text (\cref{sec:measurements}), but neither the raw data nor the tools that produced them. In the interest of reproducibility and verifiability, I have collected all these artefacts on \url{http://www.joachim-breitner.de/thesis}. In particular, there you will find: \begin{itemize} \item The Isabelle sources of both developments, in precisely the version that is described in this document, in three formats: The plain \texttt{.thy} file, a browsable HTML version and the Isabelle-generated {\LaTeX} output. \item Scripts to fetch and build GHC in the version discussed in this thesis (7.10.3). \item Patches to that version of GHC to produce the various variants compared in the benchmark sections. \item Scripts to run the benchmark suite, collect the results and produce \cref{tab:nofib,tab:compiletimes} in the benchmark sections. \item Code that I have created to check claims in this thesis, e.g.\ about the performance cost of unsaturated function calls (\cref{sec:stg}) and the effect of Call Arity on difference lists (\cref{tab:dlist}). \item The {\LaTeX} sources of the thesis document itself. \item Errata, if necessary. \end{itemize} \section{Lazy evaluation} \label{sec:lazyevaluation} With the title of this thesis sporting the term \emph{lazy evaluation}\index{lazy evaluation} so prominently, it seems prudent to briefly introduce it in general terms.\goodbreak Consider the function \begin{hcode} \> writeErrorToLog e = \\ \> \hind \> writeToLog ("Error " ++ errNum e ++ ": " ++ errDesc e) \end{hcode} which turns an error, given as an element of a structured data type, into a readable text and uses a hypothetical \hi{writeToLog} function to write the text to a log file. In most programming languages, \mbox{\hi{writeErrorToLog e}} would first calculate the text and only then call \hi{writeToLog}. But assume that in the application at hand, logging is optional, and actually turned off: \hi{writeToLog} would have to discard the text passed to it, and the calculation would have been useless. This behaviour is called \emph{strict evaluation} or \emph{call-by-value}. In a programming language with lazy evaluation, the function \hi{writeErrorToLog} would not actually assemble the text, but defer this calculation, by creating a \emph{thunk}\index{thunk\idxexpl{deferred calculation during lazy evaluation}} that serves as a placeholder. If \hi{writeToLog} decides that no log file is to be written, it will discard the thunk and the useless calculation never happens. On the other hand, if \hi{writeToLog} does write to the log file, this will eventually require the actual value of the argument and only then trigger the evaluation of the thunk. We also say that its evaluation is \emph{forced}. \medskip The point of lazy evaluation is not just to avoid useless computation: One of its main benefits is that code can be refactored much more easily. Consider the following plausible implementation of \hi{writeToLog}: \begin{hcode} \> writeToLog txt = \\ \> \hind !if logLevel >= 1 \> !then \> appendFile "error.log" txt \\ \> \> !else \> return () \end{hcode} This function does two things: It decides whether logging is actually required, and if so, it performs the logging. Likely there are more places where we need to decide whether logging is required, so it is desirable to abstract over this procedure and implement it in a definition of its own: \begin{hcode} \> ifLogging action = \\ \> \hind !if logLevel >= 1 \> !then \> action \\ \> \> !else \> return () \\ \> \\ \> writeToLog txt = ifLogging (appendFile "error.log" txt) \end{hcode} In a programming language with strict evaluation, this will not work as intended: The argument \hi{appendFile "error.log" txt} would be evaluated before \hi{ifLogging} gets a chance to check the log level. Our refactoring just broke the program! Note that even in strict languages, the \hi{!if-!then-!else}-construct evaluates the two branches lazily, but this is a built-in special case for this syntactic construct, and not available for the programmer to abstract over. In a programming language with lazy evaluation, however, this refactoring is valid. This way, lazy evaluation allows the programmer to define custom control structures. \medskip Another aspect of lazy evaluation that is crucial to my work is \emph{sharing}\index{sharing}: Although the argument to a function is not evaluated until it is used for the first time, it will not be evaluated a second time. For example the code \mbox{\hi{map (2$^\wedge$b *) xs}}, which multiplies every element of the list \hi{xs} by a certain power of two, will not actually calculate \hi{2$^\wedge$b} if the list \hi{xs} is empty. But even if \hi{xs} has more than one element, \hi{2$^\wedge$b} is calculated only once, and the result is shared between the various uses. This feature distinguishes lazy evaluation, also called \emph{call-by-need}, from \emph{call-by-name} evaluation. According to the latter scheme, which is of less practical relevance, the calculation of an argument is also deferred until it is needed, but it would be re-evaluated repeatedly if used more than once. A common way to implement sharing is to add code to every thunk that, after the evaluation of the thunk has been triggered and its value has been calculated, replaces the thunk by this value, so that every existing reference to the thunk now references the value. This mechanism is called \emph{updating}.\index{updating} \section{The GHC Haskell compiler} \label{sec:ghc} The programming language Haskell has been created in the 1990s by a committee with the aim to overcome the then wild growths of lazy functional programming languages. The committee produced a series of language specifications, including the final Haskell 98 language report \cite{report}.\footnote{After a long phase of stability, a revision was published in 2010 and is now the most recent Haskell specification \cite{report2010}. With regard to this thesis, the differences are irrelevant.} This standardisation allowed a number of Haskell compilers to emerge. These days, still a number of compilers are actively developed, but while most of them are dedicated to special purposes or research, only one compiler is of practical relevance: The Glasgow Haskell Compiler (GHC).\index{GHC} In order for my work to have an impact on actual users using Haskell to solve real problems, I implemented Call Arity within GHC. It was first shipped with GHC-7.10, released on March 27th 2015\footnote{The coincidence with my birthday is, well, coincidental.}. GHC’s internal structure necessarily influenced the design and implementation of Call Arity, so I will outline its relevant features here. \subsection{GHC Core} \label{sec:ghc-core} Speaking in terms of syntax, Haskell is a large language: As of version 7.10.3 of GHC, the data types used to represent the abstract syntax tree of an Haskell expression have over 79 constructors, and 24 more are required to express the Haskell types. Therefore GHC -- like most compilers -- transforms the source language into a smaller intermediate language. In this case, the intermediate language is \emph{GHC Core}\index{GHC!Core} and uses only the 15 constructors given in \cref{fig:core} to represent expressions. %, together with 12 constructors for types. \begin{figure} \begin{hcode} \column{..}{@{}c@{ }} \> !data Expr b \> = \> Var \> Id \\ \> \> | \> Lit \> Literal \\ \> \> | \> App \> (Expr b) (Expr b) \\ \> \> | \> Lam \> b (Expr b) \\ \> \> | \> Let \> (Bind b) (Expr b) \\ \> \> | \> Case \> (Expr b) b Type [(AltCon, [b], Expr b)] \\ \> \> | \> Cast \> (Expr b) Coercion \\ \> \> | \> Tick \> (Tickish Id) (Expr b) \\ \> \> | \> Type \> Type \\ \> \> | \> Coercion \> Coercion \\ \> \\ \> !data AltCon \> = \> DataAlt \> DataCon \\ \> \> | \> LitAlt \> Literal \\ \> \> | \> DEFAULT \> \\ \> \\ \> !data Bind b \> = \> NonRec b (Expr b) \\ \> \> | \> Rec [(b, (Expr b))] \end{hcode} \caption{The data type representing GHC Core expressions} \label{fig:core} \end{figure} The translation from Haskell to Core is not just a matter of simple syntactic desugaring, as the type systems differ noticeably: Haskell has features in the type system that have a computational meaning; most prominently type classes. Therefore, GHC has to type-check the full Haskell program, and as a side-effect of type-checking the compiler produces the code that implements these features. In the case of type classes the compiler generates dictionaries\footnote{From an operational point of view, these might better be called tuples, as they are single-constructor data types and the members are at fixed, statically known positions. There is no runtime string-based lookup as in “dictionaries” in dynamic languages.} for each instance and passes them around as regular function arguments. Nevertheless, Core does have a type system, and Core terms are explicitly typed. This is used as an effective quality assurance tool \cite{aos}: The internal type checker (called linter) would complain if the Core generated from the Haskell source is not well-typed, or if any of the further processing steps breaks the typing. The type system is relatively small (12 constructors) but powerful enough to support all features of the Haskell type system, including fancy extensions like GADTs \cite{gadts} and type families \cite{typefamilies}. The theory behind Core is System $F_C$, an explicitly typed lambda calculus with explicit type abstraction and application as well as type equality witnesses called coercions \cite{systemfc}. The latter add another 15 constructors to the count. Core and its theoretical counterpart, System $F_C$, are continuously refined, recently by a stratification of the coercions into roles \cite{generative-type-abstraction,safe-coercions}. \index{System FC@System $F_C$} % 47 (expr) + 7 (do statements) + 20 (patterns) + 5 (bindings) % 24 (HsType) % 6 (Type) + 8 (TyCon) + 15 (coercions) Most of the published research around Core and System $F_C$ revolves around the type system: How to make it more expressive and more powerful. There is, however, a lack of operational treatments of Core in the literature. The extended version of \cite{safe-coercions} contains a small-step semantics of System $F_C$. It serves not as a description of Core’s operational behaviour but rather as a tool to prove type safety of System $F_C$ and punts on let-bindings completely. Eisenberg also maintains a small-step semantics for full Core \cite{core}, which is call-by-name. There is no description of how Core implements lazy evaluation besides the actual implementation in GHC, i.e.\@ the Core-to-STG transformation. This lack contributed to the breadth of the formalisation gap of this work (\cref{sec:formalisationgap-semantics}). Almost all of the optimisations performed by GHC are Core-to-Core transformations; Call Arity is no exception. But as not all features of Core are relevant in the description and discussion of Call Arity, the trimmed down lambda calculus introduced in \cref{sec:syntax} serves to take the role of Core; I discuss this simplification in \cref{sec:formalisationgap-syntax}. \subsection{Rewrite rules and list fusion} When we teach functional programming, we often use equational reasoning to explain when two programs are the same, or to derive more specialised or faster programs from specifications or existing programs, e.g.\ as Bird does \cite{bird}. Such equational reasoning is especially powerful in pure, lazy languages, as more equalities hold here: For example, bindings may be floated out of or into expressions, or inlined completely, common code patterns can be abstracted into higher-order functions etc. But instead of expecting the programmer to apply such equalities, we can actually teach the compiler to do that. This mechanism, called \emph{rewrite rules}, lets the author of a software library specify rules that contain a code pattern (the left-hand side of the rule) and replacements (the right-hand side of a rule), with free variables that will be matched by any code \cite{rewriterules}. \index{rewrite rule} For example, the code \begin{hcode} \> \{-\# !RULES \\ \> ~\> "map/map" !forall f g xs. map f (map g xs) = map (f . g) xs \\ \> \> \#-\} \end{hcode} allows the compiler to make use of the functoriality of \hi{map} and replace code like \begin{hcode} \> sum (map (+1) (map (*2) [0..10])) \end{hcode} by \begin{hcode} \> sum (map ((+1) . (*2)) [0..10]), \end{hcode} which calls \hi{map} only once, and hence avoids the allocation, traversal and deallocation of one intermediate list. \iffalse % Probably not needed If the right-hand side has a more specific type than the left hand side, then GHC will only apply the rule when the code in question actually has the more specific type. This is used extensively to specialise or completely get rid of numeric conversion functions: \begin{hcode} \> \{-\# !RULES \\ \> ~ \> "fromIntegral/Word8→Word8"\\ \> \> \hind \> fromIntegral = id :: Word8 → Word8 \\ \> \> "fromIntegral/Word8→Integer" \\ \> \> \> fromIntegral = toInteger :: Word8 → Integer \\ \> \> \#-\} \end{hcode} \fi \label{listlesscode} What about the other intermediate lists in that code? Can we get rid of them as well? After all, the code could well be written completely list-lessly:% \footnote{Due to the excessive use of the stack, this is not an efficient way to sum the elements of a list, and a real implementation would use a strict accumulator and tail recursion. For the sake of this explanation, please bear with me here.} \begin{hcode} \> go 0 \\ \> \hwhere \\ \> \hind \> go n \> | n > 10 \> = 0 \\ \> \> \> | otherwise \> = (n*2 + 1) + go (n + 1) \end{hcode} \index{list fusion} This feat is done by \emph{list fusion} \cite{deforestation}, which is essentially a set of rewrite rules that tell the compiler how to transform the high-level code with lists into the nice code above. The central idea is that instead of allocating the list constructors (\hi{:} and \hi{[]}), the producer of a list passes the head of the list and the (already processed) tail of the list to a function provided by the consumer. Thus a list producer is expected to use the following \hi{build} function to produce a list, instead of using the constructors directly: \begin{hcode} \> build :: !forall a. (!forall b. (a → b → b) → b → b) → [a] \\ \> build g = g (:) [] \end{hcode} The higher rank type signature ensures that \hi{g} is consistent in using the argument provided by \hi{build} to produce the result: By requiring the argument \hi{g} to build a result of an arbitrary type \hi{b}, it has no choice but to use the given arguments (here \hi{(:)} and \hi{[]}) to construct it. A list producer implemented using \hi{build} is called a \emph{good producer}. \index{good producer} For example, instead of defining the enumeration function naively as \begin{hcode} \> \relax[n..m] = go n m \\ \> \hwhere \\ \> \hind \> go n m \> | n > m \> = [] \\ \> \> \> | otherwise \> = n : go (n+1) m \end{hcode} it can be defined in terms of \hi{build}, and thus the actual code in \hi{go} is abstract in the list constructors: \index{build@\hi{build}\idxexpl{list fusion interface for list producers}} \begin{hcode} \> \relax[n..m] = build (go n m) \\ \> \hwhere \\ \> \hind \> go n m cons nil \> | n > m \> = nil \\ \> \> \> | otherwise \> = n `cons` go (n+1) m cons nil \end{hcode} \goodbreak The \hi{build} function has a counterpart that is to be used by list consumers; it is the well-known right-fold: \index{foldr@\hi{foldr}\idxexpl{list fusion interface for list consumers}} \begin{hcode} \> foldr :: (a → b → b) → b → [a] → b \\ \> foldr k z = go \\ \> \hwhere \\ \> \hind \> go [] \> = z \\ \> \> go (y:ys) \> = y `k` go ys \end{hcode} Any list consumer implemented via \hi{foldr} is called a \emph{good consumer}. \index{good consumer} It is a typical exercise for beginners to write a list consuming function like \hi{sum} in terms of \hi{foldr}:\footnote{As mentioned in the previous footnote, this is \emph{not} a good and practical definition for summation. In your code, please do use \hi{sum = foldl' (+) 0} instead!} \begin{hcode} \> sum :: [Int] → Int \\ \> sum xs = foldr (+) 0 xs \end{hcode} After rewriting as many list producers as possible in terms of \hi{build}, and as many list consumers as possible in terms of \hi{foldr}, what have we gained? The benefit comes from one single and generally applicable rewrite rule \begin{hcode} \> \{-\# !RULES \\ \> ~\> "fold/build" !forall k z g. foldr k z (build g) = g k z\\ \> \> \#-\} \end{hcode} which fuses a good producer with a good consumer. It makes the producer use the consumer’s combinators instead of the actual list constructors, and thus eliminates the intermediate list. Simplifying our example a bit, we can see that \hi{sum [0..10]} would, after some inlining, become \begin{hcode} \> foldr (+) 0 (build (go 0 10)) \\ \> \hwhere \\ \> \hind \> go n m cons nil \> | n > m \> = nil \\ \> \> \> | otherwise \> = n `cons` go (n+1) m cons nil \end{hcode} where the rewrite rule is applicable, and GHC rewrites this to \begin{hcode} \> go 0 10 (+) 0\\ \> \hwhere \\ \> \hind \> go n m cons nil \> | n > m \> = nil \\ \> \> \> | otherwise \> = n `cons` go (n+1) m cons nil \end{hcode} which can further be simplified (by a constant propagation and dropping unused arguments) to \begin{hcode} \> go 0 10\\ \> \hwhere \\ \> \hind \> go n m \> | n > m \> = 0 \\ \> \> \> | otherwise \> = n + go (n+1) m \end{hcode} which is roughly the code we would write by hand. A function like \hi{map} is both a list consumer and a list producer, but it poses no problem to make it both a good consumer and a good producer: \begin{hcode} \> map :: (a → b) → [a] → [b] \\ \> map f xs = build (λcons nil → foldr (λ x ys → f x `cons` ys) nil xs) \end{hcode} With this definition for \hi{map}, the compiler will indeed transform the expression \hi{sum (map (+1) (map (*2) [0..10]))} into the nice list-less code on page \pageref{listlesscode}. It is remarkable that list fusion does not have to be a built-in feature of the compiler, but can be completely defined by library code using rewrite rules. List fusion based on \hi{foldr}/\hi{build} is but one of several techniques to eliminate intermediate data structures; there is \hi{unfoldr}/\!\hi{destroy} \cite{unfoldr} and stream fusion \cite{streamfusion}; they differ in what functions can be efficiently turned into good producers and consumers \cite{practicalfusion}. I focus on \hi{foldr}/\hi{build} as that is the technique used for the list data type in the Haskell standard libraries. \pagebreak \subsection{Evaluation and function arities} \label{sec:stg} \index{STG}\index{GHC!STG} When GHC is done optimising the program at the Core stage, it transforms it to machine code via yet another intermediate language. GHC Core is translated to the Spineless Tagless G-Machine (STG) \cite{stg}. Although still a functional language based on the untyped lambda calculus, it already determines many low-level details of the eventual execution: In particular, allocation of data and of function closures is explicit, the memory layout of data structures is known and all functions have a particular arity, i.e.\@ number of parameters. So although it is not machine code yet, together with the runtime system (which is implemented in C), most details of the runtime behaviour are known by now. The function arity at this stage has an important effect on performance, as a mismatch between the number of arguments a function expects and the number of arguments it is called with causes significant overhead during execution. \begin{figure} \abovedisplayskip=0pt \belowdisplayskip=0pt \begin{hcode} \> multA :: Int → Int → Int \\ \> multA \=a 0 \> y = 0 \\ \> multA \=a x \> y = x * y \\ \> \\ \> multB :: Int → Int → Int \\ \> multB \=b 0 \> = λ\_ \> → 0 \\ \> multB \=b x \> = λy \> → x * y \end{hcode} \caption{Semantically equal functions with different arities} \label{fig:short-circuit-mult} \end{figure} Consider the two functions in \cref{fig:short-circuit-mult}, which both implement a short-circuiting multiplication operator. The first has an arity of 2, while the second has an arity of 1. This matters: Evaluating the expression \mbox{\hi{multB 1 2}} is more than 25\% slower than evaluating \hi{multA 1 2}! Why is that so? For the former, the compiler sees that enough arguments are given to \hi{multA} to satisfy its arity, so it puts them in registers and simply calls the code of \hi{multA}. For the latter, the code first pushes onto the stack a continuation that will, eventually, apply its argument to \hi{2}. Then it calls \hi{multB} with only the first argument in an register. \hi{multB} then evaluates this argument and checks that is not zero. It then allocates, on the heap, a function closure capturing \hi{x}, and passes it to the continuation on the stack. This continuation, implemented generically in the runtime, analyses the function closure to see that it indeed expects one more argument, so it finally passes the second argument, and the actual computation can happen. This example demonstrates why it is important for good performance to have functions expect as many arguments as they are being called with. Could the compiler simply always make a function expect as many arguments as possible? No! Compare the expression \hi{sum (map (multA n) [1..1000])} with the expression \hi{sum (map (multB n) [1..1000])}. The former will call \hi{multA} one thousand times and thus perform the check \hi{n == 0} over and over again, while the latter calls \hi{multB} once, hence performs the check once, and then re-uses the returned function a thousand times. In this example the check is rather cheap, but even then, for \hi{n=0}, the latter code is 20\% faster. With different, more expensive checks, the performance difference can become arbitrarily large. More details about how GHC implements function calls, and why it does it that way, can be found in \cite{fastcurry}. \section{Arities and eta-expansion} \label{sec:arity} The notion of arity is central to this thesis, and deserves a more abstract definition in terms of eta-expansion. This definition formally builds on the syntax and semantics introduced later, but can be understood on its own. \emph{Eta-expansion} replaces an expression $e$ by $(\sLam{z}{\sApp ez})$, where $z$ is fresh with regard to $e$. More generally, the $n$-fold eta-expansion is described by \[ \etaex{e}{n} \coloneqq (\sLam {z_1 \ldots z_n} {\sApp{\sApp{e}{z_1}\ldots} {z_n}}), \] where the $z_i$ are distinct and fresh with regard to $e$. \index{eta@$\etaex{\idxdummy}{\idxdummy}$\idxexpl{eta-expansion}} \index{eta-expansion} We intuitively consider an expression $e$ to have \emph{arity} $\alpha \in \mathbb N$ if we can replace it by $\etaex{e}{\alpha}$ without negative effect on the performance -- whatever that means precisely. Analogously, for a variable bound by $\keyword{let}~{x = e}$, its arity $x_\alpha$ is the arity of $e$. \index{arity} \index{N@$\mathbb N$\idxexpl{natural numbers including 0; used to represent arities}} \begin{example} The Haskell function \begin{hcode} \> !let f x = !if x \> !then \> λ y → y + 1 \\ \> \> !else \> λ y → y - 1 \end{hcode} can be considered to have arity 2: If we eta-expand its right-hand side, and apply some mild simplifications, we get \begin{hcode} \> !let f x y = !if x \> !then \> y + 1 \\ \> \> !else \> y - 1 \end{hcode} which should in general perform better than the original code. Note that in a lazy language, \hi{x} will be evaluated at most once. \end{example} In this example, I determined the arity of an expression based on its definition and obtained its \emph{internal} arity. Such an analysis has been part of GHC since a while and is described in \cite{arityanalysis}. For the rest of this work, however, I treat $e$ as a black box and instead look at how it is being used, i.e.\@ its context, to determine its \emph{external} arity. For that, I can give an alternative definition: An expression $e$ has arity $\alpha$ if upon every evaluation of $e$, there are at least $\alpha$ arguments on the stack. \begin{example} In the Haskell code \begin{hcode} \> !let f x = !if g x \> !then \> λ y → y + 1 \\ \> \> !else \> λ y → y - 1 \\ \> !in f 1 2 + f 3 4 \end{hcode} the function \hi{f} has arity 2: Because it is always called with two arguments, the eta-expansion itself has no effect, but it allows for subsequent optimisations that improve the code to \begin{hcode} \> !let f x y = !if g x \> !then \> y + 1 \\ \> \> !else \> y - 1 \\ \> !in f 1 2 + f 3 4. \end{hcode} The internal arity is insufficient to justify this, as in a different context, this transformation could create havoc: Assume the function is passed to a higher-order function such as \hi{map (f 1) [1.1000]}. If \hi{f} were now eta-expanded, the possibly costly call to \hi{g 1} would no longer be shared and repeated a thousand times. \end{example} If an expression has arity $\alpha$, then it also has arity $\alpha'$ for $\alpha' \le \alpha$; every expression has arity $0$. The arities can thus be arranged to form a lattice: \[ \cdots \sqsubset 3 \sqsubset 2 \sqsubset 1 \sqsubset 0. \] For convenience, I set $0 - 1= 0$. As mentioned in \cref{sec:notation}, $\bar \alpha$ is a partial map from variable names to arities, and $\dot \alpha$ is a list of arities.% \index{.R2sqsubset@$\idxdummy\sqsubseteq\idxdummy$!on arities} \section{Nominal logic} \label{sec:nominal} In pen-and-paper proofs about programming languages, it is customary to consider alpha-equivalent terms as equal, i.e.\@ $\sLam x x = \sLam y y$. The human brain is relatively good in following that reasoning, keeping track of the scope of variables and implicitly making the right assumptions about what names in a proof may be equal to another. For example, in a proof by induction on the formation of terms, it often goes without saying that in the case for $\sLam x e$, the $x$ is fresh and not related to any name occurring outside the scope of this lambda. Such loose reasoning stands in the way of a rigorous and formal treatment. If the formalisation introduces terms as raw terms where the name of the bound variable contributes to the identity of the object, i.e.\@ $\sLam x x \ne \sLam y y$, then in every inductive proof one would have to worry about the bound variable possibly being equal to some name in the context, and if that poses a problem, one has to explicitly alpha-rename the lambda abstraction, which in turn requires a proof that the statement of the lemma indeed respects alpha-equivalence. \index{de-Bruijn index} One alternative is to use nameless representations such as de-Bruijn indices. With these, every term has a unique representation and the issue of alpha-equivalency disappears. The downsides of such an approach are the need for two different syntactic constructors for variables -- one for the index of a bound name, and one for the name of a free variable -- and the relatively unnatural syntax, which stands in the way of readability A way out is provided by \emph{nominal logic}, as devised by Pitts \cite{nominal}. This formalism allows us to use names as usual in binders and terms, while still equating alpha-equivalent terms, and it provides induction principles that allow us to assume bound names to be as fresh as we intuitively want them to be. This section gives a shallow introduction to nominal logic. I took inspiration from \cite{nominal_in_isabelle}, simplified some details and omitted the proofs. \index{nominal logic} In the main body of the thesis I present my definitions and proofs in the intuitive and somewhat loose way, without making use of concepts specific to nominal logic. In particular I do not bother to state the equivariance of my definitions and predicates. Having a machine-checked formalisation, where all these slightly annoying and not very enlightening details have been taken care of, gives me the certainty that no problems lurk here. \subsection{Permutation sets} A core idea in nominal logic is that the effect of \emph{permuting} names in an object describes its binding structure. Full nominal logic supports an infinite number of distinct sorts of names, or \emph{atoms}, but as I do not need this expressiveness, I restrict this exposition to one sort of atoms, here suggestively named $\sVar$. \index{atom\idxexpl{a name (nominal logic)}} We are concerned with sets that admit swapping names: \index{pset\idxexpl{set with a permutation action (nominal logic)}} \begin{definition}[PSets] \index{.O2bullet@$\idxdummy\bullet\idxdummy$\idxexpl{permutation operation on psets (nominal logic)}} A pset is a set $X$ with an action $\bullet$ of the group $\Sym(\sVar)$ on $X$. \end{definition} Deciphering the group theory language, this means that there is an operation $\bullet$ that satisfies, for every $x\in X$, \begin{compactitem}[~~-] \item $() \bullet x = x$ and \item $(\pi_1 \cdot \pi_2) \bullet x = \pi_1 \bullet (\pi_2 \bullet x)$ for all permutations $\pi_1, \pi_2$ \end{compactitem} where $()$ is the identity permutation, and $\cdot$ the usual composition of permutations. The set of atoms, \sVar, is naturally a pset, with the standard action of the permutation group. Any set can be turned into a pset using the trivial operation, i.e.\@ $\pi \bullet x = x$ for all elements $x$ of the set. This way, objects that do not “contain names”, e.g. the set of natural numbers, or the Booleans, can be elegantly part of the formalism. Such a pset is called \emph{pure}. \index{pure\idxexpl{sets with empty carrier (nominal logic)}} Products and sums of psets are psets, with the permutations acting on the components. Similarly, the set of lists with elements in a pset is a pset. Functions from psets to psets are psets, with the action defined as \[ \pi \bullet f = \lambda x. \pi \bullet (f (\pi^{-1} \bullet x)). \] Note that the permutation acting on the argument has to be inverted. \subsection{Support and freshness} Usually, when discussing names and binders, one of the first definitions is that of $\fv{e}$, the set of free variables of some term $e$. Intuitively, it is the set of variables occurring in $e$ that are not hidden behind some binder. But this intuition gets us only so far: Consider the identity function $\operatorname{id} \colon \sVar\to\sVar$. On the one hand, it does not operate on any variables, it just passes them through. On the other hand, its graph mentions all variables. So what should its set of free variables be – nothing ($\{\}$) or everything ($\sVar$)? Nominal logic avoids this problem by giving a general and abstract definition of the set of free variables\footnote{This is commonly called the \emph{support}. I use the term \emph{free variables} in this introduction, as the notions coincide in all cases relevant to this thesis.} of an element of any pset: \begin{definition}[Free and fresh variables] The set of free variables of an element $x$ of some pset $X$ is defined as \index{fv@$\fv{\idxdummy}$\idxexpl{free variables}!b@\idxexpl{free variables (nominal logic)}} \[ \fv{x} = \{ a \mid \operatorname{card}\{b \mid (a\, b)\bullet x \ne x\} = \infty\}. \] A variable $v$ is \emph{fresh} with regard to $x$ if $v \notin \fv x$. \index{fresh} \end{definition} Spelled out, this says that a variable $a$ is free in $x$ if there are infinitely many other variables $b$ such that swapping these two affects $x$. Or, more vaguely, $a$ matters to $x$. \goodbreak From this definition, many useful and expected equalities about $\fv{}$ can be derived:\nopagebreak \begin{compactitem} \item $\fv{v} = \{v\}$ for $v \in \sVar$. \item $\fv{{(x,y})} = \fv x \cup \fv y$. \item $\fv{x} = \{\}$ if $x$ is from a pure pset. \item $\fv{\operatorname{id}} = \{\}$, as $\pi\bullet\operatorname{id} = \operatorname{id}$ for all permutations $\pi$. \item If $a,b \notin \fv{x}$, then $(a\, b)\bullet x = x$. \end{compactitem} \medskip When talking about programming languages, we are used to having “enough” variables, i.e.\@ there is always one that is fresh with regard to everything else around. This is not true in general. For example, let $f \colon \sVar \to \mathbb{N}$ be a bijection, then $\fv f = \sVar$, as every transposition $(a\,b)$ changes $f$. If such an object would appear during a proof, we would not be able to say “let $x$ be a variable that is fresh with regard to $f$” But in practice, such objects do not occur, and there is always a fresh variable. This is captured by the following \begin{definition}[Finite support] A pset $X$ is said to have \emph{finite support} if $\fv x$ is finite for all $x\in X$. \end{definition} Since $\sVar$ is infinite, it immediately follows that for every $x$ from a pset with finite support, there is a variable $a$ that is fresh with regard to $x$. The pset $\sVar$, as well as every pure pset, is a set with finite support. Products, sums and lists of psets with finite support have themselves finite support. Sets of functions from psets with finite support, or from an infinite set to a pset with finite support, do in general not have finite support. This can be slightly annoying, as discussed in \cref{sec:type-of-envs}. \subsection{Abstractions} The point of nominal logic is to provide a convenient way to work with \emph{abstractions}. Formally, a nominal abstraction over a pset $X$ is any operation $[\_].\_ \colon \sVar \to X \to X$ that fulfils \begin{enumerate}[~~(i)] \item $\pi \bullet ([a].x) = [\pi \bullet a].(\pi \bullet x)$ and \item $[a].x_1 = [b].x_2 \iff x_1 = (a\, b) \bullet x_2 \wedge (a = b \vee a \notin \fv{{x_2}})$. \end{enumerate} For a pset $X$ with finite support, this implies \[ \fv{[a].x} = \fv x \setminus \{a\}, \] which further shows that this notion of free variables coincides with our intuition and expectation. This notion of abstraction can be extended to multiple binders, e.g.\ to represent mutually recursive let-expressions \cite{nominal2}. \subsection{Strong induction rules} \label{sec:strong-induction} My use case for nominal logic is to model the syntax of the lambda calculus, and to get better induction principles. Consider this inductive definition of lambda expressions: \[ e \in \sExp \Coloneqq x \mid e~e \mid \lambda x. e \] where $x$ is a meta-variable referring to elements of $\sVar$. This would yield the following induction rule \begin{flalign*} &(\forall x \in \sVar.\, P(x)) \implies \\ &(\forall e_1, e_2 \in \sExp.\, P(e_1) \implies P(e_2) \implies P(e_1~e_2)) \implies \\ &(\forall x\in\sVar, e \in \sExp.\, P(e) \implies P(\lambda x. e)) \implies & \mathllap{P(e)} \end{flalign*} where in the case for lambda expressions, the proof obligation is to be discharged for \emph{any} variable $x$, even if that variable is part of the context (i.e.\@ mentioned in $P$). This can be a major hurdle during a proof. If one had $\sExp$ as a permutation set such that $\lambda x. e$ is a proper nominal induction, then it would be possible to prove a stronger induction rule: \begin{flalign*} &(\forall s\in X, x \in \sVar.\, P(s,x)) \implies \\ &(\forall s\in X, e_1, e_2 \in \sExp.\, P(s, e_1) \implies P(s,e_2) \implies P(s,e_1~e_2)) \implies \\ &(\forall s\in X, x\in\sVar, e \in \sExp.\, x \notin \fv s \implies P(s,e) \implies P(s,\lambda x. e)) \implies \\ && \mathllap{P(s,e)} \end{flalign*} Here the proposition $P$ explicitly specifies its “context” in its first parameter, which may be of any pset $X$ with finite support. In the case for the lambda abstraction, we may additionally, and without any manual naming or renaming, assume the variable $x$ to be fresh with regard to that context. The construction of $\sExp$ as a permutation set with a nominal abstraction is not trivial and described in \cite{nominal_in_isabelle}. Luckily, we do not have to worry about that: The implementation of nominal logic in Isabelle takes care of that (cf.\ \cref{sec:nominal-isa}). \subsection{Equivariance} The last concept from nominal logic that I need to introduce at this point is that of \emph{equivariance}.\index{equivariance\idxexpl{commutes with name permutations (nominal logic)}} In order to systematically construct inductively defined types as psets, and then to define functions over terms of such types by giving equations for each of these “constructors”, the involved operations and functions need to be well-behaving, i.e.\@ oblivious to the concrete names involved. This intuition is captured by the following definition: \begin{definition}[Equivariance] A function $f \colon X_1\to X_2\to \cdots \to X_n \to X$, $n\ge 0$, between psets is called \emph{equivariant} if \[ \pi \bullet f(x_1,x_2,\ldots,x_n) = f(\pi \bullet x_1, \pi \bullet x_2, \ldots, \pi\bullet x_n). \] \end{definition} Most common operations, such as tupling, list concatenation, the constructors of {\sExp} etc. are equivariant, and this ability to freely move permutations around is crucial to, for example, being able to prove \[ (\lambda x.\, e~x) = (\lambda y.\, e~y). \] \section{Isabelle} \label{sec:isabelle} \index{Isabelle} This work has been formalised in the interactive theorem prover Isabelle \cite{isabelle}. Roughly speaking, an interactive theorem prover has the appearance of a text editor that allows the user to write mathematics (definitions, theorems, proofs), with the very peculiar feature that it understands what is written, and either points out problems to the user, or confirms the correctness of the math. There are a number of such systems in use, with Coq \cite{coq} and Isabelle being the most prominent examples. One distinguishing feature of Isabelle is its genericity: It provides a meta-logical framework that can be instantiated with different concrete logics. \index{Isabelle!HOL} I build on the logic Isabelle/HOL, which implements a typed higher-order logic of total functions, in contrast to, for example, Isabelle/ZF, which builds on untyped set theory à la Zermelo-Fraenkel. Although I, like -- I presume -- most mathematicians, have been taught mathematics assuming set theory as the foundation of all math, all the actual math that we commonly do happens in an implicitly typed setting, and the choice of Isabelle/HOL over Isabelle/ZF is indeed natural. Furthermore, the tooling provided by Isabelle -- libraries of existing formalisations, conservative extensions, proof automation -- is much more comprehensive for Isabelle/HOL.\footnote{In Isabelle 2016, the HOL directory is more than 13 times the size of the ZF directory, measured in lines of code.} This theses builds on and refers to the Isabelle 2016 release. \subsection{The prettiness of Isabelle code} \index{Isabelle!Isar} One distinguishing feature of Isabelle is its proof language \emph{Isar} \cite{isar}, which has a somewhat legible syntax with keywords in English and allows for proofs that are nicely structured and readable. Furthermore, Isabelle supports generating \LaTeX{} code from its theory files. So the question arises whether I could have avoided re-writing everything in the hand-written style, by generating the all the definitions, proofs and theorems of this thesis out of my Isabelle theories. \begin{figure} \abovedisplayskip=0pt \belowdisplayskip=0pt \begin{conteq}[explline] \dsem{\sApp{e}{x}}{\esem{\Gamma}\rho} \\ = \sFnProj{\dsem{e}{\esem{\Gamma}\rho}}{{\esem{\Gamma}{\rho}}\, x} & by the denotation of application \\ = \sFnProj{\dsem{\sLam y {e'}}{\esem{\Delta}\rho}}{\esem{\Gamma}{\rho}~x} & by the induction hypothesis \\ = \sFnProj{\dsem{\sLam y {e'}}{\esem{\Delta}\rho}}{{\esem{\Delta}{\rho}}~x} & see above \\ = \dsem{e'}{(\esem{\Delta}\rho)(y \mapsto {\esem{\Delta}{\rho}}~x)} & by the denotation of lambda abstraction \\ = \dsem{\subst{e'}{y}{x}}{\esem{\Delta}\rho} & by \cref{lem:subst} \\ = \dsem{v}{\esem{\Theta}\rho} & by the induction hypothesis \end{conteq} %\vspace{-2ex} \strut\hfill\hrulefill\hrulefill\hfill\strut \vspace{.5ex} \isainput{correctness-case-app} \caption{A hand-written proof and the corresponding Isabelle code} \label{fig:nice-isabelle} \end{figure} For some parts, this would certainly be a viable option. Consider the hand-written proof and the corresponding fragment of the Isabelle theory in \cref{fig:nice-isabelle}, taken from the case for application in the proof of \cref{thm:thm2}. To a reader who knows some Isabelle syntax, it is pleasing to see how similar the hand-written proof and the Isabelle formalisation are. However, even this carefully selected fragment still has its warts: \begin{itemize} \item The syntax does not quite match up. For example, the abstract syntax tree node for an application is written explicitly using the \isa{App} constructor, whereas it is nicer to simply write $\sApp{e}{x}$. This is not possible in Isabelle -- juxtaposition cannot be overloaded.\footnote{At least not within reasonable use of the system.} In some cases, I can define custom syntax in Isabelle that comes very close to what I want. The $\sFnProj\_\_$ operator is a good example for that. Unfortunately, this often comes at the cost of extra inconvenience when entering these symbols. In antiquotations, where Isabelle is asked to produce a certain existing term, such as the conclusion of a previously proven lemma, Isabelle can make use of such fancy syntax automatically, and hence for free, but regular Isabelle theories will be converted to \LaTeX{} as they are entered, so in order to get fancy syntax, fancy syntax needs to be typed in. \item An Isabelle formalisation will almost always contain some technicalities that I would like not to pervade the presentation. A good example for that is the seemingly stray centre dot after \isa{Fn}: My formalisation uses the HOLCF package \cite{holcf}, which has a type dedicated to \emph{continuous} functions. This design choice avoids having to explicitly state continuity as a side conditions, but it also means that normal juxtaposition cannot be used to apply such functions, and a dedicated binary operator has to be used explicitly -- this is the “\isa{\isasymcdot}” seen in some of the Isabelle listings in this thesis. \item While Isabelle commands are chosen so that a theory is reminiscent of a proper English text, it is not a great pleasure to read. Many Isabelle commands (such as \isa{\isacommand{by} simp}) are only relevant to the system, but should be omitted when addressing a human reader, and other bits of technical syntax (e.g.\ invoking the induction hypothesis as \isa{Application{\isachardot}hyps{\isacharparenleft}{\isadigit{9}}{\isacharparenright}{\isacharbrackleft}OF\ prem{\isadigit{1}}{\isacharbrackright}}) would be out of place. There are ways to hide any part of the Isabelle code from the generated {\LaTeX}, but these markers would in turn clutter the Isabelle source code, and defeat the purpose of having a faithful representation of the proof in print. \end{itemize} Other parts of the development are even further away from a clean and easy-to-digest presentation, so I chose to keep most of the Isabelle development separate from the dissertation thesis. \Cref{ch:formal-stuff} contains a few snippets of the development, namely the main theorems and all definitions that are involved in them. The full formalisation is published in the Archive of Formal Proof \cite{launchbury-isa,arity-afp}.% and, frozen in the precise version corresponding to this publication, on \url{http://www.joachim-breitner.de/thesis}. \subsection{Nominal logic in Isabelle} \label{sec:nominal-isa} I have outlined the concepts of nominal logic in \cref{sec:nominal} in general terms. In my formalisation, I did not implement this machinery myself, but rather build on the Nominal2 package for Isabelle by Christian Urban and others \cite{nominal_in_isabelle,nominal2}, which provides all the basic concepts of nominal logic, together with tools to work with them. \index{Nominal2\idxexpl{Isabelle package for nominal logic}} Permutation sets are modelled as types within the type class \isa{pt}, which fixes the permutation action $\bullet$. In the context of this type class, the package provides general definitions for support (\isa{supp}), freshness (\isa{fresh}, or written infix as \isa{{\isasymsharp}}). Type classes that extend \isa{pt} with additional requirements are \isa{fs} for permutation sets with finite support and \isa{pure} for pure permutation sets. I define the function \isa{fv} as the support, restricted to one sort of atoms: \isainput{def-fv} Nominal2 provides the proof method \isa{perm{\isacharunderscore}simp} which simplifies proof goals involving permutations by pushing them inside expressions as far as possible. It maintains a list of equivariance theorems that the user can extend with equivariance lemmas about newly defined constants. The command \isa{\isacommand{nominal{\isacharunderscore}datatype}} allows the user to conveniently construct a permutation set corresponding to a usual, inductive definition with binding structure annotated. See \cref{sec:using-nominal} for an example. The constructors of such a data type cannot be used as constructors with Isabelle tools like \isa{\isacommand{fun}}, because they do not completely behave as such. For example, they are not necessarily injective. Therefore, Nominal2 provides the separate command \isa{\isacommand{nominal{\isacharunderscore}function}} to define functions over a nominal data type. It is not completely automatic and requires the user to discharge a number of proof obligations, such as equivariance of the function’s graph and representation independence of the equations. Similarly, Nominal2 provides the command \isa{\isacommand{nominal{\isacharunderscore}inductive}}, which can be used, after defining an inductive predicate as usual with \isa{\isacommand{inductive}}, to specify which free variables of a rule should not clash with the context during a proof by induction. It requires the user to prove that the variable is fresh with regard to the conclusion of the rule, and in return generates a stronger induction rule akin to the one shown in \cref{sec:strong-induction}. The proof method \isa{nominal{\isacharunderscore}induct}, which can be used instead of the usual \isa{induct} method, supports the additional option \isa{avoiding} and instantiates the strong induction rule so that the desired additional freshness assumptions become available. \subsection{Domain theory and the \isa{HOLCF} package} \label{sec:domaintheory} Applications of domain theory, i.e.\@ the mathematical field that studies certain partial orders, pervade programming language research: They are used to give semantics to recursive functions and to recursive types; they structure program analysis results and tell us how to find fixpoints. As my use of domain theory in this thesis is quite standard, I will elide most of the technicalities and usually state just the partial order used. My domains are of the pointed, chain-complete kind. I consider only $\omega$-chains, i.e.\@ sequences $(a_i)_{i\in\mathbb N}$ with $a_i \sqsubseteq a_{i+1}$; completeness of the domain implies that every such chain has a least upper bound $\bigsqcup_{i\in\mathbb N} a_i$. A domain is called \emph{pointed} if it has a least element, written $\bot$. \index{pointed domain\idxexpl{domain with a least element}} \index{HOLCF\idxexpl{Isabelle package for domain theory}}\index{Isabelle!HOLCF} This choice is motivated by my use of the Isabelle package HOLCF \cite{holcf}, which is a comprehensive suite of definitions and tools for working with domain theory in Isabelle. In particular, it allows me to define possibly complex recursive domains such as the domain used by the resourced denotational semantics in \cref{sec:resourced_adequacy}, with one command: \isainput{cvalue-domain} \goodbreak This will not only define the type \isa{CValue}, but also the two injection functions \isa{CFn} and \isa{CB}, corresponding projection functions and induction principles. The command \isa{\isacommand{fixrec}} can then define functions over such a domains. The type \isa{CValue} is then automatically made a member of a number of type classes that come with HOLCF. Most relevant for us are \begin{itemize} \item \isa{po} for types supporting a partial order, written with square operators and relations, i.e.\@ \isa{\isasymsqsubseteq}, \index{po@\isa{po}\idxexpl{type class for partial orders (HOLCF)}} \item \isa{cpo} for complete partial orders, i.e.\@ types in \isa{po} where additionally every $\omega$-chain has a least upper bound and \index{cpo@\isa{cpo}\idxexpl{type class for complete partial orders (HOLCF)}} \item \isa{pcpo} for pointed complete partial order, which extends \isa{cpo} by the requirement that a least element \isa{\isasymbottom} exists. \index{pcpo@\isa{pcpo}\idxexpl{type class for pointed cpos (HOLCF)}} \end{itemize} \index{.R2rightarrow@\isa{\idxdummy\isasymrightarrow\idxdummy}\idxexpl{type of continuous functions (HOLCF)}} HOLCF introduces a type dedicated to continuous functions, written \mbox{\isa{\isacharprime a {\isasymrightarrow} \isacharprime b}}, which is separate from Isabelle’s regular function type, written \mbox{\isa{\isacharprime a {\isasymRightarrow} \isacharprime b}}. Encoding the continuity of functions in the types avoid having to explicitly assume functions to be continuous in the various lemmas. This is particularly important when some definition is only well-defined if its arguments are continuous, as it is the case for the fixed-point operator \isa{fix {\isacharcolon} (\isacharprime a \isasymrightarrow \isacharprime a) \isasymrightarrow \isacharprime a} (with \isa{\isacharprime a\isacharcolon\isacharcolon pcpo}, i.e.\@ the type \isa{\isacharprime a} has an instance of the type class \isa{pcpo}). Without this trick, \isa{fix} would not be a total function, and working with partial functions in Isabelle is always annoying to some degree. \index{.O2contapp@\isa{\idxdummy\isasymcdot\idxdummy}\idxexpl{application operator for continuous functions (HOLCF)}} The downside of this design choice is that such continuous functions cannot be applied directly. Therefore, HOLCF introduces an explicit function application operator \isa{\_\isasymcdot\_ {\isacharcolon} (\isacharprime a \isasymrightarrow \isacharprime b) {\isasymRightarrow} \isacharprime a {\isasymRightarrow} \isacharprime b}. I advise to simply assume this operator is not there when reading Isabelle code using HOLCF. The custom type has further consequences: Existing tools to define new functions, such as \isa{\isacommand{definition}}, \isa{\isacommand{fun}} and the Nominal-specific command \isa{\isacommand{nominal{\isacharunderscore}function}} know how to define normal functions, but are unable to produce values of type \mbox{\isa{\isacharprime a {\isasymrightarrow} \isacharprime b}}. In these cases, I have to resort to defining the function by using the -- again HOLCF-specific -- lambda abstraction for continuous functions written \isa{({\isasymLambda} x. e)} on the right-hand side of the definition. I can still prove the intended function equations, with the argument on the left-hand side, manually afterwards, as long as the function definition is indeed continuous. \medskip The standard proof principle for functions defined in terms of the aforementioned \isa{fix} is \emph{fixed-point induction}\index{fixed-point induction}: In order to prove that a predicate $P$ holds for \isa{fix{\isasymcdot}F}, where the functorial \isa{F} is of type \isa{{\isacharprime}a\isasymrightarrow{\isacharprime}a} with \isa{\isacharprime a\isacharcolon\isacharcolon pcpo}, it suffices to prove that \begin{compactitem} \item the predicate $P$ is admissible, i.e.\@ if it holds for all elements of a chain, then it holds for the least upper bound of the chain, \item $P$ holds for $\bot$ and \item $P$ holds for any \isa{F{\isasymcdot}x}, given that $P$ holds for $x$. \end{compactitem} A derived proof principle is that of \emph{parallel fixed-point induction} which can be used to establish that a binary predicate $P$ (usually an equality or inequality) holds for \isa{fix{\isasymcdot}F} and \isa{fix{\isasymcdot}G}. This requires a proof that \begin{compactitem} \item the predicate $P$, understood as a predicate on tuples, is admissible, \item \isa{P {\isasymbottom} {\isasymbottom}} holds and \item \isa{P (F{\isasymcdot}x) (F{\isasymcdot}y)} holds, given that \isa{P x y} holds. \end{compactitem} Both principles are provided by HOLCF as lemmas, and an extensible set of syntax-directed lemmas helps to take care of the admissibility proof obligation.