nomeata’s mind shares

Extrinsic termination proofs for well-founded recursion in Lean

mail@joachim-breitner.de (Joachim Breitner) — Mon, 10 Mar 2025 18:47:59 +0100

A few months ago I explained that one reason why this blog has become more quiet is that all my work on Lean is covered elsewhere.

This post is an exception, because it is an observation that is (arguably) interesting, but does not lead anywhere, so where else to put it than my own blog…

Want to share your thoughts about this? Please join the discussion on the Lean community zulip!

Background

When defining a function recursively in Lean that has nested recursion, e.g. a recusive call that is in the argument to a higher-order function like List.map, then extra attention used to be necessary so that Lean can see that xs.map applies its argument only elements of the list xs. The usual idiom is to write xs.attach.map instead, where List.attach attaches to the list elements a proof that they are in that list. You can read more about this my Lean blog post on recursive definitions and our new shiny reference manual, look for Example “Nested Recursion in Higher-order Functions”.

To make this step less tedious I taught Lean to automatically rewrite xs.map to xs.attach.map (where suitable) within the construction of well-founded recursion, so that nested recursion just works (issue #5471). We already do such a rewriting to change if c then … else … to the dependent if h : c then … else …, but the attach-introduction is much more ambitious (the rewrites are not definitionally equal, there are higher-order arguments etc.) Rewriting the terms in a way that we can still prove the connection later when creating the equational lemmas is hairy at best. Also, we want the whole machinery to be extensible by the user, setting up their own higher order functions to add more facts to the context of the termination proof.

I implemented it like this (PR #6744) and it ships with 4.18.0, but in the course of this work I thought about a quite different and maybe better™ way to do this, and well-founded recursion in general:

A simpler `fix`

Recall that to use WellFounded.fix

WellFounded.fix : (hwf : WellFounded r) (F : (x : α) → ((y : α) → r y x → C y) → C x) (x : α) : C x

we have to rewrite the functorial of the recursive function, which naturally has type

F : ((y : α) →  C y) → ((x : α) → C x)

to the one above, where all recursive calls take the termination proof r y x. This is a fairly hairy operation, mangling the type of matcher’s motives and whatnot.

Things are simpler for recursive definitions using the new partial_fixpoint machinery, where we use Lean.Order.fix

Lean.Order.fix : [CCPO α] (F : β → β) (hmono : monotone F) : β

so the functorial’s type is unmodified (here β will be ((x : α) → C x)), and everything else is in the propositional side-condition montone F. For this predicate we have a syntax-guided compositional tactic, and it’s easily extensible, e.g. by

theorem monotone_mapM (f : γ → α → m β) (xs : List α) (hmono : monotone f) :
    monotone (fun x => xs.mapM (f x))

Once given, we don’t care about the content of that proof. In particular proving the unfolding theorem only deals with the unmodified F that closely matches the function definition as written by the user. Much simpler!

Isabelle has it easier

Isabelle also supports well-founded recursion, and has great support for nested recursion. And it’s much simpler!

There, all you have to do to make nested recursion work is to define a congruence lemma of the form, for List.map something like our List.map_congr_left

List.map_congr_left : (h : ∀ a ∈ l, f a = g a) :
    List.map f l = List.map g l

This is because in Isabelle, too, the termination proofs is a side-condition that essentially states “the functorial F calls its argument f only on smaller arguments”.

Can we have it easy, too?

I had wished we could do the same in Lean for a while, but that form of congruence lemma just isn’t strong enough for us.

But maybe there is a way to do it, using an existential to give a witness that F can alternatively implemented using the more restrictive argument. The following callsOn P F predicate can express that F calls its higher-order argument only on arguments that satisfy the predicate P:

section setup

variable {α : Sort u}
variable {β : α → Sort v}
variable {γ : Sort w}

def callsOn (P : α → Prop) (F : (∀ y, β y) → γ) :=
  ∃ (F': (∀ y, P y → β y) → γ), ∀ f, F' (fun y _ => f y) = F f

variable (R : α → α → Prop)
variable (F : (∀ y, β y) → (∀ x, β x))

local infix:50 " ≺ " => R

def recursesVia : Prop := ∀ x, callsOn (· ≺ x) (fun f => F f x)

noncomputable def fix (wf : WellFounded R) (h : recursesVia R F) : (∀ x, β x) :=
  wf.fix (fun x => (h x).choose)

def fix_eq (wf : WellFounded R) h x :
    fix R F wf h x = F (fix R F wf h) x := by
  unfold fix
  rw [wf.fix_eq]
  apply (h x).choose_spec

This allows nice compositional lemmas to discharge callsOn predicates:

theorem callsOn_base (y : α) (hy : P y) :
    callsOn P (fun (f : ∀ x, β x) => f y) := by
  exists fun f => f y hy
  intros; rfl

@[simp]
theorem callsOn_const (x : γ) :
    callsOn P (fun (_ : ∀ x, β x) => x) :=
  ⟨fun _ => x, fun _ => rfl⟩

theorem callsOn_app
    {γ₁ : Sort uu} {γ₂ : Sort ww}
    (F₁ :  (∀ y, β y) → γ₂ → γ₁) -- can this also support dependent types?
    (F₂ :  (∀ y, β y) → γ₂)
    (h₁ : callsOn P F₁)
    (h₂ : callsOn P F₂) :
    callsOn P (fun f => F₁ f (F₂ f)) := by
  obtain ⟨F₁', h₁⟩ := h₁
  obtain ⟨F₂', h₂⟩ := h₂
  exists (fun f => F₁' f (F₂' f))
  intros; simp_all

theorem callsOn_lam
    {γ₁ : Sort uu}
    (F : γ₁ → (∀ y, β y) → γ) -- can this also support dependent types?
    (h : ∀ x, callsOn P (F x)) :
    callsOn P (fun f x => F x f) := by
  exists (fun f x => (h x).choose f)
  intro f
  ext x
  apply (h x).choose_spec

theorem callsOn_app2
    {γ₁ : Sort uu} {γ₂ : Sort ww}
    (g : γ₁ → γ₂ → γ)
    (F₁ :  (∀ y, β y) → γ₁) -- can this also support dependent types?
    (F₂ :  (∀ y, β y) → γ₂)
    (h₁ : callsOn P F₁)
    (h₂ : callsOn P F₂) :
    callsOn P (fun f => g (F₁ f) (F₂ f)) := by
  apply_rules [callsOn_app, callsOn_const]

With this setup, we can have the following, possibly user-defined, lemma expressing that List.map calls its arguments only on elements of the list:

theorem callsOn_map (δ : Type uu) (γ : Type ww)
    (P : α → Prop) (F : (∀ y, β y) → δ → γ) (xs : List δ)
    (h : ∀ x, x ∈ xs → callsOn P (fun f => F f x)) :
    callsOn P (fun f => xs.map (fun x => F f x)) := by
  suffices callsOn P (fun f => xs.attach.map (fun ⟨x, h⟩ => F f x)) by
    simpa
  apply callsOn_app
  · apply callsOn_app
    · apply callsOn_const
    · apply callsOn_lam
      intro ⟨x', hx'⟩
      dsimp
      exact (h x' hx')
  · apply callsOn_const

end setup

So here is the (manual) construction of a nested map for trees:

section examples

structure Tree (α : Type u) where
  val : α
  cs : List (Tree α)

-- essentially
-- def Tree.map (f : α → β) : Tree α → Tree β :=
--   fun t => ⟨f t.val, t.cs.map Tree.map⟩)
noncomputable def Tree.map (f : α → β) : Tree α → Tree β :=
  fix (sizeOf · < sizeOf ·) (fun map t => ⟨f t.val, t.cs.map map⟩)
    (InvImage.wf (sizeOf ·) WellFoundedRelation.wf) <| by
  intro ⟨v, cs⟩
  dsimp only
  apply callsOn_app2
  · apply callsOn_const
  · apply callsOn_map
    intro t' ht'
    apply callsOn_base
    -- ht' : t' ∈ cs -- !
    -- ⊢ sizeOf t' < sizeOf { val := v, cs := cs }
    decreasing_trivial

end examples

This makes me happy!

All details of the construction are now contained in a proof that can proceed by a syntax-driven tactic and that’s easily and (likely robustly) extensible by the user. It also means that we can share a lot of code paths (e.g. everything related to equational theorems) between well-founded recursion and partial_fixpoint.

I wonder if this construction is really as powerful as our current one, or if there are certain (likely dependently typed) functions where this doesn’t fit, but the β above is dependent, so it looks good.

With this construction, functions defined by well-founded recursion will reduce even worse in the kernel, I assume. This may be a good thing.

The cake is a lie

What unfortunately kills this idea, though, is the generation of the functional induction principles, which I believe is not (easily) possible with this construction: The functional induction principle is proved by massaging F to return a proof, but since the extra assumptions (e.g. for ite or List.map) only exist in the termination proof, they are not available in F.

Oh wey, how anticlimactic.

PS: Path dependencies

Curiously, if we didn’t have functional induction at this point yet, then very likely I’d change Lean to use this construction, and then we’d either not get functional induction, or it would be implemented very differently, maybe a more syntactic approach that would re-prove termination. I guess that’s called path dependence.

Coding on my eInk Tablet

mail@joachim-breitner.de (Joachim Breitner) — Sun, 02 Feb 2025 16:07:35 +0100

For many years I wished I had a setup that would allow me to work (that is, code) productively outside in the bright sun. It’s winter right now, but when its summer again it’s always a bit. this weekend I got closer to that goal.

TL;DR: Using code-server on a beefy machine seems to be quite neat.

Personal history

Looking back at my own old blog entries I find one from 10 years ago describing how I bought a Kobo eBook reader with the intent of using it as an external monitor for my laptop. It seems that I got a proof-of-concept setup working, using VNC, but it was tedious to set up, and I never actually used that. I subsequently noticed that the eBook reader is rather useful to read eBooks, and it has been in heavy use for that every since.

Four years ago I gave this old idea another shot and bought an Onyx BOOX Max Lumi. This is an A4-sized tablet running Android and had the very promising feature of an HDMI input. So hopefully I’d attach it to my laptop and it just works™. Turns out that this never worked as well as I hoped: Even if I set the resolution to exactly the tablet’s screen’s resolution I got blurry output, and it also drained the battery a lot, so I gave up on this. I subsequently noticed that the tablet is rather useful to take notes, and it has been in sporadic use for that.

Going off on this tangent: I later learned that the HDMI input of this device appears to the system like a camera input, and I don’t have to use Boox’s “monitor” app but could other apps like FreeDCam as well. This somehow managed to fix the resolution issues, but the setup still wasn’t as convenient to be used regularly.

I also played around with pure terminal approaches, e.g. SSH’ing into a system, but since my usual workflow was never purely text-based (I was at least used to using a window manager instead of a terminal multiplexer like screen or tmux) that never led anywhere either.

VSCode, working remotely

Since these attempts I have started a new job working on the Lean theorem prover, and working on or with Lean basically means using VSCode. (There is a very good neovim plugin as well, but I’m using VSCode nevertheless, if only to make sure I am dogfooding our default user experience).

My colleagues have said good things about using VSCode with the remote SSH extension to work on a beefy machine, so I gave this a try now as well, and while it’s not a complete game changer for me, it does make certain tasks (rebuilding everything after a switching branches, running the test suite) very convenient. And it’s a bit spooky to run these work loads without the laptop’s fan spinning up.

In this setup, the workspace is remote, but VSCode still runs locally. But it made me wonder about my old goal of being able to work reasonably efficient on my eInk tablet. Can I replicate this setup there?

VSCode itself doesn’t run on Android directly. There are project that run a Linux chroot or in termux on the Android system, and then you can VNC to connect to it (e.g. on Andronix)… but that did not seem promising. It seemed fiddly, and I probably should take it easy on the tablet’s system.

code-server, running remotely

A more promising option is code-server. This is a fork of VSCode (actually of VSCodium) that runs completely on the remote machine, and the client machine just needs a browser. I set that up this weekend and found that I was able to do a little bit of work reasonably.

Access

With code-server one has to decide how to expose it safely enough. I decided against the tunnel-over-SSH option, as I expected that to be somewhat tedious to set up (both initially and for each session) on the android system, and I liked the idea of being able to use any device to work in my environment.

I also decided against the more involved “reverse proxy behind proper hostname with SSL” setups, because they involve a few extra steps, and some of them I cannot do as I do not have root access on the shared beefy machine I wanted to use.

That left me with the option of using a code-server’s built-in support for self-signed certificates and a password:

$ cat .config/code-server/config.yaml
bind-addr: 1.2.3.4:8080
auth: password
password: xxxxxxxxxxxxxxxxxxxxxxxx
cert: true

With trust-on-first-use this seems reasonably secure.

Update: I noticed that the browsers would forget that I trust this self-signed cert after restarting the browser, and also that I cannot “install” the page (as a Progressive Web App) unless it has a valid certificate. But since I don’t have superuser access to that machine, I can’t just follow the official recommendation of using a reverse proxy on port 80 or 431 with automatic certificates. Instead, I pointed a hostname that I control to that machine, obtained a certificate manually on my laptop (using acme.sh) and copied the files over, so the configuration now reads as follows:

bind-addr: 1.2.3.4:3933
auth: password
password: xxxxxxxxxxxxxxxxxxxxxxxx
cert: .acme.sh/foobar.nomeata.de_ecc/foobar.nomeata.de.cer
cert-key: .acme.sh/foobar.nomeata.de_ecc/foobar.nomeata.de.key

(This is getting very specific to my particular needs and constraints, so I’ll spare you the details.)

Service

To keep code-server running I created a systemd service that’s managed by my user’s systemd instance:

~ $ cat ~/.config/systemd/user/code-server.service
[Unit]
Description=code-server
After=network-online.target

[Service]
Environment=PATH=/home/joachim/.nix-profile/bin:/nix/var/nix/profiles/default/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
ExecStart=/nix/var/nix/profiles/default/bin/nix run nixpkgs#code-server

[Install]
WantedBy=default.target

(I am using nix as a package manager on a Debian system there, hence the additional PATH and complex ExecStart. If you have a more conventional setup then you do not have to worry about Environment and can likely use ExecStart=code-server.

For this to survive me logging out I had to ask the system administrator to run loginctl enable-linger joachim, so that systemd allows my jobs to linger.

Git credentials

The next issue to be solved was how to access the git repositories. The work is all on public repositories, but I still need a way to push my work. With the classic VSCode-SSH-remote setup from my laptop, this is no problem: My local SSH key is forwarded using the SSH agent, so I can seamlessly use that on the other side. But with code-server there is no SSH key involved.

I could create a new SSH key and store it on the server. That did not seem appealing, though, because SSH keys on Github always have full access. It wouldn’t be horrible, but I still wondered if I can do better.

I thought of creating fine-grained personal access tokens that only me to push code to specific repositories, and nothing else, and just store them permanently on the remote server. Still a neat and convenient option, but creating PATs for our org requires approval and I didn’t want to bother anyone on the weekend.

So I am experimenting with Github’s git-credential-manager now. I have configured it to use git’s credential cache with an elevated timeout, so that once I log in, I don’t have to again for one workday.

$ nix-env -iA nixpkgs.git-credential-manager
$ git-credential-manager configure
$ git config --global credential.credentialStore cache
$ git config --global credential.cacheOptions "--timeout 36000"

To login, I have to https://github.com/login/device on an authenticated device (e.g. my phone) and enter a 8-character code. Not too shabby in terms of security. I only wish that webpage would not require me to press Tab after each character…

This still grants rather broad permissions to the code-server, but at least only temporarily

Android setup

On the client side I could now open https://host.example.com:8080 in Firefox on my eInk Android tablet, click through the warning about self-signed certificates, log in with the fixed password mentioned above, and start working!

I switched to a theme that supposedly is eInk-optimized (eInk by Mufanza). It’s not perfect (e.g. git diffs are unhelpful because it is not possible to distinguish deleted from added lines), but it’s a start. There are more eInk themes on the official Visual Studio Marketplace, but because code-server is a fork it cannot use that marketplace, and for example this theme isn’t on Open-VSX.

For some reason the F11 key doesn’t work, but going fullscreen is crucial, because screen estate is scarce in this setup. I can go fullscreen using VSCode’s command palette (Ctrl-P) and invoking the command there, but Firefox often jumps out of the fullscreen mode, which is annoying. I still have to pay attention to when that’s happening; maybe its the Esc key, which I am of course using a lot due to me using vim bindings.

A more annoying problem was that on my Boox tablet, sometimes the on-screen keyboard would pop up, which is seriously annoying! It took me a while to track this down: The Boox has two virtual keyboards installed: The usual Google ASOP keyboard, and the Onyx Keyboard. The former is clever enough to stay hidden when there is a physical keyboard attached, but the latter isn’t. Moreover, pressing Shift-Ctrl on the physical keyboard rotates through the virtual keyboards. Now, VSCode has many keyboard shortcuts that require Shift-Ctrl (especially on an eInk device, where you really want to avoid using the mouse). And the limited settings exposed by the Boox Android system do not allow you configure that or disable the Onyx keyboard! To solve this, I had to install the KISS Launcher, which would allow me to see more Android settings, and in particular allow me to disable the Onyx keyboard. So this is fixed.

I was hoping to improve the experience even more by opening the web page as a Progressive Web App (PWA), as described in the code-server FAQ. Unfortunately, that did not work. Firefox on Android did not recognize the site as a PWA (even though it recognizes a PWA test page). And I couldn’t use Chrome either because (unlike Firefox) it would not consider a site with a self-signed certificate as a secure context, and then code-server does not work fully. Maybe this is just some bug that gets fixed in later versions.

Now that I use a proper certificate, I can use it as a Progressive Web App, and with Firefox on Android this starts the app in full-screen mode (no system bars, no location bar). The F11 key still does’t work, and using the command palette to enter fullscreen does nothing visible, but then Esc leaves that fullscreen mode and I suddenly have the system bars again. But maybe if I just don’t do that I get the full screen experience. We’ll see.

I did not work enough with this yet to assess how much the smaller screen estate, the lack of colors and the slower refresh rate will bother me. I probably need to hide Lean’s InfoView more often, and maybe use the Error Lens extension, to avoid having to split my screen vertically.

I also cannot easily work on a park bench this way, with a tablet and a separate external keyboard. I’d need at least a table, or some additional piece of hardware that turns tablet + keyboard into some laptop-like structure that I can put on my, well, lap. There are cases for Onyx products that include a keyboard, and maybe they work on the lap, but they don’t have the Trackpoint that I have on my ThinkPad TrackPoint Keyboard II, and how can you live without that?

Conclusion

After this initial setup chances are good that entering and using this environment is convenient enough for me to actually use it; we will see when it gets warmer.

A few bits could be better. In particular logging in and authenticating GitHub access could be both more convenient and more safe – I could imagine that when I open the page I confirm that on my phone (maybe with a fingerprint), and that temporarily grants access to the code-server and to specific GitHub repositories only. Is that easily possible?

Do surprises get larger?

mail@joachim-breitner.de (Joachim Breitner) — Sun, 30 Jun 2024 15:28:31 +0200

The setup

Imagine you are living on a riverbank. Every now and then, the river swells and you have high water. The first few times this may come as a surprise, but soon you learn that such floods are a recurring occurrence at that river, and you make suitable preparation. Let’s say you feel well-prepared against any flood that is no higher than the highest one observed so far. The more floods you have seen, the higher that mark is, and the better prepared you are. But of course, eventually a higher flood will occur that surprises you.

Of course such new record floods are happening rarer and rarer as you have seen more of them. I was wondering though: By how much do the new records exceed the previous high mark? Does this excess decrease or increase over time?

A priori both could be. When the high mark is already rather high, maybe new record floods will just barley pass that mark? Or maybe, simply because new records are so rare events, when they do occur, they can be surprisingly bad?

This post is a leisurely mathematical investigating of this question, which of course isn’t restricted to high waters; it could be anything that produces a measurement repeatedly and (mostly) independently – weather events, sport results, dice rolls.

The answer of course depends on the distribution of results: How likely is each possible results.

Dice are simple

With dice rolls the answer is rather simple. Let our measurement be how often you can roll a die until it shows a 6. This simple game we can repeat many times, and keep track of our record. Let’s say the record happens to be 7 rolls. If in the next run we roll the die 7 times, and it still does not show a 6, then we know that we have broken the record, and every further roll increases by how much we beat the old record.

But note that how often we will now roll the die is completely independent of what happened before!

So for this game the answer is: The excess with which the record is broken is always the same.

Mathematically speaking this is because the distribution of “rolls until the die shows a 6” is memoryless. Such distributions are rather special, its essentially just the example we gave (a geometric distribution), or its continuous analogue (the exponential distributions, for example the time until a radioactive particle decays).

Mathematical formulation

With this out of the way, let us look at some other distributions, and for that, introduce some mathematical notations. Let X be a random variable with probability density function φ(x) and cumulative distribution function Φ(x), and a be the previous record. We are interested in the behavior of

Y(a) = X − a ∣ X > x

i.e. by how much X exceeds a under the condition that it did exceed a. How does Y change as a increases? In particular, how does the expected value of the excess e(a) = E(Y(a)) change?

Uniform distribution

If X is uniformly distributed between, say, 0 and 1, then a new record will appear uniformly distributed between a and 1, and as that range gets smaller, the excess must get smaller as well. More precisely,

e(a) = E(X − a ∣ X > a) = E(X ∣ X > a) − a = (1 − a)/2

This not very interesting linear line is plotted in blue in this diagram:

The orange line with the logarithmic scale on the right tries to convey how unlikely it is to surpass the record value a: it shows how many attempts we expect before the record is broken. This can be calculated by n(a) = 1/(1 − Φ(a)).

Normal distribution

For the normal distribution (with median 0 and standard derivation 1, to keep things simple), we can look up the expected value of the one-sided truncated normal distribution and obtain

e(a) = E(X ∣ X > a) − a = φ(a)/(1 − Φ(a)) − a

Now is this growing or shrinking? We can plot this an have a quick look:

Indeed it is, too, a decreasing function!

(As a sanity check we can see that e(0) = √(2/π), which is the expected value of the half-normal distribution, as it should.)

Could it be any different?

This settles my question: It seems that each new surprisingly high water will tend to be less surprising than the previously – assuming high waters were uniformly or normally distributed, which is unlikely to be helpful.

This does raise the question, though, if there are probability distributions for which e(a) is be increasing?

I can try to construct one, and because it’s a bit easier, I’ll consider a discrete distribution on the positive natural numbers, and consider at g(0) = E(X) and g(1) = E(X − 1 ∣ X > 1). What does it take for g(1) > g(0)? Using E(X) = p + (1 − p)E(X ∣ X > 1) for p = P(X = 1) we find that in order to have g(1) > g(0), we need E(X) > 1/p.

This is plausible because we get equality when E(X) = 1/p, as it precisely the case for the geometric distribution. And it is also plausible that it helps if p is large (so that the next first record is likely just 1) and if, nevertheless, E(X) is large (so that if we do get an outcome other than 1, it’s much larger).

Starting with the geometric distribution, where P(X > n ∣ X ≥ n) = p_n = p (the probability of again not rolling a six) is constant, it seems that these p_n is increasing, we get the desired behavior. So let p₁ < p₂ < p_n < … be an increasing sequence of probabilities, and define X so that P(X = n) = p₁ ⋅ ⋯ ⋅ p_n − 1 ⋅ (1 − p_n) (imagine the die wears off and the more often you roll it, the less likely it shows a 6). Then for this variation of the game, every new record tends to exceed the previous more than previous records. As the p increase, we get a flatter long end in the probability distribution.

Gamma distribution

To get a nice plot, I’ll take the intuition from this and turn to continuous distributions. The Wikipedia page for the exponential distribution says it is a special case of the gamma distribution, which has an additional shape parameter α, and it seems that it could influence the shape of the distribution to be and make the probability distribution have a longer end. Let’s play around with β = 2 and α = 0.5, 1 and 1.5:

For α = 1 (dotted) this should just be the exponential distribution, and we see that e(a) is flat, as predicted earlier.
For larger α (dashed) the graph does not look much different from the one for the normal distribution – not a surprise, as for α → ∞, the gamma distribution turns into the normal distribution.
For smaller α (solid) we get the desired effect: e(a) is increasing. This means that new records tend to break records more impressively.

The orange line shows that this comes at a cost: for a given old record a, new records are harder to come by with smaller α.

Conclusion

As usual, it all depends on the distribution. Otherwise, not much, it’s late.

Blogging on Lean

mail@joachim-breitner.de (Joachim Breitner) — Fri, 31 May 2024 13:47:06 +0100

This blog has become a bit quiet since I joined the Lean FRO. One reasons is of course that I can now improve things about Lean, rather than blog about what I think should be done (which, by contraposition, means I shouldn’t blog about what can be improved…). A better reason is that some of the things I’d otherwise write here are now published on the official Lean blog, in particular two lengthy technical posts explaining aspects of Lean that I worked on:

It would not be useful to re-publish them here because the technology verso behind the Lean blog, created by my colleage David Thrane Christansen, enables such fancy features like type-checked code snippets, including output and lots of information on hover. So I’ll be content with just cross-linking my posts from here.

Convenient sandboxed development environment

mail@joachim-breitner.de (Joachim Breitner) — Mon, 11 Mar 2024 21:39:58 +0100

I like using one machine and setup for everything, from serious development work to hobby projects to managing my finances. This is very convenient, as often the lines between these are blurred. But it is also scary if I think of the large number of people who I have to trust to not want to extract all my personal data. Whenever I run a cabal install, or a fun VSCode extension gets updated, or anything like that, I am running code that could be malicious or buggy.

In a way it is surprising and reassuring that, as far as I can tell, this commonly does not happen. Most open source developers out there seem to be nice and well-meaning, after all.

Convenient or it won’t happen

Nevertheless I thought I should do something about this. The safest option would probably to use dedicated virtual machines for the development work, with very little interaction with my main system. But knowing me, that did not seem likely to happen, as it sounded like a fair amount of hassle. So I aimed for a viable compromise between security and convenient, and one that does not get too much in the way of my current habits.

For instance, it seems desirable to have the project files accessible from my unconstrained environment. This way, I could perform certain actions that need access to secret keys or tokens, but are (unlikely) to run code (e.g. git push, git pull from private repositories, gh pr create) from “the outside”, and the actual build environment can do without access to these secrets.

The user experience I thus want is a quick way to enter a “development environment” where I can do most of the things I need to do while programming (network access, running command line and GUI programs), with access to the current project, but without access to my actual /home directory.

I initially followed the blog post “Application Isolation using NixOS Containers” by Marcin Sucharski and got something working that mostly did what I wanted, but then a colleague pointed out that tools like firejail can achieve roughly the same with a less “global” setup. I tried to use firejail, but found it to be a bit too inflexible for my particular whims, so I ended up writing a small wrapper around the lower level sandboxing tool https://github.com/containers/bubblewrap.

Selective bubblewrapping

This script, called dev and included below, builds a new filesystem namespace with minimal /proc and /dev directories, it’s own /tmp directories. It then binds-mound some directories to make the host’s NixOS system available inside the container (/bin, /usr, the nix store including domain socket, stuff for OpenGL applications). My user’s home directory is taken from ~/.dev-home and some configuration files are bind-mounted for convenient sharing. I intentionally don’t share most of the configuration – for example, a direnv enable in the dev environment should not affect the main environment. The X11 socket for graphical applications and the corresponding .Xauthority file is made available. And finally, if I run dev in a project directory, this project directory is bind mounted writable, and the current working directory is preserved.

The effect is that I can type dev on the command line to enter “dev mode” rather conveniently. I can run development tools, including graphical ones like VSCode, and especially the latter with its extensions is part of the sandbox. To do a git push I either exit the development environment (Ctrl-D) or open a separate terminal. Overall, the inconvenience of switching back and forth seems worth the extra protection.

Clearly, isn’t going to hold against a determined and maybe targeted attacker (e.g. access to the X11 and the nix daemon socket can probably be used to escape easily). But I hope it will help against a compromised dev dependency that just deletes or exfiltrates data, like keys or passwords, from the usual places in $HOME.

Rough corners

There is more polishing that could be done.

In particular, clicking on a link inside VSCode in the container will currently open Firefox inside the container, without access to my settings and cookies etc. Ideally, links would be opened in the Firefox running outside. This is a problem that has a solution in the world of applications that are sandboxed with Flatpak, and involves a bunch of moving parts (a xdg-desktop-portal user service, a filtering dbus proxy, exposing access to that proxy in the container). I experimented with that for a bit longer than I should have, but could not get it to work to satisfaction (even without a container involved, I could not get xdg-desktop-portal to heed my default browser settings…). For now I will live with manually copying and pasting URLs, we’ll see how long this lasts.
With this setup (and unlike the NixOS container setup I tried first), the same applications are installed inside and outside. It might be useful to separate the set of installed programs: There is simply no point in running evolution or firefox inside the container, and if I do not even have VSCode or cabal available outside, so that it’s less likely that I forget to enter dev before using these tools.

It shouldn’t be too hard to cargo-cult some of the NixOS Containers infrastructure to be able to have a separate system configuration that I can manage as part of my normal system configuration and make available to bubblewrap here.

So likely I will refine this some more over time. Or get tired of typing dev and going back to what I did before…

The script

The dev script (at the time of writing)

#!/usr/bin/env bash

extra=()
if [[ "$PWD" == /home/jojo/build/* ]] || [[ "$PWD" == /home/jojo/projekte/programming/* ]]
then
extra+=(--bind "$PWD" "$PWD" --chdir "$PWD")
fi

if [ -n "$1" ]
then
    cmd=( "$@" )
else
    cmd=( bash )
fi

# Caveats:
# * access to all of `/etc`
# * access to `/nix/var/nix/daemon-socket/socket`, and is trusted user (but needed to run nix)
# * access to X11

exec bwrap \
  --unshare-all \
\
`# blank slate` \
  --share-net \
  --proc /proc \
  --dev /dev \
  --tmpfs /tmp \
  --tmpfs /run/user/1000 \
\
`# Needed for GLX applications, in paticular alacritty` \
  --dev-bind /dev/dri /dev/dri \
  --ro-bind /sys/dev/char /sys/dev/char \
  --ro-bind /sys/devices/pci0000:00 /sys/devices/pci0000:00 \
  --ro-bind /run/opengl-driver /run/opengl-driver \
\
  --ro-bind /bin /bin \
  --ro-bind /usr /usr \
  --ro-bind /run/current-system /run/current-system \
  --ro-bind /nix /nix \
  --ro-bind /etc /etc \
  --ro-bind /run/systemd/resolve/stub-resolv.conf /run/systemd/resolve/stub-resolv.conf \
\
  --bind ~/.dev-home /home/jojo \
  --ro-bind ~/.config/alacritty ~/.config/alacritty  \
  --ro-bind ~/.config/nvim ~/.config/nvim  \
  --ro-bind ~/.local/share/nvim ~/.local/share/nvim  \
  --ro-bind ~/.bin ~/.bin \
\
  --bind /tmp/.X11-unix/X0 /tmp/.X11-unix/X0 \
  --bind ~/.Xauthority ~/.Xauthority \
  --setenv DISPLAY :0 \
\
  --setenv container dev \
  "${extra[@]}" \
  -- \
  "${cmd[@]}"

GHC Steering Committee Retrospective

mail@joachim-breitner.de (Joachim Breitner) — Thu, 25 Jan 2024 01:21:41 +0100

After seven years of service as member and secretary on the GHC Steering Committee, I have resigned from that role. So this is a good time to look back and retrace the formation of the GHC proposal process and committee.

ghc-proposals Github repository with a sketch of a process and sent out a call for nominations on the GHC user’s mailing list, which I replied to. The Simons picked the first set of members, and in the fall of 2016 we discussed the committee’s by-laws and procedures. As so often, Richard was an influential shaping force here.

Three ingredients

For example, it was him that suggested that for each proposal we have one committee member be the “Shepherd”, overseeing the discussion. I believe this was one ingredient for the process effectiveness: There is always one person in charge, and thus we avoid the delays incurred when any one of a non-singleton set of volunteers have to do the next step (and everyone hopes someone else does it).

The next ingredient was that we do not usually require a vote among all members (again, not easy with volunteers with limited bandwidth and occasional phases of absence). Instead, the shepherd makes a recommendation (accept/reject), and if the other committee members do not complain, this silence is taken as consent, and we come to a decision. It seems this idea can also be traced back on Richard, who suggested that “once a decision is requested, the shepherd [generates] consensus. If consensus is elusive, then we vote.”

At the end of the year we agreed and wrote down these rules, created the mailing list for our internal, but publicly archived committee discussions, and began accepting proposals, starting with Adam Gundry’s OverloadedRecordFields.

At that point, there was no “secretary” role yet, so how I did become one? It seems that in February 2017 I started to clean-up and refine the process documentation, fixing “bugs in the process” (like requiring authors to set Github labels when they don’t even have permissions to do that). This in particular meant that someone from the committee had to manually handle submissions and so on, and by the aforementioned principle that at every step there ought to be exactly one person in change, the role of a secretary followed naturally. In the email in which I described that role I wrote:

Simon already shoved me towards picking up the “secretary” hat, to reduce load on Ben.

So when I merged the updated process documentation, I already listed myself “secretary”.

It wasn’t just Simon’s shoving that put my into the role, though. I dug out my original self-nomination email to Ben, and among other things I wrote:

I also hope that there is going to be clear responsibilities and a clear workflow among the committee. E.g. someone (possibly rotating), maybe called the secretary, who is in charge of having an initial look at proposals and then assigning it to a member who shepherds the proposal.

So it is hardly a surprise that I became secretary, when it was dear to my heart to have a smooth continuous process here.

I am rather content with the result: These three ingredients – single secretary, per-proposal shepherds, silence-is-consent – helped the committee to be effective throughout its existence, even as every once in a while individual members dropped out.

Ulterior motivation

I must admit, however, there was an ulterior motivation behind me grabbing the secretary role: Yes, I did want the committee to succeed, and I did want that authors receive timely, good and decisive feedback on their proposals – but I did not really want to have to do that part.

I am, in fact, a lousy proposal reviewer. I am too generous when reading proposals, and more likely mentally fill gaps in a specification rather than spotting them. Always optimistically assuming that the authors surely know what they are doing, rather than critically assessing the impact, the implementation cost and the interaction with other language features.

And, maybe more importantly: why should I know which changes are good and which are not so good in the long run? Clearly, the authors cared enough about a proposal to put it forward, so there is some need… and I do believe that Haskell should stay an evolving and innovating language… but how does this help me decide about this or that particular feature.

I even, during the formation of the committee, explicitly asked that we write down some guidance on “Vision and Guideline”; do we want to foster change or innovation, or be selective gatekeepers? Should we accept features that are proven to be useful, or should we accept features so that they can prove to be useful? This discussion, however, did not lead to a concrete result, and the assessment of proposals relied on the sum of each member’s personal preference, expertise and gut feeling. I am not saying that this was a mistake: It is hard to come up with a general guideline here, and even harder to find one that does justice to each individual proposal.

So the secret motivation for me to grab the secretary post was that I could contribute without having to judge proposals. Being secretary allowed me to assign most proposals to others to shepherd, and only once in a while myself took care of a proposal, when it seemed to be very straight-forward. Sneaky, ain’t it?

7 Years later

For years to come I happily played secretary: When an author finished their proposal and public discussion ebbed down they would ping me on GitHub, I would pick a suitable shepherd among the committee and ask them to judge the proposal. Eventually, the committee would come to a conclusion, usually by implicit consent, sometimes by voting, and I’d merge the pull request and update the metadata thereon. Every few months I’d summarize the current state of affairs to the committee (what happened since the last update, which proposals are currently on our plate), and once per year gathered the data for Simon Peyton Jones’ annually GHC Status Report. Sometimes some members needed a nudge or two to act. Some would eventually step down, and I’d sent around a call for nominations and when the nominations came in, distributed them off-list among the committee and tallied the votes.

Initially, that was exciting. For a long while it was a pleasant and rewarding routine. Eventually, it became a mere chore. I noticed that I didn’t quite care so much anymore about some of the discussion, and there was a decent amount of naval-gazing, meta-discussions and some wrangling about claims of authority that was probably useful and necessary, but wasn’t particularly fun.

I also began to notice weaknesses in the processes that I helped shape: We could really use some more automation for showing proposal statuses, notifying people when they have to act, and nudging them when they don’t. The whole silence-is-assent approach is good for throughput, but not necessary great for quality, and maybe the committee members need to be pushed more firmly to engage with each proposal. Like GHC itself, the committee processes deserve continuous refinement and refactoring, and since I could not muster the motivation to change my now well-trod secretarial ways, it was time for me to step down.

Luckily, Adam Gundry volunteered to take over, and that makes me feel much less bad for quitting. Thanks for that!

And although I am for my day job now enjoying a language that has many of the things out of the box that for Haskell are still only language extensions or even just future proposals (dependent types, BlockArguments, do notation with (← foo) expressions and 💜 Unicode), I’m still around, hosting the Haskell Interlude Podcast, writing on this blog and hanging out at ZuriHac etc.

The Haskell Interlude Podcast

mail@joachim-breitner.de (Joachim Breitner) — Fri, 22 Dec 2023 10:04:42 +0100

It was pointed out to me that I have not blogged about this, so better now than never:

Since 2021 I am – together with four other hosts – producing a regular podcast about Haskell, the Haskell Interlude. Roughly every two weeks two of us interview someone from the Haskell Community, and we chat for approximately an hour about how they came to Haskell, what they are doing with it, why they are doing it and what else is on their mind. Sometimes we talk to very famous people, like Simon Peyton Jones, and sometimes to people who maybe should be famous, but aren’t quite yet.

For most episodes we also have a transcript, so you can read the interviews instead, if you prefer, and you should find the podcast on most podcast apps as well. I do not know how reliable these statistics are, but supposedly we regularly have around 1300 listeners. We don’t get much feedback, however, so if you like the show, or dislike it, or have feedback, let us know (for example on the Haskell Disourse, which has a thread for each episode).

At the time of writing, we released 40 episodes. For the benefit of my (likely hypothetical) fans, or those who want to train an AI voice model for nefarious purposes, here is the list of episodes co-hosted by me:

Can’t decide where to start? The one with Ryan Trinkle might be my favorite.

Thanks to the Haskell Foundation and its sponsors for supporting this podcast (hosting, editing, transscription).

Joining the Lean FRO

mail@joachim-breitner.de (Joachim Breitner) — Wed, 01 Nov 2023 21:47:06 +0100

Tomorrow is going to be a new first day in a new job for me: I am joining the Lean FRO, and I’m excited.

What is Lean?

Lean is the new kid on the block of theorem provers.

It’s a pure functional programming language (like Haskell, with and on which I have worked a lot), but it’s dependently typed (which Haskell may be evolving to be as well, but rather slowly and carefully). It has a refreshing syntax, built on top of a rather good (I have been told, not an expert here) macro system.

As a dependently typed programming language, it is also a theorem prover, or proof assistant, and there exists already a lively community of mathematicians who started to formalize mathematics in a coherent library, creatively called mathlib.

What is a FRO?

A Focused Research Organization has the organizational form of a small start up (small team, little overhead, a few years of runway), but its goals and measure for success are not commercial, as funding is provided by donors (in the case of the Lean FRO, the Simons Foundation International, the Alfred P. Sloan Foundation, and Richard Merkin). This allows us to build something that we believe is a contribution for the greater good, even though it’s not (or not yet) commercially interesting enough and does not fit other forms of funding (such as research grants) well. This is a very comfortable situation to be in.

Why am I excited?

To me, working on Lean seems to be the perfect mix: I have been working on language implementation for about a decade now, and always with a preference for functional languages. Add to that my interest in theorem proving, where I have used Isabelle and Coq so far, and played with Agda and others. So technically, clearly up my alley.

Furthermore, the language isn’t too old, and plenty of interesting things are simply still to do, rather than tried before. The ecosystem is still evolving, so there is a good chance to have some impact.

On the other hand, the language isn’t too young either. It is no longer an open question whether we will have users: we have them already, they hang out on zulip, so if I improve something, there is likely someone going to be happy about it, which is great. And the community seems to be welcoming and full of nice people.

Finally, this library of mathematics that these users are building is itself an amazing artifact: Lots of math in a consistent, machine-readable, maintained, documented, checked form! With a little bit of optimism I can imagine this changing how math research and education will be done in the future. It could be for math what Wikipedia is for encyclopedic knowledge and OpenStreetMap for maps – and the thought of facilitating that excites me.

With this new job I find that when I am telling friends and colleagues about it, I do not hesitate or hedge when asked why I am doing this. This is a good sign.

What will I be doing?

We’ll see what main tasks I’ll get to tackle initially, but knowing myself, I expect I’ll get broadly involved.

To get up to speed I started playing around with a few things already, and for example created Loogle, a Mathlib search engine inspired by Haskell’s Hoogle, including a Zulip bot integration. This seems to be useful and quite well received, so I’ll continue maintaining that.

Expect more about this and other contributions here in the future.

Squash your Github PRs with one click

mail@joachim-breitner.de (Joachim Breitner) — Sun, 29 Oct 2023 22:46:56 +0100

TL;DR: Squash your PRs with one click at https://squasher.nomeata.de/.

Very recently I got this response from the project maintainer at a pull request I contributed: “Thanks, approved, please squash so that I can merge.”

It’s nice that my contribution can go it, but why did the maintainer not just press the “Squash and merge button”, and instead adds the this unnecessary roundtrip to the process? Anyways, maintainers make the rules, so I play by them. But unlike the maintainer, who can squash-and-merge with just one click, squashing the PR’s branch is surprisingly laberous: Github does not allow you to do that via the Web UI (and hence on mobile), and it seems you are expected to go to your computer and juggle with git rebase --interactive.

I found this rather annoying, so I created Squasher, a simple service that will squash your branch for you. There is no configuration, just paste the PR url. It will use the PR title and body as the commit message (which is obviously the right way™), and create the commit in your name:

If you find this useful, or found it to be buggy, let me know. The code is at https://github.com/nomeata/squasher if you are curious about it.

Left recursive parser combinators via sharing

mail@joachim-breitner.de (Joachim Breitner) — Sun, 10 Sep 2023 17:16:08 -0700

At this year’s ICFP in Seattle I gave a talk about my rec-def Haskell library, which I have blogged about before here. While my functional pearl paper focuses on a concrete use-case and the tricks of the implementation, in my talk I put the emphasis on the high-level idea: it beholds of a declarative lazy functional like Haskell that recursive equations just work whenever they describe a (unique) solution. Like in the paper, I used equations between sets as the running example, and only conjectured that it should also work for other domains, in particular parser combinators.

Naturally, someone called my bluff and asked if I actually tried it. I had not, but I should have, because it works nicely and is actually more straight-forward than with sets. I wrote up a prototype and showed it off a few days later as a lightning talk at Haskell Symposium; here is the write up that goes along with that.

Parser combinators

Parser combinators are libraries that provide little functions (combinators) that you compose to define your parser directly in your programming language, as opposed to using external tools that read some grammar description and generate parser code, and are quite popular in Haskell (e.g. parsec, attoparsec, megaparsec).

Let us define a little parser that recognizes sequences of as:

ghci> let aaa = tok 'a' *> aaa <|> pure ()
ghci> parse aaa "aaaa"
Just ()
ghci> parse aaa "aabaa"
Nothing

ghci> let aaa = aaa <* tok 'a' <|> pure ()
ghci> parse aaa "aaaa"
^CInterrupted.

Nicolas Wu’s overview paper), all the common parser combinator libraries cannot handle it and the usual advise is to refactor your grammar to avoid left recursion.

But there are some libraries that can handle left recursion, at least with a little help from the programmer. I found two variations:

The library provides an explicit fix point combinator, and as long as that is used, left-recursion works. This is for example described by Frost, Hafiz and Callaghan by, and (of course) Oleg Kiselyov has an implementation of this too.
The library expects explicit labels on recursive productions, so that the library can recognize left-recursion. I found an implementation of this idea in the Agda.Utils.Parser.MemoisedCPS module in the Agda code, the gll library seems to follow this style and Jaro discusses it as well.

I took the module from the Agda source and simplified a bit for the purposes of this demonstration (Parser.hs). Indeed, I can make the left-recursive grammar work:

ghci> let aaa = memoise ":-)" $ aaa <* tok 'a' <|> pure ()
ghci> parse aaa "aaaa"
Just ()
ghci> parse aaa "aabaa"
Nothing

data Parser k tok a -- k is type of keys, tok type of tokens (e.g. Char)
instance Functor (Parser k tok)
instance Applicative (Parser k tok)
instance Alternative (Parser k tok)
instance Monad (Parser k tok)
parse :: Parser k tok a -> [tok] -> Maybe a
sat :: (tok -> Bool) -> Parser k tok tok
tok :: Eq tok => tok -> Parser k tok tok
memoise :: Ord k => k -> Parser k tok a -> Parser k tok a

import qualified Parser as P

newtype Parser tok a = MkP { unP :: P.Parser Unique tok a }

parse :: Parser tok a -> [tok] -> Maybe a
parses (MkP p) = P.parse p

sat :: Typeable tok => (tok -> Bool) -> Parser tok tok
sat p = MkP (P.sat p)

tok :: Eq tok => tok -> Parser tok tok
tok t = MkP (P.tok t)

So far, nothing interesting had to happen, because so far I cannot build recursive parsers. The first interesting combinator that allows me to do that is <*> from the Applicative class, so I should use memoise there. The question is: Where does the unique key come from?

Proprioception

As with the rec-def library, pure code won’t do, and I have to get my hands dirty: I really want a fresh unique label out of thin air. To that end, I define the following combinator, with naming aided by Richard Eisenberg:

propriocept :: (Unique -> a) -> a
propriocept f = unsafePerformIO $ f <$> newUnique

A thunk defined with propriocept will know about it’s own identity, and will be able to tell itself apart from other such thunks. This gives us a form of observable sharing, precisely what we need. But before we return to our parser combinators, let us briefly explore this combinator.

Using propriocept I can define an operation cons :: [Int] -> [Int] that records (the hash of) this Unique in the list:

ghci> let cons xs = propriocept (\x -> hashUnique x : xs)
ghci> :t cons
cons :: [Int] -> [Int]

ghci> cons (cons (cons []))
[1,2,3]

ghci> cons (cons (cons []))
[4,5,6]

ghci> take 20 (acyclic 0)
[7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26]

ghci> let cyclic = cons cyclic
ghci> take 20 cyclic
[27,27,27,27,27,27,27,27,27,27,27,27,27,27,27,27,27,27,27,27]

fix from Data.Function:

ghci> import Data.Function
ghci> take 20 (fix cons)
[28,28,28,28,28,28,28,28,28,28,28,28,28,28,28,28,28,28,28,28]

series of screencasts.

So with propriocept we can distinguish different heap objects, and also recognize when we come across the same heap object again.

With that we return to our parser. We define a smart constructor for the new Parser that passes the unique from propriocept to the underlying parser’s memoise function:

withMemo :: P.Parser Unique tok a -> Parser tok a
withMemo p = propriocept $ \u -> MkP $ P.memoise u p

instance Functor (Parser tok) where
  fmap f p = withMemo (fmap f (unP p))

instance Applicative (Parser tok) where
  pure x = MkP (pure x)
  p1 <*> p2 = withMemo (unP p1 <*> unP p2)

instance Alternative (Parser tok) where
  empty = MkP empty
  p1 <|> p2 = withMemo (unP p1 <|> unP p2)

instance Monad (Parser tok) where
  return = pure
  p1 >>= f = withMemo $ unP p1 >>= unP . f

RParser.hs for the full code):

ghci> let aaa = aaa <* tok 'a' <|> pure ()
ghci> parse aaa "aaaa"
Just ()
ghci> parse aaa "aabaa"
Nothing

type Ident = String
type RuleRhs = [Seq]
type Seq = [Atom]
data Atom = Lit String | NonTerm Ident deriving Show
type Rule = (Ident, RuleRhs)
type BNF = [Rule]

numExp :: String
numExp = unlines
    [ "term   := sum;"
    , "pdigit := '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9';"
    , "digit  := '0' | pdigit;"
    , "pnum   := pdigit | pnum digit;"
    , "num    := '0' | pnum;"
    , "prod   := atom | atom '*' prod;"
    , "sum    := prod | prod '+' sum;"
    , "atom   := num | '(' term ')';"
    ]

type P = Parser Char

snoc :: [a] -> a -> [a]
snoc xs x = xs ++ [x]

l :: P a -> P a
l p = p <|> l p <* sat isSpace
quote :: P Char
quote = tok '\''
quoted :: P a -> P a
quoted p = quote *> p <* quote
str :: P String
str = some (sat (not . (== '\'')))
ident :: P Ident
ident = some (sat (\c -> isAlphaNum c && isAscii c))
atom :: P Atom
atom = Lit     <$> l (quoted str)
   <|> NonTerm <$> l ident
eps :: P ()
eps = void $ l (tok 'ε')
sep :: P ()
sep = void $ some (sat isSpace)
sq :: P Seq
sq = []   <$ eps
 <|> snoc <$> sq <* sep <*> atom
 <|> pure <$> atom
ruleRhs :: P RuleRhs
ruleRhs = pure <$> sq
      <|> snoc <$> ruleRhs <* l (tok '|') <*> sq
rule :: P Rule
rule = (,) <$> l ident <* l (tok ':' *> tok '=') <*> ruleRhs <* l (tok ';')
bnf :: P BNF
bnf = pure <$> rule
  <|> snoc <$> bnf <*> rule

ghci> parse bnf numExp
^CInterrupted.

What a pity, it does not work! What went wrong?

The underlying library can handle left-recursion if it can recognize it by seeing a memoise label passed again. This works fine in all the places where we re-use a parser definition (e.g. in bnf), but it really requires that values are shared!

If we look carefully at our definition of l (which parses a lexeme, i.e. something possibly followed by whitespace), then it recurses via a fresh function call, and the program will keep expanding the definition – just like the acyclic above:

l :: P a -> P a
l p = p <|> l p <* sat isSpace

l :: P a -> P a
l p = p'
  where p' = p <|> p' <* sat isSpace

ghci> parse bnf numExp
Just [("term",[[NonTerm "sum"]]),("pdigit",[[Lit "1"],…

interp :: BNF -> P String
interp bnf = parsers M.! start
  where
    start :: Ident
    start = fst (head bnf)

    parsers :: M.Map Ident (P String)
    parsers = M.fromList [ (i, parseRule i rhs) | (i, rhs) <- bnf ]

    parseRule :: Ident -> RuleRhs -> P String
    parseRule ident rhs = trace <$> asum (map parseSeq rhs)
      where trace s = ident ++ "(" ++ s ++ ")"

    parseSeq :: Seq -> P String
    parseSeq = fmap concat . traverse parseAtom

    parseAtom :: Atom -> P String
    parseAtom (Lit s) = traverse tok s
    parseAtom (NonTerm i) = parsers M.! i

BNFEx.hs):

ghci> Just bnfp = parse bnf numExp
ghci> :t bnfp
bnfp :: BNF
ghci> parse (inter
interact  interp
ghci> parse (interp bnfp) "12+3*4"
Just "term(sum(prod(atom(num(pnum(pnum(pdigit(1))digit(pdigit(2))))))+sum(prod(atom(num(pnum(pdigit(3))))*prod(atom(num(pnum(pdigit(4)))))))))"
ghci> parse (interp bnfp) "12+3*4+"
Nothing

It is worth noting that the numExp grammar is also left-recursive, so implementing interp with a conventional parser combinator library would not have worked. But thanks to our propriocept tick, it does! Again, the sharing is important; in the code above it is the map parsers that is defined in terms of itself, and will ensure that the left-recursive productions will work.

Closing thoughts

I am using unsafePerformIO, so I need to justify its use. Clearly, propriocept is not a pure function, and it’s type is a lie. In general, using it will break the nice equational properties of Haskell, as we have seen in our experiments with cons.

In the case of our parser library, however, we use it in specific ways, namely to feed a fresh name to memoise. Assuming the underlying parser library’s behavior does not observably depend on where and with which key memoise is used, this results in a properly pure interface, and all is well again. (NB: I did not investigate if this assumption actually holds for the parser library used here, or if it may for example affect the order of parse trees returned.)

I also expect that this implementation, which will memoise every parser involved, will be rather slow. It seems plausible to analyze the graph structure and figure out which memoise calls are actually needed to break left-recursion (at least if we drop the Monad instance or always memoise >>=).

the paper about rec-def, watch one of my talks about it (MuniHac, BOBKonf, ICFP23; the presentation evolved over time), or if you just want to see more about how things are laid out on Haskell’s heap, go to my screen casts exploring the Haskell heap.

Generating bibtex bibliographies from DOIs via DBLP

mail@joachim-breitner.de (Joachim Breitner) — Wed, 12 Jul 2023 14:32:19 +0200

I sometimes write papers and part of paper writing is assembling the bibliography. In my case, this is done using BibTeX. So when I need to add another citation, I have to find suitable data in Bibtex format.

Often I copy snippets from .bib files from earlier paper.

Or I search for the paper on DBLP, which in my experience has highest quality BibTeX entries and best coverage of computer science related publications, copy it to my .bib file, and change the key to whatever I want to refer the paper by.

But in the days of pervasive use of DOIs (digital object identifiers) for almost all publications, manually keeping the data in bibtex files seems outdated. Instead I’d rather just put the two pieces of data I care about: the key that I want to use for citation, and the doi. The rest I do not want to be bothered with.

So I wrote a small script that takes a .yaml file like

entries:
  unsafePerformIO: 10.1007/10722298_3
  dejafu: 10.1145/2804302.2804306
  runST: 10.1145/3158152
  quickcheck: 10.1145/351240.351266
  optimiser: 10.1016/S0167-6423(97)00029-4
  sabry: 10.1017/s0956796897002943
  concurrent: 10.1145/237721.237794
  launchbury: 10.1145/158511.158618
  datafun: 10.1145/2951913.2951948
  observable-sharing: 10.1007/3-540-46674-6_7
  kildall-73: 10.1145/512927.512945
  kam-ullman-76: 10.1145/321921.321938
  spygame: 10.1145/3371101
  cocaml: 10.3233/FI-2017-1473
  secrets: 10.1017/S0956796802004331
  modular: 10.1017/S0956796817000016
  longley: 10.1145/317636.317775
  nievergelt: 10.1145/800152.804906
  runST2: 10.1145/3527326
  polakow: 10.1145/2804302.2804309
  lvars: 10.1145/2502323.2502326
  typesafe-sharing: 10.1145/1596638.1596653
  pure-functional: 10.1007/978-3-642-14162-1_17
  clairvoyant: 10.1145/3341718
subs:
  - replace: Peyton Jones
    with: '{Peyton Jones}'

and turns it into a nice .bibtex file:

$ ./doi2bib.py < doibib.yaml > dblp.bib
$ head dblp.bib
@inproceedings{unsafePerformIO,
  author       = {Simon L. {Peyton Jones} and
                  Simon Marlow and
                  Conal Elliott},
  editor       = {Pieter W. M. Koopman and
                  Chris Clack},
  title        = {Stretching the Storage Manager: Weak Pointers and Stable Names in
                  Haskell},
  booktitle    = {Implementation of Functional Languages, 11th International Workshop,
                  IFL'99, Lochem, The Netherlands, September 7-10, 1999, Selected Papers},

The last bit allows me to do some fine-tuning of the file, because unfortunately, not even DBLP BibTeX files are perfect, for example in the presence of two family names.

Now I have less moving parts to worry about, and a more consistent bibliography.

The script is rather small, so I’ll just share it here:

#!/usr/bin/env python3

import sys
import yaml
import requests
import requests_cache
import re

requests_cache.install_cache(backend='sqlite')

data = yaml.safe_load(sys.stdin)

for key, doi in data['entries'].items():
    bib = requests.get(f"https://dblp.org/doi/{doi}.bib").text
    bib = re.sub('{DBLP.*,', '{' + key + ',', bib)
    for subs in data['subs']:
        bib = re.sub(subs['replace'], subs['with'], bib)
    print(bib)

dblpbibtex in C++ and dblpbib in Ruby. These allow direct use of \cite{DBLP:rec/conf/isit/BreitnerS20} in Latex, which is also nice, but for now I like to choose more speaking citation keys myself.

ICFP Pearl preprint on rec-def

mail@joachim-breitner.de (Joachim Breitner) — Thu, 22 Jun 2023 18:21:11 +0200

I submitted a Functional Pearl to this year’s ICFP and it got accepted!

It is about the idea of using Haskell’s inherent ability to define recursive equations, and use them for more than just functions and lazy data structures. I blogged about this before (introducing the idea, behind the scenes, applications to program analysis, graph algorithms and minesweeper), but hopefully the paper brings out the idea even more clearly. The constructive feedback from a few friendly readers (Claudio, Sebastian, and also the anonymous reviewers) certainly improved the paper.

Abstract

Haskell’s laziness allows the programmer to solve some problems naturally and declaratively via recursive equations. Unfortunately, if the input is “too recursive”, these very elegant idioms can fall into the dreaded black hole, and the programmer has to resort to more pedestrian approaches.

It does not have to be that way: We built variants of common pure data structures (Booleans, sets) where recursive definitions are productive. Internally, the infamous unsafePerformIO is at work, but the user only sees a beautiful and pure API, and their pretty recursive idioms – magically – work again.

If you are curious, please have a look at the current version of the paper. Any feedback is welcome; even more if it comes before July 11, because then I can include it in the camera ready version.

There are still some open questions around this work. What bothers me maybe most is the lack of a denotational semantics that unifies the partial order underlying the Haskell fragment, and the partial order underlying the domain of the embedded equations.

The crux of the probem is maybe best captured by this question:

Imagine an untyped lambda calculus with constructors, lazy evaluation, and an operation rseq that recursively evaluates constructors, but terminates in the presence of cycles. So for example
rseq (let x    = 1 :: x    in x   ) ≡ ()
rseq (let x () = 1 :: x () in x ()) ≡ ⊥
In this language, knot tying is observable. What is the “nicest” denotational semantics.

Update: I made some progress via a discussion on the Haskell Discource and started some rough notes on a denotational semantics.

The curious case of the half-half Bitcoin ECDSA nonces

mail@joachim-breitner.de (Joachim Breitner) — Wed, 07 Jun 2023 08:42:28 +0200

This is the week of the Gulaschprogrammiernacht, a yearly Chaos Computer Club even in Karlsruhe, so it was exactly a year ago that I sat in my AirBnB room and went over the slides for my talk “Lattice Attacks on Ethereum, Bitcoin, and HTTPS” that I would give there.

It reports on research done with Nadia Heninger while I was in Phildalephia, and I really liked giving that talk: At some point we look at some rather odd signatures we found on the bitcoin blockchain, and part of the signature (the “nonce”) happens to share some bytes with the secret key. A clear case of some buffer overlap in a memory unsafe language, which I, as a fan of languages like Haskell, are very happy to sneer at!

But last year, as I was going over the slides I looked at the raw data again for some reason, and I found that we overlooked something: Not only was the the upper half ot the nonce equal to the lower half of the secret key, but he lower half of the nonce was also equal to the upper half of the message hash!

This now looks much less like an accident to me, and more like a (overly) simple form of deterministic nonce creation… so much for my nice anecdote. (I still used the anecdote in my talk, followed up with an “actually”.)

When I told Nadia about this, she got curious as well, and quickly saw that from a signature with such a nonce, one can rather easily extract the secret key. So together with her student Dylan Rowe, we implemented this analysis and searched the bitcoin blockchain for more instance of such signatures. We did find a few, and were even able to trace them back to a somewhat infamous bitcoin activist going under the pseudonym Amaclin.

This research and sleuthing turned into another paper, “The curious case of the half-half Bitcoin ECDSA nonces”, to be presented at AfricaCrypt 2023. Enjoy!

Giving back to OPLSS

mail@joachim-breitner.de (Joachim Breitner) — Sun, 04 Jun 2023 14:18:33 +0200

Nine years ago, when I was a PhD student, I attended the Oregon Programming Language Summer School in Eugene. I had a great time and learned a lot.

Learning some of the things I learned there, and meeting some of the people I met there, also led to me graduating, which led to me becoming a PostDoc at UPenn, which led to me later joining DFINITY to implement the Motoko programming language and help design and specify the public interface of their “Internet Computer”, including the response certification (video).

So when the ICDevs non-profit offered a development bounty for a Motoko library implementing the merkle trees involved in certification, this sounded like a fun little coding task, so I completed it; likely with less effort than it would have taken someone who first had to get into these topics.

The bounty was quite generous, at US$ 10k, and I was too vain to “just” have it donated to some large charity, as I recently with a few coding and consulting gigs, and looked for more personal. So, the ICDevs guys and I agreed to donate the money to this year’s OPLSS, where I heard it can cover the cost of about 8 students, and hopefully helps the PL cause.

(You will not find us listed as sponsors because for some reason, a “donation” instead of “sponsorship” comes with less strings attached to the organizers.)

More thoughts on a bootstrappable GHC

mail@joachim-breitner.de (Joachim Breitner) — Wed, 26 Apr 2023 07:41:50 +0200

The bootstrappable builds project tries to find ways of building all our software from source, without relying on binary artifacts. A noble goal, and one that is often thwarted by languages with self-hosting compilers, like GHC: In order to build GHC, you need GHC. A Pull Request against nixpkgs, adding first steps of the bootstrapping pipeline, reminded me of the issue with GHC, which I have noted down some thoughts about before and I played around a bit more.

The most promising attempt to bootstrap GHC was done by rekado in 2017. He observed that Hugs is maybe the most recently maintained bootstrappable (since written in C) Haskell compiler, but noticed that “it cannot deal with mutually recursive module dependencies, which is a feature that even the earliest versions of GHC rely on. This means that running a variant of GHC inside of Hugs is not going to work without major changes.” He then tries to bootstrap another very old Haskell compiler (nhc) with Hugs, and makes good but incomplete progress.

This made me wonder: What if Hugs supported mutually recursive modules? Would that make a big difference? Anthony Clayden keeps advocating Hugs as a viable Haskell implementation, so maybe if that was the main blocker, then adding support to Hugs for that is probably not too hard (at least in a compile-the-strongly-connected-component-as-one-unit mode) and worthwhile?

All of GHC in one file?

That reminded me of a situation I was in before, where I had to combine multiple Haskell modules into one before: For my talk “Lock-step simulation is child’s play” I wrote a multi-player game, a simulation environment for it, and a presentation tool around it, all in the CodeWorld programming environment, which supports only a single module. So I hacked the a small tool hs-all-in-one that takes multiple Haskell modules and combines them into one, mangling the names to avoid name clashes.

This made me wonder: Can I turn all of GHC into one module, and compile that?

At this point I have probably left the direct path towards bootstrapping, but I kinda good hooked.

Using GHC’s hadrian/ghci tool, I got it to produce the necessary generated files (e.g. from happy grammars) and spit out the lit of modules that make up GHC, which I could feed to hs-all-in-one.
It uses haskell-src-exts for parsing, and it was almost able to parse all of that. It has a different opinion about how MultiWayIf should be indented, whether EmptyCase needs {} and issues pretty-printing some promoted values, but otherwise the round-tripping worked fine, and I as able to generate a large file (680,000 loc, 41 MB) that passes GHC’s parser.
It also uses haskell-names to resolve names.

This library is less up-to-date with various Haskell features, so I added support for renaming in some pragmas (ANN, SPECIALIZE), pattern signatures etc.

For my previous use-case I could just combine all the imports, knowing that I would not introduce conflicts. For GHC, this is far from true: Some modules import Data.Map.Strict, others Data.Map.Lazy, and yet others introduce names that clash with stuff imported from the prelude… so I had to change the tool to fully qualify all imported values. This isn’t so bad, I can do that using haskell-names, if I somehow know what all the modules in base, containers, transformers and array export.

The haskell-names library itself comes with a hard-coded database of base exports, but it is incomplete and doesn’t help me with, say, containers.

I then wrote a little parser for the .txt files that haddock produces for the benefit of hoogle, and that are conveniently installed along the packages (at least on nix). This would have been great, if these files wouldn’t simply omit all reexported entities! I added some manual hacks (Data.Map has the same exports as Data.IntMap; Prelude exports all entities as known by haskell-names, but those that are also exported from Data.List, use the symbol from there…)

I played this game of whack-a-mole for a while, solving many of the problems that GHC’s renamer reports, but eventually stopped to write this blog post. I am fairly confident that this could be pulled through, though.

Back to bootstrapping

So what if we could pull this through? We’d have a very large code file that GHC may be able to compile to produce a ghc binary without exhausting my RAM. But that doesn’t help with bootstrapping yet.

If lack of support for recursive modules is all that Hugs is missing, we’d be done indeed. But quite contrary, it is probably the least of our worries, given that contemporary GHC uses many many other features not supported by Hugs.

Some of them a syntactic and can easily be rewritten to more normal Haskell in a preprocessing step (e.g. MultiWayIf).

Others are deep and hard (GADTs, Pattern synonyms, Type Families), and prohibit attempting to compile a current version of GHC (even if its all one module) with Hugs. So one would certainly have to go back in time and find a version of GHC that is not yet using all these features. For example, the first use of GADTs was introduced by Simon Marlow in 2011, so this suggests going back to at least GHC 7.0.4, maybe earlier.

Still, being able to mangle the source code before passing it to Hugs is probably a useful thing. This poses the question whether Hugs can compile such a tool; in particular, is it capable of compiling haskell-src-exts, which I am not too optimistic about either. Did someone check this already?

So one plan of attack could be

Identify an old version of GHC that
- One can use to bootstrap subsequent versions until today.
- Is old enough to use as few features not supported by hugs as possible.
- Is still new enough so that one can obtain a compatible toolchain.
Wrangle the build system to tell you which files to compile, with which preprocessor flags etc.
Boostrap all pre-processing tools used by GHC (cpphs or use plan cpp, happy, alex).
For every language feature not supported by Hugs, either
- Implement it in Hugs,
- Manually edit the source code to avoid compiling the problematic code, if it is optional (e.g. in an optimization pass)
- Rewrite the problematic code
- Write a pre-processing tool (like the one above) that compiles the feature away
Similarly, since Hugs probably ships a base that is different than what GHC, or the libraries used by GHC expects, either adjust Hugs’ base, or modify the GHC code that uses it.

My actual plan, though, for now is to throw these thoughts out, maybe make some noise on Discourse, Mastodon, Twitter and lobste.rs, and then let it sit and hope someone else will pick it up.

nomeata’s mind shares

Extrinsic termination proofs for well-founded recursion in Lean

Background

A simpler fix

Isabelle has it easier

Can we have it easy, too?

The cake is a lie

PS: Path dependencies

Coding on my eInk Tablet

Personal history

VSCode, working remotely

code-server, running remotely

Access

Service

Git credentials

Android setup

Conclusion

Do surprises get larger?

The setup

Dice are simple

Mathematical formulation

Uniform distribution

Normal distribution

Could it be any different?

Gamma distribution

Conclusion

Blogging on Lean

Convenient sandboxed development environment

Convenient or it won’t happen

Selective bubblewrapping

Rough corners

The script

GHC Steering Committee Retrospective

Three ingredients

Ulterior motivation

7 Years later

The Haskell Interlude Podcast

Joining the Lean FRO

What is Lean?

What is a FRO?

Why am I excited?

What will I be doing?

Squash your Github PRs with one click

Left recursive parser combinators via sharing

Parser combinators

Left-recursion through sharing (cont.)

Generating bibtex bibliographies from DOIs via DBLP

ICFP Pearl preprint on rec-def

Abstract

The curious case of the half-half Bitcoin ECDSA nonces

Giving back to OPLSS

More thoughts on a bootstrappable GHC

All of GHC in one file?

Back to bootstrapping

A simpler `fix`