Joachim Breitner's Homepage
Blog
git-multisect
Because it seemed useful, and it seems it did not exist yet, I created a small tool called git-multisect
. You point it to a git repo, give it a version range and a command to run, and it will list those commits that affect the output of the command.
Here is an example (with a slightly complex invocation):
$ git-multisect.py \
-C ~/build/nixpkgs \
-f $(nix flake metadata ~/nixos --json|jq -r ".locks.nodes[.locks.nodes[.locks.root].inputs.nixpkgs].locked.rev") \
-t release-22.11 \
-c 'nix path-info --derivation '\
'~/nixos#nixosConfigurations.richard.config.system.build.toplevel '\
'--no-update-lock-file '\
'--override-input nixpkgs ~/"build/nixpkgs?rev=$REV"' \
--hide-stderr --log-option=--pretty=medium
Found 39 commits
[39 total, 0 relevant, 0 irrelevant, 0 skipped, 39 unknown] inspecing 2fb7d74 ...
[39 total, 0 relevant, 0 irrelevant, 0 skipped, 39 unknown] inspecing 569163e ...
[39 total, 0 relevant, 0 irrelevant, 0 skipped, 39 unknown] inspecing 5642ce8 ...
[39 total, 0 relevant, 0 irrelevant, 0 skipped, 39 unknown] inspecing e0c8cf8 ...
[39 total, 0 relevant, 1 irrelevant, 8 skipped, 30 unknown] inspecing 89d0361 ...
[39 total, 0 relevant, 1 irrelevant, 8 skipped, 30 unknown] inspecing d1c7e63 ...
[39 total, 0 relevant, 2 irrelevant, 9 skipped, 28 unknown] inspecing e6d5772 ...
[39 total, 0 relevant, 3 irrelevant, 9 skipped, 27 unknown] inspecing a099526 ...
[39 total, 1 relevant, 4 irrelevant, 9 skipped, 25 unknown] inspecing 854312a ...
[39 total, 1 relevant, 5 irrelevant, 10 skipped, 23 unknown] inspecing 95043dc ...
[39 total, 1 relevant, 6 irrelevant, 10 skipped, 22 unknown] inspecing 0cf4274 ...
[39 total, 2 relevant, 8 irrelevant, 29 skipped, 0 unknown] done
commit a0995268af8ba0336a81344a3bf6a50d6d6481b2
Author: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Date: Sat Feb 18 10:45:11 2023 -0800
linux_{5_15,6_1}: revert patch to fix Equinix Metal bonded networking with `ice` driver (#216955)
…
commit 0cf4274b5d06325bd16dbf879a30981bc283e58a
Merge: 95043dc713d 532f3aa6052
Author: Pierre Bourdon <delroth@gmail.com>
Date: Sun Feb 19 23:37:48 2023 +0900
Merge pull request #217121 from NixOS/backport-216463-to-release-22.11
[Backport release-22.11] sudo: 1.9.12p2 -> 1.9.13
As you can see it tries to be clever to not even look at commits that come between two commits where the output is equal (like git-bisect
).
Over at https://github.com/nomeata/git-multisect you’ll find more explanations and examples. Hope you’ll maybe remember this when you find you’ll need it, and that it will prove to be useful.
Pro-charity consulting
A few times last year I was asked to help out with some software project – debugging a particularly gnarly nix issue, helping with GHC internals or reviewing Canister code of applications running on the Internet Computer. Usually, these are relatively small jobs, at least reasonably interesting, and it was often friends that approached me.
So I do want to help.
But I did not want to help for free. Incentives matter, and I wanted to incentivize careful use of my time. And of course I like to see my work valued. So normally, one would just charge an hourly fee.
But having to go through all the necessary legalese (setting up a contract, maybe an NDA, thinking about taxes and liabilities, possibly getting permission from the current employer) tilts the scale and would not made it worth it for me anymore.
But instead of just turning it down, a few times now already I used the following model: pro-charity consulting, where you still pay for what you get, but you don’t pay me, but instead a charity.
The rules of pro-charity consulting
The rough rules are as follows:
- There is no contract. I trust you and you trust me, else we wouldn’t even be doing this.
- There is no liability. You probably have a rough idea of what I can deliver and you trust me, so you don’t get to sue me if I screw up.
- There is (usually) no NDA. Of course I’ll keep your secrets, but again you don’t get to sue me if I screw up.
- I tell you about my usual hourly rate if this was a commercial agreement. (This may depend a lot on the field I’m helping you in.)
- Afterwards I’ll also let you know how much time I have spend on helping you.
- You’ll make a donation on a charity of our choosing. We’ll pick something that we both find worth supporting.
- You can donate whatever the work was worth to you, but the expectation is that it shouldn’t be cheaper for you than if you’d have hired me commercially. (Unless I failed to fulfill your expectations.)
- Legally, the donation is by you (e.g. you get to keep any tax benefits).
- Morally, it is by me. For example if donations can come with a message (like on https://donate.openstreetmap.org/) it would be appreciated to mention me.
- According to the German saying “Tue Gutes und rede darüber!” we both get to brag about this donation (target charity, amount, client, kind of work), e.g. on Blogs, Mastodon, Twitter, unless we have a different agreement. See below for some examples.
I’m writing the rules out here so that I can just point to this blog post when I next have to explain this scheme.
Pro-charity gigs so far
- April 2022: 3000€ to the Haskell Foundation for helping the Active Group with a GHC Plugin issue (my tweet, their tweet)
- August 2022: 50€ to the NixOS Foundation for helping FinDeck with a Nix issue (my tweet)
- September 2022: 1400€ to Reporters without Borders for helping StalkedICP by reviewing their Motoko code (my tweet, their tweet)
- September 2022: 1100€ to Amnesty International for helping btcflower by reviewing their Motoko code (their tweet)
- January 2023: 1050$ to Mediciens sans Borders for helping finterest.ooo by reviewing their Motoko code.
- March 2023: 1046$ to UNICEF for helping CycleOps by reviewing their Motoko code.
Serverless WebRTC instead of websocket
For over two decades I have maintained a personal dedicated server, and hosted a fair number of websites, such as my homepage, various projects, sites of friends and family, and during its heyday even servers for DNS, email, mailing lists, Jabber, cloud storage…
But it turns out just because interests shift a bit, such running services don’t simply disappear, but keep running, just less well maintained, and then become a burdensome responsibility. So over the last year I have been moving them, one by one, off my server and elsewhere™.
Most websites are static and can easily be hosted on something convenient like Github Pages or netlify. Some are “mostly static” and require a bit more work (I might blog about that some other time), but still doable. But I had one website that I thought will be hard to host elsewhere:

On https://sumserum.nomeata.de/ you can play a small abstract two-player game. It’s implementing the mechanics of “Sim Serim” by Heinrich Glumpler, and I built it out of a whim 9 years ago, after gifting it away and then wanting to play it. I later did get my own copy of the game, when the author discovered the online version and sent me the real thing.
You can play Sum Serum locally, then it is simply a static webpage with a bit of JavaScript code. But you can also play it with a friend remotely. It connects via WebSockets to a small NodeJS service, which makes the connection and relays messages. It works nicely, but “running a WebSocket service” is usually not part of these simple static-website-from-git services, so this was getting into the way of (eventually) decomissioning my server.
But web technology has advanced, and WebRTC, which allows browsers to talk to each other, has become widely available. So indeed, I found the trystero JS libray which provides a very simple API to connect with someone else over the internet, and using instead of my little WebSocket service was a relatively small change. Now the Sum Serum code is a purely static webpage that I can very easily host on netlify! Neat!
Granted, there are still servers involved to make the connection, as the library is using the BitTorrent network to find the peers. But as far as I am concered, that is nothing I need to worry about. Furthermore, trystero is quite easy to use, so maybe I can create more simply fun multi-use interactive thingies, now that I know I don’t have to worry about hosting a backend.
rec-def: Minesweeper case study
I’m on the train back from MuniHac, where I gave a talk about the rec-def
library that I have excessively blogged about recently (here, here, here and here). I got quite flattering comments about that talk, so if you want to see if they were sincere, I suggest you watch the recording of “Getting recursive definitions off their bottoms” (but it’s not necessary for the following).
After the talk, Franz Thoma approached me and told me a story of how we was once implementing the game Minesweeper in Haskell, and in particular the part of the logic where, after the user has uncovered a field, the game would automatically uncover all fields that are next to a “neutral” field, i.e. one with zero adjacent bombs. He was using a comonadic data structure, which makes a “context-dependent parallel computation” such as uncovering one field quite natural, and was hoping that using a suitable fix-point operator, he can elegantly obtain not just the next step, but directly the result of recursively uncovering all these fields. But, much to his disappointment, that did not work out: Due to the recursion inherent in that definition, a knot-tying fixed-point operator will lead to a cyclic definition.

He was wondering if the rec-def
library could have helped him, and we sat down to find out, and this is the tale of this blog post. I will avoid the comonadic abstractions and program it more naively, though, to not lose too many readers along the way. Maybe read Chris Penner’s blog post and Finch’s functional pearl “Getting a Quick Fix on Comonads” if you are curious about that angle.
Minesweeper setup
Let’s start with defining a suitable data type for the grid of the minesweeper board. I’ll use the Array
data type, it’s Ix
-based indexing is quite useful for grids:
import Data.Array
type C = (Int,Int)
type Grid a = Array C a
The library lacks a function to generate an array from a generating function, but it is easy to add:
genArray :: Ix i => (i,i) -> (i -> e) -> Array i e
= listArray r $ map f $ range r genArray r f
Let’s also fix the size of the board, as a pair of lower and upper bounds (this is the format that the Ix
type class needs):
size :: (C,C)
= ((0,0), (3,3)) size
Now board is simply a grid of boolean values, with True
indicating that a bomb is there:
type Board = Grid Bool
board1 :: Board
= listArray size
board1 False, False, False, False
[ True, False, False, False
, True, False, False, False
, False, False, False, False
, ]
It would be nice to be able to see these board in a nicer way. So let us write A function that prints a grid, including a frame, given a function that prints something for each coordinate. Together with a function that prints a bomb (as *
), we can print the board:
pGrid :: (C -> String) -> String
= unlines
pGrid p concat [ p' (y,x) | x <- [lx-1 .. ux+1] ]
[ | y <- [ly-1 .. uy+1] ]
where
= size
((lx,ly),(ux,uy))
| inRange size c = p c
p' c = "#"
p' _
pBombs :: Board -> String
= pGrid $ \c -> if b ! c then "*" else " " pBombs b
The expression b ! c
looks up a the coordinate in the array, and is True
when there is a bomb at that coordinate.
So here is our board, with two bombs:
ghci> putStrLn $ pBombs board1
######
# #
#* #
#* #
# #
######
But that’s not what we want to show to the user: Every field should have have a number that indicates the number of bombs in the surrounding fields. To that end, we first define a function that takes a coordinate, and returns all adjacent coordinates. This also takes care of the border, using inRange
:
neighbors :: C -> [C]
=
neighbors (x,y)
[ c| (dx, dy) <- range ((-1,-1), (1,1))
/= (0,0)
, (dx, dy) let c = (x + dx, y + dy)
, inRange size c
, ]
With that, we can calculate what to display in each cell – a bomb, or a number:
data H = Bomb | Hint Int deriving Eq
hint :: Board -> C -> H
| b ! c = Bomb
hint b c = Hint $ sum [ 1 | c' <- neighbors c, b ! c' ] hint b c
With a suitable printing function, we can now see the full board:
hint :: Board -> C -> H
| b ! c = Bomb
hint b c = Hint $ sum [ 1 | c' <- neighbors c, b ! c' ]
hint b c
pCell :: Board -> C -> String
= case hint b c of
pCell b c Bomb -> "*"
Hint 0 -> " "
Hint n -> show n
pBoard :: Board -> String
= pGrid (pCell b) pBoard b
And here it is:
ghci> putStrLn $ pBoard board1
######
#11 #
#*2 #
#*2 #
#11 #
######
Next we have to add masks: We need to keep track of which fields the user already sees. We again use a grid of booleans, and define a function to print a board with the masked fields hidden behind ?
:
type Mask = Grid Bool
mask1 :: Mask
= listArray size
mask1 True, True, True, False
[ False, False, False, False
, False, False, False, False
, False, False, False, False
,
]
pMasked :: Board -> Mask -> String
= pGrid $ \c -> if m ! c then pCell b c else "?" pMasked b m
So this is what the user would see
ghci> putStrLn $ pMasked board1 mask1
######
#11 ?#
#????#
#????#
#????#
######
Uncovering some fields
With that setup in place, we now implement the piece of logic we care about: Uncovering all fields that are next to a neutral field. Here is the first attempt:
solve0 :: Board -> Mask -> Mask
= m1
solve0 b m0 where
m1 :: Mask
= genArray size $ \c ->
m1 ! c || or [ m0 ! c' | c' <- neighbors c, hint b c' == Hint 0 ] m0
The idea is that we calculate the new mask m1
from the old one m0
by the following logic: A field is visible if it was visible before (m0 ! c
), or if any of its neighboring, neutral fields are visible.
This works so far: I uncovered the three fields next to the one neutral visible field:
ghci> putStrLn $ pMasked board1 $ solve0 board1 mask1
######
#11 #
#?2 #
#????#
#????#
######
But that’s not quite what we want: We want to keep doing that to uncover all fields.
Uncovering all fields
So what happens if we change the logic to: A field is visible if it was visible before (m0 ! c
), or if any of its neighboring, neutral fields will be visible.
In the code, this is just a single character change: Instead of looking at m0
to see if a neighbor is visible, we look at m1
:
solve1 :: Board -> Mask -> Mask
= m1
solve1 b m0 where
m1 :: Mask
= genArray size $ \c ->
m1 ! c || or [ m1 ! c' | c' <- neighbors c, hint b c' == Hint 0 ] m0
(This is roughly what happened when Franz started to use the kfix
comonadic fixed-point operator in his code, I believe.)
Does it work? It seems so:
ghci> putStrLn $ pMasked board1 $ solve1 board1 mask1
######
#11 #
#?2 #
#?2 #
#?1 #
######
Amazing, isn’t it!
Unfortunately, it seems to work by accident. If I start with a different mask:
mask2 :: Mask
= listArray size
mask2 True, True, False, False
[ False, False, False, False
, False, False, False, False
, False, False, False, True
, ]
which looks as follows:
ghci> putStrLn $ pMasked board1 mask2
######
#11??#
#????#
#????#
#??? #
######
Then our solve1
function does not work, and just sits there:
ghci> putStrLn $ pMasked board1 $ solve1 board1 mask2
######
#11^CInterrupted.
Why did it work before, but now now?
It fails to work because as the code tries to figure out if a field, it needs to know if the next field will be uncovered. But to figure that out, it needs to know if the present field will be uncovered. With the normal boolean connectives (||
and or
), this does not make progress.
It worked with mask1
more or less by accident: None of the fields on in the first column don’t have neutral neighbors, so nothing happens there. And for all the fields in the third and forth column, the code will know for sure that they will be uncovered based on their upper neighbors, which come first in the neighbors
list, and due to the short-circuting properties of ||
, it doesn’t have to look at the later cells, and the vicious cycle is avoided.
rec-def to the rescue
This is where rec-def
comes in: By using the RBool
type in m1
instead of plain Bool
, the recursive self-reference is not a problem, and it simply works:
import qualified Data.Recursive.Bool as RB
solve2 :: Board -> Mask -> Mask
= fmap RB.get m1
solve2 b m0 where
m1 :: Grid RB.RBool
= genArray size $ \c ->
m1 ! c) RB.|| RB.or [ m1 ! c' | c' <- neighbors c, hint b c' == Hint 0 ] RB.mk (m0
Note that I did not change the algorithm, or the self-reference through m1
; I just replaced Bool
with RBool
, ||
with RB.||
and or
with RB.or
. And used RB.get
at the end to get a normal boolean out. And 🥁, here we go:
ghci> putStrLn $ pMasked board1 $ solve2 board1 mask2
######
#11 #
#?2 #
#?2 #
#?1 #
######
That’s the end of this repetition of “let’s look at a tying-the-knot-problem and see how rec-def
helps”, which always end up a bit anti-climatic because it “just works”, at least in these cases. Hope you enjoyed it nevertheless.
rec-def: Dominators case study
More ICFP-inspired experiments using the rec-def
library: In Norman Ramsey’s very nice talk about his Functional Pearl “Beyond Relooper: Recursive Translation of Unstructured Control Flow to Structured Control Flow”, he had the following slide showing the equation for the dominators of a node in a graph:

He said “it’s ICFP and I wanted to say the dominance relation has a beautiful set of equations … you can read all these algorithms how to compute this, but the concept is simple”.
This made me wonder: If the concept is simple and this formula is beautiful – shouldn’t this be sufficient for the Haskell programmer to obtain the dominator relation, without reading all those algorithms?
Before we start, we have to clarify the formula a bit: If a node is an entry node (no predecessors) then the big intersection is over the empty set, and that is not a well-defined concept. For these nodes, we need that big intersection to return the empty set, as entry nodes are not dominated by any other node. (Let’s assume that the entry nodes are exactly those with no predecessors.)
Let’s try, first using plain Haskell data structures. We begin by implementing this big intersection operator on Data.Set
, and also a function to find the predecessors of a node in a graph:
import qualified Data.Set as S
import qualified Data.Map as M
intersections :: [S.Set Int] -> S.Set Int
= S.empty
intersections [] = foldl1 S.intersection xs
intersections xs
preds :: [(Int,Int)] -> M.Map Int [Int]
= M.fromListWith (<>) $
preds edges | (v1, _) <- edges ] ++ -- to make the map total
[ (v1, []) | (v1, v2) <- edges ] [ (v2, [v1])
Now we can write down the formula that Norman gave, quite elegantly:
domintors1 :: [(Int,Int)] -> M.Map Int [Int]
= fmap S.toList doms
domintors1 edges where
doms :: M.Map Int (S.Set Int)
= M.mapWithKey
doms -> S.insert v (intersections [ doms M.! v' | v' <- vs]))
(\v vs (preds edges)
Does this work? It seems it does:
> domintors1 []
ghci
fromList []> domintors1 [(1,2)]
ghci1,[1]),(2,[1,2])]
fromList [(> domintors1 [(1,2),(1,3),(2,4),(3,4)]
ghci1,[1]),(2,[1,2]),(3,[1,3]),(4,[1,4])] fromList [(
But – not surprising if you have read my previous blog posts – it falls over once we have recursion:
> domintors1 [(1,2),(1,3),(2,4),(3,4),(4,3)]
ghci1,[1]),(2,[1,2]),(3,^CInterrupted. fromList [(
So let us reimplement it with Data.Recursive.Set
.
import qualified Data.Recursive.Set as RS
intersections :: [RS.RSet Int] -> RS.RSet Int
= RS.empty
intersections [] = foldl1 RS.intersection xs
intersections xs
domintors2 :: [(Int,Int)] -> M.Map Int [Int]
= fmap (S.toList . RS.get) doms
domintors2 edges where
doms :: M.Map Int (RS.RSet Int)
= M.mapWithKey
doms -> RS.insert v (intersections [ doms M.! v' | v' <- vs]))
(\v vs (preds edges)
The hope is that we can simply replace the operations, and that now it can suddenly handle cyclic graphs as well. Let’s see:
> domintors2 [(1,2),(1,3),(2,4),(3,4),(4,3)]
ghci1,[1]),(2,[1,2]),(3,[3]),(4,[4])] fromList [(
It does! Well, it does return a result… but it looks strange. Clearly node 3 and 4 are also dominated by 1, but the result does not reflect that.
But the result is a solution to Norman’s equation. Was the equation wrong? No, but we failed to notice that the desired solution is the largest, not the smallest. And Data.Recursive.Set
calculates, as documented, the least fixed point.
What now? Until the library has code for RDualSet a
, we can work around this by using the dual formula to calculate the non-dominators. To do this, we
- use union instead of intersection
- delete instead of insert,
S.empty
, use the set of all nodes (which requires some extra plumbing)- subtract the result from the set of all nodes to get the dominators
and thus the code turns into:
unions' :: S.Set Int -> [RS.RSet Int] -> RS.RSet Int
= mkR univ
unions' univ [] = foldl1 RS.union xs
unions' _ xs
domintors3 :: [(Int,Int)] -> M.Map Int [Int]
= fmap (S.toList . S.difference nodes . RS.get) nonDoms
domintors3 edges where
= S.fromList [v | (v1,v2) <- edges, v <- [v1,v2]]
nodes nonDoms :: M.Map Int (RS.RSet Int)
= M.mapWithKey
nonDoms -> RS.delete v (unions' nodes [ nonDoms M.! v' | v' <- vs]))
(\v vs (preds edges)
And with this, now we do get the correct result:
ghci> domintors3 [(1,2),(1,3),(2,4),(3,4),(4,3)]
fromList [(1,[1]),(2,[1,2]),(3,[1,3]),(4,[1,4])]
We worked a little bit on how to express the “beautiful formula” to Haskell, but at no point did we have to think about how to solve it. To me, this is the essence of declarative programming.
rec-def: Program analysis case study
At this week’s International Conference on Functional Programming I showed my rec-def
Haskell library to a few people. As this crowd appreciates writing compilers, and example from the realm of program analysis is quite compelling.
To Throw or not to throw
Here is our little toy language to analyze: It has variables, lambdas and applications, non-recursive (lazy) let bindings and, so that we have something to analyze, a way to throw and to catch exceptions:
type Var = String
data Exp
= Var Var
| Throw
| Catch Exp
| Lam Var Exp
| App Exp Exp
| Let Var Exp Exp
Given such an expression, we would like to know whether it might throw an exception. Such an analysis is easy to write: We traverse the syntax tree, remembering in the env
which of the variables may throw an exception:
canThrow1 :: Exp -> Bool
= go M.empty
canThrow1 where
go :: M.Map Var Bool -> Exp -> Bool
Var v) = env M.! v
go env (Throw = True
go env Catch e) = False
go env (Lam v e) = go (M.insert v False env) e
go env (App e1 e2) = go env e1 || go env e2
go env (Let v e1 e2) = go env' e2
go env (where
= M.singleton v (go env e1)
env_bind = M.union env_bind env env'
The most interesting case is the one for Let
, where we extend the environment env
with the information about the additional variable env_bind
, which is calculated from analyzing the right-hand side e1
.
So far so good:
ghci> someVal = Lam "y" (Var "y")
ghci> canThrow1 $ Throw
True
ghci> canThrow1 $ Let "x" Throw someVal
False
ghci> canThrow1 $ Let "x" Throw (App (Var "x") someVal)
True
Let it rec
To spice things up, let us add a recursive let to the language:
data Exp
…| LetRec [(Var, Exp)] Exp
How can we support this new constructor in canThrow1
? Let use naively follow the pattern used for Let
: Calculate the analysis information for the variables in env_bind
, extend the environment with that, and pass it down:
LetRec binds e) = go env' e
go env (where
= M.fromList [ (v, go env' e) | (v,e) <- binds ]
env_bind = M.union env_bind env env'
Note that, crucially, we use env'
, and not just env
, when analyzing the right-hand sides. It has to be that way, as all the variables are in scope in all the right-hand sides.
In a strict language, such a mutually recursive definition, where env_bind
uses env'
which uses env_bind
is basically unthinkable. But in a lazy language like Haskell, it might just work.
Unfortunately, it works only as long as the recursive bindings are not actually recursive, or if they are recursive, they are not used:
ghci> canThrow1 $ LetRec [("x", Throw)] (Var "x")
True
ghci> canThrow1 $ LetRec [("x", App (Var "y") someVal), ("y", Throw)] (Var "x")
True
ghci> canThrow1 $ LetRec [("x", App (Var "x") someVal), ("y", Throw)] (Var "y")
True
But with genuine recursion, it does not work, and simply goes into a recursive cycle:
ghci> canThrow1 $ LetRec [("x", App (Var "x") someVal), ("y", Throw)] (Var "x")
^CInterrupted.
That is disappointing! Do we really have to toss that code and somehow do an explicit fixed-point calculation here? Obscuring our nice declarative code? And possibly having to repeat work (such as traversing the syntax tree) many times that we should only have to do once?
rec-def to the rescue
Not with rec-def! Using RBool
from Data.Recursive.Bool
instead of Bool
, we can write the exact same code, as follows:
import qualified Data.Recursive.Bool as RB
canThrow2 :: Exp -> Bool
= RB.get . go M.empty
canThrow2 where
go :: M.Map Var RBool -> Exp -> R Bool
Var v) = env M.! v
go env (Throw = RB.true
go env Catch e) = RB.false
go env (Lam v e) = go (M.insert v RB.false env) e
go env (App e1 e2) = go env e1 RB.|| go env e2
go env (Let v e1 e2) = go env' e2
go env (where
= M.singleton v (go env e1)
env_bind = M.union env_bind env
env' LetRec binds e) = go env' e
go env (where
= M.fromList [ (v, go env' e) | (v,e) <- binds ]
env_bind = M.union env_bind env env'
And it works!
ghci> canThrow2 $ LetRec [("x", App (Var "x") someVal), ("y", Throw)] (Var "x")
False
ghci> canThrow2 $ LetRec [("x", App (Var "x") Throw), ("y", Throw)] (Var "x")
True
I find this much more pleasing than the explicit naive fix-pointing you might do otherwise, where you stabilize the result at each LetRec
independently: Not only is all that extra work hidden from the programmer, but now also a single traversal of the syntax tree creates, thanks to the laziness, a graph of RBool
values, which are then solved “under the hood”.
The issue with x=x
There is one downside worth mentioning: canThrow2
fails to produce a result in case we hit x=x
:
ghci> canThrow2 $ LetRec [("x", Var "x")] (Var "x")
^CInterrupted.
This is, after all the syntax tree has been processed and all the map lookups have been resolved, equivalent to
ghci> let x = x in RB.get (x :: RBool)
^CInterrupted.
which also does not work. The rec-def
machinery can only kick in if at least one of its function is used on any such cycle, even if it is just a form of identity (which I ~ought to add to the library~ since have added to the library):
ghci> idR x = RB.false ||| x
ghci> let x = idR x in getR (x :: R Bool)
False
And indeed, if I insert a call to idR
in the line
= M.fromList [ (v, idR (go env' e)) | (v,e) <- binds ] env_bind
then our analyzer will no longer stumble over these nasty recursive equations:
ghci> canThrow2 $ LetRec [("x", Var "x")] (Var "x")
False
It is a bit disappointing to have to do that, but I do not see a better way yet. I guess the def-rec
library expects the programmer to have a similar level of sophistication as other tie-the-know tricks with laziness (where you also have to ensure that your definitions are productive and that the sharing is not accidentally lost).
rec-def: Behind the scenes
A week ago I wrote about the rec-def
Haskell library, which allows you to write more recursive definitions, such as in this small example:
let s1 = RS.insert 23 s2
s2 = RS.insert 42 s1
in RS.get s1
This will not loop (as it would if you’d just used Data.Set
), but rather correctly return the set S.fromList [23,42]
. See the previous blog post for more examples and discussion of the user-facing side of this.
For quick reference, these are the types of the functions involved here:
insert :: Ord a => a -> RSet a -> RSet a
get :: RSet a -> Set a
The type of s1
and s2
above is not Set Int
, but rather RSet Int
, and in this post I’ll explain how RSet
works internally.
Propagators, in general
The conceptual model behind an recursive equation like above is
- There are a multiple cells that can hold values of an underlying type (here
Set
) - These cells have relations that explain how the values in the cells should relate to each other
- After registering all the relations, some form of solving happens.
- If the solving succeeds, we can read off the values from the cells, and they should satisfy the registered relation.
This is sometimes called a propagator network, and is a quite general model that can support different kind of relations (e.g. equalities, inequalities, functions), there can be various solving strategies (iterative fixed-points, algebraic solution, unification, etc.) and information can flow on along the edges (and hyper-edges) possibly in multiple directions.
For our purposes, we only care about propagator networks where all relations are functional, so they have a single output cell that is declared to be a function of multiple (possibly zero) input cells, without affecting these input cells. Furthermore, every cell is the output of exactly one such relation.
IO-infested propagator interfaces
This suggests that an implementation of such a propagator network could provide an interface with the following three operations:
- Functions to declare cells
- Functions to declare relations
- Functions to read values off cells
This is clearly an imperative interface, so we’ll see monads, and we’ll simply use IO
. So concretely for our small example above, we might expect
data Cell a
newCell :: IO (Cell a)
-- declareInsert x c1 c2 declares that the value of c1 ought to be S.insert x c2
declareInsert :: Ord a => a -> Cell a -> Cell a -> IO ()
getCell :: Cell a -> IO (Set a)
There is no need for an explicit “solve” function: solving can happen when declareInsert
or getCell
is called – as a User I do not care about that.
You might be curious about the implementation of newCell
, declareInsert
and getCell
, but I have to disappoint you: This is not the topic of this article. Instead, I want to discuss how to turn this IO-infested interface into the pure interface seen above?
Pure, but too strict
Obviously, we have to get rid of the IO
somehow, and have to use unsafePerformIO :: IO a -> a
somehow. This dangerous function creates a pure-looking value that, when used the first time, will run the IO-action and turn into that action’s result.
So maybe we can simply write the following:
type RSet a = Cell a
= unsafePerformIO $ do
insert x c2 <- newCell
c1
declareInsert x c1 c2return c1
= unsafePerformIO $
get c getCell c
Indeed, the types line up, but if we try to use that code, nothing will happen. Our insert
is too strict to be used recursively: It requires the value of c2
(as it is passed to declareInsert
, which we assume to be strict in its arguments) before it can return c1
, so the recursive example at the top of this post will not make any progress.
Pure, lazy, but forgetful
To work around this, maybe it suffices if we do not run declareInsert
right away, but just remember that we have to do it eventually? So let’s introduce a new data type for RSet a
that contains not just the cell (Cell a
), but also an action that we still have to run before getting a value:
data RSet a = RSet (Cell a) (IO ())
= unsafePerformIO $ do
insert x r2 <- newCell
c1 let todo = do
let (RSet c2 _) = r2
declareInsert x c1 c2return (RSet c1 todo)
RSet c todo) = unsafePerformIO $ do
get (
todo getCell c
This is better: insert
is now lazy in its arguments (for this it is crucial to pattern-match on RSet
only inside the todo
code, not in the pattern of insert
!) This means that our recursive code above does not get stuck right away.
Pure, lazy, but runs in circles
But it is still pretty bad: Note that we do not run get s2
in the example above, so that cell’s todo
, which would declareInsert 42
, will never run. This cannot work! We have to (eventually) run the declaration code from all involved cells before we can use getCell
!
We can try to run the todo action of all the dependencies as part of a cell’s todo action:
data RSet a = RSet (Cell a) (IO ())
= unsafePerformIO $ do
insert x r2 <- newCell
c1 let todo = do
let (RSet c2 todo2) = r2
declareInsert x c1 c2
todo2return (RSet c1 todo)
RSet c todo) = unsafePerformIO $ do
get (
todo getCell c
Now we certainly won’t forget to run the second cell’s todo action, so that is good. But that cell’s todo action will run the first cell’s todo action, and that again the second cell’s, and so on.
Pure, lazy, terminating, but not thread safe
This is silly: We only need (and should!) run that code once! So let’s keep track of whether we ran it already:
data RSet a = RSet (Cell a) (IO ()) (IORef Bool)
= unsafePerformIO $ do
insert x r2 <- newCell
c1 <- newIORef False
done let todo = do
<- readIORef done
is_done $ do
unless is_done True
writeIORef done let (RSet c2 todo2 _) = r2
declareInsert x c1 c2
todo2return (RSet c1 todo done)
RSet c todo _) = unsafePerformIO $ do
get (
todo getCell c
Ah, much better: It works! Our call to get c1
will trigger the first cell’s todo action, which will mark it as done before calling the second cell’s todo action. When that now invokes the first cell’s todo action, it is already marked done and we break the cycle, and by the time we reach getCell
, all relations have been correctly registered.
In a single-threaded world, this would be all good and fine, but we have to worry about multiple threads running get
concurrently, on the same or on different cells.
In fact, because we use unsafePerformIO
, we have to worry about this even when the program is not using threads.
And the above code has problems. Imagine a second call to get c1
while the first one has already marked it as done, but has not finished processing all the dependencies yet: It will call getCell
before all relations are registered, which is bad.
Recursive do-once IO actions
Making this thread-safe seems to be possible, but would clutter both the code and this blog post. So let’s hide that problem behind a nice and clean interface. Maybe there will be a separate blog post about its implementation (let me know if you are curious), or you can inspect the code in System.IO.RecThunk
module yourself). The interface is simply
data Thunk
thunk :: IO [Thunk] -> IO Thunk
force :: Thunk -> IO ()
and the idea is that thunk act
will defer the action act
until the thunk is passed to force
for the first time, and force
will not return until the action has been performed (possibly waiting if another thread is doing that at the moment), and also until the actions of all the thunks returned by act
have performed, recursively, without running into cycles.
We can use this in our definition of RSet
and get to the final, working solution:
data RSet a = RSet (Cell a) Thunk
= unsafePerformIO $ do
insert x r2 <- newCell
c1 <- thunk $ do
t1 let (RSet c2 t2) = r2
declareInsert x c1 c2return [t2]
return (RSet c1 t1)
RSet c t) = unsafePerformIO $ do
get (
force t getCell c
This snippet captures the essential ideas behind rec-def
:
- Use laziness to allow recursive definition to describe the propagator graph naturally
- Use a form of “explicit thunk” to register the propagator graph relations at the right time (not too early/strict, not too late)
And that’s all?
The actual implementation in rec-def
has a few more moving parts.
In particular, it tries to support different value types (not just sets), possibly with different implementations, and even mixing them (e.g. in member :: Ord a => a -> RSet a -> RBool
), so the generic code is in Data.Propagator.Purify
, and supports various propagators underneath. The type RSet
is then just a newtype around that, defined in Data.Recursive.Internal
to maintain the safety of the abstraction,
I went back and forth on a few variants of the design here, including one where there was a generic R
type constructor (R (Set a)
, R Bool
etc.), but then monomorphic interface seems simpler.
Does it really work?
The big remaining question is certainly: Is this really safe and pure? Does it still behave like Haskell?
The answer to these questions certainly depends on the underlying propagator implementation. But it also depends on what we actually mean by “safe and pure”? For example, do we expect the Static Argument Transformation be semantics preserving? Or is it allowed to turn undefined values into defined ones (as it does here)?
I am unsure myself yet, so I’ll defer this discussion to a separate blog post, after I hopefully had good discussions about this here at ICFP 2022 in Ljubljana. If you are around and want to discuss, please hit me up!
More recursive definitions
Haskell is a pure and lazy programming language, and the laziness allows us to write some algorithms very elegantly, by recursively referring to already calculated values. A typical example is the following definition of the Fibonacci numbers, as an infinite stream:
= 0 : 1 : zipWith (+) fibs (tail fibs) fibs
Elegant graph traversals
A maybe more practical example is the following calculation of the transitive closure of a graph:
import qualified Data.Set as S
import qualified Data.Map as M
type Graph = M.Map Int [Int]
transitive1 :: Graph -> Graph
= M.map S.toList sets
transitive1 g where
sets :: M.Map Int (S.Set Int)
= M.mapWithKey (\v vs -> S.insert v (S.unions [ sets M.! v' | v' <- vs ])) g sets
We represent graphs as maps from vertex to their successors vertex, and define the resulting map sets
recursively: The set of reachable vertices from a vertex v
is v
itself, plus those reachable by its successors vs
, for which we query sets
.
And, despite this apparent self-referential recursion, it works!
> transitive1 $ M.fromList [(1,[3]),(2,[1,3]),(3,[])]
ghci1,[1,3]),(2,[1,2,3]),(3,[3])] fromList [(
Cyclic graphs ruin it all
These tricks can be very impressive … until someone tries to use it on a cyclic graph and the program just hangs until we abort it:
> transitive1 $ M.fromList [(1,[2,3]),(2,[1,3]),(3,[])]
ghci1,fromList ^CInterrupted. fromList [(
At this point we are thrown back to implement a more pedestrian graph traversal, typically keeping explicit track of vertices that we have seen already:
transitive2 :: Graph -> Graph
= M.fromList [ (v, S.toList (go S.empty [v])) | v <- M.keys g ]
transitive2 g where
go :: S.Set Int -> [Int] -> S.Set Int
= seen
go seen [] :vs) | v `S.member` seen = go seen vs
go seen (v:vs) = go (S.insert v seen) (g M.! v ++ vs) go seen (v
I have written that seen
/todo
recursion idiom so often in the past, I can almost write it blindly And indeed, this code handles cyclic graphs just fine:
ghci> transitive2 $ M.fromList [(1,[2,3]),(2,[1,3]),(3,[])]
fromList [(1,[1,2,3]),(2,[1,2,3]),(3,[3])]
But this is a bit anticlimactic – Haskell is supposed to be a declarative language, and transitive1
declares my intent just fine!
We can have it all
It seems there actually is a way to write essentially the code in transitive1
, and still get the right result in all cases, and I have just published a possible implementation as rec-def
. In the module Data.Recursive.Set
we find an API that resembles that of Set
, with a type RSet a
, and in addition to conversion functions from and to sets, we find the two operations that we needed in transitive1
:
insert :: Ord a => a -> RSet a -> RSet a
unions :: Ord a => [RSet a] -> RSet a
get :: RSet a -> Set a
Let’s try that:
import qualified Data.Recursive.Set as RS
transitive2 :: Graph -> Graph
= M.map (S.toList . RS.get) sets
transitive2 g where
sets :: M.Map Int (RS.RSet Int)
= M.mapWithKey (\v vs -> RS.insert v (RS.unions [ sets M.! v' | v' <- vs ])) g sets
And indeed it works! Magic!
ghci> transitive2 $ M.fromList [(1,[3]),(2,[1,3]),(3,[])]
fromList [(1,[1,3]),(2,[1,2,3]),(3,[3])]
ghci> transitive2 $ M.fromList [(1,[2,3]),(2,[1,3]),(3,[])]
fromList [(1,[1,2,3]),(2,[1,2,3]),(3,[3])]
To show off some more, here are small examples:
ghci> let s = RS.insert 42 s in RS.get s
fromList [42]
ghci> :{
let s1 = RS.insert 23 s2
s2 = RS.insert 42 s1
in RS.get s1
:}
fromList [23,42]
How is that possible? Is it still Haskell?
The internal workings of the RSet a
type will be the topic of a future blog post; let me just briefly mention that it uses unsafe features under the hood, and just keeps applying the equations you gave until a fixed-point is reached. Because it starts with the empty set and all operations provided by Data.Recursive.Set
are monotonous (e.g. no difference
) it will eventually find the least fixed point.
Despite the unsafe machinery under the hood, I claim that Data.Recursive.Set
is itself nicely safe, and does not destroy Haskell’s nice properties like purity, referential transparency and equational reasoning. If you disagree, I’d like to hear about it (here, on Twitter, Reddit or Discourse)! There is a brief discussion at the end of the tutorial in Data.Recursive.Example
.
More than sets
The library also provides Data.Recursive.Bool
for recursive equations with booleans (preferring False
) and Data.Recursive.DualBool
(preferring True
), and some operations like member :: Ord a => a -> RSet a -> RBool
can actually connect different types. I plan to add other data types (natural numbers, maps, Maybe
, with suitable orders) as demand arises and as I come across nice small example use-cases for the documentation (e.g. finding shortest paths in a graph).
I believe this idiom is practically useful in a wide range of applications (which of course all have some underlying graph structure – but then almost everything in Computer Science is a graph). My original motivation was a program analysis. Imagine you want to find out from where in your program you can run into a division by zero. As long as your program does not have recursion, you can simply keep track of a boolean flag while you traverse the program, keeping track a mapping from function names to whether they can divide by zero – all nice and elegant. But once you allow mutually recursive functions, things become tricky. Unless you use RBool
! Simply use laziness, pass the analysis result down when analyzing the function’s right-hand sides, and it just works!
The Via Alpina red trail through Slovenia
This July my girlfriend and I hiked the Slovenian part of the Red Trail of the Via Alpina, from the edge of the Julian Alps to Trieste, and I’d like to share some observations and tips that we might have found useful before our trip.

Getting there
As we traveled with complete camping gear and wanted to stay in our tent, we avoided the high alpine parts of the trail and started just where the trail came down from the Alps and entered the Karst. A great way to get there is to take the night train from Zurich or Munich towards Ljubljana, get off at Jesenice, have breakfast, take the local train to Podbrdo and you can start your tour at 9:15am. From there you can reach the trail at Pedrovo Brdo within 1½h.
Finding the way
We did not use any paper maps, and instead relied on the OpenStreetMap data, which is very good, as well as the official(?) GPX tracks on Komoot, which are linked from the official route descriptions. We used OsmAnd.
In general, trails are generally very well marked (red circle with white center, and frequent signs), but the signs rarely tell you which way the Via Alpina goes, so the GPS was needed.
Sometimes the OpenStreetMap trail and the Komoot trail disagreed on short segments. We sometimes followed one and other times the other.
Variants
We diverged from the trail in a few places:
We did not care too much about the horses in Lipica and at least on the map it looked like a longish boringish and sun-exposed detour, so we cut the loop and hiked from Prelože pri Lokvi up onto the peak of the Veliko Gradišče (which unfortunately is too overgrown to provide a good view).
When we finally reached the top of Mali Kras and had a view across the bay of Trieste, it seemed silly to walk to down to Dolina, and instead we followed the ridge through Socerb, essentially the Alpe Adria Trail.
Not really a variant, but after arriving in Muggia, if one has to go to Trieste, the ferry is a probably nicer way to finish a trek than the bus.
Pitching a tent
We used our tent almost every night, only in Idrija we got a room (and a shower…). It was not trivial to find good camp spots, because most of the trail is on hills with slopes, and the flat spots tend to have housed built on them, but certainly possible. Sometimes we hid in the forest, other times we found nice small and freshly mowed meadows within the forest.
Water
Since this is Karst land, there is very little in terms of streams or lakes along the way, which is a pity.
The Idrijca river right south of Idrija was very tempting to take a plunge. Unfortunately we passed there early in the day and we wanted to cover some ground first, so we refrained.
As for drinking water, we used the taps at the bathrooms of the various touristic sites, a few (but rare) public fountains, and finally resorted to just ringing random doorbells and asking for water, which always worked.
Paths
A few stages lead you through very pleasant narrow forest paths with a sight, but not all. On some days you find yourself plodding along wide graveled or even paved forest roads, though.
Landscape and sights
The view from Nanos is amazing and, with this high peak jutting out over a wide plain, rather unique. It may seem odd that the trail goes up and down that mountain on the same day when it could go around, but it is certainly worth it.
The Karst is mostly a cultivated landscape, with lots of forestry. It is very hilly and green, which is pretty, but some might miss some craggedness. It’s not the high alps, after all, but at least they are in sight half the time.
But the upside is that there are few sights along the way that are worth visiting, in particular the the Franja Partisan Hospital hidden in a very narrow gorge, the Predjama Castle and the Škocjan Caves
Telegram bots in Python made easy
A while ago I set out to get some teenagers interested in programming, and thought about a good way to achieve that. A way that allows them to get started with very little friction, build something that’s relevant in their currently life quickly, and avoids frustration.
They were old enough to have their own smartphone, and they were already happily chatting with their friends, using the Telegram messenger. I have already experimented a bit with writing bots for Telegram (e.g. @Umklappbot
or @Kaleidogen
), and it occurred to me that this might be a good starting point: Chat bot interactions have a very simple data model: message in, response out, all simple text. Much simpler than anything graphical or even web programming. In a way it combines the simplicity of the typical initial programming exercises on the command-line with the impact and relevance of web programming.
But of course “real” bot programming is still too hard – installing a programming environment, setting up a server, deploying, dealing with access tokens, understanding the Telegram Bot API and mapping it to your programming language.

So I built a browser-based Python programming environments for Telegram bots that takes care of all of that. You simply write a single Python function, click the “Deploy” button, and the bot is live. That’s it!
This environment provides a much simpler “API” for the bots: Define a function like the following:
def private_message(sender, text):
return "Hello!"
This gets called upon a message, and if it returns a String
, that’s the response. That’s it! Not enough to build any kind of Telegram bot, but sufficient for many fun applications.

In fact, my nephew and niece use this to build a simple interactive fiction game, where the player says where they are going (“house”, ”forest”, “lake”) and thus explore the story, and in the end kill the dragon. And my girlfriend created a shopping list bot that we are using “productively”.
If you are curious, you can follow the instructions to create your own bot. There you can also find the source code and instructions for hosting your own instance (on Amazon Web Services).
Help with the project (e.g. improving the sandbox for running untrustworthy python code; making the front-end work better) is of course highly appreciated, too. The frontend is written in PureScript, and the backend in Python, building on Amazon lambda and Amazon DynamoDB.