Joachim Breitner

Blog

The curious case of the half-half Bitcoin ECDSA nonces

Published 2023-06-07 in sections English, Digital World.

This is the week of the Gulaschprogrammiernacht, a yearly Chaos Computer Club even in Karlsruhe, so it was exactly a year ago that I sat in my AirBnB room and went over the slides for my talk “Lattice Attacks on Ethereum, Bitcoin, and HTTPS” that I would give there.

It reports on research done with Nadia Heninger while I was in Phildalephia, and I really liked giving that talk: At some point we look at some rather odd signatures we found on the bitcoin blockchain, and part of the signature (the “nonce”) happens to share some bytes with the secret key. A clear case of some buffer overlap in a memory unsafe language, which I, as a fan of languages like Haskell, are very happy to sneer at!

A sneery slide

But last year, as I was going over the slides I looked at the raw data again for some reason, and I found that we overlooked something: Not only was the the upper half ot the nonce equal to the lower half of the secret key, but he lower half of the nonce was also equal to the upper half of the message hash!

This now looks much less like an accident to me, and more like a (overly) simple form of deterministic nonce creation… so much for my nice anecdote. (I still used the anecdote in my talk, followed up with an “actually”.)

When I told Nadia about this, she got curious as well, and quickly saw that from a signature with such a nonce, one can rather easily extract the secret key. So together with her student Dylan Rowe, we implemented this analysis and searched the bitcoin blockchain for more instance of such signatures. We did find a few, and were even able to trace them back to a somewhat infamous bitcoin activist going under the pseudonym Amaclin.

This research and sleuthing turned into another paper, “The curious case of the half-half Bitcoin ECDSA nonces”, to be presented at AfricaCrypt 2023. Enjoy!

Giving back to OPLSS

Published 2023-06-04 in sections English, Haskell.

Nine years ago, when I was a PhD student, I attended the Oregon Programming Language Summer School in Eugene. I had a great time and learned a lot.

The OPLSS’14

Learning some of the things I learned there, and meeting some of the people I met there, also led to me graduating, which led to me becoming a PostDoc at UPenn, which led to me later joining DFINITY to implement the Motoko programming language and help design and specify the public interface of their “Internet Computer”, including the response certification (video).

So when the ICDevs non-profit offered a development bounty for a Motoko library implementing the merkle trees involved in certification, this sounded like a fun little coding task, so I completed it; likely with less effort than it would have taken someone who first had to get into these topics.

The bounty was quite generous, at US$ 10k, and I was too vain to “just” have it donated to some large charity, as I recently with a few coding and consulting gigs, and looked for more personal. So, the ICDevs guys and I agreed to donate the money to this year’s OPLSS, where I heard it can cover the cost of about 8 students, and hopefully helps the PL cause.

(You will not find us listed as sponsors because for some reason, a “donation” instead of “sponsorship” comes with less strings attached to the organizers.)

More thoughts on a bootstrappable GHC

Published 2023-04-26 in sections English, Haskell.

The bootstrappable builds project tries to find ways of building all our software from source, without relying on binary artifacts. A noble goal, and one that is often thwarted by languages with self-hosting compilers, like GHC: In order to build GHC, you need GHC. A Pull Request against nixpkgs, adding first steps of the bootstrapping pipeline, reminded me of the issue with GHC, which I have noted down some thoughts about before and I played around a bit more.

The most promising attempt to bootstrap GHC was done by rekado in 2017. He observed that Hugs is maybe the most recently maintained bootstrappable (since written in C) Haskell compiler, but noticed that “it cannot deal with mutually recursive module dependencies, which is a feature that even the earliest versions of GHC rely on. This means that running a variant of GHC inside of Hugs is not going to work without major changes.” He then tries to bootstrap another very old Haskell compiler (nhc) with Hugs, and makes good but incomplete progress.

This made me wonder: What if Hugs supported mutually recursive modules? Would that make a big difference? Anthony Clayden keeps advocating Hugs as a viable Haskell implementation, so maybe if that was the main blocker, then adding support to Hugs for that is probably not too hard (at least in a compile-the-strongly-connected-component-as-one-unit mode) and worthwhile?

All of GHC in one file?

That reminded me of a situation I was in before, where I had to combine multiple Haskell modules into one before: For my talk “Lock-step simulation is child’s play” I wrote a multi-player game, a simulation environment for it, and a presentation tool around it, all in the CodeWorld programming environment, which supports only a single module. So I hacked the a small tool hs-all-in-one that takes multiple Haskell modules and combines them into one, mangling the names to avoid name clashes.

This made me wonder: Can I turn all of GHC into one module, and compile that?

At this point I have probably left the direct path towards bootstrapping, but I kinda good hooked.

  1. Using GHC’s hadrian/ghci tool, I got it to produce the necessary generated files (e.g. from happy grammars) and spit out the lit of modules that make up GHC, which I could feed to hs-all-in-one.

  2. It uses haskell-src-exts for parsing, and it was almost able to parse all of that. It has a different opinion about how MultiWayIf should be indented, whether EmptyCase needs {} and issues pretty-printing some promoted values, but otherwise the round-tripping worked fine, and I as able to generate a large file (680,000 loc, 41 MB) that passes GHC’s parser.

  3. It also uses haskell-names to resolve names.

    This library is less up-to-date with various Haskell features, so I added support for renaming in some pragmas (ANN, SPECIALIZE), pattern signatures etc.

    For my previous use-case I could just combine all the imports, knowing that I would not introduce conflicts. For GHC, this is far from true: Some modules import Data.Map.Strict, others Data.Map.Lazy, and yet others introduce names that clash with stuff imported from the prelude… so I had to change the tool to fully qualify all imported values. This isn’t so bad, I can do that using haskell-names, if I somehow know what all the modules in base, containers, transformers and array export.

    The haskell-names library itself comes with a hard-coded database of base exports, but it is incomplete and doesn’t help me with, say, containers.

    I then wrote a little parser for the .txt files that haddock produces for the benefit of hoogle, and that are conveniently installed along the packages (at least on nix). This would have been great, if these files wouldn’t simply omit all reexported entities! I added some manual hacks (Data.Map has the same exports as Data.IntMap; Prelude exports all entities as known by haskell-names, but those that are also exported from Data.List, use the symbol from there…)

    I played this game of whack-a-mole for a while, solving many of the problems that GHC’s renamer reports, but eventually stopped to write this blog post. I am fairly confident that this could be pulled through, though.

Back to bootstrapping

So what if we could pull this through? We’d have a very large code file that GHC may be able to compile to produce a ghc binary without exhausting my RAM. But that doesn’t help with bootstrapping yet.

If lack of support for recursive modules is all that Hugs is missing, we’d be done indeed. But quite contrary, it is probably the least of our worries, given that contemporary GHC uses many many other features not supported by Hugs.

Some of them a syntactic and can easily be rewritten to more normal Haskell in a preprocessing step (e.g. MultiWayIf).

Others are deep and hard (GADTs, Pattern synonyms, Type Families), and prohibit attempting to compile a current version of GHC (even if its all one module) with Hugs. So one would certainly have to go back in time and find a version of GHC that is not yet using all these features. For example, the first use of GADTs was introduced by Simon Marlow in 2011, so this suggests going back to at least GHC 7.0.4, maybe earlier.

Still, being able to mangle the source code before passing it to Hugs is probably a useful thing. This poses the question whether Hugs can compile such a tool; in particular, is it capable of compiling haskell-src-exts, which I am not too optimistic about either. Did someone check this already?

So one plan of attack could be

  1. Identify an old version of GHC that

    • One can use to bootstrap subsequent versions until today.
    • Is old enough to use as few features not supported by hugs as possible.
    • Is still new enough so that one can obtain a compatible toolchain.
  2. Wrangle the build system to tell you which files to compile, with which preprocessor flags etc.

  3. Boostrap all pre-processing tools used by GHC (cpphs or use plan cpp, happy, alex).

  4. For every language feature not supported by Hugs, either

    • Implement it in Hugs,
    • Manually edit the source code to avoid compiling the problematic code, if it is optional (e.g. in an optimization pass)
    • Rewrite the problematic code
    • Write a pre-processing tool (like the one above) that compiles the feature away
  5. Similarly, since Hugs probably ships a base that is different than what GHC, or the libraries used by GHC expects, either adjust Hugs’ base, or modify the GHC code that uses it.

My actual plan, though, for now is to throw these thoughts out, maybe make some noise on Discourse, Mastodon, Twitter and lobste.rs, and then let it sit and hope someone else will pick it up.

rclone, WebDav, Mailbox.org

Published 2023-04-19 in sections English, Digital World.

Just a small googleable note for those trying to set this up too:

If you try to access your “My Files” mailbox.org WebDav storage using rclone, then the URL is not just https://dav.mailbox.org/servlet/webdav.infostore/, as written on the mailbox.org documentation, but actually https://dav.mailbox.org/servlet/webdav.infostore/Userstore/Your Name.

(You can start with https://dav.mailbox.org/servlet/webdav.infostore/ and then use rclone ls to find out the full path, but then it may be more convenient to change it to point directly to the “My Files” space where you can actually create files.)

Farewell quimby and fry, welcome richard

Published 2023-04-17 in sections English, Digital World.

For a little more than two decades, I have been running one or two dedicated servers for a fair number of services. At some time or the other, it was hosting

  • A web server for my personal website
  • A web server for various private and professional webpages for friends and family
  • An email server with IMAP, SMTP, Spam filtering, for me and family
  • A mailing list server for various free software project
  • A DNS server
  • A Half-Life and Counter Strike server, with a statistics web page
  • The web page for my Counter Strike clan, running on a custom Perl-and-Mysql based CMS
  • The web page for my high school class, running on the same system (this was before everyone used, or had used, Facebook, and even supported tagging people on images…)
  • A Docbook-and-SVN-based documentation management system that my brother and I built and even sold to a few customers.
  • A custom SVN-, Perl and Template-Toolkit based static site generating CMS, before that was cool, also with one or two actual customers.
  • A SVN- and later Git based site for collaborative editing of math and computer science lecture notes; back then there was no Overleaf
  • A Jabber server
  • The backend for an online adaption of the board game “Sim Serim” which got the author to gift me the real thing
  • An SVN server
  • A darcs server
  • A git server
  • A tool to track darcs patches that were submitted by mailing lists
  • A blog aggregator (a “planet”) for friends, and one for Ghana’s Free Software community
  • An Owncloud instance for family
  • Virtual machines maintained by friends, to share resources more cheaply
  • A OpenVPN and later tinc based VPN for my machines
  • Jobs that RSS feeds to IMAP (using feed2imap and later feed2imap-go)
  • Jobs that send email greetings to a friend of mine that I have collected from his wedding guests, and are now delivered at decreasing rate over the next decades.
  • Online questionnaires to gather data for a genealogy project
  • Jobs that send an email with a daily summary of family events from that date.
  • A Django app to organize a larger family gathering
  • Multiple Mediawiki instances
  • A freenet node and a tor node
  • Code that demonstrated the Cross-site authentication attack
  • … and probably more stuff that I don’t remember anymore

It’s not you, it’s me

Running this kind of service on my own was great fun and a huge learning experience, and in some cases at that time the only way to have a certain feature. But in recent years my interests shifted a bit, more into Programming Languages Theory (and practice) than Devops, and I was no longer paying attention as much as these services require. This gave me a bad conscience, especially in terms of security updates.

Especially running your own email server, while still possible, isn’t fire-and-forget: One has to stay on top of spam protection measures, both on the receiving end (spamassassin etc.) as well as when sending email (DKIM, configuring mailing lists to rewrite sender etc.)

Also some family members were commercially relying on these servers and services working, which was no longer tenable.

Weaning off

Therefore, more than a year ago, I decided to wind this down. Turns out that getting rid of responsibilities takes at least as long as taking them on, especially if your “customers” are content and a move to something else is but an annoyance. But last weekend I was finally able to turn the machines, called quimby and fry, off.

Many of the services above were already no longer active when I started the weaning off (Jabber, Freenet, Tor). So what happened to the rest?

  • For emails, we all moved to https://mailbox.org. Happy to pay for such a crucial service.
  • For the mailing lists, especially for the Tip-Toi-Hacking project , https://jpberlin.de/ has decent enough rates that I don’t feel bad for paying for it.
  • Managing DNS is made super easy by dnscontrol; I’m using Hetzner’s DNS servers, but thanks to that tool that does not matter a lot
  • For those websites that are static enough, Github pages is nice. This includes the lecture notes site.
  • For those websites that need a little bit of server-side logic, e.g. for complex redirects and normalization procedures, or access control, but no state, I found that netlify introduced their Edge functions feature just in time. This was crucial for my main website.
  • I got rid of the backend of the two-player game Sum Serum completely, by using WebRTC. In terms of hosting, it is now just a boring static website, which is the best kind of website.
  • I converted all SVN and Darcs repositories to git, and pushed them to GitHub.
  • Although I was mildly proud to have kept the websites of my high school class and Counter Strike clan live and functional for many years after anyone stopped caring about them, I decided it is silly to keep doing that. I briefly thought about entombing that running system in a locked down virtual machine or so, but in the end simply used wget to create a static mirror of them, which is now hosted on netlify (using Edge Functions to restrict public access to a few pages).

In the end, I was not able to get rid of all services, so there is still a server running for me (smaller now, and virtual):

  • My photo album, https://bilder.joachim-breitner.de/, which is a bit too big for something like netlify.
  • Some dumb static webspace used by a family member’s business for internal reasons, which likewise is a bit too large for something like netlify or github actions, and not as critical as other sites.
  • The feed2imap jobs
  • For one Mediawiki instanced used for a genealogy project of a relative of mine I could not find a suitable hosted alternative. However, it is HTTP-AUTH-password-protected, so I am a bit less worried about staying on top of security updates for this PHP-drive site.
  • Also, I am still running the daily email job from the genealogy project, but now through nullmailer and Amazon SNS, to worry a bit less about the intricacies of modern email.

Debian → Nix

I took this opportunity to set up a new server for the residual services. I have been using Debian since 2001 and was a Debian Developer from 2003 to 2022, and it is a great operating system and a great project.

But after learning about Nix at DFINITY, and using NixOS on my laptop for almost two years, managing a system in a non-declarative way simply feels … wrong. Similar to programming in a non-functional programming language.

Hence I have shut down my Debian-based systems (two virtual machines called quimby and fry and the surrounding Dom0 freddy – my machines are named after Simpsons side kicks and (sometimes) Futurama characters), and am using the NixOS-based aarch64 host richard (because of the [“Nix” in the name]) instead.

git-multisect

Published 2023-02-27 in sections English, Digital World.

Because it seemed useful, and it seems it did not exist yet, I created a small tool called git-multisect. You point it to a git repo, give it a version range and a command to run, and it will list those commits that affect the output of the command.

Here is an example (with a slightly complex invocation):

$ git-multisect.py \
	-C ~/build/nixpkgs \
	-f $(nix flake metadata ~/nixos --json|jq -r ".locks.nodes[.locks.nodes[.locks.root].inputs.nixpkgs].locked.rev") \
	-t release-22.11 \
	-c 'nix path-info --derivation '\
	   '~/nixos#nixosConfigurations.richard.config.system.build.toplevel '\
	   '--no-update-lock-file '\
	   '--override-input nixpkgs ~/"build/nixpkgs?rev=$REV"' \
	--hide-stderr --log-option=--pretty=medium
Found 39 commits
[39 total,  0 relevant,  0 irrelevant,  0 skipped, 39 unknown] inspecing 2fb7d74 ...
[39 total,  0 relevant,  0 irrelevant,  0 skipped, 39 unknown] inspecing 569163e ...
[39 total,  0 relevant,  0 irrelevant,  0 skipped, 39 unknown] inspecing 5642ce8 ...
[39 total,  0 relevant,  0 irrelevant,  0 skipped, 39 unknown] inspecing e0c8cf8 ...
[39 total,  0 relevant,  1 irrelevant,  8 skipped, 30 unknown] inspecing 89d0361 ...
[39 total,  0 relevant,  1 irrelevant,  8 skipped, 30 unknown] inspecing d1c7e63 ...
[39 total,  0 relevant,  2 irrelevant,  9 skipped, 28 unknown] inspecing e6d5772 ...
[39 total,  0 relevant,  3 irrelevant,  9 skipped, 27 unknown] inspecing a099526 ...
[39 total,  1 relevant,  4 irrelevant,  9 skipped, 25 unknown] inspecing 854312a ...
[39 total,  1 relevant,  5 irrelevant, 10 skipped, 23 unknown] inspecing 95043dc ...
[39 total,  1 relevant,  6 irrelevant, 10 skipped, 22 unknown] inspecing 0cf4274 ...
[39 total,  2 relevant,  8 irrelevant, 29 skipped,  0 unknown] done

commit a0995268af8ba0336a81344a3bf6a50d6d6481b2
Author: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Date:   Sat Feb 18 10:45:11 2023 -0800

    linux_{5_15,6_1}: revert patch to fix Equinix Metal bonded networking with `ice` driver (#216955)
…
commit 0cf4274b5d06325bd16dbf879a30981bc283e58a
Merge: 95043dc713d 532f3aa6052
Author: Pierre Bourdon <delroth@gmail.com>
Date:   Sun Feb 19 23:37:48 2023 +0900

    Merge pull request #217121 from NixOS/backport-216463-to-release-22.11

    [Backport release-22.11] sudo: 1.9.12p2 -> 1.9.13

As you can see it tries to be clever to not even look at commits that come between two commits where the output is equal (like git-bisect).

Over at https://github.com/nomeata/git-multisect you’ll find more explanations and examples. Hope you’ll maybe remember this when you find you’ll need it, and that it will prove to be useful.

Pro-charity consulting

Published 2023-01-08 in sections English.

A few times last year I was asked to help out with some software project – debugging a particularly gnarly nix issue, helping with GHC internals or reviewing Canister code of applications running on the Internet Computer. Usually, these are relatively small jobs, at least reasonably interesting, and it was often friends that approached me.

So I do want to help.

But I did not want to help for free. Incentives matter, and I wanted to incentivize careful use of my time. And of course I like to see my work valued. So normally, one would just charge an hourly fee.

But having to go through all the necessary legalese (setting up a contract, maybe an NDA, thinking about taxes and liabilities, possibly getting permission from the current employer) tilts the scale and would not made it worth it for me anymore.

But instead of just turning it down, a few times now already I used the following model: pro-charity consulting, where you still pay for what you get, but you don’t pay me, but instead a charity.

The rules of pro-charity consulting

The rough rules are as follows:

  • There is no contract. I trust you and you trust me, else we wouldn’t even be doing this.
  • There is no liability. You probably have a rough idea of what I can deliver and you trust me, so you don’t get to sue me if I screw up.
  • There is (usually) no NDA. Of course I’ll keep your secrets, but again you don’t get to sue me if I screw up.
  • I tell you about my usual hourly rate if this was a commercial agreement. (This may depend a lot on the field I’m helping you in.)
  • Afterwards I’ll also let you know how much time I have spend on helping you.
  • You’ll make a donation on a charity of our choosing. We’ll pick something that we both find worth supporting.
  • You can donate whatever the work was worth to you, but the expectation is that it shouldn’t be cheaper for you than if you’d have hired me commercially. (Unless I failed to fulfill your expectations.)
  • Legally, the donation is by you (e.g. you get to keep any tax benefits).
  • Morally, it is by me. For example if donations can come with a message (like on https://donate.openstreetmap.org/) it would be appreciated to mention me.
  • According to the German saying “Tue Gutes und rede darüber!” we both get to brag about this donation (target charity, amount, client, kind of work), e.g. on Blogs, Mastodon, Twitter, unless we have a different agreement. See below for some examples.

I’m writing the rules out here so that I can just point to this blog post when I next have to explain this scheme.

Pro-charity gigs so far

Serverless WebRTC instead of websocket

Published 2022-12-23 in sections English, Digital World.

For over two decades I have maintained a personal dedicated server, and hosted a fair number of websites, such as my homepage, various projects, sites of friends and family, and during its heyday even servers for DNS, email, mailing lists, Jabber, cloud storage…

But it turns out just because interests shift a bit, such running services don’t simply disappear, but keep running, just less well maintained, and then become a burdensome responsibility. So over the last year I have been moving them, one by one, off my server and elsewhere™.

Most websites are static and can easily be hosted on something convenient like Github Pages or netlify. Some are “mostly static” and require a bit more work (I might blog about that some other time), but still doable. But I had one website that I thought will be hard to host elsewhere:

A finished game of Sum Serum

On https://sumserum.nomeata.de/ you can play a small abstract two-player game. It’s implementing the mechanics of “Sim Serim” by Heinrich Glumpler, and I built it out of a whim 9 years ago, after gifting it away and then wanting to play it. I later did get my own copy of the game, when the author discovered the online version and sent me the real thing.

You can play Sum Serum locally, then it is simply a static webpage with a bit of JavaScript code. But you can also play it with a friend remotely. It connects via WebSockets to a small NodeJS service, which makes the connection and relays messages. It works nicely, but “running a WebSocket service” is usually not part of these simple static-website-from-git services, so this was getting into the way of (eventually) decomissioning my server.

But web technology has advanced, and WebRTC, which allows browsers to talk to each other, has become widely available. So indeed, I found the trystero JS libray which provides a very simple API to connect with someone else over the internet, and using instead of my little WebSocket service was a relatively small change. Now the Sum Serum code is a purely static webpage that I can very easily host on netlify! Neat!

Granted, there are still servers involved to make the connection, as the library is using the BitTorrent network to find the peers. But as far as I am concered, that is nothing I need to worry about. Furthermore, trystero is quite easy to use, so maybe I can create more simply fun multi-use interactive thingies, now that I know I don’t have to worry about hosting a backend.

rec-def: Minesweeper case study

Published 2022-10-10 in sections English, Haskell.

I’m on the train back from MuniHac, where I gave a talk about the rec-def library that I have excessively blogged about recently (here, here, here and here). I got quite flattering comments about that talk, so if you want to see if they were sincere, I suggest you watch the recording of “Getting recursive definitions off their bottoms” (but it’s not necessary for the following).

After the talk, Franz Thoma approached me and told me a story of how we was once implementing the game Minesweeper in Haskell, and in particular the part of the logic where, after the user has uncovered a field, the game would automatically uncover all fields that are next to a “neutral” field, i.e. one with zero adjacent bombs. He was using a comonadic data structure, which makes a “context-dependent parallel computation” such as uncovering one field quite natural, and was hoping that using a suitable fix-point operator, he can elegantly obtain not just the next step, but directly the result of recursively uncovering all these fields. But, much to his disappointment, that did not work out: Due to the recursion inherent in that definition, a knot-tying fixed-point operator will lead to a cyclic definition.

Microsoft Minesweeper

He was wondering if the rec-def library could have helped him, and we sat down to find out, and this is the tale of this blog post. I will avoid the comonadic abstractions and program it more naively, though, to not lose too many readers along the way. Maybe read Chris Penner’s blog post and Finch’s functional pearl “Getting a Quick Fix on Comonads” if you are curious about that angle.

Minesweeper setup

Let’s start with defining a suitable data type for the grid of the minesweeper board. I’ll use the Array data type, it’s Ix-based indexing is quite useful for grids:

import Data.Array

type C = (Int,Int)
type Grid a = Array C a

The library lacks a function to generate an array from a generating function, but it is easy to add:

genArray :: Ix i => (i,i) -> (i -> e) -> Array i e
genArray r f = listArray r $ map f $ range r

Let’s also fix the size of the board, as a pair of lower and upper bounds (this is the format that the Ix type class needs):

size :: (C,C)
size = ((0,0), (3,3))

Now board is simply a grid of boolean values, with True indicating that a bomb is there:

type Board = Grid Bool

board1 :: Board
board1 = listArray size
  [ False, False, False, False
  , True,  False, False, False
  , True,  False, False, False
  , False, False, False, False
  ]

It would be nice to be able to see these board in a nicer way. So let us write A function that prints a grid, including a frame, given a function that prints something for each coordinate. Together with a function that prints a bomb (as *), we can print the board:

pGrid :: (C -> String) -> String
pGrid p = unlines
    [ concat [ p' (y,x) | x <- [lx-1 .. ux+1] ]
    | y  <- [ly-1 .. uy+1] ]
  where
    ((lx,ly),(ux,uy)) = size

    p' c | inRange size c = p c
    p'  _ = "#"

pBombs :: Board -> String
pBombs b = pGrid $ \c -> if b ! c then "*" else " "

The expression b ! c looks up a the coordinate in the array, and is True when there is a bomb at that coordinate.

So here is our board, with two bombs:

ghci> putStrLn $ pBombs board1
######
#    #
#*   #
#*   #
#    #
######

But that’s not what we want to show to the user: Every field should have have a number that indicates the number of bombs in the surrounding fields. To that end, we first define a function that takes a coordinate, and returns all adjacent coordinates. This also takes care of the border, using inRange:

neighbors :: C -> [C]
neighbors (x,y) =
    [ c
    | (dx, dy) <- range ((-1,-1), (1,1))
    , (dx, dy) /= (0,0)
    , let c = (x + dx, y + dy)
    , inRange size c
    ]

With that, we can calculate what to display in each cell – a bomb, or a number:

data H = Bomb | Hint Int deriving Eq

hint :: Board -> C -> H
hint b c | b ! c = Bomb
hint b c = Hint $ sum [ 1 | c' <- neighbors c, b ! c' ]

With a suitable printing function, we can now see the full board:

hint :: Board -> C -> H
hint b c | b ! c = Bomb
hint b c = Hint $ sum [ 1 | c' <- neighbors c, b ! c' ]

pCell :: Board -> C -> String
pCell b c = case hint b c of
    Bomb -> "*"
    Hint 0 -> " "
    Hint n -> show n

pBoard :: Board -> String
pBoard b = pGrid (pCell b)

And here it is:

ghci> putStrLn $ pBoard board1
######
#11  #
#*2  #
#*2  #
#11  #
######

Next we have to add masks: We need to keep track of which fields the user already sees. We again use a grid of booleans, and define a function to print a board with the masked fields hidden behind ?:

type Mask = Grid Bool

mask1 :: Mask
mask1 = listArray size
  [ True,  True,  True,  False
  , False, False, False, False
  , False, False, False, False
  , False, False, False, False
  ]

pMasked :: Board -> Mask -> String
pMasked b m = pGrid $ \c -> if m ! c then pCell b c else "?"

So this is what the user would see

ghci> putStrLn $ pMasked board1 mask1
######
#11 ?#
#????#
#????#
#????#
######

Uncovering some fields

With that setup in place, we now implement the piece of logic we care about: Uncovering all fields that are next to a neutral field. Here is the first attempt:

solve0 :: Board -> Mask -> Mask
solve0 b m0 = m1
  where
    m1 :: Mask
    m1 = genArray size $ \c ->
      m0 ! c || or [ m0 ! c' | c' <- neighbors c, hint b c' == Hint 0 ]

The idea is that we calculate the new mask m1 from the old one m0 by the following logic: A field is visible if it was visible before (m0 ! c), or if any of its neighboring, neutral fields are visible.

This works so far: I uncovered the three fields next to the one neutral visible field:

ghci> putStrLn $ pMasked board1 $ solve0 board1 mask1
######
#11  #
#?2  #
#????#
#????#
######

But that’s not quite what we want: We want to keep doing that to uncover all fields.

Uncovering all fields

So what happens if we change the logic to: A field is visible if it was visible before (m0 ! c), or if any of its neighboring, neutral fields will be visible.

In the code, this is just a single character change: Instead of looking at m0 to see if a neighbor is visible, we look at m1:

solve1 :: Board -> Mask -> Mask
solve1 b m0 = m1
  where
    m1 :: Mask
    m1 = genArray size $ \c ->
      m0 ! c || or [ m1 ! c' | c' <- neighbors c, hint b c' == Hint 0 ]

(This is roughly what happened when Franz started to use the kfix comonadic fixed-point operator in his code, I believe.)

Does it work? It seems so:

ghci> putStrLn $ pMasked board1 $ solve1 board1 mask1
######
#11  #
#?2  #
#?2  #
#?1  #
######

Amazing, isn’t it!

Unfortunately, it seems to work by accident. If I start with a different mask:

mask2 :: Mask
mask2 = listArray size
  [ True,  True,  False, False
  , False, False, False, False
  , False, False, False, False
  , False, False, False, True
  ]

which looks as follows:

ghci> putStrLn $ pMasked board1 mask2
######
#11??#
#????#
#????#
#??? #
######

Then our solve1 function does not work, and just sits there:

ghci> putStrLn $ pMasked board1 $ solve1 board1 mask2
######
#11^CInterrupted.

Why did it work before, but now now?

It fails to work because as the code tries to figure out if a field, it needs to know if the next field will be uncovered. But to figure that out, it needs to know if the present field will be uncovered. With the normal boolean connectives (|| and or), this does not make progress.

It worked with mask1 more or less by accident: None of the fields on in the first column don’t have neutral neighbors, so nothing happens there. And for all the fields in the third and forth column, the code will know for sure that they will be uncovered based on their upper neighbors, which come first in the neighbors list, and due to the short-circuting properties of ||, it doesn’t have to look at the later cells, and the vicious cycle is avoided.

rec-def to the rescue

This is where rec-def comes in: By using the RBool type in m1 instead of plain Bool, the recursive self-reference is not a problem, and it simply works:

import qualified Data.Recursive.Bool as RB

solve2 :: Board -> Mask -> Mask
solve2 b m0 = fmap RB.get m1
  where
    m1 :: Grid RB.RBool
    m1 = genArray size $ \c ->
      RB.mk (m0 ! c) RB.|| RB.or [ m1 ! c' | c' <- neighbors c, hint b c' == Hint 0 ]

Note that I did not change the algorithm, or the self-reference through m1; I just replaced Bool with RBool, || with RB.|| and or with RB.or. And used RB.get at the end to get a normal boolean out. And 🥁, here we go:

ghci> putStrLn $ pMasked board1 $ solve2 board1 mask2
######
#11  #
#?2  #
#?2  #
#?1  #
######

That’s the end of this repetition of “let’s look at a tying-the-knot-problem and see how rec-def helps”, which always end up a bit anti-climatic because it “just works”, at least in these cases. Hope you enjoyed it nevertheless.

rec-def: Dominators case study

Published 2022-09-15 in sections English, Haskell.

More ICFP-inspired experiments using the rec-def library: In Norman Ramsey’s very nice talk about his Functional Pearl “Beyond Relooper: Recursive Translation of Unstructured Control Flow to Structured Control Flow”, he had the following slide showing the equation for the dominators of a node in a graph:

Norman Ramsey shows a formula

He said “it’s ICFP and I wanted to say the dominance relation has a beautiful set of equations … you can read all these algorithms how to compute this, but the concept is simple”.

This made me wonder: If the concept is simple and this formula is beautiful – shouldn’t this be sufficient for the Haskell programmer to obtain the dominator relation, without reading all those algorithms?

Before we start, we have to clarify the formula a bit: If a node is an entry node (no predecessors) then the big intersection is over the empty set, and that is not a well-defined concept. For these nodes, we need that big intersection to return the empty set, as entry nodes are not dominated by any other node. (Let’s assume that the entry nodes are exactly those with no predecessors.)

Let’s try, first using plain Haskell data structures. We begin by implementing this big intersection operator on Data.Set, and also a function to find the predecessors of a node in a graph:

import qualified Data.Set as S
import qualified Data.Map as M

intersections :: [S.Set Int] -> S.Set Int
intersections [] = S.empty
intersections xs = foldl1 S.intersection xs

preds :: [(Int,Int)] -> M.Map Int [Int]
preds edges = M.fromListWith (<>) $
    [ (v1, [])   | (v1, _) <- edges ] ++ -- to make the map total
    [ (v2, [v1]) | (v1, v2) <- edges ]

Now we can write down the formula that Norman gave, quite elegantly:

domintors1 :: [(Int,Int)] -> M.Map Int [Int]
domintors1 edges = fmap S.toList doms
  where
    doms :: M.Map Int (S.Set Int)
    doms = M.mapWithKey
        (\v vs -> S.insert v (intersections [ doms M.! v' | v' <- vs]))
        (preds edges)

Does this work? It seems it does:

ghci> domintors1 []
fromList []
ghci> domintors1 [(1,2)]
fromList [(1,[1]),(2,[1,2])]
ghci> domintors1 [(1,2),(1,3),(2,4),(3,4)]
fromList [(1,[1]),(2,[1,2]),(3,[1,3]),(4,[1,4])]

But – not surprising if you have read my previous blog posts – it falls over once we have recursion:

ghci> domintors1 [(1,2),(1,3),(2,4),(3,4),(4,3)]
fromList [(1,[1]),(2,[1,2]),(3,^CInterrupted.

So let us reimplement it with Data.Recursive.Set.

import qualified Data.Recursive.Set as RS

intersections :: [RS.RSet Int] -> RS.RSet Int
intersections [] = RS.empty
intersections xs = foldl1 RS.intersection xs

domintors2 :: [(Int,Int)] -> M.Map Int [Int]
domintors2 edges = fmap (S.toList . RS.get) doms
  where
    doms :: M.Map Int (RS.RSet Int)
    doms = M.mapWithKey
        (\v vs -> RS.insert v (intersections [ doms M.! v' | v' <- vs]))
        (preds edges)

The hope is that we can simply replace the operations, and that now it can suddenly handle cyclic graphs as well. Let’s see:

ghci> domintors2 [(1,2),(1,3),(2,4),(3,4),(4,3)]
fromList [(1,[1]),(2,[1,2]),(3,[3]),(4,[4])]

It does! Well, it does return a result… but it looks strange. Clearly node 3 and 4 are also dominated by 1, but the result does not reflect that.

But the result is a solution to Norman’s equation. Was the equation wrong? No, but we failed to notice that the desired solution is the largest, not the smallest. And Data.Recursive.Set calculates, as documented, the least fixed point.

What now? Until the library has code for RDualSet a, we can work around this by using the dual formula to calculate the non-dominators. To do this, we

  • use union instead of intersection
  • delete instead of insert,
  • S.empty, use the set of all nodes (which requires some extra plumbing)
  • subtract the result from the set of all nodes to get the dominators

and thus the code turns into:

unions' :: S.Set Int -> [RS.RSet Int] -> RS.RSet Int
unions' univ [] = mkR univ
unions' _ xs = foldl1 RS.union xs

domintors3 :: [(Int,Int)] -> M.Map Int [Int]
domintors3 edges = fmap (S.toList . S.difference nodes . RS.get) nonDoms
  where
    nodes = S.fromList [v | (v1,v2) <- edges, v <- [v1,v2]]
    nonDoms :: M.Map Int (RS.RSet Int)
    nonDoms = M.mapWithKey
        (\v vs -> RS.delete v (unions' nodes [ nonDoms M.! v' | v' <- vs]))
        (preds edges)

And with this, now we do get the correct result:

ghci> domintors3 [(1,2),(1,3),(2,4),(3,4),(4,3)]
fromList [(1,[1]),(2,[1,2]),(3,[1,3]),(4,[1,4])]

We worked a little bit on how to express the “beautiful formula” to Haskell, but at no point did we have to think about how to solve it. To me, this is the essence of declarative programming.