Distributing Haskell programs in a multi-platform zip file

Published 2020-11-09 in sections English, Haskell.

My maybe most impactful piece of code is tttool and the surrounding project, which allows you to create your own content for the Ravensburger Tiptoi™ platform. The program itself is a command line tool, and in this blog post I want to show how I go about building that program for Linux (both normal and static builds), Windows (cross-compiled from Linux), OSX (only on CI), all combined into and released as a single zip file.

Maybe some of it is useful or inspiring to my readers, or can even serve as a template. This being a blob post, though, note that it may become obsolete or outdated.

Ingredients

I am building on the these components:

nix

Without the nix build system and package manger I probably woudn’t even attempt to pull of complex tasks that may, say, require a patched ghc. For many years I resisted learning about nix, but when I eventually had to, I didn’t want to go back.
haskell.nix

This project provides an alternative Haskell build infrastructure for nix. While this is not crucial for tttool, it helps that they tend to have some cross-compilation-related patches more than the official nixpkgs. I also like that it more closely follows the cabal build work-flow, where cabal calculates a build plan based on your projects dependencies. It even has decent documentation (which is a new thing compared to two years ago).
niv

Niv is a neat little tool to keep track of your dependencies. You can quickly update them with, say niv update nixpkgs. But what’s really great is to temporarily replace one of your dependencies with a local checkout, e.g. via
```
NIV_OVERRIDE_haskellNix=$HOME/build/haskell/haskell.nix nix-instantiate  -A osx-exe-bundle
```
There is a Github action that will keep your niv-managed dependencies up-to-date.
Cachix.org

This service (proprietary, but free to public stuff up to 10GB) gives your project its own nix cache. This means that build artifacts can be cached between CI builds or even build steps, and your contributors. A cache like this is a must if you want to use nix in more interesting ways where you may end up using, say, a changed GHC compiler. Comes with GitHub actions integration.
CI via Github actions

Until recently, I was using Travis, but Github actions are just a tad easier to set up and, maybe more important here, the job times are high enough that you can rebuild GHC if you have to, and even if your build gets canceled or times out, cleanup CI steps still happen, so that any new nix build products will still reach your nix cache.

The repository setup

All files discussed in the following are reflected at https://github.com/entropia/tip-toi-reveng/tree/7020cde7da103a5c33f1918f3bf59835cbc25b0c.

We are starting with a fairly normal Haskell project, with a single .cabal file (but multi-package projects should work just fine). To make things more interesting, I also have a cabal.project which configures one dependency to be fetched via git from a specific fork.

To start building the nix infrastructure, we can initialize niv and configure it to use the haskell.nix repo:

niv init
niv add input-output-hk/haskell.nix -n haskellNix

This creates nix/sources.json (which you can also edit by hand) and nix/sources.nix (which you can treat like a black box).

Now we can start writing the all-important default.nix file, which defines almost everything of interest here. I will just go through it line by line, and explain what I am doing here.

{ checkMaterialization ? false }:

This defines a flag that we can later set when using nix-build, by passing --arg checkMaterialization true, and which is off by default. I’ll get to that flag later.

let
  sources = import nix/sources.nix;
  haskellNix = import sources.haskellNix {};

This imports the sources as defined niv/sources.json, and loads the pinned revision of the haskell.nix repository.

  # windows crossbuilding with ghc-8.10 needs at least 20.09.
  # A peek at https://github.com/input-output-hk/haskell.nix/blob/master/ci.nix can help
  nixpkgsSrc = haskellNix.sources.nixpkgs-2009;
  nixpkgsArgs = haskellNix.nixpkgsArgs;

  pkgs = import nixpkgsSrc nixpkgsArgs;

Now we can define pkgs, which is “our” version of the nixpkgs package set, extended with the haskell.nix machinery. We rely on haskell.nix to pin of a suitable revision of the nixpkgs set (see how we are using their niv setup).

Here we could our own configuration, overlays, etc to nixpkgsArgs. In fact, we do in

  pkgs-osx = import nixpkgsSrc (nixpkgsArgs // { system = "x86_64-darwin"; });

to get the nixpkgs package set of an OSX machine.

  # a nicer filterSource
  sourceByRegex =
    src: regexes: builtins.filterSource (path: type:
      let relPath = pkgs.lib.removePrefix (toString src + "/") (toString path); in
      let match = builtins.match (pkgs.lib.strings.concatStringsSep "|" regexes); in
      ( type == "directory"  && match (relPath + "/") != null
      || match relPath != null)) src;

Next I define a little helper that I have been copying between projects, and which allows me to define the input to a nix derivation (i.e. a nix build job) with a set of regexes. I’ll use that soon.

  tttool-exe = pkgs: sha256:
    (pkgs.haskell-nix.cabalProject {

The cabalProject function takes a cabal project and turns it into a nix project, running cabal v2-configure under the hood to let cabal figure out a suitable build plan. Since we want to have multiple variants of the tttool, this is so far just a function of two arguments pkgs and sha256, which will be explained in a bit.

      src = sourceByRegex ./. [
          "cabal.project"
          "src/"
          "src/.*/"
          "src/.*.hs"
          ".*.cabal"
          "LICENSE"
        ];

The cabalProject function wants to know the source of the Haskell projects. There are different ways of specifying this; in this case I went for a simple whitelist approach. Note that cabal.project.freze (which exists in the directory) is not included.

      # Pinning the input to the constraint solver
      compiler-nix-name = "ghc8102";

The cabal solver doesn’t find out which version of ghc to use, that is still my choice. I am using GHC-8.10.2 here. It may require a bit of experimentation to see which version works for your project, especially when cross-compiling to odd targets.

      index-state = "2020-11-08T00:00:00Z";

I want the build to be deterministic, and not let cabal suddenly pick different package versions just because something got uploaded. Therefore I specify which snapshot of the Hackage package index it should consider.

      plan-sha256 = sha256;
      inherit checkMaterialization;

Here we use the second parameter, but I’ll defer the explanation for a bit.

      modules = [{
        # smaller files
        packages.tttool.dontStrip = false;
      }] ++

These “modules” are essentially configuration data that is merged in a structural way. Here we say that we want the tttool binary to be stripped (saves a few megabyte).

      pkgs.lib.optional pkgs.hostPlatform.isMusl {
        packages.tttool.configureFlags = [ "--ghc-option=-static" ];

Also, when we are building on the musl platform, that’s when we want to produce a static build, so let’s pass -static to GHC. This seems to be enough in terms of flags to produce static binaries. It helps that my project is using mostly pure Haskell libraries; if you link against C libraries you might have to jump through additional hoops to get static linking going. The haskell.nix documentation has a section on static building with some flags to cargo-cult.

        # terminfo is disabled on musl by haskell.nix, but still the flag
        # is set in the package plan, so override this
        packages.haskeline.flags.terminfo = false;
      };

This (again only used when the platform is musl) seems to be necessary to workaround what might be a big in haskell.nix.

    }).tttool.components.exes.tttool;

The cabalProject function returns a data structure with all Haskell packages of the project, and for each package the different components (libraries, tests, benchmarks and of course executables). We only care about the tttool executable, so let’s project that out.

  osx-bundler = pkgs: tttool:
   pkgs.stdenv.mkDerivation {
      name = "tttool-bundle";

      buildInputs = [ pkgs.macdylibbundler ];

      builder = pkgs.writeScript "tttool-osx-bundler.sh" ''
        source ${pkgs.stdenv}/setup

        mkdir -p $out/bin/osx
        cp ${tttool}/bin/tttool $out/bin/osx
        chmod u+w $out/bin/osx/tttool
        dylibbundler \
          -b \
          -x $out/bin/osx/tttool \
          -d $out/bin/osx \
          -p '@executable_path' \
          -i /usr/lib/system \
          -i ${pkgs.darwin.Libsystem}/lib
      '';
    };

This function, only to be used on OSX, takes a fully build tttool, finds all the system libraries it is linking against, and copies them next to the executable, using the nice macdylibbundler. This way we can get a self-contained executable.

A nix expert will notice that this probably should be written with pkgs.runCommandNoCC, but then dylibbundler fails because it lacks otool. This should work eventually, though.

in rec {
  linux-exe      = tttool-exe pkgs
     "0rnn4q0gx670nzb5zp7xpj7kmgqjmxcj2zjl9jqqz8czzlbgzmkh";
  windows-exe    = tttool-exe pkgs.pkgsCross.mingwW64
     "01js5rp6y29m7aif6bsb0qplkh2az0l15nkrrb6m3rz7jrrbcckh";
  static-exe     = tttool-exe pkgs.pkgsCross.musl64
     "0gbkyg8max4mhzzsm9yihsp8n73zw86m3pwvlw8170c75p3vbadv";
  osx-exe        = tttool-exe pkgs-osx
     "0rnn4q0gx670nzb5zp7xpj7kmgqjmxcj2zjl9jqqz8czzlbgzmkh";

Time to create the four versions of tttool. In each case we use the tttool-exe function from above, passing the package set (pkgs,…) and a SHA256 hash.

The package set is either the normal one, or it is one of those configured for cross compilation, building either for Windows or for Linux using musl, or it is the OSX package set that we instantiated earlier.

The SHA256 hash describes the result of the cabal plan calculation that happens as part of cabalProject. By noting down the expected result, nix can skip that calculation, or fetch it from the nix cache etc.

How do we know what number to put there, and when to change it? That’s when the --arg checkMaterialization true flag comes into play: When that is set, cabalProject will not blindly trust these hashes, but rather re-calculate them, and tell you when they need to be updated. We’ll make sure that CI checks them.

  osx-exe-bundle = osx-bundler pkgs-osx osx-exe;

For OSX, I then run the output through osx-bundler defined above, to make it independent of any library paths in /nix/store.

This is already good enough to build the tool for the various systems! The rest of the the file is related to packaging up the binaries, to tests, and various other things, but nothing too essentially. So if you got bored, you can more or less stop now.

  static-files = sourceByRegex ./. [
    "README.md"
    "Changelog.md"
    "oid-decoder.html"
    "example/.*"
    "Debug.yaml"
    "templates/"
    "templates/.*\.md"
    "templates/.*\.yaml"
    "Audio/"
    "Audio/digits/"
    "Audio/digits/.*\.ogg"
  ];

  contrib = ./contrib;

The final zip file that I want to serve to my users contains a bunch of files from throughout my repository; I collect them here.

  book = …;

The project comes with documentation in the form of a Sphinx project, which we build here. I’ll omit the details, because they are not relevant for this post (but of course you can peek if you are curious).

  os-switch = pkgs.writeScript "tttool-os-switch.sh" ''
    #!/usr/bin/env bash
    case "$OSTYPE" in
      linux*)   exec "$(dirname "''${BASH_SOURCE[0]}")/linux/tttool" "$@" ;;
      darwin*)  exec "$(dirname "''${BASH_SOURCE[0]}")/osx/tttool" "$@" ;;
      msys*)    exec "$(dirname "''${BASH_SOURCE[0]}")/tttool.exe" "$@" ;;
      cygwin*)  exec "$(dirname "''${BASH_SOURCE[0]}")/tttool.exe" "$@" ;;
      *)        echo "unsupported operating system $OSTYPE" ;;
    esac
  '';

The zipfile should provide a tttool command that works on all systems. To that end, I implement a simple platform switch using bash. I use pks.writeScript so that I can include that file directly in default.nix, but it would have been equally reasonable to just save it into nix/tttool-os-switch.sh and include it from there.

  release = pkgs.runCommandNoCC "tttool-release" {
    buildInputs = [ pkgs.perl ];
  } ''
    # check version
    version=$(${static-exe}/bin/tttool --help|perl -ne 'print $1 if /tttool-(.*) -- The swiss army knife/')
    doc_version=$(perl -ne "print \$1 if /VERSION: '(.*)'/" ${book}/book.html/_static/documentation_options.js)

    if [ "$version" != "$doc_version" ]
    then
      echo "Mismatch between tttool version \"$version\" and book version \"$doc_version\""
      exit 1
    fi

Now the derivation that builds the content of the release zip file. First I double check that the version number in the code and in the documentation matches. Note how ${static-exe} refers to a path with the built static Linux build, and ${book} the output of the book building process.

    mkdir -p $out/
    cp -vsr ${static-files}/* $out
    mkdir $out/linux
    cp -vs ${static-exe}/bin/tttool $out/linux
    cp -vs ${windows-exe}/bin/* $out/
    mkdir $out/osx
    cp -vsr ${osx-exe-bundle}/bin/osx/* $out/osx
    cp -vs ${os-switch} $out/tttool
    mkdir $out/contrib
    cp -vsr ${contrib}/* $out/contrib/
    cp -vsr ${book}/* $out
  '';

The rest of the release script just copies files from various build outputs that we have defined so far.

Note that this is using both static-exe (built on Linux) and osx-exe-bundle (built on Mac)! This means you can only build the release if you either have setup a remote osx builder (a pretty nifty feature of nix, which I unfortunately can’t use, since I don’t have access to a Mac), or the build product must be available in a nix cache (which it is in my case, as I will explain later).

The output of this derivation is a directory with all the files I want to put in the release.

  release-zip = pkgs.runCommandNoCC "tttool-release.zip" {
    buildInputs = with pkgs; [ perl zip ];
  } ''
    version=$(bash ${release}/tttool --help|perl -ne 'print $1 if /tttool-(.*) -- The swiss army knife/')
    base="tttool-$version"
    echo "Zipping tttool version $version"
    mkdir -p $out/$base
    cd $out
    cp -r ${release}/* $base/
    chmod u+w -R $base
    zip -r $base.zip $base
    rm -rf $base
  '';

And now these files are zipped up. Note that this automatically determines the right directory name and basename for the zipfile.

This concludes the step necessary for a release.

  gme-downloads = …;
  tests = …;

These two definitions in default.nix are related to some simple testing, and again not relevant for this post.

  cabal-freeze = pkgs.stdenv.mkDerivation {
    name = "cabal-freeze";
    src = linux-exe.src;
    buildInputs = [ pkgs.cabal-install linux-exe.env ];
    buildPhase = ''
      mkdir .cabal
      touch .cabal/config
      rm cabal.project # so that cabal new-freeze does not try to use HPDF via git
      HOME=$PWD cabal new-freeze --offline --enable-tests || true
    '';
    installPhase = ''
      mkdir -p $out
      echo "-- Run nix-shell -A check-cabal-freeze to update this file" > $out/cabal.project.freeze
      cat cabal.project.freeze >> $out/cabal.project.freeze
    '';
  };

Above I mentioned that I still would like to be able to just run cabal, and ideally it should take the same library versions that the nix-based build does. But pinning the version of ghc in cabal.project is not sufficient, I also need to pin the precise versions of the dependencies. This is best done with a cabal.project.freeze file.

The above derivation runs cabal new-freeze in the environment set up by haskell.nix and grabs the resulting cabal.project.freeze. With this I can run nix-build -A cabal-freeze and fetch the file from result/cabal.project.freeze and add it to the repository.

  check-cabal-freeze = pkgs.runCommandNoCC "check-cabal-freeze" {
      nativeBuildInputs = [ pkgs.diffutils ];
      expected = cabal-freeze + /cabal.project.freeze;
      actual = ./cabal.project.freeze;
      cmd = "nix-shell -A check-cabal-freeze";
      shellHook = ''
        dest=${toString ./cabal.project.freeze}
        rm -f $dest
        cp -v $expected $dest
        chmod u-w $dest
        exit 0
      '';
    } ''
      diff -r -U 3 $actual $expected ||
        { echo "To update, please run"; echo "nix-shell -A check-cabal-freeze"; exit 1; }
      touch $out
    '';

But generated files in repositories are bad, so if that cannot be avoided, at least I want a CI job that checks if they are up to date. This job does that. What’s more, it is set up so that if I run nix-shell -A check-cabal-freeze it will update the file in the repository automatically, which is much more convenient than manually copying.

Lately, I have been using this pattern regularly when adding generated files to a repository:

Create one nix derivation that creates the files
Create a second derivation that compares the output of that derivation against the file in the repo
Create a derivation that, when run in nix-shell, updates that file. Sometimes that derivation is its own file (so that I can just run nix-shell nix/generate.nix), or it is merged into one of the other two.

This concludes the tour of default.nix.

The CI setup

The next interesting bit is the file .github/workflows/build.yml, which tells Github Actions what to do:

name: "Build and package"
on:
  pull_request:
  push:

Standard prelude: Run the jobs in this file upon all pushes to the repository, and also on all pull requests. Annoying downside: If you open a PR within your repository, everything gets built twice. Oh well.

jobs:
  build:
    strategy:
      fail-fast: false
      matrix:
        include:
        - target: linux-exe
          os: ubuntu-latest
        - target: windows-exe
          os: ubuntu-latest
        - target: static-exe
          os: ubuntu-latest
        - target: osx-exe-bundle
          os: macos-latest
    runs-on: ${{ matrix.os }}

The “build” job is a matrix job, i.e. there are four variants, one for each of the different tttool builds, together with an indication of what kind of machine to run this on.

    - uses: actions/checkout@v2
    - uses: cachix/install-nix-action@v12

We begin by checking out the code and installing nix via the install-nix-action.

    - name: "Cachix: tttool"
      uses: cachix/cachix-action@v7
      with:
        name: tttool
        signingKey: '${{ secrets.CACHIX_SIGNING_KEY }}'

Then we configure our Cachix cache. This means that this job will use build products from the cache if possible, and it will also push new builds to the cache. This requires a secret key, which you get when setting up your Cachix cache. See the nix and Cachix tutorial for good instructions.

    - run: nix-build --arg checkMaterialization true -A ${{ matrix.target }}

Now we can actually run the build. We set checkMaterialization to true so that CI will tell us if we need to update these hashes.

    # work around https://github.com/actions/upload-artifact/issues/92
    - run: cp -RvL result upload
    - uses: actions/upload-artifact@v2
      with:
        name: tttool (${{ matrix.target }})
        path: upload/

For convenient access to build products, e.g. from pull requests, we store them as Github artifacts. They can then be downloaded from Github’s CI status page.

  test:
    runs-on: ubuntu-latest
    needs: build
    steps:
    - uses: actions/checkout@v2
    - uses: cachix/install-nix-action@v12
    - name: "Cachix: tttool"
      uses: cachix/cachix-action@v7
      with:
        name: tttool
        signingKey: '${{ secrets.CACHIX_SIGNING_KEY }}'
    - run: nix-build -A tests

The next job repeats the setup, but now runs the tests. Because of needs: build it will not start before the builds job has completed. This also means that it should get the actual tttool executable to test from our nix cache.

  check-cabal-freeze:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v2
    - uses: cachix/install-nix-action@v12
    - name: "Cachix: tttool"
      uses: cachix/cachix-action@v7
      with:
        name: tttool
        signingKey: '${{ secrets.CACHIX_SIGNING_KEY }}'
    - run: nix-build -A check-cabal-freeze

The same, but now running the check-cabal-freeze test mentioned above. Quite annoying to repeat the setup instructions for each job…

  package:
    runs-on: ubuntu-latest
    needs: build
    steps:
    - uses: actions/checkout@v2
    - uses: cachix/install-nix-action@v12
    - name: "Cachix: tttool"
      uses: cachix/cachix-action@v7
      with:
        name: tttool
        signingKey: '${{ secrets.CACHIX_SIGNING_KEY }}'

    - run: nix-build -A release-zip

    - run: unzip -d upload ./result/*.zip
    - uses: actions/upload-artifact@v2
      with:
        name: Release zip file
        path: upload

Finally, with the same setup, but slightly different artifact upload, we build the release zip file. Again, we wait for build to finish so that the built programs are in the nix cache. This is especially important since this runs on linux, so it cannot build the OSX binary and has to rely on the cache.

Note that we don’t need to checkMaterialization again.

Annoyingly, the upload-artifact action insists on zipping the files you hand to it. A zip file that contains just a zipfile is kinda annoying, so I unpack the zipfile here before uploading the contents.

Conclusion

With this setup, when I do a release of tttool, I just bump the version numbers, wait for CI to finish building, run nix-build -A release-zip and upload result/tttool-n.m.zip. A single file that works on all target platforms. I have not yet automated making the actual release, but with one release per year this is fine.

Also, when trying out a new feature, I can easily create a branch or PR for that and grab the build products from Github’s CI, or ask people to try them out (e.g. to see if they fixed their bugs). Note, though, that you have to sign into Github before being able to download these artifacts.

One might think that this is a fairly hairy setup – finding the right combinations of various repertories so that cross-compilation works as intended. But thanks to nix’s value propositions, this does work! The setup presented here was a remake of a setup I did two years ago, with a much less mature haskell.nix. Back then, I committed a fair number of generated files to git, and juggled more complex files … but once it worked, it kept working for two years. I was indeed insulated from upstream changes. I expect that this setup will also continue to work reliably, until I choose to upgrade it again. Hopefully, then things are even more simple, and require less work-around or manual intervention.

Comments

Have something to say? You can post a comment by sending an e-Mail to me at <mail@joachim-breitner.de>, and I will include it here.

Go up