Joachim Breitner

Blog

GHC performance is rather stable

Published 2016-02-09 in sections English, Haskell.

Johannes Bechberger, while working on his Bachelor’s thesis supervised by my colleague Andreas Zwinkau, has developed a performance benchmark runner and results visualizer called “temci”, and used GHC as a guinea pig. You can read his elaborate analysis on his blog.

This is particularly interesting given recent discussions about GHC itself becoming slower and slower, as for example observed by Johannes Waldmann and Anthony Cowley.

Johannes Bechberger’s take-away is that, at least for the programs at hand (which were taken from the The Computer Language Benchmarks Game, there are hardly any changes worth mentioning, as most of the observed effects are less than a standard deviation and hence insignificant. He tries hard to distill some useful conclusions from the data; the one he finds are:

  • Compile time does not vary significantly.
  • The compiler flag -O2 indeed results in faster code than -O.
  • With -O (but not -O2), GHC 8.0.1 is better than GHC 7.0.1. Maybe some optimizations were promoted to -O?

If you are interested, please head over to Johannes’s post and look at the gory details of the analysis and give him feedback on that. Also, maybe his tool temci is something you want to try out?

Personally, I find it dissatisfying to learn so little from so much work, but as he writes: “It’s so easy to lie with statistics.”, and I might add “lie to yourself”, e.g. by ignoring good advise about standard deviations and significance. I’m sure my tool gipeda (which powers perf.haskell.org) is guilty of that sin.

Maybe a different selection of test programs would yield more insight; the benchmark’s games programs are too small and hand-optimized, the nofib programs are plain old and the fibon collection has bitrotted. I would love to see a curated, collection of real-world programs, bundled with all dependencies and frozen to allow meaningful comparisons, but updated to a new, clearly marked revision, on a maybe bi-yearly basis – maybe Haskell-SPEC-2016 if that were not a trademark infringement.

Protecting static content with mod_rewrite

Published 2016-02-08 in sections English, Digital World.

Since fourteen years, I have been photographing digitally and putting the pictures on my webpage. Back then, online privacy was not a big deal, but things have changed, and I had to at least mildly protect the innocent. In particular, I wanted to prevent search engines from accessing some of my pictures.

As I did not want my friends and family having to create an account and remember a password, I set up an OpenID based scheme five years ago. This way, they could use any of their OpenID enabled account, e.g. their Google Mail account, to log in, without disclosing any data to me. As my photo album consists of just static files, I created two copies on the server: the real one with everything, and a bunch of symbolic links representing the publicly visible parts. I then used mod_auth_openid to prevent access to the protected files, unless the users logged in. I never got around of actually limiting who could log in, so strangers were still able to see all photos, but at least search engine spiders were locked out.

But, very unfortunately, OpenID did never really catch on, Google even stopped being a provider, and other promising decentralized authentication schemes like Mozilla Persona are also phased out. So I needed an alternative.

A very simply scheme would be a single password that my friends and family can get from me and that unlocks the pictures. I could have done that using HTTP Auth, but that is not very user-friendly, and the login does not persist (at least not without the help of the browser). Instead, I wanted something that involves a simple HTTP form. But I also wanted to avoid server-side programming, for performance and security reasons. I love serving static files whenever it is feasible.

Then I found that mod_rewrite, Apache’s all-around-tool for URL rewriting and request mangling, supports reading and writing cookies! So I came up with a scheme that implements the whole login logic in the Apache server configuration. I’d like to describe this setup here, in case someone finds it inspiring.

I created a login.html with a simple HTML form:

<form method="GET" action="/bilder/login.html">
 <div style="text-align:center">
  <input name="password" placeholder="Password" />
  <button type="submit">Sign-In</button>
 </div>
</form>

It sends the user to the same page again, putting the password into the query string, hence the method="GET"mod_rewrite unfortunately cannot read the parameters of a POST request.

The Apache configuration is as follows:

RewriteMap public "dbm:/var/www/joachim-breitner.de/bilder/publicfiles.dbm"
<Directory /var/www/joachim-breitner.de/bilder>
 RewriteEngine On

 # This is a GET request, trying to set a password.
 RewriteCond %{QUERY_STRING} password=correcthorsebatterystaple
 RewriteRule ^login.html /bilder/loggedin.html [L,R,QSD,CO=bilderhp:correcthorsebatterystaple:www.joachim-breitner.de:2000000:/bilder]

 # This is a GET request, trying to set a wrong password.
 RewriteCond %{QUERY_STRING} password=
 RewriteRule ^login.html /bilder/notloggedin.html [L,R,QSD]

 # No point in loggin in if there is already the right password
 RewriteCond %{HTTP:Cookie} bilderhp=correcthorsebatterystaple
 RewriteRule ^login.html /bilder/loggedin.html [L,R]

 # If protected file is requested, check for cookie.
 # If no cookie present, redirect pictures to replacement picture
 RewriteCond %{HTTP:Cookie} !bilderhp=correcthorsebatterystaple
 RewriteCond ${public:$0|private} private
 RewriteRule ^.*\.(png|jpg)$ /bilder/pleaselogin.png [L]

 RewriteCond %{HTTP:Cookie} !bilderhp=correcthorsebatterystaple
 RewriteCond ${public:$0|private} private
 RewriteRule ^.+$ /bilder/login.html [L,R]
</Directory>

The publicfiles.dbm file is generated from a text file with lines like

login.html.en 1
login.html.de 1
pleaselogin.png 1
thumbs/20030920165701_thumb.jpg 1
thumbs/20080813225123_thumb.jpg 1
...

using

/usr/sbin/httxt2dbm -i publicfiles.txt -o publicfiles.dbm

and whitelists all files that are visible without login. Make sure it contains the login page, otherwise you’ll get a redirect circle.

The other directives in the above configure fulfill these tasks:

  • If the password (correcthorsebatterystaple) is in the query string, the server redirects the user to a logged-in-page that tells him that the login was successful and instructs him to reload the photo album. It also sets a cookie that will last very long -- after all, I want this to be convenient for my visitors. The query string parsing is not very strict (e.g. a password of correcthorsebatterystaplexkcdrules would also work), but that’s ok.
  • The next request detects an attempt to set a password. It must be wrong (otherwise the first rule would have matched), so we redirect the user to a variant of the login page that tells him so.
  • If the user tries to access the login page with a valid cookie, just log him in.
  • The next two rules implement the actual protection. If there no valid cookie and the accessed file is not whitelisted, then access is forbidden. For requests to images, we do an internal redirect to a placeholder image, while for everything else we redirect the user to the login page.

And that’s it! No resource-hogging web frameworks, not security-dubious scripting languages, and a dead-simple way to authenticate.

Oh, and if you believe you know me well enough to be allowed to see all photos: The real password is not correcthorsebatterystaple; just ask me what it is.

Dreaming of role playing

Published 2016-01-28 in sections English.

Recently, at a summer-school-like event, we were discussing pen-and-paper role playing. I’m not sure if this was after a session of role-playing, but I was making the point that you don’t need much or any at all of the rules, and scores, and dice, if you are one of the story-telling role players, and it can actually be more fun this way.

As an example, I said, it can make sense if one of the players (and the game master, I suppose) reads up a lot about one aspect of the fantasy world, e.g. one geographical area, one cult, one person, and then this knowledge is used to create an exciting puzzle, even without any opponents.

I’m not quite sure, but I think I fell asleep shortly after, and I dreamed of such a role playing session. It was going roughly like this:

I (a human), and my fellows (at least a dwarf, not sure about the rest) went to some castle. It was empty, but scary. We crossed its hall, and went into a room on the other side. It was locked towards the hall by a door that covered the door frame only partly, and suddenly we could see a large Ogre, together with other foul folk not worth mentioning, hammered at the door. My group (which was a bit larger in that moment) all prepared shooting arrows at him the moment it burst through the door. I had the time to appreciate the ingenuity that we all waited for him to burst through, so that none of the arrows would bounce of the door, but it did not help, and we ran from the castle, over a field, through a forest, at the other side of which we could see, below a sleep slope, a house, so we went there.

The path towards that was filled with tracks that looked surprisingly like car tracks. When we reached the spot there was no house any more, but rather a cold camp side. We saw digging tools, and helmets (strangely, baseball helmets) were arranged in a circle, as if it was a burial site.

We set up camp there and slept.

It occurred to me that I must have been the rightful owner of the castle, and it was taken by me from my brother and his wife, who denied my existence or something treacherously like that. When we woke up at the camp side, she were there, together with what must be my niece. My sister in law mocked us for fighting unsuccessfully at the castle, but my niece was surprised to see me, as I must have a very similar appearance to my brother. She said that her mother forbid it, but she nevertheless sneakily takes out something which looks like a Gameboy with a camera attachment and a CompactFlash card from her mothers purse, puts it in and take a photo of me. This is when I realize that I will get my castle back.

At that moment, I woke up. I somewhat liked the story (and it was a bit more coherent in my mind than what I then wrote down here), so I wanted to write it down. I quickly fetched my laptop. My friends at the summer school were a bit worried, and I promised not to mention their names and concrete places, and started writing. They distracted me, so I searched for a place of my own, lied down (why? no idea), and continued writing. I had to to touch writing on my belly, because my laptop was not actually there.

I also noticed that I am back at the camp side, and that I am still wearing my back protector that I must have been wearing while fighting in the castle, and which I did not take off while sleeping at the camp side. Funnily, it was not a proper medieval amour, but rather my snowboarding back protector.

At that moment, I woke up. I somewhat liked the story (and it was a bit more coherent in my mind than what I then wrote down here), so I wanted to write it down. I quickly got up, started my laptop, and wrote it down. And this is what you are reading right now.

Off to bed again, let’s see what happens next.

A multitude of early Christmas presents

Published 2015-12-18 in sections English, Tiptoi, Haskell.

Today was a nice day with a surprising number of early Christmas presents:

  • I submitted my Ph.D. thesis “Lazy Evaluation: From formal semantics to to a machine-checked compiler transformation” about an Isabelle formalisation of Launchbury’s natural semantics (also found in the Launchbury entry in the Archive of Formal Proofs), about the GHC optimization “Call Arity” that I developed (see the TFP 2014 paper) and a formal proof that it is indeed an optimization (Haskell Symposium 2015 paper, AFP entry). After 243 pages, it is finally done and I can now turn to new things.

    Speaking of “something new”: I’m hunting jobs right now. I am in particular interested in academic post-doc positions in my field (functional programming languages and/or interactive theorem proving), but I’d also like to hear about attractive non-academic positions that fit my profile.

  • Someone who I value and respect a lot asked me to co-author a paper with him. I am not sure whether further details at this stage belong to a public blog post, so I’ll keep them back. But nevertheless this is very nice.

  • A paper on information flow control by my colleagues Jürgen Graf, Martin Hecker and Martin Mohr, my advisor Gregor Snelting and (although only marginally) me was accepted at POST 2016, and the notification reached us today.

  • Together with Carsten Podszun, I have written an article about the Ravensburger Tip-toi pen and how to create your own books for that using the tttool program that I have created for the German magazine Make:, and it is included in issue 6/2015, which arrived today in my mailbox.

  • The Incredible Proof Machine attracts new contributors. Well, one. But a few days ago, I heard that a 11½ year old girl has great fun with it!

  • Tonight, my group is having their Christmas party and I can expect to be fed well by home-cooked Mexican something. Got to go...

HaL 10

Published 2015-12-04 in sections English, Haskell.

Today I had the honour to hold my first invited talk: I was the keynote speaker at the tenth Haskell in Leipzig workshop. I chose to present how I used the MonadFix type class to conveniently and by-construction-correctly calculate offsets while serialising to a binary data format in the implementation of tttool. Johannes Waldmann could not resist to point out that MonadFix was already a topic at the very first HaL meeting nine years ago, but nevertheless I think that my live-coded slide-less talk was quite suitable for the given audience of roughly 55 attendees.

I have uploaded a transcript of this talk.

Reproducible Builds World Summit

Published 2015-12-04 in sections English, Debian.

This week, I was attending the first Reproducible Builds World Summit in Athens. A while ago, I have fixed some reproducibility bugs in the Haskell compiler GHC (parts of #4012), and that got me a ticket to this fully sponsored and nicely organised event.

This was the first event I attended that had a dedicated and professional moderator, Gunner from Aspiration, who came up with a suitable on-the-fly-schedule, ensured the time was structured by shorter plenary rounds with all ~40 attendees as well as longer working sessions in groups of usually at most 10 people, motivated people to run (“facilitate”) these working groups as well as ensured that any results were noted and collected. This definitely makes a difference to the otherwise unstructured “let’s all sit in a big circle and endlessly discuss things“ or “let’s all just hack on what we want to hack on right now” structure of less professionally organised meetings.

But after 1½ days of talking, a certain desire to hack and get work done could be felt throughout the group, so most of us were happy to be allowed to make use of the laptop on Wednesday afternoon. At least as far as possible – the wireless was not expecting this crowd, it seems.

Besides the few patches above, my involvement in the project is rather rudimentary, so I tried to contribute as far as possible during the event itself. This was not always easy, as many of the working sessions were either self-reflecting or forward-facing: What problems do we see? What should we get done next? How do we talk to users/upstream/press/doners? Where possible, I tried to at least be helpful in the discussion.

During the hacking sessions, I found it most useful to contribute to diffoscope, a program to recursively and deeply show the difference between files of any format, so there are now patches to implement a multi-file HTML output, to more efficiently include large diffs, and – more importantly – I assisted Lunar in making diffoscope multi-threaded. In Python this is not a very great experience; I guess I am spoiled by Haskell (as it is often the case).

I enjoyed the group; a bit like DebConf but with many new faced from similarly-minded, but different projects. It was, however, again a Free Software event with an embarrassing low number of female participants. I liked discussing a proof of Gödel’s first incompleteness theorem during a skill exchange session. Unfortunately, I had to leave very timely after the event on Thursday evening, in order to attend the Haskell in Leipzig workshop.

Freierzone, jetzt erst recht

Published 2015-11-24 in sections Deutsch.

Vor drei Tagen habe ich ja hier einen Vorschlag für ein alternatives Fußgängerzonen-Schild für die Karlsruher Brunnenstraße gemacht. Und siehe da, es hat sich jemand (das Tiefbauamt? ein Street-Artist? die arbeitende Bevölkerung der Brunnenstraße?) meinem Vorschlag angenommen, und tatsächlich das Schild geändert:

Neue Beschilderung der Karlsruher Brunnenstraße

Neue Beschilderung der Karlsruher Brunnenstraße

Mehr Bilder auf meiner Bilder-Seite.

Ich bin gespannt wie lange es so hängen wird. Wenn Ihr es die nächsten Tage seht, lasst es mich wissen!

Update (24.11.2015): Jemand auf Instagram hat das Schild auch entdeckt.

Update (25.11.2015): Leider hing das Schild schon heute morgen nicht mehr.

Update (25.11.2015): Auch ka-news berichtet.

Freierzone

Published 2015-11-21 in sections Deutsch.

In Karlsruhe gibt es eine berüchtigte Straße, die Brunnenstraße, die schön zentral zwischen Studentenkneipe und Campus liegt. Diese ist nicht nur für die eine Art von Verkehr interessant, sondern auch für den anderen, normalen Verkehr: Es ist eine Fußgängerzone, die auch für Radfahrer freigegeben ist.

Beschilderung der Karlsruher Brunnenstraße

Beschilderung der Karlsruher Brunnenstraße

Soweit, so gut, wir sind ja in Karlsruhe. Inhaltlich finde ich diese Beschilderung durchaus sinnvoll. Aber die Gestaltung ist sehr unpassend – eine Frau mit Kind an der Hand ist doch in diesem Milieu äußerst fehl am Platz. Daher, im Folgenden mein Vorschlag zur passenden Verkehrs-Beschilderung: Das Verkehrszeichen 242 in der Rotlicht-Variante:

Freierzone

Freierzone

Ich hoffe das Karlsruher Tiefbauamt liest mit.

Update: Ich weiß nicht ob es das Tiefbauamt war, aber irgendwer hat mitgelesen

Mehr als Trinken mit Linken

Published 2015-10-16 in sections Deutsch, Politik.

An der Litfaßsäule Vor dem Info-Gebäube klebt zur Zeit dieses Plakat:

Trinken mit Linken

Trinken mit Linken

So eingängige Sprüche sollen wohl nicht nur einem politischem Sprektrum vorbehalten sein, denn kurze Zeit später hingen da noch ein paar mehr Plakate:

Fechten mit Rechten

Fechten mit Rechten

Sühnen mit Grünen

Sühnen mit Grünen

Die Litfaßsäule in der Totalen

Die Litfaßsäule in der Totalen

(Und ja, ich würde mir Wünschen dass mein Jolla Phone eine bessere Kamera hätte.)

Constructing a list in a monad revisited

Published 2015-10-14 in sections English, Haskell.

Two years ago, I discussed various ways of constructing a list in a Monad (or, more specifically, in IO) in Haskell, and compared their merits in terms of elegance, stack usage, number of traversals of the list and run-time efficiency.

Recently, two blog posts discussed the issue further and proposed new, more daring alternatives. Neil Mitchell breaks through the abstraction provided by IO, duplicates “the world” and traverses the list twice, and obtains a speed-up for long lists.

Twarn van Laarhoven went even further and wrote custom C-- code to destructively update the tail-pointer of the list cell to be able to create the list completely evaluated on the first start. This basically answers my question from two years ago:

I’m still wondering: What would be required from Haskell, GHC or the monads in question to have a fully satisfactory solution here, i.e. one that is as fast as the naive recursion, as stack efficient as the difference list solution, and allocates no thunks, only list cells?

He also has a variant with a slightly nicer interface around “holes”, i.e. explicit objects on the heap that can later be replaced by indirections. Obviously, both approaches are very unsafe.

I took this as an opportunity to redo my benchmark measurements, and include their variants (named escapeIO, hackIO and holeIO). The following table omits the variants with quadratic performance, as I ran it on longer lists now:

Variant 10^0 10^1 10^2 10^3 10^4 10^5 10^6
accumReverse 37ns 153ns 1134ns 12µs 208µs 8540µs 97ms
recursion 29ns 139ns 680ns 6790ns 160µs 6441µs 76ms
replicateM 26ns 126ns 677ns 6785ns 168µs 6314µs 78ms
accumDList 35ns 165ns 995ns 10µs 190µs 9706µs 100ms
streams 27ns 136ns 691ns 6788ns 173µs 5771µs 75ms
unsafeInterleave 60ns 329ns 2804ns 28µs 373µs 5605µs 57ms
listFix 51ns 412ns 4109ns 56µs 2761µs 42ms 445ms
escapeIO 41ns 187ns 1808ns 16µs 234µs 4409µs 45ms
hackIO 30ns 152ns 1199ns 11µs 140µs 3701µs 42ms
holeIO 40ns 222ns 1725ns 17µs 218µs 4446µs 53ms

The following graph shows that around 10000, the naive approaches become much slower and the fancy hacks pay of, with Twarn’s tail-pointer-updating code performing the best:

I would really like to see a package that provides a API like Twarn’s holes, either in this raw unsafe variant (but with the garbage collector related code checked), or with a safe API using type hackery similar to the ST monad that ensures that after “normal” code gets its hands on a term possibly involving holes, the holes may no longer be modified.

I have put the code and results on GitHub.