Joachim Breitner

Evaluation-State Assertions in Haskell

Published 2013-02-06 in sections English, Haskell.

I have just uploaded a new version of ghc-heap-view to Hackage that provides “Evaluation state assertions” in the module GHC.AssertNF.

Imagine you are writing a web application in Haskell that sports a global number-of-visitors counter in an IORef Int. For every request, you call modifyIORef (+1). Eventually, you notice your very popular web site to hog more and more memory. So you browse to the internal page that shows the counter, and you have to wait for a long time until you eventually see the result (or get a stack overflow). The reason: The applications of  (+1) were not performed until you looked at the number; instead, a long chain of such computation first filled your heap and then your stack.

So you have learned the hard way that you might want to avoid space leaks, and want calculations to be done during the request that caused them, and want the IORef to always contain fully evaluated data. So you stumble about modifyIORef' in Data.IORef and indeed, this fixes your problem.

Later, you notice that you want to count POST and GET requests separately. You change the type to IORef (Int, Int) and call modifyIORef' (first (+1)) or modifyIORef' (second (+1)). And suddenly, the space leak is back (which you only notice after the next push to the real site, because your local tests never caused enough requests to make it noticeable). So you not only want to fix it, you also want to ensure that it does not break again.

In other words, you want to ensure the policy that values stored in an IORef are always in normal form. You achieve this with the following alternative to modifyIORef':

modifyIORef'Assert :: IORef a -> (a -> a) -> IO ()
modifyIORef'Assert ref f = do
    x <- readIORef ref
    let x' = f x
    x' `seq` return ()
    assertNF x'
    writeIORef ref x'

Using this instead of modifyIORef' will print this warning to standard error output right the first time you call modifyIORef'Assert (first (+1)):

Parameter not in normal form: 2 thunks found:
let x1 = (S# 1,S# 1)
in _bh (_thunk x1 (_bco (S# 1)),_sel x1)

(Otherwise, the program runs as usual.) So obviously, you need to use a strict variant of first (or strict pairs):

first' :: (a -> b) -> (a, c) -> (b, c)
first' f (x,y) = let { x' = f x; r = (x', y) } in x' `seq` r `seq` r

With this, the warning goes away. Whenever you now change the type of the IORef or modify it in a too-lazy-way, you can be sure that you’ll be warned about it, before the space leak itself becomes noticeable.

In the production code, you might want to disable the check. For that, simply put disableAssertNF somewhere in your main function.

Why is this better than just calling deepseq in modifyIORef'Assert? Because this way, the code still creates unwanted thunks that are then evaluated before storing them in the IORef, whereas with assertNF you are told about the thunks and can prevent them from being created in the first place. Also, assertNF does not add a type class constraint.

This is just one example application for assertNF (and its variants assertNFNamed, which includes a name in the warning to better spot the cause, and $assertNFHere, which uses Template Haskell to include the current source code position in the warning), and I hope that there are more. If you happen to make use of it, I’d like to hear your story.

Comments

Joachim,



I think you forgot to push your private darcs branch to the public repo :-).



Cheers,

Erik
#1 Erik am 2013-02-10
Indeed, the version bump was missing. Thanks.
#2 Joachim Breitner (Homepage) am 2013-02-10
When I cabal install this I notice that get a message:



setup: This library cannot be built using profiling. Try invoking cabal with the --disable-library-profiling flag.



Does this mean I can't use this library and create a profiling enabled version of my application?



Cheers,

Erik
#3 Erik am 2013-02-09
Yes, for the moment that is true. But you are right that this needs to be improved. At least it should build for profiling and then be a noop.



The problem is that I’m also building C-- code there, and I cannot provide different variants, but some symbols are only available in the non-profiling run time system. Or something – I’ll look into it.
#4 Joachim Breitner (Homepage) am 2013-02-09

Have something to say? You can post a comment by sending an e-Mail to me at <mail@joachim-breitner.de>, and I will include it here.