Joachim Breitner

Faster Winter 6: Simpler Code

Published 2019-11-24 in sections English, Haskell.

(This is the sixth optimization presented in the “faster winter” series, please see that post for background information.)

The Wasm reference interpreter has a function step that performs one single one step of evaluation, by taking the state before (in a type called code), and returning the state after the step. The function eval calls step, checks if the result is “done” (no instructions left to do), and if it is not done, recursively calls eval again. This way, evaluation progresses step by step until it is done.

The Haskell port follows the same pattern of a single-step step and a driver eval, but it chose to write the code continuation passing style: Instead of returning, the function step takes a function that it passes the result to. So the code looks like this (slightly simplified):

type CEvalT m a = ReaderT Config (ExceptT EvalError m) a

step :: PrimMonad => Code -> (Code -> CEvalT m r) -> CEvalT m r
step c k =  … k new_c …

eval :: PrimMonad => Code -> CEvalT m (Stack Value)
eval c =
    if is_done c
    then stack c
    else step c eval

There must have been a good reason to prefer this style over the plain style, but I was curious if it was really helpful. So I changed it to the following, more straight-forward code:

type CEvalT m a = ReaderT Config (ExceptT EvalError m) a

step :: PrimMonad => Code -> CEvalT m Code
step c k =  … new_c …

eval :: PrimMonad => Code f m -> CEvalT f m (Stack Value)
eval c =
    if is_done c
    then stack c
    else step c >>= eval

And indeed, the simpler code worked better:

Improvement: Allocations: -9.6 Time: -16.91% (Commit eb8add3...e66f1e0)

Again, avoiding function values (as we construct them as the contination closure) might have helped here.

Or maybe the simpler code allowed GHC to notice that the Code value, which is simply a tuple with a different name, is constructed by step but immediatelly deconstructed by eval, and thus GHC could optimize that away (by “unboxing” the argument and/or result of step and simply passing the components of the tuple).

And finally, independent of all the performance questions, it also makes the code easier to understand.

Comments

Have something to say? You can post a comment by sending an e-Mail to me at <mail@joachim-breitner.de>, and I will include it here.