Joachim Breitner's Homepage
Johannes Bechberger, while working on his Bachelor’s thesis supervised by my colleague Andreas Zwinkau, has developed a performance benchmark runner and results visualizer called “temci”, and used GHC as a guinea pig. You can read his elaborate analysis on his blog.
Johannes Bechberger’s take-away is that, at least for the programs at hand (which were taken from the The Computer Language Benchmarks Game, there are hardly any changes worth mentioning, as most of the observed effects are less than a standard deviation and hence insignificant. He tries hard to distill some useful conclusions from the data; the one he finds are:
- Compile time does not vary significantly.
- The compiler flag
-O2indeed results in faster code than
-O2), GHC 8.0.1 is better than GHC 7.0.1. Maybe some optimizations were promoted to
If you are interested, please head over to Johannes’s post and look at the gory details of the analysis and give him feedback on that. Also, maybe his tool temci is something you want to try out?
Personally, I find it dissatisfying to learn so little from so much work, but as he writes: “It’s so easy to lie with statistics.”, and I might add “lie to yourself”, e.g. by ignoring good advise about standard deviations and significance. I’m sure my tool gipeda (which powers perf.haskell.org) is guilty of that sin.
Maybe a different selection of test programs would yield more insight; the benchmark’s games programs are too small and hand-optimized, the nofib programs are plain old and the fibon collection has bitrotted. I would love to see a curated, collection of real-world programs, bundled with all dependencies and frozen to allow meaningful comparisons, but updated to a new, clearly marked revision, on a maybe bi-yearly basis – maybe Haskell-SPEC-2016 if that were not a trademark infringement.