Swirly Mein Kopf

Saturday, October 5. 2013

Why PVP is better than no PVP

Haskell

This is a reply to Roman Cheplyaka’s post “Why PVP doesn't work”, where he argues that Haskell packages should not per default put upper bounds on their build dependencies (as mandated by the Package Versioning Policy), at least not without good reason.

I disagree, and I’d like to explain why. I assume that you are packaging your libraries and programs not only for your fellow developers, but also for users. Hackage has turned not only into a hub for developers sharing their libraries, but also for users to conveniently download stuff from. And when I say users, I include xmonad users (although they write a bit of Haskell for their configuration) and beginners (who we certainly don’t expect to understand and make sense of error messages that occur if some obscure crypto-related dependency of their favourite web framework fails to build). And for their benefit we should try hard to make sure “cabal install” either succeeds (great) or fails early with an error message that the user can understand as “I could  not find a package configuration of which the maintainers believe with good confidence that it works. Please talk to them.”

I speak with some experience, because as a maintainer of the >666 Haskell packages in Debian, I am, in a sense, a very heavy user. In the years after the PVP got commonly accepted, we could noticeably reduce the number of build failures that we encounter on our machines and the Debian build farms, simply because the meta data (which we transfer to the Debian package level) prevented such builds from being attempted. We now even have a build compatibility prediction system: We maintain a text file with all Haskell packages in Debian, together with a directory of patches (where we sometimes indeed override the packages’ Cabal file), and a script that runs cabal-install’s dependency checker against this day. This way, when I want to upgrade a package, I can simply change the version in the file and know what packages will be broken. So I change their version until I either find a fixed-point (and only then actually upload packages to the user) or find a problem, which I then report to the maintainers or start cooking up my own patches.

This works really well, but it only does because most maintainers are following the PVP. That’s why I argue: Please continue to do so.

(What we would probably need instead of less using the PVP is more tools in helping us with it. I could imagine a build bot that finds packages on Hackage with an upper bound on a package where a newer version exists, fetches those and automatically tries to build against the new version, and if it succeeds, sends a message to the maintainer: “Your dependency on bar in foo also builds against version 0.42 of bar; please check for incompatibilities that are not expressed in the type system and upload a new version of foo allowing  bar-0.42.” Isn’t that something the Stackage infrastructure can be used for?)

(What we could probably use as well (and is probably a stepping stone towards what I just said) is a way to tell cabal-install to ignore a certain build dependency version bound. This way we still prevent normal users from trying a dependency version that the maintainer did not test, but nevertheless make it as easy as possible for the advanced user to try nevertheless, and then hopefully report about it to the maintainer. Of course, other had that idea before, someone just has to make it happen.)

Trackbacks


No Trackbacks

Comments

Display comments as (Linear | Threaded)

*Thanks for the response.

Have you ever run into incompatible constraints (as described in my post)? How do you deal with them?
#1 Roman Cheplyaka (Homepage) on 2013-10-05 15:42 (Reply)
*They are only a problem if there is a package foo that depends on both package-a and package-b. But then, at some time in the past, there were versions of package-a and package-b that worked together. So when I attempt to upgrade package-b, I would have noticed that it breaks foo, and done something about it, e.g. notified the maintainer of package-a that he should check if containers-0.5 works as well.

But in practice, this had not been much of a problem, no.
#1.1 Joachim Breitner (Homepage) on 2013-10-05 16:05 (Reply)
*I see. You don't experience these problems because the author of foo must have dealt with them by the time foo is ready for packaging.

So while I appreciate your perspective, it doesn't really address the problems that PVP brings.
#1.1.1 Roman Cheplyaka (Homepage) on 2013-10-05 16:31 (Reply)
*Note that I do not claim that the PVP is without problems. But I say that we are still better with it than without.

What do you think of support in cabal-install to overwrite constraints? Wouldn’t that adress a few of the issues that you have with the PVP?
#1.1.1.1 Joachim Breitner (Homepage) on 2013-10-05 17:10 (Reply)
*> But I say that we are still better with it than without.

Well, in this post you mostly talk about your experience as a Debian developer, which is very uncommon. So I can believe that you in particular benefit from PVP. I still PVP is a bad idea from a Haskell developer's point of view, for the reasons described in my post.

Regarding the end users — note that a Haskell app developer is free to pin down all the versions in his/her own cabal file. So users still can have the same nice error messages, if the app's developer cares enough.

> What do you think of support in cabal-install to overwrite constraints? Wouldn’t that adress a few of the issues that you have with the PVP?

Yes, it would solve half of the problem. But library developers still shouldn't rely on upper bounds and depend on older versions of packages.
#1.1.1.1.1 Roman Cheplyaka (Homepage) on 2013-10-05 18:10 (Reply)
*We have ~1200 packages in gentoo-haskell overlay. I have 1215 packages
installed simultaneously (had to fix not only .cabal files I must admit).
We have ~300 not-upstreamed-yet patches and ~300 not-upstreamed-yet
cabal constraints amendments.

Courous can look at each of those:

17 pages: https://github.com/gentoo-haskell/gentoo-haskell/search?p=1&q=cabal_chdeps&ref=cmdform
10 pages: https://github.com/gentoo-haskell/gentoo-haskell/search?q=PATCHES%3D&type=Code
11 pages: https://github.com/gentoo-haskell/gentoo-haskell/search?q=epatch+&type=Code

Sometimes the most curious users try to use GHC developer snapshots. It's such a
fun to change base depends in most of packages. But things usually Just Work!

You can't find bugs early in such a hostile world and have to offload bug discovery
to users (or autopatch .cabal files). It happens with every major ghc release. X.Y.1
are barely usable both in terms of hackage package installability and in bugs in
code generated by ghc.

When new ghc is out we have hard time upstreaming tons of patches. Other developers
just can't test them! Too much (too strict version constraints) is broken
on hackage / in their distro! Dust starts to settle down in a month or two,
and we repeat. It's not one or 2 patches to upstream, it's hundreds of them.

A problem is really culture of haskell programmers to break API even if it can be _easily_
avoided. And developers don't feel the pain of breakage, partly due to PVP. Fun, isn't it?
Handly for a developer, hurts users and _all_ _other_ _developers_ (unless they want to
use buggy outdated package with outdated ghc, and they do).

With constant API breakage it's a problem to write portagble code.
base-4-exceptions change is a great example of devastating breakage.

But you can't fix that mess with PVP! You can't mix packages with nonintersecting
base constraints. You can't easily make your code work across 5 library versions
constantly breaking API. You just pick one at random.

Or if you managed to abstract away that, every user gets their set of libraries thanks
to cabal-install's very clever technique of installing "compatible" versions.
You never know what QuickCheck version user will end up having when will try to enable
--enable-tests on a large set of packages. And when it will suddenly break due to
outdated version of some random 'directory-1.1' package, reporting such a bug is very
nontrivial endeavour.

An example of uselessy broken API in recent Cabal:

https://github.com/haskell/cabal/commit/50af0d77b544fa2a9b68d3c1ca6f40a32e35ac33
Instead of adding new function and leave rawSystemStdInOut alone (possibly adding
deprecation pragma) it broke a number of exported APIs.

Luckily only some packages had to be fixed in Setup.hs to unbreak it.

And it happens all the time for every core haskell application. Constantly delayed
darcs releases working with latest ghc is a great example. People need to use darcs
to send fixes upstream, but they can't build it. It's not a problem for you binary
world, but think about it for a minute. Haskell ecosystem can't deliver even basic
developer tools after ghc release. There is even a special event for it: Haskell Platform,
which adds another variable to newbie haskell developers to target on.

Now developers, who care about users litter their code with #ifdefs or just leave older
Cabal in constrains if they can. Exercise to the reader: when such packages will break?

C land is a good counterexample where people are afraid to break things
_that_ often (they get instant feedback) and you can build working system based on it.

No need for PVP to make it work, Just don't break the code too often (or too radical).

Sorry for a long post.

Thanks!
#2 Sergei Trofimovich on 2013-10-06 10:12 (Reply)
*Your long post is welcome. But I’m not sure what the conclusion is: Is it “PVP or no PVP is irrelevant, as there are other, deeper problems?”
#2.1 Joachim Breitner (Homepage) on 2013-10-06 13:00 (Reply)
*In short, yes. For me PVP is about two unrelated things:

1. naming scheme (good):
When developer versions things having consistent scheme is brilliant.
That's the good part of PVP. Sometimes I don't even look at
package diffs when bump from A.B.C.x to A.B.C.y.

2. version constraints (pain point):
But attempt to consider major release compatibity breaking
as the norm is disastorous. The clause:
"This means specifying not only lower bounds, but also upper bounds on every dependency." adds false assurance of the safe side for interface breaking.

Maybe the PVP page just needs more elaboration on why breaking interfaces
is so damaging for the whole ecosystem.
#2.1.1 Sergei Trofimovich on 2013-10-12 14:58 (Reply)
*I don't think that upper bounds solve the problem of reproducing builds in the future. An upper bound is almost always a guess about the future, and it's often wrong. We should publish known good build configurations instead.

My complete response:

http://txt.arboreus.com/2013/10/16/lies-damn-lies-and-upper-bounds-in-haskell-pvp.html
#3 Sergey (Homepage) on 2013-10-16 19:02 (Reply)
*"An upper bound is almost always a guess about the future"

With the PVP the upper bound is no longer a guess.
#3.1 Lemming on 2014-02-25 10:31 (Reply)

Add Comment



To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA

Gravatar, Favatar, Identica author images supported.
What is the first name of the owner of this blog? / Wie heißt der Betreiber dieses Blogs mit Vornamen?
 
 
Nach oben