Joachim Breitner

rec-def: Dominators case study

Published 2022-09-15 in sections English, Haskell.

More ICFP-inspired experiments using the rec-def library: In Norman Ramsey’s very nice talk about his Functional Pearl “Beyond Relooper: Recursive Translation of Unstructured Control Flow to Structured Control Flow”, he had the following slide showing the equation for the dominators of a node in a graph:

Norman Ramsey shows a formula

He said “it’s ICFP and I wanted to say the dominance relation has a beautiful set of equations … you can read all these algorithms how to compute this, but the concept is simple”.

This made me wonder: If the concept is simple and this formula is beautiful – shouldn’t this be sufficient for the Haskell programmer to obtain the dominator relation, without reading all those algorithms?

Before we start, we have to clarify the formula a bit: If a node is an entry node (no predecessors) then the big intersection is over the empty set, and that is not a well-defined concept. For these nodes, we need that big intersection to return the empty set, as entry nodes are not dominated by any other node. (Let’s assume that the entry nodes are exactly those with no predecessors.)

Let’s try, first using plain Haskell data structures. We begin by implementing this big intersection operator on Data.Set, and also a function to find the predecessors of a node in a graph:

import qualified Data.Set as S
import qualified Data.Map as M

intersections :: [S.Set Int] -> S.Set Int
intersections [] = S.empty
intersections xs = foldl1 S.intersection xs

preds :: [(Int,Int)] -> M.Map Int [Int]
preds edges = M.fromListWith (<>) $
    [ (v1, [])   | (v1, _) <- edges ] ++ -- to make the map total
    [ (v2, [v1]) | (v1, v2) <- edges ]

Now we can write down the formula that Norman gave, quite elegantly:

domintors1 :: [(Int,Int)] -> M.Map Int [Int]
domintors1 edges = fmap S.toList doms
  where
    doms :: M.Map Int (S.Set Int)
    doms = M.mapWithKey
        (\v vs -> S.insert v (intersections [ doms M.! v' | v' <- vs]))
        (preds edges)

Does this work? It seems it does:

ghci> domintors1 []
fromList []
ghci> domintors1 [(1,2)]
fromList [(1,[1]),(2,[1,2])]
ghci> domintors1 [(1,2),(1,3),(2,4),(3,4)]
fromList [(1,[1]),(2,[1,2]),(3,[1,3]),(4,[1,4])]

But – not surprising if you have read my previous blog posts – it falls over once we have recursion:

ghci> domintors1 [(1,2),(1,3),(2,4),(3,4),(4,3)]
fromList [(1,[1]),(2,[1,2]),(3,^CInterrupted.

So let us reimplement it with Data.Recursive.Set.

import qualified Data.Recursive.Set as RS

intersections :: [RS.RSet Int] -> RS.RSet Int
intersections [] = RS.empty
intersections xs = foldl1 RS.intersection xs

domintors2 :: [(Int,Int)] -> M.Map Int [Int]
domintors2 edges = fmap (S.toList . RS.get) doms
  where
    doms :: M.Map Int (RS.RSet Int)
    doms = M.mapWithKey
        (\v vs -> RS.insert v (intersections [ doms M.! v' | v' <- vs]))
        (preds edges)

The hope is that we can simply replace the operations, and that now it can suddenly handle cyclic graphs as well. Let’s see:

ghci> domintors2 [(1,2),(1,3),(2,4),(3,4),(4,3)]
fromList [(1,[1]),(2,[1,2]),(3,[3]),(4,[4])]

It does! Well, it does return a result… but it looks strange. Clearly node 3 and 4 are also dominated by 1, but the result does not reflect that.

But the result is a solution to Norman’s equation. Was the equation wrong? No, but we failed to notice that the desired solution is the largest, not the smallest. And Data.Recursive.Set calculates, as documented, the least fixed point.

What now? Until the library has code for RDualSet a, we can work around this by using the dual formula to calculate the non-dominators. To do this, we

  • use union instead of intersection
  • delete instead of insert,
  • S.empty, use the set of all nodes (which requires some extra plumbing)
  • subtract the result from the set of all nodes to get the dominators

and thus the code turns into:

unions' :: S.Set Int -> [RS.RSet Int] -> RS.RSet Int
unions' univ [] = mkR univ
unions' _ xs = foldl1 RS.union xs

domintors3 :: [(Int,Int)] -> M.Map Int [Int]
domintors3 edges = fmap (S.toList . S.difference nodes . RS.get) nonDoms
  where
    nodes = S.fromList [v | (v1,v2) <- edges, v <- [v1,v2]]
    nonDoms :: M.Map Int (RS.RSet Int)
    nonDoms = M.mapWithKey
        (\v vs -> RS.delete v (unions' nodes [ nonDoms M.! v' | v' <- vs]))
        (preds edges)

And with this, now we do get the correct result:

ghci> domintors3 [(1,2),(1,3),(2,4),(3,4),(4,3)]
fromList [(1,[1]),(2,[1,2]),(3,[1,3]),(4,[1,4])]

We worked a little bit on how to express the “beautiful formula” to Haskell, but at no point did we have to think about how to solve it. To me, this is the essence of declarative programming.

Comments

Have something to say? You can post a comment by sending an e-Mail to me at <mail@joachim-breitner.de>, and I will include it here.