With the Beta 3 release of Darcs 2.5 on Hackage (the Haskell library and program repository), the ipatch program I recently introduced could now be uploaded to hackage, too. If you use cabal-install, you can now install and use it with a simple run of "cabal install ipatch".
I also made the program now handle patches that add or remove files, extended the help texts a bit and added a test suite. This means that you can actually make use of ipatch as of now, to split patches into several small patches and to apply a patch interactively. Of course it needs some more testing, and you might have feature wishes – in either case, let me know.
As a Debian maintainer, I often work with patches (files listing changes to text files), for example when tracking the modification I make to some software before I upload the package to Debian. To manage these patches, quilt is a nice tool: It helps you maintain a stack of patches on top of the original code and encourages you to keep your variously modifications separate.
One use case is not supported by quilt at all: Splitting patches. One often has a large patch containing several independent changes. This might happen after you fix a few problems in the upstream code and then run dpkg-buildpackage, which will create one patch of your changes and put it in debian/patches. Before, I had to manually edit the patch and write the hunks, which are the building blocks of patches, into separate file.
There is no such problem when using a version control system, such as Darcs. Especially Darcs is rightly famous for its user-friendly interface and powerful hunk-selection features. You can even split a single hunk (which could be a change to one line) into two separate steps! Have a look at the HunkEditor page on the Darcs wiki to see how that works.
Well, it is not stealing if it is Free Software... Darcs has these nice capabilities and provides them in the context of version control systems, while we need them in the context of patch files. But Darcs is providing an API to its code, so shoudn’t it be possible to create a program that uses the Darcs code to split patch files? As a matter of fact, it is possible: You can see that program in action on this 3min Ogg Theora-Video or directly here if your browser supports HTML5:
The code is a working proof of concept. What you see works. You do not see how it handles patches that create or delete files, patches that do not apply cleanly or are already applied or any kind of error handling. That does not work yet. If you still want to try it, you can grab the code from the Darcs repository at http://darcs.nomeata.de/ipatch, but you need to build the latest development state of the Darcs library first.
I think ipatch could become a very useful and powerful tool with applications in areas where nobody would think of using Darcs. I definitely want some integration into quilt, replacing the splitted patch in the series by the replacing patches automatically. Maybe even a git plugin could be created? But I don’t think I can push this project far enough on my own. So this is an invitation to join me and make ipatch a great tool. This invitation goes especially to the Darcs developers: Please have a look how the code uses the Darcs API and help to improve the collaboration here. I think we can use the darcs-users mailing list until there is need for a dedicated mailing list.
Haskell provides type classes to support polymorphism. A type class defines a few methods, which can then be implemented for a concrete type in the type class instance. This is a powerful system, but it also has it drawbacks. Most notably, each type can have at most one implementation of the type class. But sometimes you need to use a different implementation.
If, for example, you used the Binary class to store data on disk. Now you changed your data type and the binary instance, and you can not read the old data any more. One solution is to re-name your type using “newtype” and implement another type instance for that. Often, this is enough. But still, instances are not first-class-citizens. You can not pass them around or modify them, as you can pass around and modify data and functions.
Under the hood of the compiler, things look different. The ghc puts the methods of the instance in a dictionary and passes that implicitly to any functions having a (Class a) constraint. (Other implementations exist though) If one could make that behavior explicit, one could easily modify the instance before passing it to the function. But this is unfortunately not possible.
But it is possible to pass an explicit dictionary along the data. I use the Monoid class as an example, and define a representation of the dictionary to-be-passed, as well as the dictionary of the default instance:
data MonoidDict a = MonoidDict
{ ed_mempty :: a
, ed_mappend :: a -> a -> a
}
monoidDict :: Monoid a => MonoidDict a
monoidDict = MonoidDict mempty mappend
(For conciseness, I ignore the mconcat method.) My first idea was to pass this instance along with data: (MonoidDict a, a). But this would not work because there are methods, such as mempty, who need the dictionary without getting passed a value to use. Therefore, I need to put the dictionary both in the covariant and the contravariant position:
newtype WithMonoidDict a = WithMonoidDict (MonoidDict a -> (MonoidDict a, a))
We need functions to clamp a dictionary to a value, and to extract it again:
wrapWithCustomMonoidDict :: MonoidDict a -> a -> WithMonoidDict a wrapWithCustomMonoidDict dict val = WithMonoidDict $ const (dict, val) extractFromCustomMonoidDict :: MonoidDict a -> WithMonoidDict a -> a extractFromCustomMonoidDict dict (WithMonoidDict f) = snd (f dict)
Note that both expect the dictionary, so that it can be fed into WithMonoidDict from “both sides”. For convenience, we can define variants that use the standard instance:
wrapWithMonoidDict :: Monoid a => a -> WithMonoidDict a wrapWithMonoidDict = wrapWithCustomMonoidDict monoidDict extractFromMonoidDict :: Monoid a => WithMonoidDict a -> a extractFromMonoidDict = extractFromCustomMonoidDict monoidDict
We want to be able to pass the wrapped values as any other value with a Monoid instance, so we need to declare that:
instance Monoid (WithMonoidDict a) where mempty = WithMonoidDict (\d -> (d, ed_mempty d)) mappend (WithMonoidDict f1) (WithMonoidDict f2) = WithMonoidDict $ \d -> let (d1,v1) = f1 d (d2,v2) = f2 d in (d1, ed_mappend d1 v1 v2)
Note that mappend has the choice between three dictionaries This is not a good sign, but let’s hope that they are all the same.
Does it work? Let’s see:
listInstance :: MonoidDict [a]
listInstance = monoidDict
reverseInstance :: MonoidDict [a]
reverseInstance = monoidDict { ed_mappend = \l1 l2 -> l2 ++ l1 }
examples = do
let l1 = [1,2,3]
let l2 = [4,5,6]
putStrLn $ "Example lists: " ++ show l1 ++ " " ++ show l2
putStrLn $ "l1 ++ l2: " ++ show (l1 ++ l2)
putStrLn $ "l1 `mappend` l2: " ++ show (l1 `mappend` l2)
putStrLn $ "Wrapped with default instance:"
putStrLn $ "l1 `mappend` l2: " ++ show (
extractFromMonoidDict $ wrapWithMonoidDict l1 `mappend` wrapWithMonoidDict l2)
putStrLn $ "Same with reversed monoid instance:"
putStrLn $ "l1 `mappend` l2: " ++ show (
extractFromCustomMonoidDict reverseInstance $
wrapWithCustomMonoidDict reverseInstance l1 `mappend`
wrapWithCustomMonoidDict reverseInstance l2)
Running examples gives this output:
Example lists: [1,2,3] [4,5,6] l1 ++ l2: [1,2,3,4,5,6] l1 `mappend` l2: [1,2,3,4,5,6] Wrapped with default instance: l1 `mappend` l2: [1,2,3,4,5,6] Same with reversed monoid instance: l1 `mappend` l2: [4,5,6,1,2,3]
Indeed it works.
Unfortunately, this approach is not sufficient for all cases. It is perfectly valid to have a function with signature (Monoid a => Maybe a -> Maybe a), whose behavior depends on the instance of a, even when being passed Nothing and returning Nothing. Such a function woul
I wonder if it would be possible to extend the Haskell language somehow to be able to properly pass an alternative dictionary to such functions. But given that not all compilers use dictionary passing, my hopes are low.
My automatic rule-based time tracker (arbtt), which is written in Haskell, collects every minute a data sample consisting mainly of the list of currently open windows (window title and program name). Naturally, this log grows rather large. Since October of last year, I collected 70,000 samples. I already went from a text-based file format to a binary format using Data.Binary, which gave a big performance boost.
But by now, I was afraid that this is not enough. My log file is now 30MB large. Looking at the memory graph of gnome-panel, it is taking up more than half of my memory. When running arbtt-stats, the Haskell run time system reports 569 MB total memory in use and the command finishes after 28.5 seconds.
Naturally, the log file is highly redundant: Compressing it with bzip2 shrinks it to 1.6MB. But as I would like to preserve the ability to just append samples at the end, without having to read the file, I chose not just to add bzip2 or gzip compression. Rather, I am now exploiting a very obvious redundancy: Two adjacent samples usually list exactly the same windows, and a focus change only changes a flag. So now, when storing a string that is part of a sample, it will check if this string was already present in the previous sample and, in this case, just store the number of that string (one byte). Only if the string was not present it will write a zero byte and then the string. When reading the sample, the process is reversed.
This greatly reduces the file size: It is down to 6.2MB. It also improves the memory consumption, due to Haskell’s abilities with regard to sharing: When a reference to a string in a previous sample is read, then only one instance of this string is in memory, even if it occurs several times in the log. This brings the memory consumption down to 264 MB and the runtime to 17 seconds.
I released the changes as version 0.4.5.1 to Hackage, Debian and as a Windows installer. The log file is not automatically converted, but new samples will be written in the compressed format. If you want to convert your whole file, you have to stop arbtt-capture, run arbtt-recover, and then move the hopefully noticeable smaller ~/.arbtt/capture.log.recovered to ~/.arbtt/capture.log.
The required code changes were not too big. I somewhat isolated the relevant code in the Data.Binary.StringRef module. Unfortunately, I have to use OverlappingInstances to be able to provide the special instance for String – is there a cleaner way (besides the trick used for the Show class)?
A friend of mine is interested in trying out the Automatic Rule Based Time-Tracker arbtt which I programmed. Unfortunately, he is using Windows and up to now, arbtt only worked on Linux. But as I wanted to check out Haskell’s cross-platform abilities for a while, this was a good opportunity to do so. I don’t have Windows installed myself (and did not plan to do so), so I did all this under WINE, the Windows compatibility layer, which works very well: It takes only a few minutes to install the Haskell Platform for Windows and then I was able to run wine ghc --make and cabal install.
I played around with some simple programs and was surprised by these timings:
$ rm *.o *.hi; ghc --make fourfours.hs ; time ./fourfours > /dev/null [1 of 1] Compiling Main ( fourfours.hs, fourfours.o ) Linking fourfours ... real 0m1.909s user 0m1.692s sys 0m0.208s $ rm *.o *.hi; wine ghc --make fourfours.hs ; time wine ./fourfours.exe > /dev/null [1 of 1] Compiling Main ( fourfours.hs, fourfours.o ) Linking fourfours.exe ... real 0m1.631s user 0m1.376s sys 0m0.092s
So it is faster to run a compiled Haskell program on top of a compatibility layer than directly on Linux! The world is in order again, though, if optimization is enabled:
$ rm *.o *.hi; ghc -O --make fourfours.hs ; time ./fourfours > /dev/null [1 of 1] Compiling Main ( fourfours.hs, fourfours.o ) Linking fourfours ... real 0m0.981s user 0m0.876s sys 0m0.108s $ rm *.o *.hi; wine ghc -O --make fourfours.hs ; time wine ./fourfours.exe > /dev/null [1 of 1] Compiling Main ( fourfours.hs, fourfours.o ) Linking fourfours.exe ... real 0m1.270s user 0m1.036s sys 0m0.072s
Funny. Anyways, I wanted to port arbtt. The only platform-dependent part is the capture module that gathers the list of open Windows. The Win32 package that comes with the Haskell Platform did not cover all the functions needed to do so, but creating additional function bindings is really easy with Haskell, as can be seen in the Graphics.Win32.Window.Extra module. I also replaced the locking code that prevents two instances of arbtt-capture to run at the same time by equivalent code using Windows mutexes (module System.Win32.Mutex). With these small changes and some CPP conditionals to make the code compile for either platform, the porting was done! Even accessing the files in ~/.arbtt works correctly on Windows, where it will look in the Application Data folder, without changing the code, thanks to System.Directory.getAppUserDataDirectory.
But Windows users won’t like compiling software on their own. They won’t even like installing software by copying various files to certain directories. Therefore, I also had to create a Windows Installer. I picked Inno Setup, because it’s Free Software and seems to be simpler than NSIS. The installer puts the compiled .exe files, the example categorize.cfg and the HTML documentation in the right spot, adds icons to the Start Menu (“Edit categorize.cfg”, which fires up wordpad, a link to the documentation and the uninstaller), puts arbtt-capture in the Autorun folder, puts the path to arbtt-stats in the PATH variable and starts arbtt-capture at the end (the last three points being optional). Of course it undoes all this when removing the program again. I integrated the call to the Inno Setup installer into the usual ./Setup build process of Haskell packages. Some more details of how to create the Windows installer are mentioned in the README file.
Now all this does not magically add a graphical user interface to arbtt, so users will still have to work with arbtt-stats on the command line – even on Windows. If this is not a problem for you then you can fetch the latest installer from the arbtt homepage. And if you happen to become a serious user of arbtt on Windows and want to help maintaining the Windows port, I’ll gladly share some responsibilities.
I’m very satisfied with the process and the result and I’m happy to
know that I can offer some of my programs also to Windows users in the
future.It is also a big plus for Haskell – with Python, shipping a program for Windows users is likely more difficult. The next step will be providing gtk-based graphical Haskell applications for Windows, including a nice installer that ideally includes all dependencies (gtk etc.).
The darcs hacking sprint is slowly nearing its end. As planned, I have worked on integrating DarcsWatch and bugs.darcs.net, and I am satisfied so far. From now on, if someone submits a Darcs patch to patches@darcs.net, the patch will also be tracked by DarcsWatch. DarcsWatch will display a link to the entry on bugs.darcs.net, and also add a comment to the bugtracker with a link to the patch on DarcsWatch. And eventually, if the patch is included in the darcs.net repository, DarcsWatch will change the state of the ticket to accepted, removing one step of work for the Darcs maintainers. Currently, it checks the state of the repository three times per hour, so expect a delay after you applied the patch to the repository before the state is updated.
Today, my alarm clock was set to 4:30, as I was going to Vienna, to attend the Darcs hacking sprint. I’ll be working on DarcsWatch, making it a bit more modular and hopefully integegrate it better into bug tracking systems (especially roundup, as that’s used by the Darcs team). On Monday, I’ll be a tourist until I leave in the evening. If any Debianers or Haskellers want to meet for keysigning or sightseeing, just drop me a mail!
Yesterday I did what Free Software authors supposedly don’t do. I wrote documentation. In fact, I had a relatively detailed README already, but I thought this would be a good opportunity to create a more elaborate documentation, using the ubiquitous DocBook. You can read the HTML documentation for arbtt online, where it’s automatically updated when I push to the darcs repository. You can see, I use the same CSS file that most Haskell-related DocBook documentation seems to use.
One motivation to use DocBook was that I can extract manpages from it, which should be present if I package arbtt for Debian. I was about to complain that references from the manpages to other part of the documentation can not work sensibly, but using the “profiling” feature of docbook-xsl, I can replace them by a textual reference to the user’s manual if the file is processed for manpage output. Have a look at my changes for that if you want to know how it works.
For some reason, the <refentry>-tags that make up the manpages are split to separate files when generating chunked HTML output, but they do not appear in the table of contents. You have to find a spot in the text where they are linked, as they are now in the section “Program references”, to find them. This is unfortunate, I expect that a few readers might miss this important part of the documentation. Am I doing something wrong?
For the configuration file, I put an EBNF-style grammar description in the documentation. There is a special DocBook module for that, but I’m not satisfied with it. On the right hand side of production rules, it has only special support for nonterminals, but no tags to semantically mark up choice, repetition etc. I followed the lead found at the YAML specification and put <quote>-Tags around literal parts of the grammar, but this is not valid according to the DocBook DTD. Speaking of the DTD: I guess since the EBNF-stuff is just a module, the tag <productionset> is not a valid content of <figure>. So, if I choose to use the EBNF module in a sensible way, I will not have a valid DocBook file.
All in all, I am satisified with the result. Especially that nobody can say any more “I like your program, but I can’t contribute, because I don’t know Haskell”. Just improve the docs! :-)
Three weeks ago, I announced the automatic rule-based time tracker here, and it seems that there are actually users out there :-). Since then, it has recorded more than 240 hours of my computer’s uptime in about 15000 samples. Until now, this data was stored in a simple text file, one line per entry, and relying on Haskell’s Show/Read instances to do the serialization. Although not quite unexpected, this turned out to be a severe bottleneck: Already, it takes more than 25 seconds to parse the log file on my computer.
Following an advice given to me on the #haskell IRC channel, I switched to a binary representation of the data, using the very nice Data.Binary module. The capture program will automatically detect if the log file is in the old format, move it away and convert the entries to binary data. And voilà, the statistics program evalutes my data in about two seconds! This should be fast enough for quite a while, I hope.
In my binary instances, which you can find in Data.hs, I prepended a version tag to each entry. This hopefully allows me to add more fields to the log file later, while still being able to parse the old data directly and without conversion. To still be able to manually inspect the recorded data, the program arbtt-dump was added to the package. The new version is uploaded to hackage as 0.3.0.
One thing still worries me: With the old format, I could easily throw away unparseable lines in the log file (e.g. from a partial write) and still read the rest of it. With the new code, there is no such robustness: If the file breaks somewhere, it will be unreadable in its whole, or at least to the broken point. I’m not sure what to do about that problem, but given the very low number of bytes written each time, I hope that it will just not happen.
Die achte GulaschProgrammierNacht des Karlsruher Entropia e.V. ist hat so langsam Halbzeit, und ich will ein wenig berichten:
Gestern Abend war der erste Vortrag meine Einführung in das Programmierspiel L-seed (mit in Kürze zusammengeschriebene Folien). Ich hatte ja eigentlich keine großen Hoffnungen, dass das Spiel großen Zuspruch erfährt, sind die Möglichkeiten darin ja eher beschränkt und die Ausgabe nicht besonders schick, aber da habe ich mich getäuscht: Seitdem kurz nach dem Vortrag das Netzwerk lief und der Server erreichbar war, wurden ununterbrochen und die Nacht durch Pflanzen programmiert! Etwa 30 Pflanzen laufen gerade, von vermutlich fast so vielen Spielern. Vor lauter Vorschlägen, Wünsche und Fragen kam ich kaum mehr zum weiterprogrammieren. Immerhin konnte ich die schlimmsten Fehler noch ausbügeln (etwa, dass man mit sehr kurzen Pflanzen sich beliebig schnell vermehren konnte, oder dass vertikale Pflanzen zu stark sind).
Inzwischen wird auch regelmäßig der Zustand des Spiels als Bilddatei rausgeschrieben (natürlich nur während der GPN erreichbar).
Die ersten kämpfen sich jetzt durch die Installation der Haskell-Toolchain, unter Debian momentan wegen Paketen in NEW stark erschwert . Vielleicht kommen bald auch noch Patches!
Wenn Ihr momentan nicht auf den Lseed-Server zugreifen könnt, dann kann das am „Routing-Loch“ liegen: Teile des Internets sind von hier aus nicht erreichbar, weil wohl manche Routingtabellen nicht aktualisiert wurden. Leider auch mein Server, so dass dies hier nur über Umwege schreiben kann...
In two weeks, the eighth „Gulasch-Programmier-Nacht“ will be held in Karlsruhe, a yearly geek event by the Entropia e.V, which is the local CCC club. It will, as usually, offer a lot of interesting talks and events. One of my personal highlights have always been the programming games: Games, where you write your own code to compete against others, while the playing field is projected in the hacking area. The last few years, dividuum has done a great job providing these (as regular readers of my blog might remember).
This year, I’m trying to follow in his footsteps and will provide the programming game, called „L-seed“. This blog post is an introduction (and a call for contribution, at the bottom of the post :-))
The participants will write code (the „genome“) that describes how plants (the biological type, not the industrial) will grow. The plants will grow simultaneously on the screen (the „garden“), will compete for light and will multiply. The players can not change the code of a growing plant, but they do have the chance to update their code for the next generation – when a plant drops a seed, it will run the newest code. All in all, the game aims to be slowly paced and relaxing, something to just watch for a while and something that does not need constant attention by the players. The score is based on the total amount of biomass produced, but I expect (and hope) that some players will aim for the most beautiful or weirdest shapes.
In contrast to the previous years, this year’s game will not allow player to use a full-fledged Turing-complete programming language, but a rather minimalistic rule based language to describe the plant’s growth. Especially, it will be hard to coordinate different branches of the same plant: Information mostly flows from the leaves to the root, and not the other direction.
The simplest plant is based on this code:
// This is the trivial plant, which just grows and grows
RULE "Very simple Rule"
GROW BY 1
You can see that each rule has a name (which is purely informational), and an action which tells the current branch to, well, grow by one. The syntax allows for Java-style comments, whitespace and newlines are insignificant and the reserved words are case-insensitive. The result will be a plant that just grows straight up, for ever and ever. A more complex rule might be this:
RULE "Start"
WHEN Length <= 0
GROW BY 1
SET TAG = "Root1"
RULE "Story 1"
WHEN TAG = "Root1"
// No Percentage means 100%
BRANCH ANGLE = 70°, LENGTH = 2, Tag = ""
ANGLE = -70°, LENGTH = 2, Tag = ""
ANGLE = 0°, LENGTH = 1, TAG = "Root2"
SET TAG = ""
RULE "Story 2"
WHEN TAG = "Root2"
BRANCH AT 100% ANGLE = 70°, LENGTH = 1.5, Tag = ""
ANGLE = -70°, LENGTH = 1.5, Tag = ""
ANGLE = 0°, LENGTH = 1, TAG = "Root3"
SET TAG = ""
RULE "Story 3"
WHEN TAG = "Root3"
BRANCH AT 100% ANGLE = 70°, LENGTH = 1, Tag = ""
ANGLE = -70°, LENGTH = 1, Tag = ""
ANGLE = 0°, LENGTH = 1, TAG = "Root4"
SET TAG = ""
RULE "Story 4"
WHEN TAG = "Root4"
BRANCH AT 100% ANGLE = 70°, LENGTH = 0.5, Tag = ""
ANGLE = -70°, LENGTH = 0.5, Tag = ""
ANGLE = 0°, LENGTH = 0.5, Tag = "Tip"
SET TAG = ""
RULE "Star"
WHEN TAG = "Tip"
Blossom
I added a picture with the resulting tree. The yellow blob at the top is a not-yet-polished rendering of a blossom. At the right, there is already the first offspring of the plant. One thing to keep in mind while writing a genome is that rules are applied to single branches, and not the whole plant. The program will, for each branch individually, check which rules apply and choose one. I’ll skip a detailed description of the syntax here, eventually you will find proper documentation on the entropia wiki page. You can find more examples in the source repository.
The players will register at a website providing the usual CRUD functionality for their code, with integrated syntax checking. They can have more than one code at the same time, but only one can be marked as „active.“ The program actually serving the projector will regularly fetch the active code and run a around (called „season“) of the game. Whenever a new seed grows, the program will get the possibly updated active code of that user and use that. A season will probably last for a fixed amount of time, and at the end the total biomass accumulated by each player is added up and written back to the database.
You can fetch the source code from my git repository and browse the haddock documentation. Unsurprisingly, it is written in Haskell. To compile it yourself, you will need the GHC Haskell compiler, parsec version 3 and for the visualization the gtk2hs package, all of which are packaged in Debian unstable. The main.hs is the interesting program. You pass it one or more plants as an argument, and it will start the simulation. If it’s too slow for test runs, then reduce the dayLength variable in Lseed/Constants.hs. If you have trouble getting it to run, just talk to me.
As you can see in the picture above, the graphical output is not very aesthetic. I am no artist, and I don’t pretend to be one. So, if you think you have the right touch, maybe know OpenGL and a bit of Haskell, I’d be very grateful if you make it look better. The UI interface is quite simple: You need to have a module that returns an Observer value, which contains a few callbacks for various situations. The code in Lseed/Renderer/Cairo.hs can of course be used as a guideline. I’m suggesting OpenGL because my code is not only ugly, it is also too slow very quickly. If you need any help, just contact me by mail or jabber.
I’m also interested in comments about the game balance, and the expressiveness of the programming language. If you play around with the code and discover that there are missing features in the language, or that your plants grow too fast or too slow, or when you discover bugs, please also tell me.
L-seed is based on an old idea of mine, advanced together with Cupe, Sven Hecht is programming the web interface, and Lay is testing the game and bugs me about it to keep the motivation going.
Update: I uploaded the package to hackage, to encourage contributions.
My contribution to the programming contest held by the German „FreiesMagazin“ got a third place out of 13 submissions. This is quite good, considering that I only wrote a small wrapper around the generic game-tree Haskell library by Colin Adams, and hardly gave any serious thought into the problem.
All entires are available for download. I have annotated the table containing the results with the line count as given by ohcount:
| W | D | L | Points | Language | Code lines | ||
| 1. | Kroschinsky | 904 | 242 | 54 | 2954 | Python | 589 |
| 2. | Schulz | 858 | 263 | 79 | 2837 | Python | 544 |
| 3. | Breitner | 837 | 281 | 82 | 2792 | Haskell | 264 |
| 4. | Jackermeier | 754 | 306 | 140 | 2568 | Perl | 183 |
| 5. | Roth | 574 | 338 | 288 | 2060 | C++ | 1731 |
| 6. | Eitel | 567 | 355 | 278 | 2056 | Ruby | 352 |
| 7. | Reichel | 342 | 328 | 530 | 1354 | Python | 266 |
| 8. | Zimmermann | 303 | 400 | 497 | 1309 | Java | 1070 |
| 9. | Apensiv | 190 | 353 | 657 | 923 | Perl | 410 |
| 10. | Maraun | 150 | 300 | 750 | 750 | C++ | 690 |
| 11. | Golemo | 131 | 319 | 750 | 712 | Python | 104 |
| 12. | Ziegelwanger | 120 | 337 | 743 | 697 | C++ | 868 |
| 13. | Fuest | 32 | 254 | 914 | 350 | Python | 645 |
Note that the line count for my haskell program includes the game-tree library, which I bundled in my submission. Without it, it’s 156 lines of code I had to write, which is second best in the code golf category.
If you look at the timing statistics, you will see that my program took the longest. When the contest was started, the timelimit was one minute per round – which I of course tried to use as much as possibly, by increasing the search tree depth. Later into the contest, the rules were changed to limit it to one minute for a whole game, and that long-running programs will get points deducted. I did some minor changes based on a profiling run, but did otherwise not care too much about performance. I would have tried to improve the runtime by using Haskell’s good ability for parallelization. But when I asked on what kind of machine the code will be run, but they would not tell me. They said that this is a hobby programmer’s contest where allowing for parallelization were not fair, so I did not work in that direction.
All in all it was a positive experience, showing of Haskell’s qualities as a language that you can quickly get good results with.
I just made my first upload to hackage (not counting uploads I did during my work in Dresden). Don Steward repeatedly told me to package and upload darcswatch, so I just did that. Thanks to Gwern Branwen to contribute the first cabal file.
There is little documentation on how to set up darcswatch yourself, so if you actually want to try it out, you most likely will have to get in touch with me. Note that you can just use the installation on http://darcswatch.nomeata.de/.
If you, for some reason, feel like hacking on darcswatch, I’d be interested in memory reduction. I currently process 916 mails containing 1867 patches and 47MB, as well as 13 repositories, some of which are rather large, and the numbers are increasing.
For a while I have been pondering over a problem that arises when your functionally written program has some state with cross references – for example a list of users, each of which uses a number of computers, and a list of computers, each having an owner.
For doing queries on such data, it would be convenient if every reference is just the referenced object itself. Although we would visualize this as a graph, semantically, it is more like an infinite tree. This is possible in Haskell, due to laziness, and if you create the data structure cleverly, it even uses constant memory, no matter how “deep” you enter this infinite tree (a recent post of mine talks about this). A possible data definition would be:
data User = User {
userName :: String,
uses :: [Computer]
}
data Computer = Computer {
computerName :: String,
owner :: User -- references the Users
}
data State = State [User] [Computer]
testState = let user = User "Conrad" [computer]
computer = Computer "Z3" user
in State [user] [computer]
But such a representation is very unsuitable for updates (at least I can’t think if a nice way of updating such a graph without breaking the internal cross-references) and serialization, which is a must for a HAppS based application. So what one would probably start with is this data structure:
data User = User {
userId :: Int,
userName :: String,
uses :: [Int] -- references the Computers
}
data Computer = Computer {
computerId :: Int,
computerName :: String,
owner :: Int -- references the Users
}
data State = State [User] [Computer]
testState = State
[User 0 "Conrad" [1]]
[Computer 1 "Z3" 0]
I think the semantics of this are clear. Note that the referencing is currently not type-safe, but this can be provided by phantom types. Maybe I’ll write more about that later.
Now imaging you want to display the information about the first computer with your web application. You extract the information with let State _ cs = testState in head cs and pass that to your templating engine. But what if your template wants to display the name of the owner? It only has access to his userId. You would either need to know what information the template will ever need, extract that from the state beforehand and pass it along, or give the template access to the whole state. In that case, though, there has to be lookup-logic in your output code, which is also not nice.
Woudln’t it be nice if you could, in your application logic, work with the explicit references, which are easy to modify and store, but somehow turn that into the implicit referencing?
One way would be to have two unrelated sets of data structures, ExplicitState, ExplicitUser, ExplicitComputer, which use explicit identifiers to reference each other, and ImplicitState,... which are defined as the first representation of our state. It is then mostly trivial to write a function that converts ExplicitState to ImplicitState.
The big disadvantage of this is that you have to maintain these two different hierarchies. It also means that every function on the state has to be defined twice, which often almost identical code. Clearly, this is not desirable.
It would be more elegant if the state is stored in one data type that, controlled by a type parameter, comes in the one or the other representation. To do that, we need two types: One that contains a value, and one that contains just a reference:
newtype Id v = Id v deriving (Show, Typeable, Data)
newtype Ref v = Ref Int deriving (Show, Typeable, Data)
Then we need to adjust our data definitions, to make use of these. (I’ll leave out the names, to keep the code smaller)
data User ref = User {
userId :: Int,
uses :: [ref (Computer ref)]
}
data Computer ref = Computer {
computerId :: Int,
owner :: ref (User ref)
}
data State ref = State [User ref] [Computer ref]
Here we introduce a type parameter “ref”, which will later be either Id or Ref. Note that now a reference also states the object it is a reference for, which greatly increases type safety. Functions on these data types that don’t work with the references will be polymorphic in the “ref” type parameter, so only need to be written once. A User Id is a complete user with all related data, while User Ref is a user with only references. And a Ref (User Ref) is reference to a user, which contains references...
Did you notice the lack of a deriving clause? Our data structures have the relatively peculiar kind ((* -> *) -> *), which makes it hard for the compiler to derive instances for things like Show. But we already know that we will only use Id or Ref for the type variable, so we can use ghc’s StandaloneDeriving language extension and have these instances created:
deriving instance Show (User Id)
deriving instance Show (User Ref)
deriving instance Show (Computer Id)
deriving instance Show (Computer Ref)
deriving instance Show (State Id)
deriving instance Show (State Ref)
The next step is to write the conversion function. It will have type
unrefState :: State Ref -> State Id
For that, and for later, we need lookup functions:
unrefUserRef :: State Id -> Ref (User Ref) -> Id (User Id)
unrefUserRef (State l _) (Ref i) = Id $ fromJust $
find (\u@(User i' _) -> i == i') l
unrefComputerRef :: State Id -> Ref (Computer Ref) -> Id (Computer Id)
unrefComputerRef (State _ l) (Ref i) = Id $ fromJust $
find (\u@(Computer i' _) -> i == i') l
These expect a State (with implicit referencing) and a reference, and look up this reference. The function unrefState then looks like this:
unrefState :: State Ref -> State Id
unrefState (State us cs) =
let unrefState = State (map (unrefUser unrefState) us)
(map (unrefComp unrefState) cs)
in unrefState
where unrefUser :: State Id -> User Ref -> User Id
unrefUser s (User i refs) = User i (map (unrefComputerRef s) refs)
unrefComp :: State Id -> Computer Ref -> Computer Id
unrefComp s (Computer i ref) = Computer i (unrefUserRef s ref)
Note how we “tie the knot” in the let expression. This is the trick that ensures constant memory consumption, because every reference points back to the same place in memory.
So what do we have? We have no duplication of data types, we can write general functions, and we can resolve the explicit referencing. We can also easily write functions like unrefUser :: State Ref -> User Ref -> User Id, which transform just a part of the state.
But writing unrefState is very tedious when the State becomes more complex. Each of the other unrefSomething functions are very similar, but need to be written anyways. This is unsatisfactory. What we want, is a generic function, something like
gunref :: State Ref -> a Ref -> a Id
which, given a state with explicit references, replaces all explicit references in the first argument (which could be State Ref or User Ref or anything like that) with implicit ones. This function can not exist, because we would not know anything about a and b. But maybe we can do this:
gunref :: (Data (a Id), Data (a Ref)) => State Ref -> a Ref -> a Id
Before being able to do so, we need to derive Data for our types. We can start with
deriving instance Data (User Id)
deriving instance Data (User Ref)
deriving instance Data (Computer Id)
deriving instance Data (Computer Ref)
deriving instance Data (State Id)
deriving instance Data (State Ref
but that will complain about missing Typeable instances. Unfortunately, ghc’s deriver for Typeable (even the stand-alone-one), does not handle our peculiar kind, so we need to do it by hand. With some help from quicksilver on #haskell, I got it to work:
instance Typeable1 ref => Typeable (User ref) where
typeOf _ = mkTyConApp (mkTyCon "User") [typeOf1 (undefined :: ref ())]
instance Typeable1 ref => Typeable (Computer ref) where
typeOf _ = mkTyConApp (mkTyCon "Computer") [typeOf1 (undefined :: ref ())]
instance Typeable1 ref => Typeable (State ref) where
typeOf _ = mkTyConApp (mkTyCon "State") [typeOf1 (undefined :: ref ())]
Turning to the documentation of Data.Generics, I notice with some disappointment that there is no function that is able to change a type – they all seem to replace a value by another value of the same type. But the functions gfoldl and gunfold sounded like they could be used for this.
Warning: What comes now is a very non-haskellish hack that subverts the type system, just to get the job done. Please read it with a healthy portion of distrust. If you know of a cleaner way of doing that, please tell me!
I want to do some heavy type hackery, so I need to disable haskell’s type system. There is Data.Dynamic, but not even that is enough, as we need to carry a type’s Data instance around as well. So let’s wrap that up:
data AData where AData :: Data a => a -> AData
instance Show AData where show (AData a) = "<" ++ show (typeOf a) ++ ">"
fromADataE :: forall b. Data b => AData -> b
fromADataE (AData d) = case cast d of
Just v -> v
Nothing -> error $ "Type error, trying to convert " ++
show (typeOf d) ++ " to " ++
show (typeOf (undefined :: b)) ++ "."
There is also a function that converts an AData back to a normal type, if possible. If it’s not possible, then there is a bug in our code, so we give an informative error message.
Similar to everywhere, we want to have transformers that combinable. They need to have the change to convert an arbitrary value, but also signal that they could not convert something (and this something has to be recursed into). I came up with this:
type ADataT = AData -> Maybe AData
extADT :: forall a b. (Data a, Data b) => ADataT -> (a -> b) -> ADataT
extADT at t a@(AData v) = case cast v of
Just x -> Just (AData (t x))
Nothing -> at a
doNothing :: ADataT
doNothing = const Nothing
ADataT is the type for such a transformer. doNothing will not transform anything, and extADT can be used to add any function to the list of tried transformers, in the spirit of extT.
To apply such a transformer, I want to use this function, which I’ll describe in the code comments:
everywhereADT :: forall a b. (Data a, Data b) => ADataT -> a -> b
-- first check if we can transform this value already
everywhereADT f v = case f (AData v) of
-- if so, coerce it to the users’ requested type, which hopefully goes well
Just r -> fromADataE r
-- if not, we need to recurse into the data structure
Nothing -> recursed
-- for that, we first need to figure out the arguments to the
-- constructor. We store them in the untyped list
where args :: [AData]
(Const args) = gfoldl k z v
-- gfoldl lets us have a look at each argument. We wrap it in AData
-- and append it to the list
k (Const args) arg = Const (AData arg : args)
z start = Const []
-- We need the data constructor of the input data type. If the user did not want
-- it to be transformed, it will be the same
c = toConstr v
-- To give better error messages, we make sure that the outmost type constructor
-- of the requested type actually has the data constructor we were given. Otherwise
-- gunfold will complain in a non-helpful way
input_type = dataTypeRep (constrType c)
output_type = dataTypeRep (dataTypeOf (undefined :: b))
recursed = if input_type /= output_type
then error $ "Can not convert <" ++ show input_type ++ ">"++
" to <" ++ show output_type ++ ">."
-- the types match, so we assemble the output type, using gunfold
else snd (gunfold k' z' c)
k' :: forall b r . Data b => ([AData], b -> r) -> ([AData],r)
-- we start by reversing the input list
z' start = (reverse args,start)
-- then we call us recursively on the argument and feed the result
-- to the output constructor
k' ((AData a) : args, append) = (args, append (everywhereADT f a))
-- Used for folding (we don’t need to retain the original constructor)
data Const a b = Const a
What a beast! But surprisingly, it works. Here are some examples. Note that I always have to specify the requested output type:
bool2Int :: Bool -> Int
bool2Int False = 0
bool2Int True = 1
*Main> everywhereADT (doNothing `extADT` bool2Int) True :: Int
1
*Main> everywhereADT (doNothing `extADT` bool2Int) True :: ()
*** Exception: Type error, trying to convert Int to ().
*Main> everywhereADT (doNothing `extADT` bool2Int) (True,False) :: (Int,Int)
(1,0)
*Main> everywhereADT (doNothing `extADT` bool2Int) ([True,False],True,()) :: ([Int],Int,())
([1,0],1,())
*Main> everywhereADT (doNothing `extADT` bool2Int) ([True,False],True,()) :: [()]
*** Exception: Can not convertto .
*Main> everywhereADT (doNothing `extADT` bool2Int) [True] :: [Bool]
[*** Exception: Type error, trying to convert Int to Bool.
I hope this code does not inflict too much pain on any Haskell-loving reader. I know I violated the language, but I didn’t know how to do it better (at least not without using Template Haskell). I also know that this is not very good performance wide: Every single value in the input will be deconstructed, type-compared several times and re-assembled. If that is an issue, then this function should only be used for prototyping.
To apply this to our state, we only need to glue it to our lookup functions from above:
gunref :: (Data (a Id), Data (a Ref)) => State Ref -> a Ref -> a Id
gunref s w = let unrefState = gunref' unrefState s
in gunref' unrefState w
gunref' :: (Data (a Id), Data (a Ref)) => State Id -> a Ref -> a Id
gunref' unrefState = everywhereADT unref'
where unref' = doNothing `extADT`
unrefUserRef unrefState `extADT`
unrefComputerRef unrefState
Now we have a generic unreferencer. I set the type a bit more specific than necessary, to make it safe to use (under the assumption that the list of lookup functions is complete and will not leave any Ref in the output).
*Main> testState
State [User {userId = 0, uses = [Ref 1]}] [Computer {computerId = 1, owner = Ref 0}]
*Main> gunref testState testState
State [User {userId = 0, uses = [Id (Computer {computerId = 1, owner = Id (User {userId = 0,
uses = [Id (Computer {computerId = 1, owner = Id (User {userId = 0, uses = ..
Oh, and by the way, if you want to test this code, you’ll need at least:
{-# LANGUAGE DeriveDataTypeable, FlexibleInstances, GADTs,
FlexibleContexts, StandaloneDeriving, ScopedTypeVariables #-}
A few years ago, I had some ideas about a real-world trading game. In short, a combination of the game play of “Settlers of Catan”, the cute pseudo-medieval world of “The Settlers” (the computer game), which you can play in your every day live along, without having to sit in front of the computer for a long time. I then started some code, lost motivation and let it sit there for a while.
Recently, I re-developed interest in the idea and started from scratch, using Haskell and HAppS. To avoid losing interest again, I’m now putting the code online and set up the server. I invite everyone to play around with it, maybe have a look at the code, send me patches or comments. As you can see, the web user interface is plain ugly HTML and could need some love. Some CSS is definitely needed, some AJAX would be nice. Also, the resource tree is very small at the moment – there are a lot of things to work on, even if you don’t want to touch Haskell!
You can register at http://mumer.net/. You will be either given a forest (a source for wood) or a source of stone. You can reap your source, and trade on the “Free Market”, which is where you can always trade online, at bad prices. The idea is that you find real trade partners, to get better prices.
For now, trading without physical contact is possible. You can create so-called issue ids, which represent stips of paper. You can then load resources on them, and give the paper (i.e. the number) to other players, who can then redeem them. Eventually, it is planned that these pieces of paper are provided centrally and (sufficiently) unforgeable, so that it is clear who owns a resource.
You can also bid on certain stuff, such as a sawmil, which allows you to turn wood into boards. It will regularily be re-leased to the highest bidder.
You can get the code (in a darcs repository) from http://darcs.nomeata.de/mumer2 and also browse the code.
If you happen to be at the CCC right now and would like to talk about it, please do so!
