Friday, November 17, 2006

HsInstaller

I've given this message and idea to a few people already, via email and IRC, so I thought maybe I should share it slightly wider and get feedback from people.

I am intending to start a project called HsInstaller, to produce Windows installers for:

Hat, Yhc, WinHugs and any Cabal program (definately Hoogle, hoepfully Happy, Alex, Haddock etc.), maybe GHC and Gtk2Hs if thats possible. This project does NOT hope to create installers for libaries, just standalone executable programs. A wrapper can be produced for libraries that just calls the underlying Cabal infastructure much more easily.

At the moment the Haskell community has several different installers, the C based one I wrote for WinHugs that generates a .exe, and the GHC one you have that generates a .msi, and the Gtk2Hs one that uses InnoSetup. There are advantages and disadvantages to all approaches, but since the world is going steadily towards .msi's that might be the best format to standardise on. Of course, it might not be the best format, discussion is welcome!

Unfortunately the installer code used by GHC is not publically available, one consequence of this is that GHC does not have snapshot installers for Windows, and in fact a GHC release is not prepackaged for Windows - this is something done later.

So, my general thoughts are to try and see if a copy of the GHC installer can be obtained (this is apparently likely in the future). Collect all the installers that everyone uses, sit down with everyone who wants an installer and figure out what we have, and what we want. Once thats done we can start implementing something that everyone can reuse easily.

Anyone have any thoughts? I have about 4 projects that need installers (Hat, Yhc, WinHugs, Hoogle), but it would be nice if everyone could be satisfied by the results.

Monday, November 13, 2006

Dr Haskell progress

I've been working on Dr Haskell for a little bit, and its now quite a bit cleverer. For those who don't know, Dr Haskell tries to make helpful suggestions about your code. For example, if you use "concat (map f x)", it will suggest you use "concatMap f x". It can also do other clever things, for example:

not (a == b) ===> use /=

if x `elem` xs then True else x == 1 ===> use an or

if bracket then "(" else "" ===> ['(' | bracket]

The idea is to introduce beginners to new functions or patterns of Haskell that they might not have been aware of, without requiring too much thought or learning, until you have identified such an instance. I think of this as a nice complement to Hoogle, since Hoogle caters for the case when beginners have already identified where a common pattern may lie.

Of course, the biggest weakness in Dr Haskell is that it matches at the expression level only, a beginner might write:

takeheads [] = []
takeheads (x:xs) = head x : takeheads xs

(And indeed they do, I've seen that 4 times already today!) An experience programmer would recognise that as a map, and now, so will Dr Haskell! It will report "specialised version of map used as takeheads, line number n".

Some other things that Dr Haskell can now spot:

sum :: [Int] -> Int
sum [] = 0
sum (x:xs) = x + sum xs
(its a foldr, as written above - although I realise foldl' is a better choice for sum)

reverse xs = rev [] xs
where
rev acc [] = acc
rev acc (x:xs) = rev (x:acc) xs
(rev is a foldl)

Dr Haskell can now spot all of these, and has the power to spot lots more patterns using recursion as well as expression matching. I think this could be really helpful for beginners, if anyone has any interesting ideas about this please let me know. I hope to package up a release in the next few days.

Sunday, November 12, 2006

Library idea: the Safe library

Often Haskell programmers end up with "error, head []" - a totally useless error message!

Because in developing Catch I have 10,000's of lines, that would be a real pain. To combat this I have:

headNote :: String -> [a] -> a
headNote err [] = error $ "headNote, " ++ err
headNote err (x:xs) = x

I also have fromJustNote, lookupJust (fromJust . lookup), assertNote etc - loads of useful safe functions. I also have headMaybe :: [a] -> Maybe a. Perhaps this could be useful as a library - the safe library?

I would imagine lots of functions from the various Prelude/List libraries, some with Note versions taking a parameter to give a better error message, some with Maybe versions to return a Maybe on failure.

Anyone think this would be useful? Its technically trivial to implement, but I don't have enough time to manage this, so perhaps someone else would like to take this one on?

Friday, November 10, 2006

System.FilePath, automated testing

I just released System.FilePath, a library for manipulating FilePath's on both Windows and Posix. The problem with a library like this is that there are lots of corner cases, lots of weird situations and everything needs to be tested twice with different semantics. Obviously this requires a test infrastructure different from most other libraries!

I started off with a separate file for writing properties, but quickly found that the properties were the best type of documentation for a function. It was also a pain to keep two different places which identify what the logic of the code is - i.e. the code and the tests. The obvious idea then is to combine the code, documentation and testing into one. Because I am using Haddock that turned out to be quite easy to do - any line beginning with "-- > " is a test. Haddock sees this as monospace formatting, Haskell as comment, and my test generator can find the tests pretty easily.

Within the System.FilePath repo I have a DOS Batch file driver (test.bat) which uses a separate Haskell program (GenTests.hs) to create a test script and run it.

Within the test listing there are some tests which I refer to as constant, and some as properties. Some of the tests have no free variables, these are constant - the GenTests recognises them and outputs them directly - in one execution they either pass or fail. The properties are just standard QuickCheck properties, with the restriction that every multi-letter keyword not in a certain known set is a function in the library, and every variable x..z is a FilePath (hence using a custom FilePath generator).

The main complication in testing from System.FilePath is the fact that every property corresponds to two different tests - one on the Posix implementation, one on the Windows implementation. The translator automatically does this duplication, unless either Posix: or Windows: is given at the start of the test, in which case the test is only executed on the appropriate version.

For QuickCheck testing I defined a FilePath to be a 25 character string, from the following set of characters "?|./:\\abcd 123;_". The idea of this set of characters is to include every character that any aspect of the library treats differently, along with a small selection of "normal" letters/numbers.

There was one modification I had to make to QuickCheck, by default QuickCheck returns successfully outputting success/failure to the console. Unfortunately if an automatic program is executing over 200 tests, then these messages can get obscured in the general noise - this happened more than once. To combat this I defined a new QuickCheck wrapper which calls error on failure. Ideally the signature of quickCheck should be changed to :: .. -> IO Bool to detect these situations and allow the driver script to fail more obviously.

Without QuickCheck I don't think it would have been possible to write the FilePath library - it caught too many obscure bugs that manual testing would never have found. In addition, QuickCheck forced me to think about the properties of the library more closely - I changed some design decisions after it turned out that the properties disagreed with me. The one thing QuickCheck helped with more than anything though was refactoring - despite a massive number of the functions all depending on each other, QuickCheck allows me to change the behaviour of one function in some obscure case and check that no other function was relying on that.

The only criticism that can be levelled at my use of QuickCheck is that failing examples are not minimal, in fact they are exactly 25 characters long. I hope that at some point soon I can make use of SmallCheck (once it has a darcs repo and a .cabal file) to do testing alongside QuickCheck to get a greater depth of coverage.

All the test scripts I have written are available in the darcs repo, under the BSD3. If anyone can make use of them, I'd be happy to have someone take them forward!