Using Parsec in simple Haskell programs

I read two very useful articles about Haskell recently. First is Haskell IO for Imperative Programmers which explains how to do simple IO. (It’s less obvious than you’d think, especially if you’re coming from one of the more popular languages like Java or Ruby.) The second is called JSON Parser in Haskell. It covers in a brief yet very informative way the exceptionally useful Parsec library that ships with many Haskell implementations.

Each of these articles are incredibly useful. You have to know how to get data in and out of your program, and there aren’t a lot of tasks that don’t require parsing data. The two techniques together, as you might expect, are crazy useful.

To summarize perhaps too quickly, you can do simple stdin/stdout IO using the “interact” function. Here’s the Haskell implementation of the popular “tac” utility:

main = interact $ unlines . (map reverse) . lines

The “lines” method turns a large string (stdin) in to a list of strings, “unlines” turns a list of strings in to one large string (stdout), “interact” ties the whole process together, and “(map reverse)” is where you would put your code.

In place of “lines” you can usefully drop Parsec in such that you can work directly on data structures instead of fussing with lines of text. (Haskell isn’t Perl, after all.) Here’s a quick example that reads pairs of integers:

module Main where
import Text.Printf
import Text.ParserCombinators.Parsec

atoi :: String -> Int
atoi = read

intsList :: Parser [(Int,Int)]
intsList = do
  intsList' <|> (eof >> return [])
      where intsList' = do
              a <- many1 digit
              many1 space
              b <- many1 digit
              newline
              r <- ( many space >> (intsList' <|> return []))
              return ((atoi a, atoi b) : r)

parseInput :: String -> [(Int,Int)]
parseInput s = case parse intsList "stdin" s of
                 Left err -> []
                 Right cs -> cs

rpt :: (Int,Int) -> String
rpt (a,b) = let c = maximum $ map ( length . cyc ) [a..b]
            in printf "%d %d %d" a b c
    where cyc n | 1 == n       = [1]
                | 1 == mod n 2 = n : cyc (1 + 3 * n)
                | otherwise    = n : cyc  (div n 2)

main = interact $ unlines . (map rpt) . parseInput

And just like that you can move input parsing in to a parser where it belongs and not worry about it in the rest of your system. Certainly a lot more work than the same thing in, for instance, Ruby:

STDIN.each { |l| a,b = l.split.map {|x| x.to_i} ; puts whatever(a, b) }

However, dropping Parsec works the same way for arbitrarily complex data structures. This same code structure workspairs of integers, HTTP, JSON, whatever. Needless to say, I find this very pleasing.

Posted in , | | 2 Responses

2 responses to “Using Parsec in simple Haskell programs”

  1. nathan
    nathan
    April 9, 2008 at 11:27 am |

    I believe that tac would be main = interact $ unlines . reverse . lines

    (map reverse) would reverse the characters within each line rather than reversing the order of the lines themselves. The unix tac utility has the latter behavior.

    Otherwise great post! :-) I found your blog while looking around intro material on parsec.

Leave a Reply