Blogging with Pandoc, literate Haskell, and a bug

I use John MacFarlane’s pandoc for converting my blog entries into HTML. My workflow is roughly:

  1. Write my posts vim using (pandoc’s enhanced) Markdown format.
  2. Store my posts in a private Git repository on GitHub.
  3. Convert posts to pastable HTML using pandoc and paste into Blogger.

pandoc is a great tool. It is a better Markdown than Markdown and also supports a bunch of other markup formats. The feature set is really extensive with plenty of useful options and sensible defaults. Kudos to John and the other contributors.

My previous post on Network.Curl was the first post where I included significant amounts of source code and was thus my first opportunity to exercise pandoc’s syntax high-lighting features. I was fortunate enough that version 1.2 of pandoc was released just days prior to my post with support for literate Haskell. The literate Haskell support meant that I could stop worrying about how to avoid my Haskell code being interpreted as Markdown block quotations (and if I need a block quotation I can always use <blockquote> tags).

By default pandoc will install without support for syntax high-lighting. To enable syntax high-lighting you must supply the highlighting flag when building, e.g. if you are using cabal:

cabal install pandoc --flags=highlighting

This triggers installation of John’s highlighting-kate library which is used for the syntax high-lighting. A snag you might run into here is the dependence on pcre-light which in turn relies on a PCRE dynamic library being available. In my case (on Mac OS 10.4) I had to install PCRE using MacPorts and add /opt/local/lib to $LD_LIBRARY_PATH.

When editing my posts I want to review occasionally and run pandoc with the --standalone option to generate a complete HTML document I can view in my browser. However, when pasting into Blogger I only want a fragment and omit said option. The problem with this is that the CSS style definitions used for the syntax high-lighting isn’t included in the fragment. My solution to this problem was to copy the <style> element from the --standalone output and paste into the <head> of my Blogger template (immediately prior to </head>).

That’s pretty much it. The complete command I use to generate the fragments I paste into Blogger is:

pandoc --smart --to=html+lhs input.lhs

(I pipe the output into e.g. pbcopy or xclip to get it straight into the clipboard.)

The +lhs portion of the --to option tells pandoc to keep the bird-tracks (those ’>’s at the beginning of a line of code). The beauty of this is that readers can copy and paste the entire (as in “Select All”) blog post into a .lhs file and it is valid literate Haskell that e.g. ghc will happily compile. (Well, at least when copied from Firefox. Safari has the nasty habit of omitting the empty line following a code block which doesn’t make for valid literate Haskell… I’ll be investigating work-arounds.)

When writing my previous post I did notice one discrepancy in the syntax high-lighting. The bird-track on Any line following a end-of-line comment (--) wouldn’t be correctly high-lighted. Example:

> -- | Additional options to simulate submitting the login form.
> loginOptions user pass =
> CurlPostFields [ "login=" ++ user, "password=" ++ pass ] : method_POST

Depending on your screens calibration it might not stick out like a sore thumb but the middle bird-track there is black while the other two are blueish. This was due to a bug in the syntax high-lighting definition file for literate Haskell (literate-haskell.xml) in the highlighting-kate package currently on Hackage (version 0.2.3). This is no fault of John’s as that file came straight from the KDE subversion repository. A couple of hours ago I submitted a patch to John as well as the KDE developers (who have already applied it to the their repo).

If you can’t wait for John to apply the patch to the highlighting-kate repo you can download it from darcswatch (it’s the one labeled “20090314201157”), apply it to your local copy of John’s repo, cabal install, and finally cabal install pandoc --flags=highlighting --reinstall.

With the patch applied the bird-tracks will render properly:

> -- | Additional options to simulate submitting the login form.
> loginOptions user pass =
> CurlPostFields [ "login=" ++ user, "password=" ++ pass ] : method_POST

Bird Track

Update 2009–03–16: John uploaded highlighting-kate 0.2.4 (with patches applied) to Hackage today. Reinstall with cabal (don’t forget to cabal update first).


  1. While I was trying to deal with the issue of style output from Pandoc, I discussed the option of putting CSS directly into the style attributes of tags on the Pandoc mailing list. I would like styles to show up in both the web page and the feed, and adding <style> to Blogger's <head> only solves half the problem. I briefly looked for a way to add style to a feed, but didn't find anything.

  2. That's a good point. I noted that code wasn't syntax high-lighted in Google Reader but didn't reflect upon it further. It isn't all that important to me but if you find a solution I'd be interested to know. Perhaps a case could be made to have the pandoc styles added to planet.haskell.org even if Blogger feeds are uncooperative?

  3. Oh wow! One of my projects that's been for a while in the "think hard about this" stage is a reST parser for Haskell, so I'm delighted to hear about pandoc. You seem to have forgotten actually to link to it!



  4. Thanks Tim. While my informal policy is to not link to stuff that is trivially googlable I probably should link to the main subject of my post regardless. (I can also see how this policy may not be considered good form but I don't want to spend all my time copying and pasting links...)

  5. This comment has been removed by a blog administrator.

  6. Using the recently announced Google Command Line toolkit, you can post to blogger directly from the command line. See http://btbytes.blogspot.com/2010/06/how-to-make-quick-blogpost-to-blogger_6658.html