2009-03-16

Space News: U.S. Air Force To Widen Access To Detailed Space Surveillance Data

Revisiting my posts on the Iridium 33/Cosmos 2251 collision and data

The front page article of last week’s Space News (Volume 20, Issue 10, March 9th, 2009) reported that “U.S. Air Force To Widen Access To Detailed Space Surveillance Data” (article quoted in full here). The short story is that:

The U.S. Air Force has agreed to provide wider access to its high-accuracy catalog showing the whereabouts of orbital debris and operational satellites as part of an effort to enable commercial and non-U.S. government satellite operators to better avoid in-orbit collisions, according to U.S. Air Force officials.

The “high-accuracy catalog” would be what I referred to as Special Perturbations (SP) data. The article reiterates the limitations of the Two-Line Elements currently available to the public:

…U.S. Air Force Space Surveillance Network data is published, but only in a form that satellite operators have long said is not useful for space traffic management. This data, called Two-Line Elements (TLEs), has too great a margin of error to permit operators to act.

And says that:

The U.S. Air Force statement suggests that it will furnish more information to the public to enable operators to make a more highly informed decision.

Apparently a policy if forthcoming. The details of the policy are unknown but it should be announced before June.

All this is of course good news and cause for cautious optimism. Why do I say cautious? Here are two quotations from a March 5 roundtable discussion titled Challenges for Space Policy in 2009:

We’re horrible at implementing policy, absolutely horrible.

The problem is not that we have insufficient space policy or a bad space policy, but as we now understand, policy is not self-actualizing. We need to have some other mechanisms that take very sound policy and turn them into action and results.

Let’s hope those mechanisms are in place by June.

While on the topic of the Iridium 33/Cosmos 2251 collision: via Jeff Foust’s spacepolitics.com (good blog, love the tagline) I read that “some have tried to portray last month’s Iridium-Cosmos satellite collision as either a deliberate act by the US or a deliberate act by Russia.” I call bullshit on both of these. The Washington Times editorial is particularly embarrassing in its tin-foil-hattery.

2009-03-14

Blogging with Pandoc, literate Haskell, and a bug

I use John MacFarlane’s pandoc for converting my blog entries into HTML. My workflow is roughly:

  1. Write my posts vim using (pandoc’s enhanced) Markdown format.
  2. Store my posts in a private Git repository on GitHub.
  3. Convert posts to pastable HTML using pandoc and paste into Blogger.

pandoc is a great tool. It is a better Markdown than Markdown and also supports a bunch of other markup formats. The feature set is really extensive with plenty of useful options and sensible defaults. Kudos to John and the other contributors.

My previous post on Network.Curl was the first post where I included significant amounts of source code and was thus my first opportunity to exercise pandoc’s syntax high-lighting features. I was fortunate enough that version 1.2 of pandoc was released just days prior to my post with support for literate Haskell. The literate Haskell support meant that I could stop worrying about how to avoid my Haskell code being interpreted as Markdown block quotations (and if I need a block quotation I can always use <blockquote> tags).

By default pandoc will install without support for syntax high-lighting. To enable syntax high-lighting you must supply the highlighting flag when building, e.g. if you are using cabal:

cabal install pandoc --flags=highlighting

This triggers installation of John’s highlighting-kate library which is used for the syntax high-lighting. A snag you might run into here is the dependence on pcre-light which in turn relies on a PCRE dynamic library being available. In my case (on Mac OS 10.4) I had to install PCRE using MacPorts and add /opt/local/lib to $LD_LIBRARY_PATH.

When editing my posts I want to review occasionally and run pandoc with the --standalone option to generate a complete HTML document I can view in my browser. However, when pasting into Blogger I only want a fragment and omit said option. The problem with this is that the CSS style definitions used for the syntax high-lighting isn’t included in the fragment. My solution to this problem was to copy the <style> element from the --standalone output and paste into the <head> of my Blogger template (immediately prior to </head>).

That’s pretty much it. The complete command I use to generate the fragments I paste into Blogger is:

pandoc --smart --to=html+lhs input.lhs

(I pipe the output into e.g. pbcopy or xclip to get it straight into the clipboard.)

The +lhs portion of the --to option tells pandoc to keep the bird-tracks (those ’>’s at the beginning of a line of code). The beauty of this is that readers can copy and paste the entire (as in “Select All”) blog post into a .lhs file and it is valid literate Haskell that e.g. ghc will happily compile. (Well, at least when copied from Firefox. Safari has the nasty habit of omitting the empty line following a code block which doesn’t make for valid literate Haskell… I’ll be investigating work-arounds.)

When writing my previous post I did notice one discrepancy in the syntax high-lighting. The bird-track on Any line following a end-of-line comment (--) wouldn’t be correctly high-lighted. Example:

> -- | Additional options to simulate submitting the login form.
> loginOptions user pass =
> CurlPostFields [ "login=" ++ user, "password=" ++ pass ] : method_POST

Depending on your screens calibration it might not stick out like a sore thumb but the middle bird-track there is black while the other two are blueish. This was due to a bug in the syntax high-lighting definition file for literate Haskell (literate-haskell.xml) in the highlighting-kate package currently on Hackage (version 0.2.3). This is no fault of John’s as that file came straight from the KDE subversion repository. A couple of hours ago I submitted a patch to John as well as the KDE developers (who have already applied it to the their repo).

If you can’t wait for John to apply the patch to the highlighting-kate repo you can download it from darcswatch (it’s the one labeled “20090314201157”), apply it to your local copy of John’s repo, cabal install, and finally cabal install pandoc --flags=highlighting --reinstall.

With the patch applied the bird-tracks will render properly:

> -- | Additional options to simulate submitting the login form.
> loginOptions user pass =
> CurlPostFields [ "login=" ++ user, "password=" ++ pass ] : method_POST

Bird Track

Update 2009–03–16: John uploaded highlighting-kate 0.2.4 (with patches applied) to Hackage today. Reinstall with cabal (don’t forget to cabal update first).

2009-03-04

Extended sessions with the Haskell Curl bindings

I recently needed to automate retrieving protected data from a secure web site. I had to:

  1. Log into the website with a POST request.
  2. Download the protected data with a GET request.

All this had to be done using SSL, and I suspected I’d need to handle cookies too.

I had read that libcurl had support for sessions and cookies spanning multiple requests, and knew that it could handle SSL. I was aware there is a Haskell binding to libcurl (aptly named “curl” but hereafter referred to as Network.Curl to avoid confusion) on Hackage so I had my tools cut out for me. While I had used the command line curl utility quite a bit I had never programmed against libcurl before and had some learning to do.

It wasn’t entirely clear to me from the haddocks how to use Network.Curl. This may not be a problem if you are familiar with libcurl already (I couldn’t tell) but for me it was quite a hurdle. Googling on the topic I found some blogged examples that got me started but I was unable to find an example demonstrating a multi-request session. However, with the basics from the blogs I was able to return to Network.Curl and figure things out by inspecting its source code. I’ll share an example here for the benefit of others who find themselves in the same situation. I’m using version 1.3.4 of Network.Curl.

As a contrived example let’s assume we want to write a small program that, given a user name and password, fetches the user’s API token from GitHub. Here is the code (literate Haskell, just copy and paste into a .lhs file):

> import Network.Curl
> import System (getArgs)
> import Text.Regex.Posix
> -- | Standard options used for all requests. Uncomment the @CurlVerbose@
> -- option for lots of info on STDOUT.
> opts = [ CurlCookieJar "cookies" {- , CurlVerbose True -} ]
> -- | Additional options to simulate submitting the login form.
> loginOptions user pass =
> CurlPostFields [ "login=" ++ user, "password=" ++ pass ] : method_POST
> main = withCurlDo $ do
> -- Get username and password from command line arguments (will cause
> -- pattern match failure if incorrect number of args provided).
> [user, pass] <- getArgs
>   -- Initialize curl instance.
> curl <- initialize
> setopts curl opts
>   -- POST request to login.
> r <- do_curl_ curl "https://github.com/session" (loginOptions user pass)
> :: IO CurlResponse
> if respCurlCode r /= CurlOK || respStatus r /= 302
> then error $ "Failed to log in: "
> ++ show (respCurlCode r) ++ " -- " ++ respStatusLine r
> else do
> -- GET request to fetch account page.
> r <- do_curl_ curl ("https://github.com/account") method_GET
> :: IO CurlResponse
> if respCurlCode r /= CurlOK || respStatus r /= 200
> then error $ "Failed to retrieve account page: "
> ++ show (respCurlCode r) ++ " -- " ++ respStatusLine r
> else putStrLn $ extractToken $ respBody r

The first thing to note is that we use do_curl_ rather than e.g. curlPost and curlGet. The latter two don’t actually give you access to the response body but instead prints it on stdout! The general process is:

  1. Initialize a curl instance.
  2. Set options.
  3. Call do_curl with the URL and request-specific options.
  4. Inspect CurlResponse
  5. Repeat from 3 until done.

Note that the all used of libcurl should be wrapped by withCurlDo. In the example I wrapped the entire body of main. Also note that the type of do_curl_ must be specified explicitly unless it can be inferred by later use. The CurlResponse type specified above uses vanilla Strings for everything.

For the POST request I added some CurlPostFields to the method_POST options predefined in the Network.Curl. For the GET request’s options since the predefined method_GET is sufficient.

A GitHub-specific peculiarity here is the error checking after the POST request. GitHub returns a 302 (“Moved Temporarily”) on successful login and a 200 (“OK”) when the credentials are bad. Stuff like this needs to be figured out on a site-by-site basis.

For completeness here is the function that extracts the token from the response body using a regular expression:

> -- | Extracts the token from GitHub account HTML page.
> extractToken body = head' "GitHub token not found" xs
> where
> head' msg l = if null l then error msg else head l
> (_,_,_,xs) = body =~ "github\\.token (.+)"
> :: (String, String, String,[String])

If you Load this code in ghci and type :main username password the Octocat will deliver your token.

Octocat, GitHub&rsquo;s mascot