2009-04-02

ANNOUNCE: fad 1.0 — Forward Automatic Differentiation for Haskell

I’m pleased to announce the initial release of the Haskell fad library, developed by Barak A. Pearlmutter and Jeffrey Mark Siskind. Fad provides Forward Automatic Differentiation (AD) for functions polymorphic over instances of Num. There have been many Haskell implementations of forward AD, with varying levels of completeness, published in papers and blog posts, but alarmingly few of these have made it into hackage — to date Conal Elliott’s vector-spaces package is the only one I am aware of.

Fad is an attempt to make as comprehensive and usable a forward AD package as is possible in Haskell. However, correctness is given priority over ease of use, and this is in my opinion the defining quality of fad. Specifically, Fad leverages Haskell’s expressive type system to tackle the problem of perturbation confusion, brought to light in Pearlmutter and Siskind’s 2005 paper Perturbation Confusion and Referential Transparency. Fad prevents perturbation confusion by employing type-level “branding” as proposed by myself in a 2007 post to haskell-cafe. To the best of our knowledge all other forward AD implementations in Haskell are susceptible to perturbation confusion.

As this library has been in the works for quite some time it is worth noting that it hasn’t benefited from Conal’s ground-breaking work in the area. Once we wrap our heads around his beautiful constructs perhaps we’ll be able to borrow some tricks from him.

As mentioned already, fad was developed primarily by Barak A. Pearlmutter and Jeffrey Mark Siskind. My own contribution has been providing Haskell infrastructure support and wrapping up loose ends in order to get the library into a releasable state. Many thanks to Barak and Jeffrey for permitting me to release fad under the BSD license.

Fad resides on GitHub and hackage and is only a cabal install fad away! What follows is Fad’s README, refer to the haddocks for detailed documentation.


   Copyright  : 2008-2009, Barak A. Pearlmutter and Jeffrey Mark Siskind
   License    : BSD3

   Maintainer : bjorn.buckwalter@gmail.com
   Stability  : experimental
   Portability: GHC only?

Forward Automatic Differentiation via overloading to perform
nonstandard interpretation that replaces original numeric type with
corresponding generalized dual number type.

Each invocation of the differentiation function introduces a
distinct perturbation, which requires a distinct dual number type.
In order to prevent these from being confused, tagging, called
branding in the Haskell community, is used.  This seems to prevent
perturbation confusion, although it would be nice to have an actual
proof of this.  The technique does require adding invocations of
lift at appropriate places when nesting is present.

For more information on perturbation confusion and the solution
employed in this library see:
<http://www.bcl.hamilton.ie/~barak/papers/ifl2005.pdf>
<http://thread.gmane.org/gmane.comp.lang.haskell.cafe/22308/>


Installation
============
To install:
    cabal install

Or:
    runhaskell Setup.lhs configure
    runhaskell Setup.lhs build
    runhaskell Setup.lhs install


Examples
========
Define an example function 'f':

> import Numeric.FAD
> f x = 6 - 5 * x + x ^ 2  -- Our example function

Basic usage of the differentiation operator:

> y   = f 2              -- f(2) = 0
> y'  = diff f 2         -- First derivative f'(2) = -1
> y'' = diff (diff f) 2  -- Second derivative f''(2) = 2

List of derivatives:

> ys = take 3 $ diffs f 2  -- [0, -1, 2]

Example optimization method; find a zero using Newton's method:

> y_newton1 = zeroNewton f 0   -- converges to first zero at 2.0.
> y_newton2 = zeroNewton f 10  -- converges to second zero at 3.0.


Credits
=======
Authors: Copyright 2008,
Barak A. Pearlmutter <barak@cs.nuim.ie> &
Jeffrey Mark Siskind <qobi@purdue.edu>

Work started as stripped-down version of higher-order tower code
published by Jerzy Karczmarczuk <jerzy.karczmarczuk@info.unicaen.fr>
which used a non-standard standard prelude.

Initial perturbation-confusing code is a modified version of
<http://cdsmith.wordpress.com/2007/11/29/some-playing-with-derivatives/>

Tag trick, called "branding" in the Haskell community, from
Bjorn Buckwalter <bjorn.buckwalter@gmail.com>
<http://thread.gmane.org/gmane.comp.lang.haskell.cafe/22308/>

13 comments:

  1. I've wondered if perturbation confusion is an artificial problem, resulting from punning derivative values with function (range) values, and if treating derivatives as linear maps side-steps the problem altogether. Linear maps also handle functions from non-scalar domains (R^n etc), without having to add a notion of "partial derivatives".

    ReplyDelete
  2. I think these are orthogonal issues.

    It is just a coincidence (pun) that perturbations of Reals are isomorphic to Reals. Other continuous primal structures, like rotations, do not admit to this particular pun. (In general, only flat manifolds allow this pun.) But you can still construct examples of perturbation confusion in a domain like that, at least if you don't have some mechanism to prevent it. After all, perturbations necessarily live in a linear domain: they are tangent vectors, meaning elements of a tangent space.

    ReplyDelete
  3. Hmm. I'm tired. How can I get a a list of all the derivatives of a function (each element applying "diff" to the previous element). I can't make it type check...

    ReplyDelete
  4. @Anonymous: the examples should type check (the README is valid literate Haskell). Depending on your use you might have to specify a type signature for your function (e.g. "f :: Num a => a -> a") to prevent GHC from defaulting it to e.g. "Integer -> Integer".

    ReplyDelete
  5. Ha ha. Yes that does not type check when done in the straightforward fashion. However we provide a tower of derivatives just for you!

    take 10 $ diffs (exp . (2*)) 0

    ReplyDelete
  6. Bjorn, I Mean I want something like:

    diffs' :: (Num a) => (a->a) -> [ a->a ]

    Though obviously with some "foralls" and stuff in there....

    ReplyDelete
  7. I believe this definition should do the trick (with -XRank2Types):

    > diffs' :: Num a => (forall tag . Dual tag a -> Dual tag a) -> [a -> a]
    > diffs' f = map (\i x -> diffs0 f x !! i) [1..]

    Although I'm not sure when you would need it... Could you elaborate on the use case? If warranted we could add something along these lines to the library.

    ReplyDelete
  8. Bjorn,
    It's mostly a nice abstraction so you can do something like

    fs = diffs' f

    and then later on you can do fs!!4 for the fourth derivative etc. which makes it easy to work in a functional way (i.e. passing around functions).

    ReplyDelete
  9. Sorry for the "ha ha". I was laughing at the Haskell type system.

    If we only had 1st-order dual numbers, then this is what you'd need:

    diffs' f = iterate diffUU f

    This seems reasonable. But the Haskell type system cannot stomach it, no matter how many foralls are sprinkled around. Hence the infinite lazy towers.

    ReplyDelete
  10. @Anonymous: I would expect the most of the time you'll either want to evaluate several derivatives at the same "point", in which case diffs/diffs0 is suitable, or you will be specifically interested in e.g. the fourth derivative in which case "(!!4) . diffs0 f" will work without fancy foralls or type extensions. But I'm perhaps being too narrow-minded... ;)

    ReplyDelete
  11. Bjorn, well I'm thinking more of cases when you're actually passing around and storing functions (which happens a lot in FP), so you do need to actually find the function rather than just the value at some point. It may be the case that you're usually specifically interested in some particular derivative though, but it would be nice to have some convenient way of accessing it, or a subset of them...

    ReplyDelete
  12. I think what anonymous wants to know is:

    Can I build a sequence of functions
    by repeated application of a
    differential operator?

    The answer is: not in general, no.

    We do provide a special mechanism to calculate this when the first-order differential operator to be iterated is diff.

    (Try the implementation in POPL-2007:155-60, or VLAD/Stalingrad.)

    ReplyDelete
  13. This comment has been removed by a blog administrator.

    ReplyDelete