2016-09-08

Haskell and Docker: Down the rabbit hole and back

For the past couple of years I've been learning Haskell, and while I enjoy reading new materials in forms of books, papers, blogs posts and even tweets, thankfully I quickly came to realize that the best way to learn is to build things.

That's not to say I just jumped right in. I spent a long time just playing around with the language, exploring various language features, libraries and getting to know the ecosystem in general. Not doing anything serious 1.

At the time that was a bit more painful than now because now-a-days there's a lot more resources and organized materials around. The community has really stepped up! 2

Anyhow... Building things. I've been using docker since it was released and have grown from resident docker fan-boy to resident expert over time. Also I've been involved with the Python API wrapper since the early days so it made sense to try and write a Haskell API wrapper. At the time that was the closest I came to web development (the docker daemon listens on a HTTP API) as I felt comfortable.

The first attempt was awful. But it worked! I was able to launch containers and everything. Of course, the API of the library was horrendous. There were very little type guarantees for anything and the whole thing was just one big giant IO blob.

Having realized this, I went on a crusade to learn all the fancy type machinery and make use of every trick and extension under the sun. That attempt went on and off for a couple of months until I realized that I created a monster and that I didn't need half the stuff I was using. So, having went from one extreme to the other I decided to delete everything and start fresh. I had some guidance from a couple of friends (thank you!) and I was getting pretty close to an API I was (more) happy with and that should be usable by most people. The library being usable was one criteria, but the other one was that it's more or less straightforward to contribute to. The reason being that the Docker Engine API is kind of huge at this point and I'm certainly either going to miss something, or implement it wrong, and it should be more or less easy for anyone to go in, and contribute a bugfix or a feature. Naturally, this dragged on since obviously I was doing this for fun and in my spare time. Thankfully, I got help somewhere along the way when James Parker jumped in. He was instrumental in getting the library in better shape so that we can finally release that major refactored version.

As you might have guessed this rewrite can't even compare to the previous release. Not only did the API change, but even the namespace under which the library lives has changed.

This version is far from stable but I think it's a step in the right direction. Even though the next major release will bring more API changes we've decided to release this version anyway, so that the work gets out there, and to have people using it. Feedback (and help) is always welcome (we're working on a contribution guide). The other reason for releasing, even though there's still stuff to do, is for me to get over the "it-must-be-perfect-right-away" mindset and do small iterative changes. I looked at the early releases of some very popular libraries in the haskell ecosystem and they all started small, so why shouldn't I?!

While this is not my first project with Haskell, it's certainly the one that was most challenging simply because of the fact that it's a library. Not a project/product, not an executable that I (or others would) use, but a library. It turns out it's that much harder to get a library right and make it easy to use in other people's code.

So let's see how the library progressed as I learned things, how it looked like at the beginning and how it looks like now.

How it all started

The initial version of the library, while it worked (I could create containers with it) was pretty simplistic. For instance, this is how createContainer looked like:

createContainer :: DockerClientOpts -> CreateContainerOpts -> IO (Maybe Text)

We had to pass the client configuration (api url and the like) to each and every function.. And I figured I'll make it the first argument so if people get tired of passing it in all the time they can partially apply the functions that they use in their code with the DockerClientOpts of their choosing. I had Text and String all around the place... For instance I was constructing my URLs like this:

printf "/containers/%s/start" containerId
printf "%s%s%s" url apiVersion endpoint

I was using lenses and TemplateHaskell even though I didn't really need them, or understand any of it for that matter. 3 And of course a lot of non-idiomatic haskell was there, syntax wise and more.

After the initial version I thought to myself, hey, I can make this more type safe, this is haskell dammit. So I deleted everything and went down the rabbit hole. And naturally the first thing that happened was that the number of GHC extensions grew. These are some of the extension that I suddenly discovered that I needed:

{-# LANGUAGE DeriveFunctor     #-}
{-# LANGUAGE FlexibleContexts  #-}
{-# LANGUAGE DataKinds         #-}
{-# LANGUAGE TypeFamilies      #-}
{-# LANGUAGE DeriveFunctor     #-}
{-# LANGUAGE DeriveGeneric     #-}
{-# LANGUAGE GADTs             #-}
{-# LANGUAGE RankNTypes        #-}
{-# LANGUAGE RecordWildCards   #-}

And that's not all of them... So what was going on?

Among other things, is was using Free monads to construct an interpreter for HTTP requests and I was using singleton types to make sure that I couldn't accidentally return a list of Image ID's with the endpoint that's supposed to return Container ID's. This was making the library much more complex, and while it provided type safety for the library author, it didn't provide much benefit for the library user. In fact, the library was harder to use, and so much more harder to contribute to.

The beauty of haskell, to me, is that I can come back to a codebase after a few months and instantly know what's what, and be able to continue working on as if I didn't have a pause at all. This, to me, is the single most beneficial feature of haskell. This was very important to this project especially...since I was basically just playing around and would often have breaks like that (couple of months) before coming back to the project again. So the turning point was when I lost this ability...

Now, while this was a wonderful learning exercise, it was a complete overkill for what I was trying to do. Some of the ideas were sound, like using Reader and not have to pass in the DockerClientOpts to every single function, but others were not. On the one hand I was practically dabbling with dependant typing and on the other I was still concatenating strings for URL's (printf) and returning things like Maybe Text. This just wouldn't do.

At that point I talked to a friend of mine who's an experienced Haskell hacker and he said the same thing: get rid of all the extensions and start small. So I did this for the nth time:

commit 2d064140910b69c7b1337f4f1f508dd6c9f3109a
Author: Deni Bertovic
Date:   Wed Mar 2 15:28:59 2016 +0100

    Delete everything. Start from scratch.

Current version

After many more trials and errors I came to an API that I like and now you can do this:

import Docker.Client

runNginxContainer :: IO ContainerID
runNginxContainer = runDockerT (defaultClientOpts, defaultHttpHandler) $ do
    let pb = PortBinding 80 TCP [HostPort "0.0.0.0" 8000]
    let myCreateOpts = addPortBinding pb $ defaultCreateOpts "nginx:latest"
    cid <- createContainer myCreateOpts (Just "myNginxContainer")
    case cid of
        Left err -> error $ show err
        Right i -> do
            _ <- startContainer defaultStartOpts i
            return i

I'm using ReaderT to pass in the defaultClientOpts used for configuring the Client. DockerT is just a wrapper around ReaderT:

newtype DockerT m a = DockerT {
    unDockerT :: Monad m => ReaderT (DockerClientOpts, HttpHandler m) m a
}

And I'm also passing in a default HTTP handler. For most people the defaultHttpHandler will be sufficient, but this leaves room for advanced users to provide their own. The handler's type is:

type HttpHandler m = Request -> m (Either DockerError Response)

And now the type of createContainer function looks like this:

createContainer :: forall m. Monad m => CreateOpts -> Maybe ContainerName -> DockerT m (Either DockerError ContainerID)

It's polymorphic in terms of the Monad that's used and it's determined by the how the HTTP handler passed in looks like.

runDockerT :: Monad m => (DockerClientOpts, HttpHandler m) -> DockerT m a -> m a
runDockerT (opts, h) r = runReaderT (unDockerT r) (opts, h)

There were some other considerations like how to handle exceptions. I talked to Michael Snoyman about this and while I agree that in the end if you're ultimately using IO you will have exceptions so there's no reason to hide those and introduce multiple layers of error handling 4. That being said I wanted to make it so that most people don't even have to know that they are talking to a HTTP API. That's why we have the Either in the response and don't just pass the HttpException along to the user. 5 People will be trying to use Docker, and they should only have to think in Docker semantics and not worry if the communication is over HTTP or something else. There's still some work to be done on that front but that's the general idea.

The fact that we can pass in the http handler should be enough for most people, and should come in handy for writing some interesting tests I think.

I intentionally left the entire git history, with all of these various attempts in the repo, so that I (and anyone else) cold refer to them later. It paints a good picture of what it's like to explore haskell and try different approaches.

I found this design to be somewhere in between simplicity and power. As I already said above, any feedback is appreciated and I'd be very much interested to hear about any advice, tip or critique.

What's next:

The biggest challenge that's up next is described in this Github issue. It has to do with streaming. After we figure that out the API should become more stable.

Fin

That's all from me for now. I'm sharing this in the hopes that my story will help someone else that's just starting out with haskell by showing that it's okay to start small, and make a lot of mistakes along the way.

If you've liked what you've read please share it and comment down below. If you want to learn more about how to manipulate docker containers with Haskell please see the docs here.

P.S. If you want to look at some other Haskell code that I wrote check out these projects:

I had to move, and was looking for new apartments to rent, but the good ones go so fast so I needed to know when new ads were published as soon as possible. The result is this scraper, for the local ads website, that would notify me about new listings (based on some criteria) via email. I enjoyed using optparse-applicative for the CLI parts.

I wanted to use hledger for my personal accounting but didn't want to manually enter each statement into the journal file.

I wanted to learn a few frontend frameworks, and new languages like Elm and Purescript, but wanted to do so building a real(ish) project and not just a Todo app. Something like a subset of Trello seemed to have the right amount of complexity in it so I decided to build a dummy-api that I could use for testing. I built it using the excellent Servant library and learned a lot in the process.


  1. I didn't attempt to build a web app right away. 

  2. If you're just starting out I recommend looking at Haskell programming from first principles

  3. The next release might re-introduce lenses because at least CreateOpts has a lot of nested fields that need setting when creating a container 

  4. Relevant article is here

  5. However since advanced users can create their own HttpHandler they can implement it in a way that doesn't mask the underlying Http Exceptions. 

docker haskell

Comments powered by Disqus