Siege, a DBMS written in Haskell

tikhonj · on Dec 25, 2011

It's always cool to see a nontrivial Haskell project.

This is also a great answer to one of the more prevalent complaints I hear about Haskell, namely that it is only good at the things it was designed for. I think it's safe to say that writing a database was not one of the things it was "designed for"; I'm also assuming you were relatively happy with the language for this project.

One niggle: have you considered reorganizing your directory structure to look vaguely like this[1]? This should make packaging your project and adding tests easier. Using hierarchical modules (e.g. Database.Seige.DBList instead of just DBList) is probably also a good idea.

[1]: http://www.haskell.org/haskellwiki/Structure_of_a_Haskell_pr...

dons · on Dec 25, 2011

There's around four thousand packages on Hackage now (>2M loc); the last company I worked at had more than 1M lines of .hs; my current employer has maybe 800k lines of .hs

When does the no-non-trivial-code meme go away?

---

As an aside, I'm looking to hire a Haskell developer in NYC.

Strong computer science background, serious Haskell development experience (ideally be able to point at some Hackage libraries or apps you've written), math or financial modelling a plus.

Please contact me if you're interested.

tikhonj · on Dec 26, 2011

Regarding your aside: I don't suppose you're also looking for an intern for the summer?

dons · on Dec 26, 2011

Good idea. I'll look into it!

DanWaterworth · on Dec 25, 2011

I'm the project's creator.

Haskell turned out to be a fantastic language to write this in; I especially liked how I was able to use the type system to separate out the levels of abstraction in the code base. I'm no stranger to implementing database like things and of the languages I've used in the past, I wouldn't favour any of them over Haskell in this setting.

I'll get right on to reorganization, it was on my todo list, but it's always more fun to add features instead. Thanks for taking the time to look at Siege.

tikhonj · on Dec 25, 2011

Heh, glad to hear you liked using Haskell. It's a cool project; if I knew more about databases (I'm probably going to take a class on them soonish, but for now I'm completely ignorant) I'd look into it more deeply.

I completely understand not cleaning your code up--for some of my old projects in various languages, that's still on the todo list :) I just wanted to make sure you knew about that article--had I known it when starting some of my earlier projects, they would have been less of a mess initially. (Admittedly my last project was for a hackathon and was an unholy amalgamation of Haskell, Scala, JavaScript and a bit of Scheme; nothing could make that neat. :()

DanWaterworth · on Dec 25, 2011

I know how you feel, it's so easy to postpone proper organization indefinitely. What's funny is that I've spent a great deal of effort making sure that the actual code is logical and well thought though, it's literally just the directory structure that's disorganized.

zohebv · on Dec 25, 2011

I have tried writing non-trivial Haskell programs in the past, but the IOMonad/lack of parameterized modules has stifled me.

Essentially, I start with basic code and possibly use some constants in the beginning. Eventually, I need the constants to be loaded from a config file or command line and now a huge swath of my code, which was perfectly pure, needs to be dragged into the IO Monad and the refactoring effort involved is huge, as no function that is even indirectly touching the so-called constant is now pure. This ends up being extremely painful. I have asked around and 2 suggestions I received, dump all my refactored functions inside a giant let clause, or wait for parameterized modules. How did you deal with this issue?

jrockway · on Dec 25, 2011

You pass the data around to the functions that need it, just like you would in any programming language.

The simplest case is:

   do_work :: Config -> Result
   do_work config = ...

   read_config :: IO Config

   main = do
       config <- read_config
       print . do_work $ config

But you can also put the configuration in a Reader, and avoid the step of manually passing the config to each function that needs it. Instead of:

    my_program :: Config -> Result
    step1 :: Config -> Arg -> OneResult
    step2 :: Config -> Arg -> SecondResult
    my_program config = step2 config . step1 config

You'll write:

    my_program = runReader config $ do
        x <- step1 42
        return . step2 $ x

So let clauses and parameterized modules are not even in the running. Whenever you have a problem in Haskell, the best way to solve it is to ask yourself how you'd solve it in some other language. Then do that.

(Nobody writes software in any language where every function is responsible for reading a config file; that functionality is delegated to some common instance that is passed around as needed. So do that in Haskell, too.)

jrockway · on Dec 27, 2011

Reading this a day later, there were a couple brainos in there. First, the second argument to step2 should be of type OneResult rather than Arg. Secondly, there is no need for return at the end of my_program, since step1 and step2 are both "in Reader":

    my_program = do
        x <- step1 42
        step2 x

or my_program = step2 =<< step1 42

(I use =<< instead of >>= so that nonadic composition reads like normal composition (.).)

Anyway, then do:

    runReader my_program config

dons · on Dec 25, 2011

If your configs become parameters to your app at initialiisation time, you're in the (pure) Reader monad. See xmonad for an example.

gregwebs · on Dec 25, 2011

I actually think this is a very poor answer. This project is in its infancy and thus very easy to dismiss as vaporware, or if you actually used it you might come away with a bad impression (as you might with any alpha software). The question itself is ridiculous and should prompt some socratic questions (what do you think Haskell is designed for?). Like most programming languages, it is not designed for any particular domain, just for high-level programming, and it can excel at most domains once good libraries are created.

tikhonj · on Dec 25, 2011

I'm not saying it's a valid complaint; however, I don't think it's uncommon. I think the person in question was assuming that Haskell was designed for mathy/programming languagey type stuff (I'm being vague and mangling English because I'm tired :)).

This is a great counter-example: it's a project in a different field and, most importantly, the programmer enjoyed using Haskell for it. As long as it at least mostly works, the actual quality and polish of the project is immaterial--you cannot really expect a spare-time project, even in a perfect language, to be perfect; its existence alone signals that Haskell is a viable multipurpose language that isn't just a silly academic curiosity.

gregwebs · on Dec 25, 2011

I strongly disagree that unpolished programs (done under time constraints or otherwise) signal that a language is more than something satisfying (academic) curiosity.

I am a contributor to the Yesod web framework (so I know this example is a good one - there are certainly others). Yesod is robust software used in production right now. Feel free to point people to www.yesodweb.com as an example. We are doing traditional IO-heavy database oriented web development, but leveraging Haskell to make things better in ways not possible in other langauges.

jasonlotito · on Dec 25, 2011

The name confused me. siege is actually a tool for web server stress testing. Not sure if this matters, but I read this as someone using siege to test the results from a DB written in Haskell.

derwiki · on Dec 25, 2011

When I was working on the database engine of Big Corporation, I kept thinking to myself that C was a poor choice. Sure, it's close to the metal -- but those gains are often lost by the incredible amount of complexity that is added. I always said a Python database with CPython modules would be almost as fast and 100x easier to maintain; glad to see someone took this further and did it in Haskell. Great work, Dan!

DanWaterworth · on Dec 25, 2011

Actually, that's almost exactly the progression I took; C -> Python -> Haskell.

aangjie · on Dec 25, 2011

Interesting.. I am looking to start with some hands-on haskell.. And DBMS is one topic am fluent with... Will try to contribute once i get it running on my ubuntu box.. Is the TODO the right place to start??

DanWaterworth · on Dec 25, 2011

Hi, I'm the project's creator (and I also use ubuntu).

It's great to hear that you'd like to contribute. One of the things I'm trying to do is to expand the the subset of redis commands that it can respond to.

It should be simple to implement `strlen` ( http://redis.io/commands/strlen ), it just involves adding another clause to the readCommand function in Commands.hs and you can use `get` as a reference, but it should give you a taste of the codebase.

To anyone else reading this who'd also like to contribute, forks and pull requests are always welcome and there are plenty of other redis commands that I haven't got around to :)

zht · on Dec 25, 2011