A DSL in 5 Languages: Ruby, Python, PHP, C#, Java

judofyr · on Aug 3, 2010

A quick tips when creating block DSLs in Ruby:

    def search(&blk)
      if blk.arity == 1
        blk.call(self)
      else
        self.instance_eval(&blk)
      end
    end
    
    # Then the user can decide what he want to use:
    Foo.search { bar }
    Foo.search { |s| s.bar }

troels · on Aug 3, 2010

So, the first invocation is similar to calling `Foo.search(&:bar)`, right?

judofyr · on Aug 3, 2010

Yeah, but the whole point is really repeating local variables vs capturing self:

    # This wouldn't work because it binds the block to Foo, and the @bar
    # would lookup inside Foo instead of the proper context:
    Foo.search { bar == @bar }
    
    # This works fine though:
    Foo.search { |s| s.bar == @bar }

Besides, if you're setting accessor, it can get a little weird:

    # Explicit and nice:
    Foo.search { |s| s.foo = 123 }
    
    # This actually sets a local variable:
    Foo.search { foo = 123 }
    
    # In order to fix it, we're back to the explicit one:
    Foo.search { self.foo = 123 }

There is no silver bullet here, and I believe the best way is to provide both techniques and let the user decide.

Vitaly · on Aug 5, 2010

actually the second. &:bar is equivalent to {|x| x.bar}

Nycto · on Aug 3, 2010

For the PHP version: Why is the search method static? You are forcing whatever class you are using to be tightly coupled. Dependency injection can be done in PHP, too.

Better:

    $bt = Braintree_Transaction_Search::someFactory();
    $collection = $bt->search(array(
        $bt->orderId()->startsWith('a2d')
        $bt->customerWebsite()->endsWith('.com'),
        $bt->billingFirstName()->is('John'),
        $bt->status()->in(array(
            Braintree_Transaction::AUTHORIZED,
            Braintree_Transaction::SETTLED
        )),
        $bt->amount()->between('10.00', '20.00')
    ));

If you're really up to it, there is no reason you couldn't use an even more fluent interface in PHP...

    $collection = $bt->where()
        ->orderId()->startsWith('a2d')
        ->and()->customerWebsite()->endsWith('.com')
        ->and()->billingFirstName()->is('John')
        ->search();

Also, please make sure the search method returns an iterator, not an array. Just because PHP allows you to shoot yourself in the foot, doesn't mean you should.

TeHCrAzY · on Aug 3, 2010

Gosh, every example (even in languages I don't know) are much easier to grok than the PHP one :/

_19qg · on Aug 3, 2010

In Lisp I would write a macro that could be used like this:

    (for-each (transaction)
          :where
          ((order              :starts-with "a2d")
           (customer-website   :ends-with   ".com")
           (billing-first-name :equals      "John")
           (status             :one-of      '(:authorized :settled))
           (amount             :between     '("10.00" "20.00")))
       :do
       (print (id transaction)))

papaf · on Aug 4, 2010

Just a question of personal taste but I'd have the conditions as functions and have search return a list if I was using clojure...

   (braintree/search
    (starts-with :order-id "a2d")
    (ends-with   :customer-website ".com")
    (equals      :billing-first-name "John")
    (one-of      :status  [:authorised :settled])
    (between     :amount 10 20))

The cool thing being that both styles are possible in a lisp.

mkramlich · on Aug 3, 2010

God that's elegant. S-expressions for the win.

No, no, remember: all other languages need a chance to incrementally catch back up to Lisp. :)

postfuturist · on Aug 3, 2010

Python tip: instead of making the user contstruct a list literal, just use (star)args in the function definition. Python will automatically pack up any number of positional arguments into a list for you!

    def search(*args) : ... # args is a sequence

    search(this, that, the_other_thing)

Edit: found the HN FAQ entry on formatting comments, finally.

CodeMage · on Aug 3, 2010

Just to clarify for anyone who isn't familiar with Python: it's asterisk followed by "args". The problem here is that HN comments system interprets asterisk as italic formatting ;)

Edit: Parent comment fixed by postfuturist.

koenigdavidmj · on Aug 4, 2010

Python 3 lets you have named keyword arguments too. Anything after the star-args is a named keyword argument, and you can have a simple star if you do not want the argument list:

    def search(foo, bar, *, baz):

Then, foo and bar would be named arguments, and baz would need to be passed by keyword.

troels · on Aug 3, 2010

> Edit: found the HN FAQ entry on formatting comments, finally.

Really? Where?

justinl · on Aug 3, 2010

http://news.ycombinator.com/formatdoc

troels · on Aug 3, 2010

Swell! Would you happen to know how to format quotations?

Scriptor · on Aug 4, 2010

As in the "Ask HN" type posts? You can simply enter something into the text field when making a submission. It will then ignore the URL field.

drewolson · on Aug 4, 2010

This is a great point and is something we're planning on adding in an upcoming release of our Python library.

d0mine · on Aug 3, 2010

  def search(*args): ...

andybak · on Aug 3, 2010

and you can be extra nice and accept either a list or positional arguments...

kg · on Aug 3, 2010

Not using 'var' (or LINQ, for that matter) in C# seems an odd choice. 'Separate steps for creating the request and performing the search' seems more like an advantage (for Java and C#) to me than a disadvantage, since you can easily eliminate the overhead with a helper function and the separation of a request and a search means that you can reuse a single request instance if you see the need.

I also consider statements like 'search.amount.between "10.00", "20.00"' or 'Amount.Between(10.00M, 20.00M)' to be vastly inferior to the native alternatives: for example, in C#, you could write that predicate as '(amount) => (amount > 10.00M && amount < 20.00M)' and in Python, it would become the even shorter 'lambda amount : 10 > amount > 20'. In each case the native way of expressing the logic requires no special knowledge of the 'fluent' API.

drewolson · on Aug 3, 2010

kevingadd -

We tend to use var when the return type is obvious, but we wanted to be explicit in the examples. Unfortunately, LINQ is not an option for us as we support .NET 2.0.

Your C# and Python predicate examples are quite readable, but in this case they don't work for us. The return type of each of these statements would be true or false, whereas we are building XML requests to be sent to a server to perform the searches. The call to Amount.Between(...) builds an XML node behind the scenes representing the search data.

Drew Olson (Braintree Dev)

ecoffey · on Aug 4, 2010

You mean you support C# 2.0, since var, LINQ, et al run on .Net 2.0 :-P

Doing a full blown Linq Provider would be pretty cool, but would also be very very very time consuming to get right. Expression trees are pretty cool, but you're still dealing with the AST so it gets crazy. There needs to be a Linq Provider framework / library to work at a slightly higher abstraction level.

confuzatron · on Aug 3, 2010

Ignoring the .NET version problem, not using LINQ probably makes plenty of sense, given that implementing a LINQ solution would be much much more complicated to implement than what you've done. Breaking apart expressions to get to your XML requests would be hard. I also think the proposal above where each criteria is expressed as an individual lamdba is the worst of both worlds in terms of ease of implementation and readability.

jeswin · on Aug 3, 2010

Fluent interfaces are cool, but they throw away the semantics of the programming language like meaningful return values.

The DSL in the article is a good example of this problem:

  OrderId.StartsWith("a2d")......

So the StartsWith() returns a TransactionSearch, instead of a bool as you would expect. Once you do this often enough, you really get objects full of state-information, rather than methods with return values.

An alternative to doing this is using Expression Trees, but somewhat harder (and not without faults) It would look like this:

  search.Where(s => s.Order.Id.StartsWith('a2d') && 
    s.Customer.Website.EndsWith('.com') && ...)

Two benefits come to mind: 1. StartsWith() can return a bool as you would expect. 2. You can use StartsWith() on strings, like OrderId --- I am not sure how you pulled this off in the DSL example (is OrderId a custom type, since you can't use Extension methods there?)

The biggest drawback of course is: 1. Harder - You will need to parse the Expression Tree.

mquander · on Aug 3, 2010

I replied to say as much but then I tossed my reply, because the guy says he's on .NET 2.0.

i2 · on Aug 3, 2010

You can make it without any 'weaknesses' in Python:

  collection = Transaction.search(
      order_id__starts_with='a2d',
      customer_website__ends_with='.com',
      billing__first_name__exact='John',
      status__in=[
        Transaction.Status.Authorized,
        Transaction.Status.Settled
      ],
      amount__between=("10.00", "20.00")
  )

the implementation could look like this:

  def search(**kwargs):
      for arg, value in kwargs.items():
          action = arg.split('__')
          attr = getattr(self, action[0])
          if len(action) == 2:
              method = getattr(attr, action[1])
              method(value)
          etc.
          .....

EDIT: removed unneeded quotes, thanks for correction, postfuturist

drewolson · on Aug 4, 2010

When we began building our Python library, we looked at both django[1] and sqlalchemy[2] for inspiration in designing our search DSL. Your example seems to closely match the django style. We preferred the sqlalchemy style, but both are solid choices.

[1]http://docs.djangoproject.com/en/1.2/topics/db/queries/#chai... [2]http://www.sqlalchemy.org/docs/ormtutorial.html#common-filte...

postfuturist · on Aug 3, 2010

Good suggestion. Also, the quotes aren't necessary on the keys of the keyword arguments:

  collection = Transaction.search(
      order_id__starts_with='a2d',
      customer_website__ends_with='.com',
      billing__first_name__exact='John',
      status__in=[
        Transaction.Status.Authorized,
        Transaction.Status.Settled
      ],
      amount__between=("10.00", "20.00")
  )

koenigdavidmj · on Aug 4, 2010

Is that really idiomatic other than for people used to Django's ORM?

extension · on Aug 3, 2010

In Java, I would do this:

  new Search().equal(Search.NAME,"Joe")
              .between(Search.AGE,18,25)
              .go();

Just as concise with less voodoo. And it's a single statement.

With the existing method, you can still ditch the empty parens by using public final fields. You just have to pre-populate them all with respective grammar objects.

pgr0ss · on Aug 3, 2010

Your syntax is definitely simpler. One reason we did it the way we did was to get compiler checking for operators. We've tried to follow idioms in different client libraries, and for the statically compiled languages, we've tried to make it so if it compiles, it will work.

Since the field comes first (orderId().startsWith()), if you try to call an operator on a field that does not support it, it will not compile.

supersillyus · on Aug 3, 2010

You should make it clearer in the commentary that you're holding the static languages to a higher standard.

d0m · on Aug 3, 2010

I really enjoy reading these articles comparing different implementations in different languages. I wish we had more of that on HN.

Concerning the article, call me grumpy if you want, I like better when there is no mass overloading even thought it's more verbose. From my maintenance experience working on various project, I always cry when after 3 hours of searching I find that "+" is overloaded and that's where the bug was hidden.

Also, for this particular implementation, we could simply use a builder.

SearchBuilder sb;

sb.equal(Search.Name, ".."); sb.equals(Search.AGE, ..);

Search s = new Search(sb);

Even thought it's more verbose, we clearly see that we configure the builder as we want, and then, we create the search object. A good side effect of that is that we get an immutable Search object. Also, we could use the "fluent" interface on the builder.

Another verbose approach might be using lots of Objects..

Search s; s.add_contraint(SearchEqual(Search.Name, "bob"));

This way, it make it easier to add constraint without modifying the Search class.

jacquesm · on Aug 3, 2010

Overloading basic operators is the road to hell. Yes, you can do wonderful and elegant things but when it goes wrong your whole mechanism for analysing what you look it is screwed up because the more experience you have in a language the more you tend to be on autopilot for stuff like that.

'test your assumptions' suddenly expands to include checking if basic language operators still do what you expect them to do.

and

  v2 = vec_add(v1,v3);

really isn't that much better than

  v2 = v1 + v3;

I know this is not a popular opinion but that's how I've always felt about it and those that complain about how difficult languages like c++ care usually painted themselves in a corner by excessive use of such features.

psadauskas · on Aug 3, 2010

Let's clean up the Ruby example a bit, shall we?

  collection = Braintree::Transaction.search do
    order_id.begins_with?       "a2d"
    customer_website.ends_with? ".com"
    billing_first_name ==       "John"
    status.in?                  Status::Authorized, Status::Settled
    amount.between?             "10.00", "20.00"
  end

  collection.each do |transaction|
    puts transaction.id
  end

cschep · on Aug 3, 2010

are the tabs breaking it into two "columns" idiomatic ruby? I prefer them all flowing "naturally" together.

dan_manges · on Aug 3, 2010

I used to vertically align things into columns in Ruby. It's easy to maintain when you're editing individual lines, but it can easily be messed up with a find-and-replace in file. At that point, it becomes harder to read with some lines off, and a pain to maintain the alignment.

Calamitous · on Aug 4, 2010

If you use Vim, the Align plugin handles this quite well: http://vim.wikia.com/wiki/Align_text_plugin

scott_s · on Aug 4, 2010

I've grown to dislike that habit in all languages. It's too cute and it doesn't scale with code changes. It's easier for me to read code that's formatted in a standard way, even if that means conceptual columns don't line up.

rue · on Aug 4, 2010

    amount.within?             "10.00".."20.00"

Or somesuch. Ranges are always overlooked.

roryokane · on Aug 3, 2010

In the Ruby code, you include

  search.status.in(
    Braintree::Transaction::Status::Authorized,
    Braintree::Transaction::Status::Settled
  )

, but wouldn’t it be better to allow just

  search.status.in(:authorized, :settled)

? (And I agree with judofyr’s comment that “search.” shouldn’t be necessary on every line.)

postfuturist · on Aug 3, 2010

You can also handle a variable number of positional arguments in PHP instead of making the client send in a literal array.

    function search() { $arg_list = func_get_args(); ... }

devcjohnson · on Aug 4, 2010

I'm confused in that none of the examples look like a DSL to me. They look like a library API - implemented in five different languages but I can hardly classify any of them as a DSL. Am I really that obtuse concerning this question?

kjbekkelund · on Aug 3, 2010

What about doing it like this in Ruby:

    collection = Braintree::Transaction.search do
      order_id /^a2d/
      customer_website /\.com$/
      billing_first_name "John"
      status :authorized, :settled
      amount 10..20
    end
    
    collection.each do
      puts id
    end

It might be more difficult not having the specific methods (in, starts_with_ ends_with, and so on), but I think it's quite readable. If we don't want the regexps, we could go for a MySQL style '%.com' and 'a2d%' or something similar.

koenigdavidmj · on Aug 4, 2010

They said above that they were turning the conditions into XML in the request. That's one of the reasons why C# expression trees were out.

icode · on Aug 3, 2010

Why dont they just take a string as the argument to search() and parse it? They could use a very SQL like Syntax:

  $collection = Braintree_Transaction::search("
   orderId LIKE 'a2d%'
   AND customerWebsite LIKE '%.com'
   AND billingFirstName='John'
   AND status() IN ('AUTHORIZED','SETTLED')
   AND amount>=10
   AND amount<=20.00)
  ");

rndmcnlly0 · on Aug 3, 2010

One of the big reasons to build the DSL in the first place is to let the host language's syntax help you spot problems in your query. Syntax errors can be spotted from editor-provided syntax highlighting directly or slightly later when the code is read by the interpreter -- much sooner than when your runtime library finally decides to look at it (at search time).

st0p · on Aug 3, 2010

Yeah, i've never done any research into DSL's, but I always thought they would have some sort of custom syntax. Since DSL's seem quite the hype ATM, I'm a bit dissapointed by these examples. Sure, nice coding and all that (even though I would have used LINQ myself, but then again I'm not restricted to C# 2.0), but I fail to to see where the L of DSL comes into play.

If this is the latest hype I didn't have time for to research, I'm not missing out.

Lewisham · on Aug 4, 2010

There are two types of DSLs, internal and external. External ones are the custom syntax you talk of, internal are DSLs that use features or quirks of a host language to appear different.

XML or SQL are external DSLs, much of the cool Ruby stuff like Rake and Rails are internal DSLs. The examples presented are all internal (presumably for a good reason). There are advantages to both approaches. Read Martin Fowler's discussion at http://martinfowler.com/dslwip/

JadeNB · on Aug 4, 2010

Dan Bernstein argues that the most bug-prone part of a user-facing program is usually its parsing: see (5) of http://cr.yp.to/qmail/guarantee.html. chromatic summarises this as “Don't parse that string!” (http://www.modernperlbooks.com/mt/2010/07/dont-parse-that-st...).

bsdemon · on Aug 3, 2010

Why no Scala?