The Magic of RPython

ahomescu1 · on Aug 1, 2015

I used RPython on my own interpreter project a few years ago (stopped working on it around 2013). It's a very interesting approach to writing interpreters/JIT compilers, and produces very fast code, but developing RPython code was very painful for a few reasons (back then, maybe they got fixed in the mean time):

1) Huge compilation times, and compilation is non-incremental. Making even a small change to the source code causes it to be fully re-compiled, which on our project took 15-20 minutes (I can only imagine how painful this is on PyPy, which took me around 2 hours to build the time I tried it). I think the root cause of this is the static analysis and type inference, which needs to run again on the entire source code, and proved to be really slow on a huge code base. This was painful for development, so much time wasted on waiting on the compiler.

2) My experience with error messages was not as positive as the OP's. Sometimes, I'd make a type error in the code and get a cryptic error message, and have to guess by myself what caused it. Perhaps things have improved since then (I see some new details in the errors in the article that weren't there when I used RPython).

hyperpape · on Aug 1, 2015

I worked on it in 2015, and it seems like 30 seconds is the minimum compilation time for something trivial. That, combined with the error messages, made me give up after just a few days.

I would have much preferred type inference with robust error messages, or frankly, even Java style verbosity to what I got out of RPython. It's a shame, because it's obviously a very impressive system.

andrewchambers · on Aug 1, 2015

The best way to work with it is to write tests which can run in interpreted mode.

ahomescu1 · on Aug 2, 2015

If you mean run RPython code as regular Python code, that doesn't always catch typing problems. For example, returning a None where an int is expected works fine in Python, not so in RPython.

orf · on Aug 1, 2015

RPython and PyPy are (IMO) the coolest Python projects out there. I made a little Brainfuck intepreter with it and it was super simple, but making small changes seemed to slow it down a lot. There is very little visibility as to how the code is compiled - i.e should I use a list or a tuple for some things? Can PyPy work out that a list is fixed sized? How should I classes to make them as efficient as possible (like are classes with two or so fields passed as structs)?

616c · on Aug 1, 2015

The coolest example of RPython in action, after PyPy of course (but goes without saying): Pixie, the lang and VM that is a native, high-speed Lisp.

https://github.com/pixie-lang/pixie

masklinn · on Aug 1, 2015

Note that, as far as I know (the rpython/pypy team will confirm or infirm) RPython is not intended to be a general-purpose python-like language. For that you want cython or nim. RPython is a toolkit for building language VMs. I'm guessing that's why relatively little work has gone into error reporting.

Jasper_ · on Aug 1, 2015

Yep. RPython is not a language as designed, it's an arbitrary set of restrictions to the Python bytecode and standard library. These restrictions change as developers implement new features and as time goes on. For instance, one restriction I had when I was working with PyPy was "str.strip can only take one character", but this has been removed in subsequence PyPy versions.

McElroy · on Aug 1, 2015

I hadn't heard the word infirm before so I looked it up. It seems to me that it does not mean what you think it means. Thought you might want to know that. http://www.merriam-webster.com/dictionary/infirm

blandinw · on Aug 1, 2015

It's a mistake French (Latin?) speakers (myself included) often make: https://translate.google.com/m/translate#fr/en/infirmer

dr_zoidberg · on Aug 1, 2015

I always had the impression that the more "RPython-like" your Python code is, the better PyPy can optimize it. If that is true, then I see a lot of value in this post. However I'm unsure how much truth is in my belief.

TheLoneWolfling · on Aug 1, 2015

I'd agree. The thing is: for stock Python, you want to minimize interpreter load. Use a lot of list comprehensions, that sort of thing. Whereas with PyPy you want to do the opposite.

For instance, I have the following two functions:

  def atLeast2(a,b,num):
    return sum(x==y for x,y in itertools.zip_longest(reversed(bin(a).partition('b')[-1]), reversed(bin(b).partition('b')[-1]), fillvalue='0')) >= num

  def atLeast4(a,b,num):
    count = 0
    while a > 0 or b > 0:
      x = a % 2
      y = b % 2
      if x == y:
        count += 1
        if count >= num:
          return True
      a //= 2
      b //= 2
    return count >= num

In Python, atLeast2 is ~2.7x faster than atLeast4. In Pypy, atLeast2 is 1.9x slower than atLeast4.

(The ordering is roughly, using relative numbers (lower = faster), and checking for at least 96 bits in common out of 128 for random inputs:

    1.0 pypyatLeast
    3.2515636711379905 pypyatLeast4
    3.5864477527073473 pypyatLeast3
    4.430998164947921 pythonatLeast
    5.903265617327027 pythonatLeast2
    6.306511850301104 pypyatLeast2
    15.832777648548758 pythonatLeast4
    15.870273448605621 pythonatLeast3

Note that atLeast2 is slower in PyPy than Python!

    def atLeast(a, b, num):
        count = 0
        for x, y in zip(bin(a).partition('b')[-1], bin(b).partition('b')[-1]):
            if x == y:
                count += 1
                if count >= num:
                    return True
        return False
        
    def atLeast3(a,b,num):
      count = 0
      while a > 0 or b > 0:
        x = a % 2
        y = b % 2
        if x == y:
          count += 1
        a //= 2
        b //= 2
      return count >= num

)

tachion · on Aug 2, 2015

You should report this scenario to PyPy - they're waiting for and love examples where Python is faster than PyPy.

TheLoneWolfling · on Aug 2, 2015

I shall do so. Thank you for reminding me!

Edit: it appears to be covered by https://bitbucket.org/pypy/pypy/issues/2100/sum-map-foo-is-m...

cschmidt · on Aug 1, 2015

I must say that RPython is not a very good name. I assumed it was a system for R and python integration. Like say the RPython package http://rpython.r-forge.r-project.org

AlphaSite · on Aug 1, 2015

It makes sense when you consider it as 'Restricted Python' (although there is another project of the same name).

et2o · on Aug 1, 2015

I had the exact same thought. R is a pretty high-profile language; they could even couch a name change in positive terms. They'd probably get more publicity if they weren't competing with a popular package.

maroonblazer · on Aug 1, 2015

Me too. Why not call it PythonC?

bbrazil · on Aug 1, 2015

That could cause confusion with CPython (the main python implementation) and possibly also Cython (C extensions for Python).

pekk · on Aug 2, 2015

RPython isn't really much like C, so that would be misleading.