I used RPython on my own interpreter project a few years ago (stopped working on it around 2013). It's a very interesting approach to writing interpreters/JIT compilers, and produces very fast code, but developing RPython code was very painful for a few reasons (back then, maybe they got fixed in the mean time):
1) Huge compilation times, and compilation is non-incremental. Making even a small change to the source code causes it to be fully re-compiled, which on our project took 15-20 minutes (I can only imagine how painful this is on PyPy, which took me around 2 hours to build the time I tried it). I think the root cause of this is the static analysis and type inference, which needs to run again on the entire source code, and proved to be really slow on a huge code base. This was painful for development, so much time wasted on waiting on the compiler.
2) My experience with error messages was not as positive as the OP's. Sometimes, I'd make a type error in the code and get a cryptic error message, and have to guess by myself what caused it. Perhaps things have improved since then (I see some new details in the errors in the article that weren't there when I used RPython).
I worked on it in 2015, and it seems like 30 seconds is the minimum compilation time for something trivial. That, combined with the error messages, made me give up after just a few days.
I would have much preferred type inference with robust error messages, or frankly, even Java style verbosity to what I got out of RPython. It's a shame, because it's obviously a very impressive system.
If you mean run RPython code as regular Python code, that doesn't always catch typing problems. For example, returning a None where an int is expected works fine in Python, not so in RPython.
RPython and PyPy are (IMO) the coolest Python projects out there. I made a little Brainfuck intepreter with it and it was super simple, but making small changes seemed to slow it down a lot. There is very little visibility as to how the code is compiled - i.e should I use a list or a tuple for some things? Can PyPy work out that a list is fixed sized? How should I classes to make them as efficient as possible (like are classes with two or so fields passed as structs)?
Note that, as far as I know (the rpython/pypy team will confirm or infirm) RPython is not intended to be a general-purpose python-like language. For that you want cython or nim. RPython is a toolkit for building language VMs. I'm guessing that's why relatively little work has gone into error reporting.
Yep. RPython is not a language as designed, it's an arbitrary set of restrictions to the Python bytecode and standard library. These restrictions change as developers implement new features and as time goes on. For instance, one restriction I had when I was working with PyPy was "str.strip can only take one character", but this has been removed in subsequence PyPy versions.
I hadn't heard the word infirm before so I looked it up. It seems to me that it does not mean what you think it means. Thought you might want to know that. http://www.merriam-webster.com/dictionary/infirm
I always had the impression that the more "RPython-like" your Python code is, the better PyPy can optimize it. If that is true, then I see a lot of value in this post. However I'm unsure how much truth is in my belief.
I'd agree. The thing is: for stock Python, you want to minimize interpreter load. Use a lot of list comprehensions, that sort of thing. Whereas with PyPy you want to do the opposite.
For instance, I have the following two functions:
def atLeast2(a,b,num):
return sum(x==y for x,y in itertools.zip_longest(reversed(bin(a).partition('b')[-1]), reversed(bin(b).partition('b')[-1]), fillvalue='0')) >= num
def atLeast4(a,b,num):
count = 0
while a > 0 or b > 0:
x = a % 2
y = b % 2
if x == y:
count += 1
if count >= num:
return True
a //= 2
b //= 2
return count >= num
In Python, atLeast2 is ~2.7x faster than atLeast4. In Pypy, atLeast2 is 1.9x slower than atLeast4.
(The ordering is roughly, using relative numbers (lower = faster), and checking for at least 96 bits in common out of 128 for random inputs:
def atLeast(a, b, num):
count = 0
for x, y in zip(bin(a).partition('b')[-1], bin(b).partition('b')[-1]):
if x == y:
count += 1
if count >= num:
return True
return False
def atLeast3(a,b,num):
count = 0
while a > 0 or b > 0:
x = a % 2
y = b % 2
if x == y:
count += 1
a //= 2
b //= 2
return count >= num
)
I must say that RPython is not a very good name. I assumed it was a system for R and python integration. Like say the RPython package http://rpython.r-forge.r-project.org
I had the exact same thought. R is a pretty high-profile language; they could even couch a name change in positive terms. They'd probably get more publicity if they weren't competing with a popular package.
1) Huge compilation times, and compilation is non-incremental. Making even a small change to the source code causes it to be fully re-compiled, which on our project took 15-20 minutes (I can only imagine how painful this is on PyPy, which took me around 2 hours to build the time I tried it). I think the root cause of this is the static analysis and type inference, which needs to run again on the entire source code, and proved to be really slow on a huge code base. This was painful for development, so much time wasted on waiting on the compiler.
2) My experience with error messages was not as positive as the OP's. Sometimes, I'd make a type error in the code and get a cryptic error message, and have to guess by myself what caused it. Perhaps things have improved since then (I see some new details in the errors in the article that weren't there when I used RPython).