Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I don't follow you, what do computed gotos have to do with threaded code?

1. Build a jump table

2. Read a bytecode

3. Replace all the instruction codes in the bytecode stream with values from the jump table (i.e., build a threaded code from your bytecode).

4. In a loop, take the next value at PC and jump to it - it's already an address. No table lookups (just once, when you load your bytecode). So it's much faster than any switch can be.

EDIT: probably you did not notice that for most of the targets jumptbl_base = 0, and an indirection is removed.



Oh I think I get it, thanks (although I still don't understand what "threaded" means in this context).

The only downside I can see is that you'd significantly increase the size of the code. On a 64bit architecture you trade each byte for a 64bit address, effectively multiplying by eight the footprint in the data cache. A 256 entry LUT on the other hand will fit snugly in cache and the lookup shouldn't be very costly.

Also if I understood you correctly what you're proposing doesn't have much to do with the "computed gotos" extension.


> I still don't understand what "threaded" means in this context

https://en.wikipedia.org/wiki/Threaded_code#Indirect_threadi...

> The only downside I can see is that you'd significantly increase the size of the code

This is why adding the jumptbl_base on 64-bit platforms. Pointers are still 32-bit, just with an added offset (and a tiny overhead).

> doesn't have much to do with the "computed gotos" extension

You cannot implement indirect threaded code without a computed goto. You can implement a direct threaded code, of course (generating jump or call instructions directly), but this is a totally different thing.


Ah, thank you for the link, that was the missing piece of the puzzle! :)


Even without inlining? Since the compiler cannot know what instruction will be called, it cannot inline it in the loop.


Why do you want any inlining? It's a threaded code. You know an address of each instruction handler, so you can replace an opcode with this address and eliminate a switch and any table lookups altogether.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: