> I don't follow you, what do computed gotos have to do with threaded code? 1. B...

simias · on June 23, 2015

Oh I think I get it, thanks (although I still don't understand what "threaded" means in this context).

The only downside I can see is that you'd significantly increase the size of the code. On a 64bit architecture you trade each byte for a 64bit address, effectively multiplying by eight the footprint in the data cache. A 256 entry LUT on the other hand will fit snugly in cache and the lookup shouldn't be very costly.

Also if I understood you correctly what you're proposing doesn't have much to do with the "computed gotos" extension.

sklogic · on June 23, 2015

> I still don't understand what "threaded" means in this context

https://en.wikipedia.org/wiki/Threaded_code#Indirect_threadi...

> The only downside I can see is that you'd significantly increase the size of the code

This is why adding the jumptbl_base on 64-bit platforms. Pointers are still 32-bit, just with an added offset (and a tiny overhead).

> doesn't have much to do with the "computed gotos" extension

You cannot implement indirect threaded code without a computed goto. You can implement a direct threaded code, of course (generating jump or call instructions directly), but this is a totally different thing.

simias · on June 23, 2015

Ah, thank you for the link, that was the missing piece of the puzzle! :)

arcatek · on June 23, 2015

Even without inlining? Since the compiler cannot know what instruction will be called, it cannot inline it in the loop.

sklogic · on June 23, 2015

Why do you want any inlining? It's a threaded code. You know an address of each instruction handler, so you can replace an opcode with this address and eliminate a switch and any table lookups altogether.