Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Why haven't Single Instruction-Set architectures taken off?

If mov is Turing Complete it seems like there'd be a big win here... You could parallelize this massively.

Edit: can someone explain why this is being down voted please, because this is a legit question



They’re impractical and massively inefficient?


Okay?

I'm asking because I don't know; can you elaborate on why that is?

Aren't most modern GPUs similar in that they're designed to just shit out triangles as fast as possible, massively parallel?


Basically, you're going to try to emulate other instructions that you don't have with this one instruction, and that's not going to perform very well because now, instead of many optimized instructions, you have strings of this one instruction in its place. And I don't see any way to parallelize this: you're doing the same thing you always were, just with a bunch more code.


> "can you elaborate on why that is?"

Might as-well suggest we use a single letter, instead of 26.


     .....   ..  ..
    ..   ..  .. ..
    ..   ..  ....
    ..   ..  .. ..
     .....   ..  ..
The first number is zero, right? Then the first letter of the alphabet is well its hard to show because it doesn't print. It's just an empty set, nothing, a stop bit. We actually do use only 1 bit in digital cpus, but a weird mix of analogue in broad band transmission. I wonder why cpus don't use ternary or whatever. But I wonder why asynchronous CPUs didn't take off, so don't mind me, just being bored.


.... --- .-- .- -... --- ..- - - .-- ---?


............... .............. .....

............. ......... ....... ........ ....................

....................... ............... ................. ...........


No, modern GPUs run many of the same instructions as CPUs. They have branches and everything. Their main limitation is that groups of threads are bundled together (called warps) and share a program counter, so lots of branching can result in a lot of wasted work if the threads disagree on which branch to take. That, and there's a huge penalty you pay for moving data across the bus to GPU RAM.


few problems are easily parallelizable. that said, that's not even the issue here. specialized instruction may be emulated by movs, but the speed loss could never be recouped even by massive parallelization.


The problem is probably the address space that movs use, instead of specialized registers with optimized pipelining. But internally, many instructions might actually come down to conditional moves. I guess that's either after the microcode is decoded, or if I guessed wrong about that, then Register Transfer Logik still pretty much sounds like it was based on, well, transfers.


Why would it make sense to use a single instruction?

(Also no, GPUs are not the same at all)


You can perform multiplication by repeated addition, but that is a very inefficient way to multiply. It's the same thing here, where you can replace other instructions with MOV, but the replacement is much slower than the original.


What makes you think this would be easier to parallelize than a traditional application? Just because there is only one kind of instruction used doesn't mean they don't still have to come in the right order!


What makes you think this would be easier to parallelize than a traditional application?

Exactly such an idea was proposed for parallel signal processing quite a while back, actually. Look up One Instruction Set Computer.


I don't see any material online indicating that programs written for one-instruction-set-computers are more parallelizable than programs written for traditional computers. In fact, here is someone claiming the opposite:

> The disadvantage of an MISC is that instructions tend to have more sequential dependencies, reducing overall instruction-level parallelism.

https://ipfs.io/ipfs/QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1m...


Also note that the idea didn't take off.


If mov is Turing Complete it seems like there'd be a big win here... You could parallelize this massively.

Actually, One Instruction Set Computers were proposed for this very application! (Also mentioned elsewhere, but is applicable in a serious way here.)

https://en.wikipedia.org/wiki/One_instruction_set_computer




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: