Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> They definitely could have made avx512 instructions trigger a switch to p-cores

Not really, no. OS-level schedulers are complicated as is with only P vs E cores to worry about, let alone having to dynamically move tasks because they used a CPU feature (and then moving them back after they don't need them anymore).

> and honestly probably could have supported them completely by splitting the same way AMD does on Zen4 and Zen5 C cores.

The issue with AVX512 is not (just) that you need a very wide vector unit, but mostly that you need an incredibly large register file: you go up from 16 * 256 bit = 4096 bits (AVX2) to 32 * 512 bit = 16384 bits (AVX512), and on top of that you need to add a whole bunch of extra registers for renaming purposes.



> The issue with AVX512 is not (just) that you need a very wide vector unit, but mostly that you need an incredibly large register file

Not necessarily, you need to behave as if you had that many registers, but IMO it would be way better if the E cores had supported avx512, but half of the registers actually didn't exist and just were in the L2 cache.


Also Zen4C has AVX512 support while being only ~35% bigger than Gracemont (although TSMC node advantage means you should possibly add another 10% or so). This isn't really a fair comparison because Zen4c is a very differently optimized core than Intel's E cores, but I do think it shows that AVX-512 can be implemented with a reasonable footprint.

Or if Intel really didn't want to do that, they needed to get AVX-10 ready for 2020 rather than going back and forth on it fore ~8 years.


They could enable it on P cores with a separate enablement check and then leave it up to the developer to schedule their code on a P core. I imagine Linux has some API to do that scheduling (because macOS does), not sure about Windows.


So introduce performance and efficiency profiles for threads at the OS level. Why should CPUs have to be heterogeneous with regard to the ISA and other details?


You don't need to switch the entire cores. You could have E cores borrow just the AVX512 circuitry from the p cores.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: