Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Too bad it lacks even the streaming mode SVE2 found in M4 cores. If only Apple would provide a full SVE2 implementation to put pressure on ARM to make it non-optional so AArch64 isn't effectively restricted to NEON for SIMD.


This is for AI which is going to benefit more from use of metal / NPU than SIMD.


Sure, but larger models that fit in that 512gb memory are going to take a long time to tokenize/detokenize without hardware-accelerated BLAS.


Why would you need BLAS for tokenization/detokenization? Pretty much everyone still uses BBPE which amounts to iteratively applying merges.

(Maybe I'm missing something here.)


Tokenization/detokenization does not use BLAS.


Hell I’m just sitting here hoping the future M5 adopts SVE. Not even SVE2.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: