Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Luckily BF16 is just a truncated FP32. That means that the hardware can do BF16, just you don't get any performance benefit compared to FP32 (and depending on the hardware design, you might also have to space the data 4 bytes apart rather than 2), so you lose the memory bandwidth and RAM usage benefits too.


At that point it’d be better to do everything in fp32. The hardware can’t do bf16 in the way you’re saying; the conversions would consume all your time.


Compute in F32, but then round and pack a pair of BF16 into 4 bytes.


The conversions are just a mask and shift? Super cheap


You still get a perf benefit from half the memory traffic and keeping twice as much data in caches, since you can do the expansion to f32 when loading into registers.


Conversions from IEEE-32 to BF16 don't round?


I don't believe the standard defines it. I believe implementations truncate (ie. round towards zero).

Remember BF16 was invented specifically to be able to be backwards compatible with existing silicon - and pulling 2 bytes out of 4 is a far cheaper operation than any rounding.


Just to elaborate, as I was confused about this and had to look it up: BF16 is indeed designed to just be a truncated F32: you can grab the top 16 bits of a F32 value and it'll still "make sense": the sign bits are in the same place in both (unsurprisingly), and the exponent part of BF16 and F32 are both 8 bits. In the case of the mantissa, you end up grabbing the top 7 bits of the F32's 23-bit mantissa, so it all works out, as this will "round" the value toward zero.


There's no standardized definition of BF16.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: