Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Despite the other answers, I will tell you the grim truth: Your mileage might vary.

It's an empirical question and depends upon the nature of your problem and data. You should try all three fp32, fp16, and bf16 as part our model selection / hyperparameter tuning.

For example, in audio generative models (where typical output is 16-bit), I've sometimes found that fp16 and bf16 just don't produce good output as fp32 weights.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: