Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If these are FP4 like the other ollama models then I'm not very interested. If I'm using an API anyway I'd rather use the full weights.


OpenAI has only provided MXFP4 weights. These are the same weights used by other cloud providers.


Oh, I didn't know that. Weird!


It was natively trained in FP4. Probably both to reduce VRAM usage at inference time (fits on a single H100), and to allow better utilization of B200s (which are especially fast for FP4).


Interesting, thanks. I didn't know you could even train at FP4 on H100s


It's impressive they got it to work — the lowest I'd heard of this far was native FP8 training.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: