Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
irthomasthomas
8 months ago
|
parent
|
context
|
favorite
| on:
Ollama Turbo
If these are FP4 like the other ollama models then I'm not very interested. If I'm using an API anyway I'd rather use the full weights.
mchiang
8 months ago
[–]
OpenAI has only provided MXFP4 weights. These are the same weights used by other cloud providers.
irthomasthomas
8 months ago
|
parent
[–]
Oh, I didn't know that. Weird!
reissbaker
8 months ago
|
root
|
parent
[–]
It was natively trained in FP4. Probably both to reduce VRAM usage at inference time (fits on a single H100), and to allow better utilization of B200s (which are especially fast for FP4).
irthomasthomas
8 months ago
|
root
|
parent
[–]
Interesting, thanks. I didn't know you could even train at FP4 on H100s
reissbaker
8 months ago
|
root
|
parent
[–]
It's impressive they got it to work — the lowest I'd heard of this far was native FP8 training.
Consider applying for YC's Summer 2026 batch! Applications are open till May 4
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: