Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Tuned Qwen 3.5 27B beats Step 3.5 on almost all benchmarks, so the point about the size class is moot.


Benchmarks are not interesting in deciding the "size class". Bigger size means more knowledge. Also, the Qwen 3.5 27B is a dense 27B active parameter model. StepFun 3.5 Flash has 11B active parameters.


> Bigger size means more knowledge.

Qwen 3.5 27B beats StepFun 3.5 Flash on GPQA Diamond too, so probably no.


Benchmarks don't tell the whole story. For one-shot coding tasks, I found Step 3.5 Flash to be stronger even than Qwen 3.5 397B.


Benchmarks don't tell the whole story... for that you need anecdotes from random HN posters :)




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: