Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> By finding smarter ways to divide a large matrix multiplication operation into more manageable subproblems, it sped up this vital kernel in Gemini’s architecture by 23%, leading to a 1% reduction in Gemini's training time.

The message I replied to said "if I have some toy poorly optimized python example". I think it's safe to say that matmul & kernel optimisation is a bit beyond a small python example.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: