Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We will eventually increase the Phind Model to 100K tokens -- the RoPE embeddings in Code Llama were designed for this.


> the RoPE embeddings in Code Llama were designed for this.

The RoPE embeddings were not "designed" for that. The original RoPE was not designed with length extrapolation in mind. Subsequent tweaks to extrapolate RoPE (e.g. position interpolation) are post-hoc tweaks (with optional tuning) to an entirely vanilla RoPE implementation.


100k tokens and good ide support would be great. Copy pasting back and forth with browser and IDE is kinda annoying and you always miss some context. I think model is now good enough but what is kinda missing is good developer experience eg what to load in that context window and how model integrates to IDE. But this is kinda missing with copilot and chatgpt4 as well.


Is it “100k” or really 100k there are so many ways to do context, I remember seeing 100k before but it was doing some cheap trick to get it


What about ALiBi and Sliding Window Attention?

Additionally Apple researchers seem to be playing with "Attention Free" variants.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: