Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

  and the EOS is "<turn|>". "<|channel>thought\n" is also used for the thinking trace!
Can someone explain this to me? Why is this faux-XML important here?


That’s how the model is trained to signal the end to its generation and to indicate its thinking.


These are likely individual tokens. They are super common.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: