Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Hasn't nailed the strawberry test yet


I found this surprising because that's such an old test that it must certainly be in the training data. I just tried to reproduce and I've been unable to get it (20B model, lowest "reasoning" budget) to fail that test (with a few different words).


I am starting to get the impression the strawberry test is an OpenAI watermark, more than an actual problem.

It is a good way to detect if another model was trained on your data for example, or is a distillation/quant/ablation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: