Hasn't nailed the strawberry test yet

pxc · 2025-08-06T05:04:37 1754456677

I found this surprising because that's such an old test that it must certainly be in the training data. I just tried to reproduce and I've been unable to get it (20B model, lowest "reasoning" budget) to fail that test (with a few different words).

quatonion · 2025-08-06T10:54:00 1754477640

I am starting to get the impression the strawberry test is an OpenAI watermark, more than an actual problem.

It is a good way to detect if another model was trained on your data for example, or is a distillation/quant/ablation.