No, that’s simply whether CoT is enabled or not. That actually *does* have impac...

No, that’s simply whether CoT is enabled or not. That actually does have impact.

What Anthropic is doing is still generating the thinking tokens (because they improve answer quality) without showing it to them. I believe this may actually hint at a future where these LLM vendors don’t want to show the internal reasoning like they do right now.

I’m very much of the opinion that hiding them from the response because it “improves latency” is nonsense.