Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's very interesting, and also quite frustrating that no two AI experiences are the same. Scrolling through the threads here and they're all seemingly contradictory.

I've had the Gemini 3.0 (presumably) A/B test and been unimpressed. It's usually on fairly novel questions. I've also gotten to the point where I often don't bother with getting Gemini's opinion on something because it's usually the worst of the bunch. I have a Claude Pro and OpenAI Pro sub and use Gemini 2.5 Pro via key.

The most glaring difference is the very low quality of web search it performs. It's the fastest of the three by far but never goes deep. Claude and Gemini seemingly take a problem apart and perform queries as they walk through it and then branch from those. Gemini feels very "last year" in this regard.

I do find it to be top notch when it comes to writing oriented tasks and sounding natural. I also find it to be fairly good about "keeping the plot" when it comes to creative writing. Claude is a great writer but makes a bit too many assumptions or changes. OpenAI is just flat out poor at creative writing currently due to the issues with "metaphorical language".

On speculative tasks -- e.g., "let's rank these polearms and swords in a tier list based on these 5 dimensions" -- Gemini does well.

On code work, Gemini is GOOD so long as it's not recent APIs. It tends to do poorly for APIs that have changed. For instance, "do XYZ in Stripe now that the API surface has changed, lookup the docs for the most recent version". GPT-5 has consistently amazed me with its ability to do this -- though taking an eternity to research. It's generally performed great with single-shot code questions (analyze this large amount of code and resolve X or fix Y).

On the Agentic front - it's a nonstarter. Both the CLI toolset and every integration I've used as recently as Monday have been sub-par when compared to Codex CLI and Claude Code.

On troubleshooting issues (PC/Software but not code), it tends to give me very generic and non-useful answers. "update your drivers, reset your PC". GPT-5 was willing to go more speculative dive deeper, given the same prompt.

On factual questions, Gemini is top notch. "Why were medieval armies smaller than Roman era armies" and that sort of thing.

On product/purchase type questions, Gemini does great. These are questions like "help me find a 25" stone vanity counter top with sink that has great reviews and from a reputable company, price cap $1000, prefer quality where possible". Unfortunately, like all of the other AI models, there's a non-zero chance that you'll walk through links and find that the product is not as described, not in-stock, or just plain wrong.

One last thing I'll note is that -- while I can't put my finger on it -- I feel like the quality of Gemini 2.5 Pro has declined over time while the model has also sped up dramatically. As a pay-per-token user, I do not like this. I'd rather pay more to get higher quality.

This is my subjective set of experiences as one person who uses AI everyday as a developer and entrepreneur. You'll notice that I'm not asking math questions or typical homework style questions. If you're using Gemini for college homework, perhaps it's the best model.



Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: