Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Every time I mention this

I feel like there’s a bunch of factors for why it will never be the same for many folks, from the models and harnesses, to the domains and existing tests/tooling.

I feel bad for the people for whom it doesn’t work, but Claude Opus has written most of my code in 2026 so far. I had to build some tools around linting entire projects and most of my tokens are probably referencing existing stuff and parallel review iterations and tests, but it’s pretty nice and even seeing legacy code doesn’t make me want move to a farm and grow potatoes.

It might be counter productive to be like: "Oh, just do X!" which works for the person suggesting it, and then have to do "But have you tried Y?" when it doesn't for the other person, if it just keeps being a never ending string of what works for one person not working for another.



> I feel like there’s a bunch of factors for why it will never be the same for many folks

Yeah, and the problem arises simply because some people are unable to accept the fact. They insist that if LLM-assisted coding doesn't work for one, it's because “you're holding it wrong”.


> I feel like there’s a bunch of factors for why it will never be the same for many folks, from the models and harnesses, to the domains and existing tests/tooling.

If the argument is “you have to use the right model, harness, test and tooling for it to work” then it’s not replacing software engineers any time soon.

The other thing is - where are all the web apps, mobile apps, games, desktop apps, from these 100x productivity multipliers. we’re 1-2 years into these tools being widely mainstream and available and I’m not seeing applications that took years to ship before appear at 100x the rate, or games being shipped by tiny teams, or new ideas of mobile apps coming out at 100x the rate. What we do see is vibe coded slop, stability issues with massive companies (windows, AWS for example), and mass layoffs back to pre-covid levels blamed on AI but everyone knows it’s a regression to the mean after a massive over hiring when money was cheap.

It’s like the emperor has no clothes on this topic to me.


I’m an indie developer and I see the explosion in apps in my niche (creative tools for photography/videography).

They wouldn’t have taken years to ship before, but easily a couple months.

Now the moment any app with any value gets popular, the App Store gets flooded with quick vibe coded copycat clones (very recognizable AI generated icon included).

The quality is low, but the impact this flood has on the market is real.


I wouldn't paint the image in such black terms. LLMs can be good in finding bugs and potential issues. And if you like, they can be like IntelliSense on steroids. Even agentic workflows can be good, e.g. for an initial assessment of a new large codebase. And potentially millions of other small tasks like writing one-off helper scripts etc.


So which apps are seeing 10x the bug fixes and improvements in stability and quality? From my side, I see one shot CRUD apps, platforms like AWS and windows actively deteriorating, to the point of causing massive outages and needing to have development processes changed [0]. Who is actually shipping 10x more stuff, or fixing 10x more bugs?

[0] https://arstechnica.com/ai/2026/03/after-outages-amazon-to-m...


I "pair" with claude-code and still write 30% by hand, with additional review with gpt-5.4, but I definitely write fewer bugs than before. I'd estimate my speedup to be 2x.


The Automation bias issue is something that has been raised by many people like myself but mostly ignored. The better models get the worse that problem with get, but IMHO the implications of the claims are not on the code generation side.

The sandwich story in the model card is the bigger issue.

LLMs have always been good at finding a needle in a haystack, if not a specific needle, it sounds like they are claiming a dramatic increase in that ability.

This will dramatically change how we write and deliver software, which has traditionally been based on the idea of well behaved non-malfeasant software with a fix as you go security model.

While I personally find value in the tools as tools, they specifically find a needle and fundamentally cannot find all of the needles that are relevant.

We will either have to move to some form of zero trust model or dramatically reduce connectivity and move to much stronger forms of isolation.

As someone who was trying to document and share a way of improving container isolation that was compatible with current practices I think I need to readdress that.

VMs are probably a minimum requirement for my use case now, and if verified this new model will dramatically impact developer productivity due to increased constraints.

Due to competing use cases and design choice constraints, none of the namespace based solutions will be safe if even trusted partners start to use this model.

How this lands in the long run is unclear, perhaps we only allow smaller models with less impact on velocity and with less essential complexity etc…

But the ITS model of sockets etc.. will probably be dead for production instances.

I hope this is marketing or aspirational to be honest. It isn’t AGI but will still be disruptive if even close to reality.


It depends on the use, I'm not fixed on "productivity" measured by LoC but on code quality. So when using LLMs to challenge my code I'm less productive but the quality of my code increases.


It actually seems like people are shipping 10x more bugs, not fixing 10x more bugs.


Where are all the apps? It's mostly visible in AI tooling itself. Harnesses, vibe coding tools and stuff with "claw" in the name saw a cambrian explosion.

And maybe using AI to use AI better is just masturbatory. But coders want interesting problems to solve. Pros also need software ideas they can monetize. And what problem is attracting more investment in money, time and neurons than the problem of making AI productive? (I am referring only to problems that can be solved in software....)

So the thing with AI is that right now it is both a tool AND a potentially very valuable problem to solve, that's why most of the AI "productivity" gains go into AI itself. At one point this self-refetential phase will have to end and people are going to see if these new AI tools, harnesses.claw-things are actually applicable to things people are willing to pay the real prices for (not the subsidized ones).


wasn’t there a news story about the app store reviews being delayed because of an increase in app influx?


that doesnt tell us much about the subjective quality of the apps in said influx


And thus the goalpost was shifted. The first question was "where are all the AI coded apps?" And once this was answered, the subject is immediately switched to quality.


no? the post they responded to said

> I’m not seeing applications that took years to ship before

> What we do see is vibe coded slop


I absolutely feel like there has been an explosion of software since the release of AI tools. This is a subjective assessment anyway…

My company for example has gotten 500% better at creating productivity tools.


Even co-pilot writes most of my code in april 2026.

Further, i don't trust code anymore that hasn't been reviewed 3x or more by co-pilot.

If you have asked me 6 months ago I wouldn't have expected this change so soon.


> I had to build some tools around linting entire projects

OK, everybody is doing that. And everybody is doing their best at making LLMs more reliable when working on non-trivial tasks. Yet, it looks like nobody came up with a universal solution yet. This is particularly true for non-trivial projects.


It’s because the models response is conditioned on the prompt. They are as intelligent as the person using them

In some sense it’s a lot like a google search. There’s this big box of knowledge and you are choosing tokens to pluck out of it. The quality of the tokens depends on how intelligent you are.


Don’t forget, it also depends on the complexity of the work and the experiences of the operator.

The less complex the work and the less experienced the operator means more perceived “wow” factor :)

There’s definitely an aspect of how you use it though. In my work it’s mostly been chaining to reduce non-determinism.


The irony here is that even if one is extracting legitimate value from LLMs because they are that much smarter than their peers, the process of using LLMs to perform all of their skilled labor makes them less intelligent.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: