Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is an excellent point. One of the main things I see people say about ChatGPT is "when it gets better in the future". But as you point out, it's already trained on the entire internet. There are many features they could add and there are infinite special cases for handling various prompts. But the core of the product, the LLM generated answers, can't get much better without an order of magnitude increase in training data.

In terms of petabytes of training data, it will be a long time before ChatGPT's own responses are a significant portion of the training set. And even then, at least for a while, that should just shift responses closer to a sort of average human response



> But the core of the product, the LLM generated answers, can't get much better without an order of magnitude increase in training data.

There's no reason to believe this. The model architecture and training methods aren't perfect, nor is the way it's queried.

eg: https://www.reddit.com/r/MachineLearning/comments/zr2en7/r_n...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: