Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This overly reductive description of LLMs misses the forest for the trees. LLMs are circuit builders, the converged parameters pick out specific paths through the network that define programs. In other words, LLMs are differentiable computers[1]. Analogous to how a CPU is configured by the program state to execute arbitrary programs, the parameters of a converged LLM configure the high level matmul sequences towards a wide range of information dynamics.

Statistics has little relevance to LLM operation. The statistics of the training corpus imparts constraints on the converged circuit dynamics, but otherwise has no representation internally to the LLM.

[1] https://x.com/karpathy/status/1582807367988654081



> LLMs are circuit builders

I think they are circuit "approximators". In other words, a result of a glorified linear regression..


I called it a “big wad of linear algebra,” above. That’s all it is.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: