Exactly. The way that LLMs have been "debunked" in simplified writings on the Web and in media is to suggest that they are just Markov chains like the famous "Mark V. Shaney"[1] bot from the 1980s, and LLMs are far more powerful than that. Yes, we need to debunk claims that LLMs have achieved sentience and such nonsense, but let's not ignore just how amazing they are.
I'm plenty familiar with flash attention, but you didn't understand what you quoted.
GPT is still (partially) probabilistic, and the "it's just autocompleting" refrain stems from this idea that being probabilistic without "higher order intent" means a system is just a bullshit generator.
-
The section of my comment you quoted is not comparing LLMs to Markov chains, it's questioning that notion: Obviously we humans don't consciously evaluate [every single word in our language * each word in the sentence]
So the pool of words that we can consciously speak in a sentence is being defined before we apply higher order intent.
If lacking higher order intent is what makes it "just autocomplete", then we're all just interfaces for autocomplete.
-
Complete this sentence with the scariest thing that comes to mind: "We went to the park and it was fun, but there was a scary..."
The specific sequence "mass hippo attack" probably didn't come to mind even though that'd probably be deadlier than what you thought of.
But that's a pointless observation: After all, what are the odds of a hippo attack happening at the park let alone several? A "mass hippo attack" is so unlikely that you might have already rejected my claim since your scary thing is much more likely.
The point is that you didn't consciously compare "hippo attacks" to whatever you thought of until it was brought up.
And that's because don't often mention hippo attacks in our recollection of going to the park... so the bullshit generator wouldn't surface that for our higher level mind to consider.
-
It turns out just having a probabilistic model of our language is enough to align with higher level thought very often. So often that I challenge the notion that higher level intent drives things. I consider the lower level bullshit generator as running the show, and the higher level self is more like a director who can ask to reshoot the scene, but can't just walk up and act out every role on stage as they please.
We all have bullshit generators that don't care if our higher order self is not racist/misogynistic/etc. and will gladly fill in blanks with hallucinations.
What matters is that our higher order self chooses to reflect and evaluate the pool we surface rather than just blurting out the first thing it surfaces. To me using GPT is no different.
That's how markov chain chat bots (a very old technology) works:
https://stackoverflow.com/questions/5306729/how-do-markov-ch...
https://www.baeldung.com/cs/markov-chain-chatbots
That's not how ChatGPT works though, because of the attention mechanism