At the lowest level of abstraction, LLMs are just matrix multiplication. Deterministic functions of their inputs. Of course, we can argue on the details and specifics of how the peculiarities of inference in practice lead to non-deterministic behaviours but now our model is being complicated by vague aspects of reality.
One convenient way of sidestepping these is to model them as random functions, sure. I wouldn't go as far to say they are "inherently stochastic creatures". Maybe that's the case, but you haven't really given substantial evidence to justify that claim.
At a higher level of abstraction, one possible model of llms is as deterministic functions of their inputs again, but now as functions of token streams or higher abstractions like sentences rather than the underlying matrix multiplication. In this case again we expect llms to produce roughly consistent outputs given the same prompt. In this case, again, we can apply deterministic theorems.
I guess my central claim is that there hasn't been a salient argument made as to why the randomness here is relevant for consensus. Maybe the models exhibit some variability in their output, but in practice does this substantially change how they approach consensus? Can we model this as artefacts of how they are initialised rather than some inherent stochasticity? Why not? It feels like randomness is being introduced here as a sort of magic "get out of jail" free card here.
LLMs utilize categorical distributions defined by the logits computed by the matrix multiplies, and there are many sampling strategies which are employed. This is one of the core mechanisms for token generation.
There's no peculiarity to discuss, that's how they work. That's how they are trained (the loss is defined by probabilistic density computations), that's how inference works, etc.
> I guess my central claim is that there hasn't been a salient argument made as to why the randomness here is relevant for consensus. Maybe the models exhibit some variability in their output, but in practice does this substantially change how they approach consensus? Can we model this as artefacts of how they are initialised rather than some inherent stochasticity? Why not? It feels like randomness is being introduced here as a sort of magic "get out of jail" free card here.
I'm really surprised to hear this given the content of the post. The claims in the post are quite strong, yet here I need to give a counterargument to why the claim about consensus applying to pseudorandom processes is relevant?
I don't think it's necessary to furnish a counterexample when pointing out when a formal claim is overreaching. It's not clear what the results are in this case! So it feels premature to claim that results cover a wider array of things than shown?
For instance, this is a strong claim:
> it means that in any multi-agentic system, irrespective of how smart the agents are, they will never be able to guarantee that they are able to do both at the same time:
>
> Be Safe - i.e produce well formed software satisfying the user's specification.
> Be Live - i.e always reach consensus on the final software module.
I'm confused as to the stance, we're either hand-waving, or we're not -- so which is it?
I just came away from the read thinking that this post was pointing to something very strong and was a bit irked to find that the state of results was more subtle than the post conveys it.
If you're pushing me, let's say we're not hand waving then. LLMs, abstraction removed, are deterministic computations of matrix-multiplication, f(x) -> y. If you want, we can make them pseudo-random, but thus still a deterministic process. FLP then holds. I'm not sure what your confusion is.
At the lowest level of abstraction, LLMs are just matrix multiplication. Deterministic functions of their inputs. Of course, we can argue on the details and specifics of how the peculiarities of inference in practice lead to non-deterministic behaviours but now our model is being complicated by vague aspects of reality.
One convenient way of sidestepping these is to model them as random functions, sure. I wouldn't go as far to say they are "inherently stochastic creatures". Maybe that's the case, but you haven't really given substantial evidence to justify that claim.
At a higher level of abstraction, one possible model of llms is as deterministic functions of their inputs again, but now as functions of token streams or higher abstractions like sentences rather than the underlying matrix multiplication. In this case again we expect llms to produce roughly consistent outputs given the same prompt. In this case, again, we can apply deterministic theorems.
I guess my central claim is that there hasn't been a salient argument made as to why the randomness here is relevant for consensus. Maybe the models exhibit some variability in their output, but in practice does this substantially change how they approach consensus? Can we model this as artefacts of how they are initialised rather than some inherent stochasticity? Why not? It feels like randomness is being introduced here as a sort of magic "get out of jail" free card here.
Just my two cents I suppose.