> When it's already faster than I can absorb the response Streaming a response f...

ben_w · on April 8, 2024

Number of different use cases (categories) I'd agree; I'm not so sure about use (volume)…

…not yet anyway. Fast moving area, lots of blue water outside the chat interface.

boroboro4 · on April 8, 2024

Name one use case where there is a difference between latency of 200 t/s (fireworks.ai mixtral model) and 500 t/s (groq mixtral)? Not throughput and not time to first token, but latency.

Groq model shines at latency, not at the other two.