What's the SOTA for memory-centric computing? I feel like maybe we need a new pa...

mkroman · on Jan 26, 2025

The AI hardware race is still going strong, but with so many rapid changes to the fundamental architectures, it doesn't make sense to bet everything on specialized hardware just yet.. It's happening, but it's expensive and slow.

There's just not enough capacity to build memory fast enough right now. Everyone needs the biggest and fastest modules they can get, since it directly impacts the performance of the models.

There's still a lot of happening to improve memory, like the latest Titans paper: https://arxiv.org/abs/2501.00663

So I think until a breakthrough happens or the fabs catch up, it'll be this painful race to build more datacenters.

lovelearning · on Jan 27, 2025

I'm not sure how SOTA it is but the sentence about connecting DRAM differently reminded me of Cerebras' scalable MemoryX and its "weight streaming" architecture to their custom ASIC. You may find it interesting.

[1]: https://cerebras.ai/press-release/cerebras-systems-announces...

[2]: https://cerebras.ai/chip/announcing-the-cerebras-architectur...

ilaksh · on Jan 27, 2025

Yeah, Cerebras seems to be the SOTA. I suspect we need something more radically different for truly memory-centric computing that will be significantly more efficient.

rfoo · on Jan 26, 2025

> Because racks of H100s are not sustainable.

Huh? Racks of H100s are the most sustainable thing we can have for LLMs for now.

ilaksh · on Jan 28, 2025

Right, they are. But they still use massive amounts of energy compared to brains.

So it seems that we need a new paradigm of some sort.

So much investment is being announced for data centers. I assumed there would be more investments in fundamental or applied research. Such as for scaling memristors or something.