The full deepseek R1 model needs more memory than 512GB. The model is 720GB alon... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		NightlyDev on March 5, 2025 \| parent \| context \| favorite \| on: Apple M3 Ultra The full deepseek R1 model needs more memory than 512GB. The model is 720GB alone. You can run a quantized version on it, but not the full model.

summarity on March 5, 2025 [–]

You can chain multiple Mac Studios using exo for inference, you'd "only" need two of these. There's a bottleneck in the RMA speed over TB5, but this may not matter as much for a MoE model.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact