Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That is only 24GB in 4bit.

People are running models 2-4 times that size on local GPUs.

What's more, this will run on a MacBook CPU just fine-- and at an extremely high speed.



Yeah, 70B is much larger and fits on a 24GB, admitedly with very lossy quantization.

This is just about right for 24GB. I bet that is intentional on their part.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: