Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Question from a noob: how good would it be to run those on a computer with AMD APU (for example Ryzen 9 7940HS) with 128GB RAM and setting aside 64GB for iGPU?


Another noob here. If I had to guess, it's because current models are mostly memory bound. The AI learning gpus (A100, H100 etc.) are not the best TFlop performers, but they have most vram. It seems that researchers found a sweet spot for neural network architectures that perform good on similar configurations, i.e. near real time (reading speed in LLMs). Once you bring those models to cpu, they might get performance bound again. Llama.cpp somehow illustrates that a bit, for bigger models you tend to wait a lot for the answer. I suspect the story would be similar with igpus


So possibly some basic iGPU (maybe even Intel) with lots of VRAM assigned could be enough?


You can try out the demo and benchmark yourself


As long as there is a Vulkan SDK for your AMD APU (likely there is), MLC-LLM can use TVM Unity to generate code for it




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: