Many of the GPU rental companies charge less for shared GPU workloads. So it's a cost/compute tradeoff. It's usually not about the workload itself needing the full GPU unless you really need all the RAM on a single instance.
I don't think Vast.ai does "shared GPUs", you can only rent full rigs, at least there is no indication the hardware is shared between multiple users at the same time.
But I think services like Runpod and similar lets you rent "1/6 of a GPU per hour" for example, which would be "shared hosting" basically, as there would be multiple users using the same hardware at the same time.
My (limited) understanding was that the industry previously knew that it was unsafe to share GPUs between tenants, which is why the major cloud providers only sell dedicated GPUs.
NVIDIA GPU's can run in MIG (Multi-Instance GPU), allowing you to pack more jobs on than you have GPUs. Very common in HPC but I don't about in the cloud.
I thought about splitting the GPU between workloads, as well terminal server/virtualized desktop situations.
I'd expect all code to be strongly controlled in the former, and reasonably secured in the latter with software/driver level mitigations possible and the fact that corrupting somebody else's desktop with row-hammer doesn't seem like good investment.
As another person mentioned- and maybe it is a wider usage than I thought- cloud gpu compute running custom code seems to be the only useful item. But, I'm having a hard time coming up with a useful scenario. Maybe corrupting a SIEM's analysis & alerting of an ongoing attack?
Update: I thought for a second I had one: Jupyter notebook services with GPUs- but looking at google colab^* even there its a dedicated GPU for that session.
* random aside: how is colab compute credits having a 90 day expiration legal? I thought california outlawed company-currency expiring? (A la gift cards)
Colab credits aren’t likely a currency equivalent but a service equivalent which is still legal to expire afaik.
Basically Google Colab credits is like buying a seasonal bus pass with X trips or a monthly parking pass with X amount of hours. Rather than getting store cash which can be used for anything.
Until the GPU is accessible by the browser and any website can execute code on it. Or the attack can come from a different piece of software on your machine.
Rowhammer itself is a write-only attack vector. It can, however, potentially be chained to change the write address to an incorrect region. Haven't dived into details.
Rowhammer allows you to corrupt/alter memory physically adjacent to memory you have access to. It doesn't let you read the memory you're attacking.
There's PoC's of corrupting memory _that the kernel uses to decide what that process can access_ but the process can't read that memory. It only knows that the kernel says yes where it used to say no. (Assuming it doesn't crash the whole machine first)
Suppose you have access to certain memory. If you repeatedly read from that memory, can't you still corrupt/alter the physically adjacent memory you don't have access to? Does it really need to be a write operation you repeatedly perform?
> Does it really need to be a write operation you repeatedly perform?
Yes. The core of rowhammer attacks is in changing the values in RAM repeatedly, creating a magnetic field, which induces a change in the state of nearby cells of memory. Reading memory doesn't do that as far as I know.
Anybody have sizable examples? Everything I can think of results in dedicated gpus.