I think reasons for slow adoption are probably a mix of:
1. Lack of developer awareness.
2. Security implications (or perceived implications) of exposing memory directly to a network without passing through CPU or application-level access control mechanisms.
The second point may be prohibitive for a lot of general purpose database systems which are intended to be run on shared infrastructure on virtualized instances.
Another reason may be that a lot of production systems are CPU-bound, not memory-bound. RDMA seems ideal for systems which require a lot of memory. I'm thinking maybe with recent advancements in AI/LLMs, it could be an interesting technology as these do require a huge amount of memory relative to CPU.
There are several good ideas in distributed databases that are effectively not deployable in cloud environments because the requisite hardware features don’t exist. Since cloud deployability is a pervasive prerequisite for commercial viability, database designs tend to overfit for the limitations of cloud environments even though we know how to do much better.
Basically, we are stuck in a local minima of making databases work well in cloud environments that are not designed to enable efficient distributed databases. It is wasteful and also provides an arbitrage opportunity for cloud companies.
Nothing proprietary in Spanner but I believe no other vendor has an atomic clock similar to what Google has. Hence they are not able to implement the paxos global transaction lock which requires that all servers participating in the transaction be perfectly synchronized. This is what one of the comments above refers to I think.
So while the spanner paper is open to be implemented by any vendor, they don't have the proprietary advantage that Google has - the atomic clock. So Yugabyte, CockroachDB don't rely on atomic clocks. I tried to get to the ground level basics of this, but I haven't understood this matter completely yet.
I guess using worse clocks would mean using a (slightly?) slower spanner, but I'm not sure what is the impact. In any case, if a big vendor (e.g., Amazon IBM Oracle Dell...) would want something on par with Google's clock they probably can achieve it (though I don't know much about these clocks).
The problem is that the slower it is the more the window for error grows. And the harder it is to recover. It also impose a high boundary to latency. You cannot be faster than your clock error without risks.
Note that even Spanner had multiple downtime due to clock and/or network failures. In these case, any operation lal guarantees are lost. This makes it really dangerous.
Do you have indications why CPU bound? I have never seen Systems other than for example linear algebra or special algorithms to max out a modern CPU. Almost all loads today are in love way or another memory bound.
Distributed transactions are invariably latency (usually storage I/O) bound, rather than CPU or memory bound. I think your #2 is a big part of the challenge with RDMA, as well as the "trickiness" of the programming paradigm.
A bulk purchase of ~60 FDR IB cards, cabling, and network switches to support them sounds pretty expensive.
That being said, IB FDR gear is "older tech" now so the cards and switches can commonly be found reasonably cheaply on Ebay. The switches tend to be bloody loud though, so they're not something you'd want nearby if you can help it.
I think reasons for slow adoption are probably a mix of:
1. Lack of developer awareness.
2. Security implications (or perceived implications) of exposing memory directly to a network without passing through CPU or application-level access control mechanisms.
The second point may be prohibitive for a lot of general purpose database systems which are intended to be run on shared infrastructure on virtualized instances.
Another reason may be that a lot of production systems are CPU-bound, not memory-bound. RDMA seems ideal for systems which require a lot of memory. I'm thinking maybe with recent advancements in AI/LLMs, it could be an interesting technology as these do require a huge amount of memory relative to CPU.