I, too, can vouch for ZFS+VersityGW being a great solution, we were able to scale it vertically on a single-node deployment to a pretty high throughput, both reads and writes.
Most notably, the PutObject operation (which had always been a pain in the ass on HDD with MinIO) is performing well now, even with many small objects.
There is a natural synergy in the gateway storing its metadata in xattrs and using ZFS special VDEV with dnode_size=auto to store the entire ZFS+S3 metadata on fast media.
The latency impacts of the gateway itself can be further minimized by running multiple instances of the gateway pinned to CPU cores, all behind a HAProxy load-balancer communicating with them over UDS.
Most notably, the PutObject operation (which had always been a pain in the ass on HDD with MinIO) is performing well now, even with many small objects.
There is a natural synergy in the gateway storing its metadata in xattrs and using ZFS special VDEV with dnode_size=auto to store the entire ZFS+S3 metadata on fast media.
The latency impacts of the gateway itself can be further minimized by running multiple instances of the gateway pinned to CPU cores, all behind a HAProxy load-balancer communicating with them over UDS.