It is worth mentioning that the Linux kernel has a new kernel API (io_uring) that changes the whole argument around using libos designs. With the new io_uring library (available with Linux kernel 5.1), peak IOPS per core is 1.7M IOPS... Which beats or is close to SPDK performance. Later updates to the patches improves the throughput even more.
Jens (the author) has done a great writeup