In this video from the 2019 OpenFabrics Workshop in Austin, Christopher Lameter from Jump Trading presents: Faster Fabrics Running Against Limits of the Operating System, the Processor, and the I/O Bus.
In 2017 we got 100G fabrics, in 2018 200G fabrics and in 2019 it looks like 400G technology may be seeing a considerable amount of adoption. These bandwidth compete with and sometimes are higher than the internal bus speeds of the servers that are connected using these fabrics. A worry is that this trend is continuing with Terabit speed fabrics in 2022. One wonders what are the implications of this for high speed fabrics? Numerous companies have started to work on projects that remedy the situation by for example having active NICs that can do partial processing on their own, by establishing ways via other busses to devices so that full performance is possible, by sharing a NIC between multiple servers and so on. This means that there is the danger of a blooming field of new proprietary technologies and extensions to RDMA technology developing in the coming years. I think we need to consider these developments and work on improving fabrics and the associated APIs so that ways to access these features become possible using vendor neutral APIs. It needs to be possible to code in a portable way and not to a vendor specific one.”
Christoph Lameter is working as a lead in research and development for Jump Trading LLC (an algorithmic trading company) in Chicago and maintains the slab allocators and the per cpu subsystems in the Linux Kernel. He contributed to a number of Linux projects since the initial kernel releases in the early 90s. As a kernel developer at SGI he helped pioneer the use of Linux for Supercomputing and developed the necessary kernel capabilities for HPC applications. Currently he is working on improving Linux through the use of new faster APIs to a variety of high performance devices and is evaluating new technologies that allow faster processing.