Those who gave up for dead Intel’s Omni-Path fabric, which Intel began working on 2012 and stopped supporting seven years later, may want to re-think that. Cornelis Networks, the company breathing life into Omni-Path since 2020, has won an $18 million R&D contract from the U.S. National Nuclear Security Administration (NNSA).
The award is part of DOE’s Next-Generation High Performance Computing Network (NG-HPCN) project in support of NNSA’s Advanced Simulation and Computing (ASC) program. The project will be led by Lawrence Livermore National Laboratory for upcoming supercomputers at the NNSA Tri-Labs: LLNL, Los Alamos and Sandia National laboratories.
The networking technologies will be built for HPC, AI, machine learning and HPDA (high-performance data analytics) to support such applications as stockpile stewardship, fusion research, advanced manufacturing, climate research and other open science on future ASC HPC systems, NNSA said. The project,
“As we move into exascale supercomputing and beyond, and increasingly rely on emerging technologies to achieve our mission objectives, we will need creative high-performance solutions to meet future challenges,” said NNSA ASC program director Thuc Hoang.
Cornelis positions itself as “the only domestic, OEM-independent, CPU/GPU agnostic high performance” fabric provider. The company has gained traction in the federal market sector — last November, Cornelis announced a DOE/NNSA contract to provide fabric networking for the NNSA’s Tri-Laboratory Commodity Technology System 2 (CTS-2) system contract (see our SC21 video interview with Cornelis CEO Phil Murphy).
The company said its first deliverable under the new DOE award will be Omni-Path Express (OPX), a new host stack based on the OpenFabrics Alliance libfabric, to be launched at the ISC 2022 conference in Germany later this month. OPX will have a host architecture based on OpenFabrics Interfaces (OFI) and offer “significant application performance gains resulting from accelerated fabric performance…” and “broad support for application-critical technologies including all popular MPIs, alternative programming models like SHMEM and Chapel, AI frameworks, object storage file systems like DAOS, and all popular GPUs.” The company added it will be the “foundational software for Cornelis’ next generation Omni-Path fabric architecture.”
In 2023, Cornelis will release the Omni-Path CN5000 Series with 400G native OFI adapters for the OFA libfabric software ecosystem, 400G top-of-rack and director-class switches with features that include dynamic adaptive routing and congestion management, a choice of deployment topologies and plans to include intelligent NICs / DPUs.
“The Cornelis team is thrilled to partner with the NNSA in co-development of our future fabric products through their Next-Generation High Performance Computing Network project,” noted Gunnar K. Gunnarsson, VP of solutions delivery and support at Cornelis. “We look forward to accelerating scientific advancements across a variety of applications by delivering industry-leading performance in support of the NNSA’s mission-critical HPC and AI workloads.”
The contract is part of the ASC post-Exascale-Computing-Initiative (ECI) investment portfolio, whose goal is to sustain R&D in partnership with industry that DOE ECI had initiated via its PathForward program. “It also will help foster a more robust domestic HPC ecosystem by increasing U.S. industry competitiveness in nextgeneration interconnect technologies,” NNSA said.