Lenovo Claims an HPC First for Liquid-Cooled Nvidia A100 GPU Servers

Print Friendly, PDF & Email

Today, Lenovo introduced the ThinkSystem SD650-N V2 server, which the company said is the first direct-to-node (DTN) liquid-cooled server for Nvidia A100 Tensor Core GPUs. It includes four board-mounted A100 GPUs in a 1U system delivering up to 3PFLOPS of compute performance in a single rack. The server uses Lenovo Neptune liquid cooling, which the company said reduces energy consumption by up to 40 percent.

The company said a single rack of the servers provides up to 2.8 PetaFLOPS HPC, or 45 PetaFLOPS AI peak performance, on less than an eight-foot footprint. The A100s are interconnected through NVLink for HPC, AI training and inference workloads. Lenovo said Nvidia InfiniBand networking enables the server to scale to thousands of GPUs or, with Nvidia Multi-Instance GPU (MIG) technology, it can be partitioned into seven GPU instances for smaller workloads.

Lenovo also launched the ThinkSystem SR670 V2, a modular system using Neptune liquid-to-air heat exchangers that supports up to eight Nvidia A100 Tensor Core GPUs or Nvidia T4 GPUs in a single 3U frame, delivering up to 160 TFLOPS of compute performance.

Built on two 3rd Gen Intel Xeon Scalable processors, the system offers six different front shuttle options, configurations that include up to eight double-width GPUs with NVLink Bridge and up to eight single-width GPUs with eight drives, Lenovo said.

Lenovo also released names of organizations adopting the new hardware, including Karlsruhe Institute of Technology (KIT), a German research university, plans to implement a new 17 PFLOPS system, featuring warm water-cooled A100 GPUs.

Lenovo SD650-N V2 Server

Dr. Jennifer Buchmüller, head of the Department for Scientific Computing and Simulation, said, “We are looking forward to collaborating with Lenovo for our state-of-the-art HoreKa supercomputer, utilizing their award-winning Neptune Direct Water Cooling (DWC) technology. The eco-credentials of this solution and the performance optimization of the overall system is ideally matched with our objective of developing increasingly efficient and sustainable scientific software. This in turn enables multi-scale simulations, significantly larger than we’ve ever done before, for research in the fields of energy and mobility in engineering, material sciences, earth system sciences, life sciences, and particle & astro-particle physics.”

Comments

  1. Hi,
    I’ve read with attention this interesting news but Lenovo claims are inexact.

    I am the Corporate Portfolio Manager for HPC, AI & Quantum at ATOS, we manufacture some of the largest SuperComputers in the world, we are doing Direct Liquid Cooled machines since 2009 and our flagship SuperComputer BullSequana XH2000 is using our 4th generation Direct Liquid Cooling system.

    We have launched our – NVIDIA Ampere A100-40 based – BullSequana X2415 DLC blade on May 14th 2020, well before Lenovo, and we have installed the biggest production system – worldwide – using this blade, #1 in the TOP500 in Europe, and #3 at the Green500 ranking.

    You can get a look at the following press releases announcing the X2415 and the installation at
    Jülich in Germany.

    I am at your disposal if you whish to get more information on our products in this space.

    Regards.

    Eric Eppe

    https://atos.net/en/2020/press-release/general-press-releases_2020_05_14/atos-launches-first-supercomputer-equipped-with-nvidia-a100-gpu
    https://atos.net/en/2020/press-release_2020_11_17/atos-powers-europes-fastest-supercomputer-at-julich-in-germany-the-most-energy-efficient-system-in-the-top100