San Jose, Aug. 5 – In the results released last week of MLPerf AI benchmark, Inspur NF5488A5 server set a new AI performance record in the Resnet50 training task, topping the list for single server performance. MLPerf (results here) is the most influential industry benchmarking organization in the field of AI around the world. Established […]
Groq AI Chip Benchmarks Leading Performance on ResNet-50 Inference
Today AI chip startup Groq announced that their new Tensor processor has achieved 21,700 inferences per second (IPS) for ResNet-50 v2 inference. Groq’s level of inference performance exceeds that of other commercially available neural network architectures, with throughput that more than doubles the ResNet-50 score of the incumbent GPU-based architecture. ResNet-50 is an inference benchmark for image classification and is often used as a standard for measuring performance of machine learning accelerators.
Optimizing in a Heterogeneous World is (Algorithms x Devices)
In this guest article, our friends at Intel discuss how CPUs prove better for some important Deep Learning. Here’s why, and keep your GPUs handy! Heterogeneous computing ushers in a world where we must consider permutations of algorithms and devices to find the best platform solution. No single device will win all the time, so we need to constantly assess our choices and assumptions.
Intel Xeon Scalable Processors Set Deep Learning Performance Record on ResNet-50
Today Intel announced a deep learning performance record on image classification workloads. “Today, we have achieved leadership performance of 7878 images per second on ResNet-50 with our latest generation of Intel Xeon Scalable processors, outperforming 7844 images per second on Nvidia Tesla V100, the best GPU performance as published by Nvidia on its website including T4.”