Today Codeplay announced the world’s first fully-conformant SYCL 1.2.1 Solution. “As a non-proprietary alternative to the incumbent CUDA, SYCL is an open standard developed by the Khronos Group that enables developers to write code for heterogeneous systems using standard C++. Developers are looking at how they can accelerate their applications without having to write optimized processor specific code. SYCL is the industry standard for C++ acceleration, giving developers a platform to write high-performance code in standard C++, unlocking the performance of accelerators and specialized processors from companies such as AMD, Intel, Renesas and Arm.”
Visualizing and Simulating Atomic Structures with CUDA
In this video, John Stone from the University of Illinois, Urbana-Champaign discusses the role of CUDA and GPUs in processing large datasets to visualize and simulate high-resolution atomic structures. CUDA does this by allowing researchers to describe hundreds of thousands to millions of independent, data-parallel work units and write software that executes on those work units, all while achieving peak hardware performance.
The ABCI Supercomputer: World’s First Open AI Computing Infrastructure
Shinichiro Takizawa from AIST gave this talk at the MVAPICH User Group. “ABCI is the world’s first large-scale Open AI Computing Infrastructure, constructed and operated by AIST, Japan. It delivers 19.9 petaflops of HPL performance and world’ fastest training time of 1.17 minutes in ResNet-50 training on ImageNet datasets as of July 2019. In this talk, we focus on ABCI’s network architecture and communication libraries available on ABCI and shows their performance and recent research achievements.”
Exploring the Universe with the SKA Radio Telescope and CUDA
In this video, Wes Armour from the Oxford eResearch Centre discusses the role of GPUs in processing large amounts of astronomical data collected by the Square Kilometre Array and how CUDA is the best suited option for their signal processing software. “The massive computational power of modern day GPUs allows code to perform algorithms such as de-dispersion, single pulse searching and Fourier Domain Acceleration Searching in real-time on very large data-sets which are comparable to those which will be produced by next generation radio-telescopes such as the SKA.”
Video: Arm HPC Update from ISC 2019
In this video, Brent Gorda provides an update on the progress on Arm HPC from the ISC 2019 conference in Frankfurt. “From the perspective of Arm in HPC, it was an excellent event with several high-profile announcements that caught everyone’s attention. The Arm ecosystem was well represented with our partners visible on the show floor and around town.”
CUDA-X HPC: Libraries and Tools for your Next Scientific Breakthrough
Today NVIDIA announced CUDA-X HPC, a collection of libraries, tools, compilers and APIs that helps developers solve the world’s most challenging problems. “CUDA-X HPC includes highly tuned kernels essential for high-performance computing. GPU-accelerated libraries for linear algebra, parallel algorithms, signal and image processing lay the foundation for compute-intensive applications in areas such as computational physics, chemistry, molecular dynamics, and seismic exploration.”
NVIDIA Brings CUDA to Arm for HPC
Today NVIDIA announced its support for Arm CPUs, providing the high performance computing industry a new path to build extremely energy-efficient, AI-enabled exascale supercomputers. “NVIDIA is making available to the Arm ecosystem its full stack of AI and HPC software — which accelerates more than 600 HPC applications and all AI frameworks — by year’s end. The stack includes all NVIDIA CUDA-X AI and HPC libraries, GPU-accelerated AI frameworks and software development tools such as PGI compilers with OpenACC support and profilers.”
Video: NVIDIA Rolls out TensorRT Hyperscale Platform and New T4 GPU for Ai Datacenters
This morning at GTC Japan, NVIDIA CEO Jensen Huang announced a set new products centered around Ai and accelerated computing. Targeting Hyperscale datacenters looking to run Ai workloads, NVIDIA continues to innovate Machine Learning technologies at an unprecedented pace. “There is no question that deep learning-powered AI is being deployed around the world, and we’re seeing incredible growth here,” Huang told an audience of more than 4,000 press, partners, academics and technologists gathered on the latest stop in a GTC world tour.
The Simulation of the Behavior of the Human Brain using CUDA
Pedro Valero-Lara from BSC gave this talk at the GPU Technology Conference. “The attendees can learn about how the behavior of Human Brain is simulated by using current computers, and the different challenges which the implementation has to deal with. We cover the main steps of the simulation and the methodologies behind this simulation. In particular we highlight and focus on those transformations and optimizations carried out to achieve a good performance on NVIDIA GPUs.”
Inside the Volta GPU Architecture and CUDA 9
“This presentation will give an overview about the new NVIDIA Volta GPU architecture and the latest CUDA 9 release. The NVIDIA Volta architecture powers the worlds most advanced data center GPU for AI, HPC, and Graphics. Volta features a new Streaming Multiprocessor (SM) architecture and includes enhanced features like NVLINK2 and the Multi-Process Service (MPS) that delivers major improvements in performance, energy efficiency, and ease of programmability. You”ll learn about new programming model enhancements and performance improvements in the latest CUDA9 release.”