A UK chip startup, VyperCore, says it has come up with a memory management scheme that does a software layer end-around and delivers as much as a 10x throughput improvement for high performance, general-purpose workloads without code modification. The company’s core insight, as described in a recent EE Times article: move “away from the processor’s […]
UK Startup VyperCore Says Its RISC-V Chip’s Memory Management Innovation Delivers 10X Performance Boost
Katana Graph and Intel Collaborate on Graph Analytics Python Library
AUSTIN, TX – Katana Graph, an AI-powered graph intelligence company, has announced the release of a high-performance graph analytics Python library in collaboration with Intel. Katana Graph has designed an easy-to-use library for data scientists and the growing the open core community. The library can also take advantage of the Anaconda Metagraph orchestration layer that […]
Video: Profiling Python Workloads with Intel VTune Amplifier
Paulius Velesko from Intel gave this talk at the ALCF Many-Core Developer Sessions. “This talk covers efficient profiling techniques that can help to dramatically improve the performance of code by identifying CPU and memory bottlenecks. Efficient profiling techniques can help dramatically improve the performance of code by identifying CPU and memory bottlenecks. We will demonstrate how to profile a Python application using Intel VTune Amplifier, a full-featured profiling tool.”
CUDA-Python and RAPIDS for blazing fast scientific computing
Abe Stern from NVIDIA gave this talk at the ECSS Symposium. “We will introduce Numba and RAPIDS for GPU programming in Python. Numba allows us to write just-in-time compiled CUDA code in Python, giving us easy access to the power of GPUs from a powerful high-level language. RAPIDS is a suite of tools with a Python interface for machine learning and dataframe operations. Together, Numba and RAPIDS represent a potent set of tools for rapid prototyping, development, and analysis for scientific computing. We will cover the basics of each library and go over simple examples to get users started.”
Joe Landman on How the Cloud is Changing HPC
In this special guest feature, Joe Landman from Scalability.org writes that the move to cloud-based HPC is having some unexpected effects on the industry. “When you purchase a cloud HPC product, you can achieve productivity in time scales measurable in hours to days, where previously weeks to months was common. It cannot be overstated how important this is.”
Podcast: When a Different OS Gets Different Results
In this podcast, the Radio Free HPC team looks at problems in the scientific software world. “There’s a bug in Python scripts that caused different results in identical routines run on different operating systems. As the guys discuss, it’s not a Python thing but a problem with the order in which files got read according to the operating system’s protocols. This impacts the sort order and thus the end results. The gang speculates on other causes of these types of problems and the fixes that should be employed.”
Parallel Computing in Python: Current State and Recent Advances
Pierre Glaser from INRIA gave this talk at EuroPython 2019. “Modern hardware is multi-core. It is crucial for Python to provide high-performance parallelism. This talk will expose to both data-scientists and library developers the current state of affairs and the recent advances for parallel computing with Python. The goal is to help practitioners and developers to make better decisions on this matter.”
Video: High-Performance Computing with Python – Reducing Bottlenecks
This course addresses scientists with a working knowledge of NumPy who wish to explore the productivity gains made possible by Python for HPC. “We will show how Python can be used on parallel architectures and how to optimize critical parts of the kernel using various tools. The following topics will be covered: – Interactive parallel programming with IPython – Profiling and optimization – High-performance NumPy – Just-in-time compilation with Numba – Distributed-memory parallel programming with Python and MPI – Bindings to other programming languages and HPC libraries – Interfaces to GPUs.”
NVIDIA DGX-2 Delivers Record Performance on STAC-A3 Benchmark
Today NVIDIA announced record performance on STAC-A3, the financial services industry benchmark suite for backtesting trading algorithms to determine how strategies would have performed on historical data. “Using an NVIDIA DGX-2 system running accelerated Python libraries, NVIDIA shattered several previous STAC-A3 benchmark results, in one case running 20 million simulations on a basket of 50 instruments in the prescribed 60-minute test period versus the previous record of 3,200 simulations.”