insideHPC Special Report: HPC and AI for the Era of Genomics – Part 3

Print Friendly, PDF & Email

This special report sponsored by Dell Technologies, takes a deep dive into HPC and AI for life sciences in the era of genomics. 2020 will be remembered for the outbreak of the Novel Coronavirus or COVID-19. While infection rates are growing exponentially, the race is on to find a treatment, vaccine, or cure. Governments and private  organizations are teaming together to understand the basic biology of the virus, its genetic code, to find  what can stop it.

Significant amounts of computing power are aimed at this problem, including using the most powerful high performance computing (HPC) systems in the world today. Finding a cure or eliminating  COVID-19 will not only benefit the worldwide population, but will also be the foundation for tackling the next pandemic, which some scientists say will happen in the not too distant future.

This technology guide, insideHPC Special Report: HPC and AI for the Era of Genomics, highlights a lineup of Ready Solutions created by Dell Technologies which are highly optimized and tuned hardware and software stacks for a variety of industries. The Ready Solutions for HPC Life Sciences have been designed to speed time to production, improve performance with purpose-built solutions, and scale easier with modular building blocks for capacity and performance.

Challenges

A number of challenges exist for both the wider adoption of technologies that can impede personalized medicine workflows and the implementation of such systems.

Perception

The perception exists that genomic analysis can only be done on hundreds of nodes or by expensive  supercomputers. However, optimized systems that include the right hardware and software, architected by  experts from leading vendors such as Dell Technologies, can bring genomics analysis to a broad base of  researchers and users. In many cases, IT departments look for an immediate ROI, or will quickly look at utilization of the compute/storage cluster. However, it is possible to start small and grow as needs grow.  Careful planning for this expansion allows for servers and storage to be added incrementally.

ROI for small use cases

Aside from the challenge of starting with a small system and growing as needs grow, small organizations may not have the staff to investigate a number of alternatives or implement a piece-by-piece purchase path. They  may be resigned to using their older technology rather than upgrading, due to a fear of the IT unknown. However, if a turnkey solution were available with minimal IT expertise needed, departments or smaller  companies would be able to take advantage of current and future technologies.

FDA compliance

FDA approval (compliance) is required for devices used to treat and diagnose patient diseases. Clinical uses  must endure a safety period. However, the FDA has in place a number of regulations and safety assurances  that must be followed when working with patient health, such as certifications when working with lab instruments, appliances, and technology that are used to facilitate patient health. These safety checks can be  daunting, thus the need to work with experienced vendor services teams.

Security

Patient data is obviously very valuable and must be kept secure. The field of genomics is no exception. It is  actually even more important to provide security for patient health record data. Special tools, processes and  products must be used to protect patient data and must be compliant with federal requirements.

Clinician practices

Clinicians using electronic medical records and imaging archives must abide by procedures and protocols in  order to comply with the Health Insurance Portability and Accountability Act (HIPAA) and best practices.  These practices include various consultations with specialists and experts that may or may not be a part of  the existing IT infrastructure. Due to the fact that these records can assist in treatment or diagnosis, they must be accurate and available quickly to those involved in the treatment of patients.

Data management

Large chunks of data must be managed in a genomics solution. A single genome is approximately 200GB to 300GB. Even though the data consists of just four letters with TGAC as its building blocks, there are  approximately three billion of these nucleotide bases in a single person. The data from the sequencer is a  very large data file that must be accessed, analyzed, stored and acted upon. Analyzing the genome magnifies  the need for nearby storage, scratch storage, archival storage and network bandwidth.

UK National Health Service

In the UK, Cardiff University and the UK National Health Services (NHS) team are working together to  capitalize on advances in HPC systems to transform public health and personalized medicine. One of the  important aspects of collaborations between universities and organizations is that many more researchers and scientists can get access to HPC resources. Teams from various disciplines, with experts from genomics,  bioinformatics and HPC, can work together in an effort to fight infectious diseases and enable personalized  healthcare.

In this case, over 1,000 users have access to the software, tools and hardware to carry out their research. This  has resulted in almost 10,000 genomes being decoded, leading to possible treatment for a variety of  diseases, such as tuberculosis. Teams are also working on how AI can be used to provide accurate and  reliable prognosis for patients with cardiovascular disease, on par with human interaction. This allows for  faster and more accurate determination of a disease and outcomes than humans alone can provide.

The MRC CLIMB – for Medical Research Council (MRC) Cloud Infrastructure for Microbial Bioinformatics  (CLIMB) project offers leading edge cyber-infrastructure for microbial bioinformatics, including cloud-based  compute, storage and analysis tools, for academic microbiologists across the UK. As part of the COVID-19  work (Spring, 2020), Cardiff University and the Public Health Wales (PHW) are working together to track the  spread of COVID-19 in the UK by sequencing genomes from patients. Read more.

Oregon State University

Oregon State University (OSU) is an international public research university that draws people from all 50  states and more than 100 countries. The OSU Center for Genome Research and Biocomputing (CGRB)  researchers are on a perpetual quest for scientific discovery.

The CGRB facilitates genome-enabled and data-driven research in the life and environmental sciences at OSU and across the state.

The CGRB places significant demand on its compute infrastructure and has Dell EMC PowerEdge servers with  AMD EPYC processors. The CGRB uses its AMD-based PowerEdge systems for a wide range of research  applications, from quantum mechanical simulations and workloads that involve gene expression and meta- genomics to studies of species diversity in tropical forests.

“We test every piece of hardware, and believe it or not, Dell EMC is the only server that can hold up to the  type of work that we are pounding on these boxes,” says Christopher Sullivan, Assistant Director for  Biocomputing, Center for Genome Research and Biocomputing, OSU.

San Diego Supercomputer Center

The San Diego Supercomputer Center (SDSC) is a multi-disciplinary computing facility that serves the needs  of a wide range of researchers. As computing demands continually increase, the SDSC must constantly  upgrade its systems. Petaflops must be delivered reliably on a sustained basis to the researchers who require such computing power. The latest system at SDSC, named “Expanse” consists of 93,000 compute cores and is  designed to serve over 50,000 academic and industry based researchers from across the United States.

“SDSC and Dell EMC have a very good partnership with Comet, starting with the co-design of the system,”  says Shawn Strande, deputy director of SDSC. “For Expanse, we worked with the same team we worked with  for Comet—an expert group of engineers and application specialists who really understood the workloads  and who share our design philosophy.”

Dell EMC Ready Solutions for Life Sciences

The technologies that are being developed for HPC and AI are moving quickly in order to serve the many disciplines that require such computing performance. Dell Technologies has been a leader in providing state-of-the-art solutions to the healthcare and life science industry. Besides supplying advanced PowerEdge  servers based on AMD® EPYC™ processors, Dell Technologies has experts standing by to work with you to  architect and implement some of the most advanced workflows in the industry today.

Dell Technologies has created a lineup of Ready Solutions, which are highly optimized and tuned hardware and software stacks for a variety of industries. The Ready Solutions for HPC Life Sciences have been designed to speed time to production, improve performance with purpose-built solutions, and scale easier with  modular building blocks for capacity and performance. Ready Solutions for HPC Life Sciences speed system design and deployment, allowing organizations to become more productive quickly.

With Dell EMC Ready Solutions for Life Sciences, organizations can start small and grow to meet the  increasing demands of researchers and scientists. Since the overall infrastructure is designed to grow, it is  easy to add more compute or storage capacity. With a system from Dell Technologies, the clusters are  supported from day one. Dell Technologies experts can assist you with scalable solutions that will meet your ongoing requirements.

Dell Technologies uniquely provides an extensive portfolio of technologies to deliver the advanced computing solutions that underpin successful data analytics and AI implementations.

With an extensive portfolio, years of experience and an ecosystem of curated technology and service  partners, Dell Technologies provides innovative solutions, workstations, servers, networking, storage and  services that reduce complexity and enable you to capitalize on the promise of data analytics, HPC and AI.

The Dell EMC Ready Solutions for HPC Life Sciences can start with PowerEdge servers with AMD EPYC 7000  series processors. Server accelerators can easily be added to increase the performance of certain applications  by significant amounts. High performance networking is included, as well as parallel file systems for increased I/O performance. Read more about Dell EMC Ready Solutions for Life Sciences.

In addition, the Dell Technologies HPC & AI Innovation Lab team stays on the leading edge, testing new  technologies, and tuning algorithms and applications to help customers keep pace with the constantly  evolving landscape. This team of industry and technology experts can help customers achieve faster time to  results by shortening both design cycle and configuration time. And, you can remotely take a test drive with
a proof of concept in one of the Dell Technologies worldwide Customer Solution Centers.

Summary

Research from IDC shows that Dell Technologies is the world leader in the supply of servers for both revenue and number of units sold. Dell Technologies is a trusted partner for your most demanding IT needs for those  involved in all aspects of the Life Science industry. From genomics to computational chemistry, Dell  Technologies has the experience and experts to help leading edge organizations implement and scale their  high performance computing systems. You can start small and grow as you need. Dell Technologies has  created optimized solutions to address these needs and is always available to assist with your toughest HPC and AI challenges.

Over the past few weeks we’ve explored these topics surrounding HPC and AI for life sciences in the era of genomics:

Download the complete insideHPC Special Report: HPC and AI for the Era of Genomics, courtesy of Dell Technologies.