Intel Labs has released a case study involving a professor at the University of California-San Diego, who for the past five years has led research efforts into what’s called hyperdimensional computing aimed at solving memory and data storage challenges.
Hyperdimensional computing (HD) is a type of machine learning inspired by observations of how humans and other animal organisms use their senses to gather data from their surrounding environment. For example, when a person takes in visual information through the eyes, the data is represented by dense input sensory signals from firing neurons. This dense depiction is then exploded in the brain as a high dimensional sparse representation where hundreds or even thousands of neurons are engaged, but only a few neurons fire. By moving from an initial compact depiction to a seemingly less efficient high dimensional sparse representation, the brain can more easily separate relevant information.
Prof. Tajana Simunic Rosing’s HD work involves COVID-19 wastewater surveillance and personalized recommendation systems. She conducted the work at UCSD’s JUMP (DARPA’s Joint University Microelectronics Program) CRISP (Collaborative Research Interface for Scientific Data Presentation) Center. She worked with Intel Labs and other industry collaborators.
UCSD is one of the universities participating in the JUMP CRISP Center led by the University of Virginia. Steered by Rosing, who is professor of Computer Science and Engineering and Electrical and Computer Engineering at UCSD, the team demonstrated that HD computing can perform complex calculations in memory and storage while requiring less computing power.
The JUMP program was co-sponsored by the Semiconductor Research Corporation (SRC), the Defense Advanced Research Projects Agency (DARPA), and academic and industry collaborators, including Intel Labs. Starting in 2018, SRC led a public-private partnership in cooperation with DARPA that invested approximately $250 million over five years in JUMP to create new architectures and system designs that lower power consumption in information and communications technologies (ICT).
In January 2023, SRC and its collaborators in cooperation with DARPA launched JUMP 2.0 to continue the efforts from JUMP with an additional $330 million investment over five years. Now spanning seven U.S. lead university research centers and a total of 141 faculty at more than 42 universities nationwide, the expanded and revised program is focused on significantly improving performance, efficiency, and capabilities across a range of electronics systems. Continuing in the footsteps of the JUMP CRISP Center, intelligent memory and storage research is being led by the PRISM Center at UCSD with Rosing as its newly appointed center director.
“Professor Rosing played a huge role in the success of the CRISP Center’s research into energy efficient solutions using HD computing,” said Roman Caudillo, Intel-SRC assignee and JUMP 2.0 director. “Now she’ll be steering the PRISM Center forward in finding more innovative ways to improve intelligent memory and storage over the next five years.”
HD computing mimics pattern-based computation in human memory. For example, by taking data from a denser initial depiction of a real number and then representing it as thousands of bits or hypervectors, the HD model can more easily discover what data is similar to other data. Using algebra with a well-defined set of operations allows parallelization in HD computing, which is important from a hardware perspective.
“This particular method is very easy to run in memory and in storage, which is really where we should be running most of our workloads these days,” said Rosing.
In addition, HD computing uses fast single-pass training, which allows real-time learning and reasoning. Using simple operations, the model can train and learn online, unlike deep neural networks that require complex systems for training. The use of hypervectors also makes HD computing robust to noise. Using hardware accelerators makes HD computing energy efficient for tasks such as classification of images and objects, and analyzing genomics.
Genomic data is doubling every seven months, outpacing Moore’s Law. A human genome has approximately 3 billion letters while the COVID-19 genome has roughly 30,000 RNA letters. The standard microbiome and COVID19 analysis pipeline starts with RNA/DNA sequencing, passes through a preprocessing step (trimming and alignment), followed by downstream analysis. Not only can the daily generated data reach up to 10 TB, it also requires expensive operations such as alignment, which takes months to process for genomic data and multiple days for viral genomic analysis. However, HD computing accelerates the virus genomic sequence tracking process from days to hours, which proved vital during the pandemic, according to Rosing.
CRISP Center Viral Analysis for COVID-19
When COVID-19 hit in 2020, many universities across the country were forced to shut down. However, UCSD continued operating due to the joint efforts of the CRISP Center and the UCSD School of Medicine to track virus genomic sequences in wastewater. Using innovative acceleration ultimately improved community prevalence estimates and the detection of emerging variants.
UCSD needed to test up to 30,000 people daily, but clinical testing at scale was infeasible. Instead, to directly compare wastewater genomic surveillance to clinical surveillance, the UCSD School of Medicine conducted a large-scale COVID-19 genome sequencing study from wastewater samples collected daily from 131 wastewater samplers across 360 campus buildings. The collection and analysis efforts were led by Rob Knight, the founding director of the Center for Microbiome Innovation and professor of Pediatrics, Bioengineering, and Computer Science and Engineering. The detection of viral activity in wastewater was possible three to five days earlier than an individual test, which allowed UCSD to quickly identify hot spots and notify campus building occupants. This helped UCSD prevent outbreaks by detecting 85 percent of cases early.
The CRISP Center focused on accelerating the genomic pipeline and the huge amount of data generated by bioinformatic workloads from large omics data sets. Omics is an emerging field in biomedical research that includes genomics, transcriptomics, proteomics, and metabolomics. Using high throughput methods to provide information at multiple levels has led to accelerated testing for diseases such as COVID-19.
For COVID-19 viral analysis, genomics focuses on the molecular mechanisms of infectious diseases, tracing the source of infection, decoding transmission routes, and detecting host susceptibility during the epidemic process, according to research published in Virulence. Information content is recorded in the DNA of the genome and expressed through viral transcription, so transcriptomics uses sequencing technology to analyze COVID-19 gene expression information. Proteomics studies the change and interaction of proteins in cells. Finally, when the body breaks down food, chemicals, or its own tissue, it generates metabolites. Metabolomics analyzes these small molecules and their relationship with physiological and pathological changes. Precision medicine development requires analyzing all four omics data sets for a full picture in understanding events such as the spread of COVID-19 on an urban college campus.
With the help of Niema Moshiri, an assistant teaching professor of Computer Science and Engineering who specializes in viral phylogenetics and epidemiology at UCSD, the CRISP team found that the four omics data sets were well matched to HD computing analysis tools. Genomics and transcriptomics both use single characters to represent single genomes, which translate well to string or character analysis. Proteomics and metabolomics use mass spectrometry spectral analysis tools, which create large bar graphs that are similar to hypervectors used in HD computing. FPGAs, which are becoming increasingly available either in local computer systems or in cloud data centers, are widely used to accelerate HD computing bioinformatics applications.
For DNA and RNA analysis at UCSD, sequences of 100 to 300 letters were relayed by the Qiita open-source microbial study management platform to a GenieHD running on a FPGA for DNA pattern matching and a RAPIDx processing in memory (PIM) architecture for DNA sequence alignment. GenieHD maps DNA sequences to hypervectors, and accelerates the pattern matching procedure in a highly-parallelized way. RAPIDx reduces internal data movement while maintaining a high degree of operational parallelism provided by PIM. The architecture is highly scalable, which facilitates precise alignment of lengthy sequences.
“When we used HD computing to represent the genes for either humans or the virus, we were able to speed up DNA sequencing by at least 200 times because we can compare these patterns in parallel,” said Rosing, who noted that Micron Technology collaborated with the CRISP team in developing optimized tools.
Clustering the transmission history of COVID-19 helped campus health officials make critical decisions to limit the spread. Through pairwise comparison of viral sequences, the team reconstructed the transmission events using FANTAIL, a high performance and energy efficient FPGA-based accelerator for pairwise distance calculation for viral transmission clustering. FANTAIL provided a 56X speedup and 168X energy reduction compared to the state-of-the-art multi-threaded CPU baseline running on an 8-core Intel Core i7 CPU.
For its efforts, UCSD received the 2021 American Council of Education/Fidelity Investments Award for Institutional Transformation. The campus was recognized for its science-based, real-time approach to maintaining a low infection rate during the pandemic through the Return to Learn program, which included risk mitigation, viral detection, and intervention. The program now serves as a best practice model for educational institutions nationwide.
Alternative Representations for Recommendation Systems
Through the CRISP Center, UCSD researchers had applied hyperdimensional computing to recommendations systems and developed a HyperRec implementation. Inspired by this work, Intel Labs research scientists Gopi Krishna Jha and Nilesh Jain worked with UCSD graduate student Anthony Thomas to further investigate how embeddings in large-scale deep learning recommendation systems (DLRMs) can be efficiently represented to achieve reduced footprint and increased performance.
Personalized recommendation systems are at the heart of a wide range of applications, including online retail, content streaming, search engines, and social media. Training data used for recommendation systems commonly includes categorical features with millions of possible distinct values. These categorical tokens are typically assigned learned vector representations that are stored in large embedding tables that are often more than 100 GB. Embedding tables typically require substantial amounts of memory, which makes embedding table lookup and data operations the main bottleneck during training and inference for these models. Intel Labs researchers proposed MEM-REC, a novel and efficient alternate representation method for embedding tables in DLRMs.
MEM-REC uses Bloom filters and hashing methods to encode categorical features using two cache-friendly embedding tables. The first table for token embedding contains raw embeddings such as learned hypervector representation. The smaller second table for weight embedding contains weights to scale these raw embeddings to maximize predictive performance and increase recommendation accuracy. The advantage of MEM-REC is that model size grows only logarithmically with the data set alphabet size, providing better scalability for commercial-scale recommendation embedding tables.
“When testing on a terabyte worth of data, we can get state-of-the-art accuracy. This is extremely exciting because now we can accelerate recommendation systems by moving embedding tables from DRAM to CPU caches,” said Rosing.
MEM-REC’s compact embedding representation provides a substantial reduction in model size and memory footprint, achieving a memory-efficient fast recommendation pipeline without compromising the quality of the generated recommendations. MEM-REC can compress the MLPerf Criteo 1TB benchmark DLRM model size by >2500X and performs up to >2.5X faster embeddings than the DLRM baseline without loss of accuracy.
Under JUMP 2.0, the PRISM Center will forge ahead with Rosing as the center director, along with center co-director Nam Sung Kim, professor of Electrical and Computer Engineering at the University of Illinois Urbana-Champaign. Over the next five years, PRISM will focus on solving fundamental intelligent memory and storage scaling challenges by creating novel computing architectures that seamlessly integrate with diverse memory, storage, compute, and software.
“We want to deal with the rapid data growth problem by exploding the traditional computing hierarchy and replacing it with a new system that is much more flexible. We’ll use novel accelerators in and near memory and storage that will allow us to optimize across the whole stack,” said Rosing.
Ultimately, these advances will be applied to challenges including personalized and secure drug discovery for cancer and antibiotic resistance, performing contextual video extraction with text or video queries, and contextual fusion for fast 3D cloud point construction.
source: Scott Bair, senior technical creative director, Intel Labs