By Jeff Whitaker, VP Product Strategy and Marketing, Panasas
The edge is growing fast. Deloitte predicts that the enterprise market for edge computing will see 22 percent growth in 2023, in comparison to 4 percent in spending on enterprise networking equipment and 6 percent on overall enterprise IT. This growth is not surprising given how data centers at the edge have the potential to improve productivity for emerging, bandwidth-intensive business applications, speed up real-time decision making and reduce IT costs.
However, edge IT deployments – particularly HPC-class edge computing and data storage – pose special requirements that must be taken into consideration.
A key element in growth at the edge is physics; the concept of data gravity has always been the biggest limitation to data movement. And in high-performance computing (HPC), where massive data volumes are the norm, this is a particularly significant challenge. The larger the dataset, the trickier it is to manage. The ability to process, move, and store data at the edge therefore becomes a highly valuable option.
Carrying out HPC workloads closer to the source of the data has clear benefits: faster processing, decreased latency for data transfers and improved cost efficiency and resource utilization. Pre-filtering AI datasets for example means that only the results are transferred from one location to the next resulting in significantly smaller data transfers. Genomics labs with benchtop sequencers can use HPC edge infrastructure to perform in-house analysis and store results on site. Advancements in storage media and software are making HPC edge deployments a viable option to store and manage the vertiginous data volumes involved in modern applications such as precision medicine, manufacturing simulations, genomics, and Cryo-EM research.
These many new applications and edge locations define modern HPC. Traditionally, HPC referred to complex modeling and simulation applications in research and academia running on large supercomputers or clustered HPC systems in central data centers. Today, modern HPC also encompasses AI, ML, high-performance data analytics (HPDA), and other emerging technologies, as well as expansion beyond the primary data center into the cloud and the edge.
To ensure your edge HPC applications are optimized for success, here are seven characteristics to prioritize for successful HPC storage at the edge:
- Manageability: It is not practical to have a dedicated HPC storage specialist stationed at every data location. As such, make sure you select an HPC system that is easy to use and does not require dedicated experts to manage. The solution should have centralized management and be able to seamlessly adapt to optimize the performance of multiple workloads.
- Reliability: A fully automated online failure recovery system and network-distributed erasure coding is essential to ensure data is always-on and available. Your business cannot afford data loss or downtime.
- Security: HPC historically was separate from the corporate LAN and the internet, which meant that security wasn’t a priority. Today, the situation is very different and data-rich HPC environments, highly attractive to cyber criminals, are fully integrated into enterprise systems. Edge deployments further broaden the attack surface. Take a multi-layered approach to the cybersecurity of your entire infrastructure and conduct regular tests.
- Cost-effectiveness: When considering your budget for your edge storage solution, make sure to consider the total cost of ownership (TCO). A low price tag on a product might result in expensive downtime or require management by a team of costly experts.
- Mobility: Talk with your HPC providers to ensure that your infrastructure facilitates easy movement of data between your core data centers, cloud environments and edge devices.
- Scalability: Find a system that scales as your datasets grow – infinitely. HPC data solutions today need to scale in both performance and capacity as organizations build their business, expand their research or grow their team.
- Visibility: Prioritize data visibility capabilities that offer a holistic view of your data across all platforms and locations, so you always know what you have and where you have it, and how and when you need to make changes.
Up until about 10 years ago, when you heard the term “HPC” you immediately pictured complex simulation or modeling applications in research and academic settings powered by centralized data centers. This is still the case today, but at the same time, you might picture a manufacturing firm creating virtual prototypes, or a genomics lab sequencing and analyzing data. Bringing HPC storage and workloads to the edge will further expand those possibilities.