Industrial HPC Solutions: Data Analytics

HPC Data

How data is collected and analysed is changing at exponential rates. In industry — and I know this from talking and consulting with hundreds of companies — so much data is being collected that many companies are either confused with, or overwhelmed by, all of the ways to leverage and analyse their data.

Comparing what it meant to analyse data in the early days of high performance computing (HPC) to what it means today is like comparing apples and oranges – similar, but so different.

This article will provide a brief history of the evolution of HPC data analytics then focus more substantively on recent use cases of applied/industrial analytics and go into some technical detail about what architecture and tools are available to drive today’s data analytics.


Data analytics come in many different forms. Even the largest of companies have traditionally analysed data manually in spreadsheets or have relatively rudimentary visual analytics in infoviz format.

Though AI has been around for decades, leveraging the power of AI (ML and DL), particularly in a HPC compute environment, is relatively new, especially in industry. Add opportunities in geospatial analysis and more sophisticated visualization in harmony with increased computing power and more analytics are taking on a deeper, more sophisticated capability, providing greater business intelligence more efficiently as we head into the next industrial revolution.

Examples of Applied/Corporate HPC Data Analytics

From DataFloq (4 Ground Breaking Use Cases of Big Data and High Performance Computing, 26 July 2015) “One company that deals with these volumes is PayPal. On a daily basis, they deal with 10+ million logins, 13 million transactions and 300 variables that are calculated per event to find a potentially fraudulent transaction. Thanks to High-Performance Data Analytics, PayPal saved in their first year of production over $700 million in fraudulent transactions that they would not have detected previously.”

In the area of medical imaging, “Pfizer uses machine learning for immuno-oncology research about how the body’s immune system can fight cancer.” (Built-In, 1 February 2019, Ultra Modern Medicine: 5 Examples of Machine Learning in Healthcare) In terms of industrial impact, Pfizer collaborated with a Chinese tech startup “to develop an artificial intelligence-powered platform to model small-molecule drugs as part of its discovery and development efforts. The project will combine quantum mechanics and machine learning to help predict the pharmaceutical properties of a broad range of molecular compounds.”

Autonomous vehicles are evolving quickly yet with seemingly so far yet to go. Deep learning impacts many elements in autonomous, according to Forbes’ 20 August 2018 article, 10 Amazing Examples Of How Deep Learning AI Is Used In Practice? “There's not just one AI model at work as an autonomous vehicle drives down the street. Some deep-learning models specialise in streets signs while others are trained to recognise pedestrians. As a car navigates down the road, it can be informed by up to millions of individual AI models that allow the car to act.”

There’s so much more, of course. From crop protection to drug discovery to insurance cost controls to smart manufacturing, AI now plays an important role in industrial advancements and successes.

Today’s Analytics

At least partially due to the data deluge, analytics today are driven more frequently by advanced computing resources. CPUs have traditionally run most analytics jobs, and even that evolution has seen significant changes.

In a 24 August 2016 insideHPC article, The Evolution of HPC, change from the need for huge CPU clusters was described. “Instead of a monolithic CPU that manages MPI or SHMEM communication a programmable co-design presents a new model that blurs the lines between discrete cluster components (i.e. the server, accelerators, and the network). A network co-design model allows data algorithms to be executed more efficiently using smart interface cards and switches.”

Meanwhile, GPUs are now very much in the analytics mix, depending on what is being analysed. Our recent upgrades to our evergreen iForge industrial cluster has expanded the use of both Skylake CPUs and NVIDIA V100 GPUs. Benchmarking across domains, including on a variety of data analytics workflows, has shown up to 60% performance uplift with GPUs.

AI has expanded its reach in today’s analytics as driven by HPC thanks to the confluence of machine learning and deep learning with other domains like modeling & simulation and bioinformatics. Oil & gas, healthcare, financial services, agriculture and more sectors take advantage of advances in analytics capabilities in creating enhanced services and generating more business.


HPC drives sophisticated data analytics. AI needs the power of HPC to collect and analyse in this era of data deluge. For industry, the more intelligence derived from data and in as short a time as possible equates to better products and services and, ultimately, ROI.

The next five years should be more of the same as we approach Exascale capacity to drive more data to deeper, more sophisticated solutions. Think about what is happening now, then add significantly more compute power, more tools, more experts, and more companies driving our evolution through their increasing applied needs. Now things are advancing exponentially. It’s going to be an incredible ride!

Brendan McGinty is Director of Industry for the National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign.

Written by Brendan McGinty (Guest)

See Brendan McGinty (Guest)'s blog

Brendan McGinty is Director of Industry for the National Center for Supercomputing Applications (NCSA), University of Illinois at Urbana-Champaign.

Related blogs

Qubits North America - Rumours from the Trade Show Floor

It's lunchtime here in the beautiful city of Amsterdam where I am enjoying my time at the excellent World Summit AI, at which Verne Global are exhibiting, and earlier today our CTO Tate Cantrell moderated an engaging panel on today's hottest AI disruptors. As I take a breather from the networking, I thought I would write some thoughts about another excellent event which I went to a couple of weeks ago - Qubits North America.

Read more

CERN - When science can't use the cloud

Demand for high performance computing (HPC) is growing fast - and you might expect it to become kind of generic. But leading research sites like CERN still make extreme demands.

Read more

DevOps. Stacking up to be a Common Theme at ISC18

Yesterday was a good day to be in Frankfurt. All of the majors in the supercomputing universe descended upon the Messe Frankfurt to begin ISC18 with a series of training seminars. For my morning session, I chose Getting Started with Containers on HPC through Singularity put on by the team at Sylabs, Inc. I have been tracking the progress on Singularity in the HPC community since before Sylabs was founded by CEO Greg Kurtzer in an effort to bring the technology of root secure containers into the realm of enterprise supported software. I was excited to hear about the progress that Sylabs has made and to see where the future of containers lies for the broader HPC community. If I was forced to sum the tutorial into a single portmanteau, it would be DevOps. After this session, it is clear to me that the world of DevOps that has been created in the cloud native universe is on a collision course with HPC. And the future of science says that it can’t happen soon enough.

Read more

We use cookies to ensure we give you the best experience on our website, to analyse our website traffic, and to understand where our visitors are coming from. By browsing our website, you consent to our use of cookies and other tracking technologies. Read our Privacy Policy for more information.