My summer usually starts with the International Super Computing (ISC) show in Frankfurt, and this year was no exception. Across the previous week in London at the AI Summit it rained every day, but thankfully high-performance computing’s influence on weather forecasting clearly shone through and in Frankfurt it was sunny and a warm 30⁰C.
ISC is an industry show for the HPC community: supercomputing centers, HPC hardware vendors and key HPC-centric enterprise customers.
The traditional HPC market is back in one of its hardware refresh cycles creating a buzz about who was buying what. Almost always these upgrades include lots of CPU accelerators – mostly Nvidia GPUs but almost some AMD devices and FPGAs. The Summit system in Oak Ridge is still the largest traditional HPC cluster and over the last 12 months it’s added several Nvidia DGX-2s for large memory model DNN training to augment its 9000+ Power9 server CPUs with 27,000+ Nvidia GPUs.
I have a love/hate relationship with the Hyperion breakfast held on the second morning of the show. They share fascinating market data about HPC, AI and now quantum computing and great food but after a late business dinner the previous night the information arrives rather soon! They shared a wealth of data and here are some nuggets that caught my bleary, morning eyes.
Steve Conway and Earl Joseph of Hyperion Research
~10% of HPC is now performed in the cloud. This is usually focused in the GPU augmented parts of the hyperscale clouds and HPC specific clouds like Verne Global’s hpcDIRECT. Our experience indicates that this will continue to grow quickly as increasing numbers of AI start-ups transition from their prototype stage into production products. Hyperion also confirmed that AI is the largest driver of HPC growth. The increasing convergence of the two fields shows no signs of stopping…
Interestingly different HPC application types have differing compute platforms:
• Weather models – 100% HPC compute clusters
• Mechanical design – 100% desktop computers
• CAE design – 27% desktop/45% cluster/18% parallel hardware/9% other
This makes perfect sense when you consider that many weather models still use legacy software, often Fortran, running on big CPU clusters, that the power of desktops and laptops, with their internal screen GPUs, has increased fabulously over the last few years, such that many mechanical design tools just work. The CAE segment is much broader than mechanical design tools and includes CFD, design validation and crash testing all with very different compute requirements.
The sales animal in me always follows the HPC commercial market revenue drivers which this year were:
4. Data analytics
I’m sure that the CAE benefitted from the considerable research investments in autonomous vehicles which look like they will continue for some years to come.
Hyperion’s return on investment (ROI) calculations confused me if I am honest. They claim that governments and academia get a 3-fold HPC ROI versus industry. Intuitively, you would think that industry, where the result is revenue, would have an ROI advantage versus governments and academia. Subsequently Earl Joseph explained away my confusion: “The ROI can be from profits, revenue generation or cost savings. Many government/public sector sites see massive cost savings around disaster prevention and minimization, e.g. earthquakes, weather, disease, attacks, etc.” I wish my commission plan was based on cost savings!
Verne Global sponsored a Frankfurt AI meetup while at ISC19. Over 100 AI professional converged on the cool WeWork office space on the Tuesday evening for 3 insightful AI presentations, including one from our CTO Tate Cantrell, and lots of excellent networking.
Back at the show, as expected, there was lots of interest in CPU accelerators. The new Nvidia T4 GPU was on display at multiple booths despite, what some would say (including myself), that it suffers from some serious mis-branding as an inference GPU. Its DNN training benchmarks return a solid performance: https://developer.nvidia.com/d... and I expect to see many of these potent devices in data centers performing DNN training, DNN inference and video applications.
IBM had their quantum computer on display and were always busy explaining its value proposition with regard to HPC applications. I also noticed a team from Graphcore scoping the show and meeting with key decision makers. No doubt they are readying for a late year marketing push which will be welcome. They are a fascinating company with some excellent concepts and innovations.
Nvidia T4 GPU
Although I don’t remember seeing one on the exhibit floor (maybe the floor wasn’t strong enough…!) there was much discussion about the Nvidia DGX-2 and its applications. The Summit supercomputing center has added them to their CPU accelerator portfolio as I mentioned earlier. One question we were asked frequently was “Can you host such a machine with its 10 - 12 kW of hot air each?” - which likely results in 30kW racks. I could stand on one foot and talk all day about how well prepared we at Verne Global are for them. 😊
From our concrete floors ready for any weight, to the high density cooling infrastructure, to the years of HPC cluster design, to the cost effective power and great PUE – bring them up to Verne Global in Iceland right now and I’ll provide an extra special deal for the first one. Thanks for reading and I look forward to meeting more high performance computing faces at SC19 in Denver.
Bob Fletcher, VP Strategy, Verne Global (Email: firstname.lastname@example.org)