MORNING FORECAST: CLEAR - Early High Performance Computing (HPC) occurred on the first-generation supercomputer mainframes from the likes of Cray and CDC. They solved computationally intensive tasks in various fields, including quantum mechanics, weather forecasting/research, oil and gas exploration, molecular modeling and physical simulations like below example of finite element crash simulation:
(Above: Finite Element Crash Simulation, courtesy of BMW)
These applications were often written in Fortran, and some continue to be evolved in Fortran, and were batched to run until complete, often consuming all the available compute resources for extended periods. Smaller non-HPC type compute tasks would often be run in a time-share environment where many of them could execute at the same time. This would be the genesis of the current day 'cloud computing' concept.
As server CPUs became more powerful, supercomputers became performance challenged by large clusters of servers and their storage area networks (SAN). The large compute tasks were segmented and distributed over this multitude of server compute resources. In both these environments, the HPC compute applications had clear unfettered access to the compute resources - the outlook was clear skies ahead.
AFTERNOON FORECAST: MOSTLY CLOUDY - Today’s cloud computing infrastructures have found a big following in the HPC community usually for very specific purposes:
- Massive, very irregular compute tasks like matching potential pharmaceutical compounds to diseases
- Often using the off-peak pricing out of business hours
- Traditional HPC applications like computational fluid dynamics (CFD) which are not run continuously or daily
- Prototyping of AI neural networks
Each of these compute tasks exploits the real value proposition of cloud networks, where the user gets to spread the infrastructure cost amongst the fellow users rather than shoulder it all themselves. Additionally, the cycle-time to first compute is much shorter than procuring your own infrastructure.
Many AI start-ups exploit the compute clouds as they experiment with their initial product design and the neural network training techniques that work best for their application. However, once the product neural network training framework is established and they want to scale out the scope of their innovation - compute clouds often become rather expensive even considering the need for in-house IT staff.
A new generation of super workstations are being used for single-user HPC and light-weight AI prototyping. These are far from your father’s desktop machine. They likely include multiple Xeon CPUs and multiple GPUs with a special blend of cooling infrastructure to operate in a low-noise office environment. The CPUs often have loop heat pipe cooling (shown below) to move the excess heat to an area where large volume, low noise fans can further distribute it into the office environment. These workstations often suffice for a single engineer performing computer aided engineering (CAE) tasks like finite element analysis of a new furniture component.
EVENING FORECAST: LARGELY CLEARING - As noted before, cloud computing delivers its highest value proposition to very intermittent compute applications where the cost of under-utilised compute resources is distributed amongst other users. Hence, once compute tasks evolve to become closer to continuous, dedicated compute resources once again become very attractive.
At the GPU Technology Conference (GTC) in San Jose this spring I asked a multitude of hard-core HPC engineers and IT professionals when should I consider moving a HPC application from cloud infrastructure to dedicated hardware. The consensus was at $15,000/month of cloud expenditure. Of course, this trigger would likely be delayed if you have zero IT staff to build, host and maintain the new HPC compute infrastructure. So, depending upon your IT sophistication once you spend between $15,000 and $30,000/month on cloud services you should be seriously considering investing in your own IT infrastructure.
At these cost levels your infrastructure will likely benefit from an HPC optimised hosting location, such as our Verne Global campus in Iceland which combines specialist technical architecture with Iceland's abundant, low-cost power and free-cooling enabled climate.
AI neural network training (like the machine vision neural network training results below) is likely to be a major growth area for Nordic data centers over the next few years as companies migrate from prototyping to production products needing a continuous evolution to stay competitive.
(Above: Machine Vision Neural Network Training Results, courtesy of NVIDIA)
Looking forward, I am eagerly anticipating NVIDIA's next GPU Technology Conference in Munich between the 10-12 October and speaking with my industry colleagues to see whether my HPC forecast is in track with their experiences and forward thinking. It would be great to meet as many HPC developers as possible - please feel free to reach out to me at: **firstname.lastname@example.org **