13.10.2023
The Impact of AI on Data Center Design
Designing a data center is a multifaceted challenge. It requires meticulous planning for power provision, cooling systems, robust security, consistent reliability, and lightning-fast network connectivity, all of which must harmoniously integrate. However, the advent of artificial intelligence (AI) has introduced an entirely new layer of complexity to the traditional data center blueprint. The network demands of the AI models require a significantly higher level of computational density, further accentuating the challenges inherent in conventional data center designs.
AI models, like those used in medical and climate research or large language modelling or financial services, demand relentless computational power. Unlike some traditional data center workloads that experience peaks and troughs, AI model training is continuous. And the current trend of the latest AI algorithms is increasing the computational demands of AI enabled data centers. Most conventional data centers are not equipped to deal with the enormous compute required to train the neural networks required of AI technology.
First and foremost is the need for higher density racks. Whereas the average rack density a few years ago was 5 kilowatts (kW) per rack, the latest generation of AI supercomputers, like the NVIDIA DGX H100, require much more from data center infrastructure. Just four of these systems in one rack could consume more than 40 kW while only occupying 60% of the space of a typical computing rack. While housing more computational power within a smaller space offers substantial cost-efficiency, it does present some unique challenges.
Traditionally cooled data centers have resorted to widely spaced racks to alleviate heat issues. But at-scale machine learning applications require server racks placed close together – as this optimizes the network configuration that is made up of very expensive high-throughput cables between servers, while also minimizing the overall cost of deployment. This rising rack density is putting strains on conventional air cooling technologies. As a result, liquid cooling is gaining traction in AI data centers.
With faster and hotter servers, data centers must also adapt their network architecture and connectivity. Low-latency east-west spine and leaf networks are essential to support production traffic and machine learning processes. Additionally, investments in cache-capable systems like CXL and NVLink are necessary to ensure efficient data transfer between CPUs, GPUs, and DPUs without causing delays.
AI-optimized data centers must also be structurally robust. These data centers need to support the transportation and installation of exceptionally heavy AI computing cabinets, some weighing over 1.5 tonnes or more when fully configured. This structural integrity is essential for the efficient operation of AI infrastructure.
As AI models deal with large and complex datasets, one of the questions being raised is how companies will look for ways to protect their proprietary data. Instead of training public models, companies will leverage proprietary engines that not only protect access to their data sources, but could help them gain a competitive edge in the marketplace. This takes data center infrastructure to a whole new level requiring even greater connectivity, agility, and scalability.
At Verne Global, we understand the unique challenges and demands that AI models place on data centers. Our data centers are designed to reliably host, support and scale some of the most demanding AI models in the world. As AI continues to reshape industries, we are staying one step ahead to meet these demands, adopting optimized designs that prioritize power, density, cooling, strength, and security. We know what powering progress looks like as we’ve been delivering high intensity compute for more than a decade. Let us help you take AI center stage for your business.