Heavy metal and tag lines – Rumours from the trade show floor

Tech Trends Insights

On the cusp of spring I regularly refresh my GPU technology suntan at the Nvidia GPU Technology Conference (GTC) in San Jose. This year was fascinating as the speed and scale of both AI and Virtual Reality industries has leapt forward. Here are my takeaways...

On that note, my favourite announcement this year was the “Heavy Metal” DGX-2 monster GPU box, which under the bonnet provides:

  • 16 GPUs burning 10kW of power
  • 2,000 TFLOPs
  • Consistent memory model across all GPUs connected with the new NVSwitch
  • 300 GB/s chip-to-chip communication at 12 times the speed of PCIe
  • 350 lbs!
  • And all for only $399,000

Not since the days of mainframe and minicomputers have I seen such a heavy solution. An early internet router from Motorola had a similar weight and needed riggers to install, and the floor to be certified. Perhaps one reason why Cisco made better progress in that market. However, I do sense that there is an appetite for this box, despite its weight, in the AI community. Jen-Hsun clearly had a marketing dude whispering in his ear because for each of the ten new products announced, we got exposed to the new tag-line: “The more you buy, the more you save” at least twice. It echoed around the hallways for the remainder of the show. Whether it was effective only time will tell.

In-between sessions I was happy to be interviewed by the team at InsideHPC on my take on how AI and GPU use is developing and how AI is helping 'turbo-charge' traditional high performance computing (HPC) applications. You can watch the video interview here.

The two other announcements that resonated with me were the integration with Kubernetes for GPU container managements and the DRIVE™ Constellation, available in Q3, which simulates a multitude of sensors for autonomous vehicle testing. The first server runs NVIDIA DRIVE Sim software to simulate a self-driving vehicle’s sensors, such as cameras, lidar and radar. The second contains a powerful NVIDIA DRIVE Pegasus™ AI car computer that runs the complete autonomous vehicle software stack and processes the simulated data as if it were coming from the sensors of a car driving on the road. With this solution NVIDIA both streamlines the amount of road testing necessary to achieve the full test regime coverage - a hot topic following the recent unfortunate crash in Arizona - and a one-stop GPU environment for autonomous vehicle developers.

Most attendees were heavy-duty technologists keen to learn the newest GPU exploitation techniques. Nevertheless my 20-year old EE degree allowed me to ingest, at a high level, lots of exciting advancements. The following NVIDIA slide shows the rapid evolution of deep neural network (DNN) training techniques over the last few years:

Walking the halls and waiting in-line for lunch I heard a lot of chatter about capsule networks. Here is a good explanation for science students. If you can understand this, then you are a candidate to become one of the next 1,000,000 GPU developers!

As the conference progressed I sensed an interesting tension resulting from the NVIDIA 1080ti End-user License Agreement (EULA) update which restricted its use in data center environments for activity other than crypto-currency mining. A few developers had prototyped their solution using these inexpensive video GPUs and were now wanting to scale their solution but felt limited from doing so without using the V100 class GPUs, which cost 10 times as much and deliver lots more performance. Others who had already passed the prototype to production milestone were only too keen to embrace the V100, DGX-1, DGX-2 class devices and exploit their deep neural networks to the full. This was especially so for well-funded larger companies.

A particularly interesting panel discussed data science best practices as a by-product of promoting the DGX-2. I discovered that data scientists have short attention spans and don’t like to be kept waiting, something I can really relate to. Consequently they focus on their projects where the DNN training takes less than a couple of days. When it takes much longer they focus on finding a new gig with the necessary hardware to train the DNN in less than two days. Hence the best practice is to provide a local workstation with a couple of GPUs for prototyping and a heavy-duty GPU cloud for full product DNN training:

Note: Thanks to Deepgram’s Scott Stephenson for his Speech DNN Training Insight.

Additionally, the current fad to move everything to the cloud does not scale well to DNN training with GPUs. On AWS adding additional GPUs helps the DNN training speed until the fifth one is added, at which point the solution becomes memory communications limited between the GPUs and without InfiniBand or other memory bandwidth accelerators, not available on AWS, the training process slows down compared to 4 GPUs and costs more!

I spent two days trying to convince people in the hallways that there was a much better tag-line than “The more you buy, the more you save” but it’s not clear that Verne Global’s “All AI training roads lead to Iceland” made a dent versus Jen-Hsun’s keynote. Please help me out and tell your GPU friends! In the meantime, here is a great keynote live-blog summary a good 15-minute video summary and my video interview with InsideHPC is here. Enjoy!

Let’s chat at the Rise of AI conference in Berlin on May 17th - bob.fletcher@verneglobal.com

Written by Bob Fletcher

See Bob Fletcher's blog

Bob, a veteran of the telecommunications and technology industries, is Verne Global's VP of Strategy. He has a keen interest in HPC and the continuing evolution of AI and deep neural networks (DNN). He's based in Boston, Massachusetts.

Related blogs

What to look out for at SC18

SC: The big show with an international HPC audience celebrates its 30th year in 2018. It’s the World Cup of supercomputing and now it’s more than “just” supercomputing. Advancements in data analytics have topics like artificial intelligence (AI), including machine learning and deep learning, as stars of the show. Here's what I am looking forward to seeing in Dallas...

Read more

Like social media, ‘social currency’ isn’t going away

The conversation about blockchain is presently dominated by cryptocurrencies, with a certain amount of attention spilling into Initial Coin Offerings (ICOs). There is no shortage of scepticism about the technology in these circumstances; cryptocurrencies look more like investment opportunities – and very risky ones at that – than actual currencies, and it’s no exaggeration to say that the majority of ICOs are scams. A recent study found that almost 80 percent are scams and just eight percent reach the trading stage.

Read more

Death, taxes and rack power density

While not as certain as death and taxes, there are signs that high-density racks will finally become more commonplace thanks to AI and other compute intensive workloads.

Read more

We use cookies to ensure we give you the best experience on our website, to analyse our website traffic, and to understand where our visitors are coming from. By browsing our website, you consent to our use of cookies and other tracking technologies. Read our Privacy Policy for more information.