AI

Expensive GPUs aren't living up to their AI potential

  • Inefficient GPU resource sharing and manual allocation hold back ROI for GPUs
  • Organizations struggle with underutilized GPUs and lack automation software for scalable, self-service access
  • Telcos are offering GPU clouds to compete with hyperscalers, aiming to optimize AI and non-AI workloads

Businesses seeking return on their AI investments are leaving money on the table due to inefficient GPU usage. 

Organizations are under pressure to make AI resources accessible in a self-service model, said Rafay Systems CEO Haseeb Budhani. These organizations face “urgency to make GPUs readily accessible for developers and data scientists, pushing enterprises to adopt platforms that support scalable and efficient deployment models,” he told Fierce Network. “However, many lack the necessary infrastructure to keep pace with these evolving needs.” Rafay provides software to automate provisioning cloud resources, including GPUs.

Inefficient processes and platforms leave expensive GPUs sitting idle. Without proper virtualization and multi-tenancy capabilities, some organizations might allocate entire GPU servers to individual users or teams, even when they need only a fraction of that capacity, Budhani said. “Organizations must either purchase more GPUs than necessary to serve multiple users, run GPU workloads in public clouds at incredibly expensive rates or maintain underutilized GPU hardware in data centers,” he added.

Also, many organizations lack software to automate easy GPU access, delaying development and leaving nearly a third of enterprises using less than 15% of their GPU capacity, per a recent Rafay study. The study found 44% of organizations prioritize enabling self-service compute consumption, but “most” lack the standardized platforms needed to make this a reality. Instead of allowing developers to simply press a button to access GPU resources, organizations struggle with manual allocation processes that create delays and inefficiencies, Budhani said. 

For those companies, building a platform to enable self-service accelerated computing hardware and AI/ML workbenches can be a time-intensive process, often taking “one to two years.”

GPUs for non-AI ROI

Data challenges compound the GPU ROI challenge, said Molly Presley, CMO of Hammerspace, which provides a platform for unifying unstructured data across multiple clouds. Organizations need to clean, move and label unstructured data before leveraging GPUs effectively, she told Fierce.

Data-intensive companies like Meta are leading in realizing ROI from GPU investments by leveraging their large data sets and mature AI strategies. The good news for organizations that aren’t far along in their AI strategy? GPUs can generate ROI elsewhere in the meantime.

A recent Hammerspace report showed that GPUs are being used in non-AI, big data and high-performance computing (HPC) applications. Companies can get ahead of the game by optimizing their GPU operations to support both the AI and non-AI use cases, Presley said. The right infrastructure will make it easy to move GPUs to new AI applications when they arise.

For AI to move beyond the exploration phase, companies must invest in optimized infrastructure and unified global data platforms, Presley advised. These can enable efficient data handling and seamless scaling of AI workloads. By addressing bottlenecks, enterprises and telcos can fully harness their GPU investments.

Telcos and GPU clouds

In the thick of the GenAI craze, telcos are embracing GPUs by offering services like GPU clouds to compete with hyperscalers. However, these telcos face hurdles, including competition from public clouds and the challenge of efficiently sharing GPUs across AI and non-AI applications.

Budhani noted a majority of CSPs are now offering GPUs as a service, either as bare metal servers, virtual machines or Kubernetes clusters. However, this approach is “not very competitive since customers may end up selecting the cheapest solution for their needs,” he said. 

On the other hand, some telcos are launching specialized clouds for GPUs (called, naturally enough, GPU clouds). Similarly, IBM and AMD recently announced a collaboration to deploy AMD accelerators as a service on IBM Cloud. The offering, expected to be available in the first half of 2025, aims to enhance performance and power efficiency for Gen AI models such as HPC applications.

Such GPU clouds are providing “a unique opportunity to telcos to compete with public clouds,” Budhani said.