There’s a big difference between “collecting dust” and “GPUs being underutilized”. Most existing AI data centers are significantly underutilized: GPUs are often busy well under two‑thirds of the time, and overall facility capacity is even less efficiently used
- A 2024–2025 large‑scale AI infrastructure survey reports that over 75% of organizations see peak GPU utilization below 70%, meaning most accelerators sit idle a substantial fraction of the time even at “busy” periods.
- Industry practitioners estimate effective model FLOPs utilization (MFU) for many LLM fine‑tuning workloads in the roughly 35–45% range, implying that much of the theoretical compute capacity in installed GPUs is not being turned into useful training.
But the main cause is inefficient use of GPUs, not lack of usage.
- Common causes include suboptimal job scheduling, fragmentation across many small teams, over‑provisioning against peak demand, and software bottlenecks (I/O, networking, data pipelines) that stall GPUs.
- As a result, even though organizations experience GPU *scarcity* and keep ordering more hardware, the prevailing view in recent analyses is that most AI GPU fleets are materially underutilized rather than consistently running near full capacity.
So there’s lots of room for improvements leveraging data center level co-optimization and management, that better match the loads to the GPU resources.
GPU supply doubled, but AI teams still starved. Why? - LinkedIn
https://www.linkedin.com/posts/nehi...-why-are-ai-activity-7389840480945205249-B5-n