
AI has rapidly shifted from experimental promise to economic imperative. Across industries, AI is no longer confined to pilot projects or niche applications. It is now embedded in core strategies, shaping competitive advantage and national priorities alike. Yet, as organisations lean harder on AI, a stubborn paradox emerges: the speed of innovation in algorithms and models is outpacing the evolution of the infrastructure meant to sustain them.
A new global study conducted by Keysight Technologies lays this tension bare. Surveying senior executives and engineers from telecoms, cloud providers and large enterprises across the Americas, APAC and Europe, the AI Cluster Networking Report 2025 captures a moment of both momentum and strain.
Nearly 90 per cent of respondents plan to expand or sustain AI infrastructure investment, but 59 per cent cite budget constraints as a major concern. In other words, ambition is soaring but wallets and data centres are struggling to keep up.
The infrastructure strain
The appetite for AI is insatiable. Training and inference workloads for next-generation models such as agentic AI systems or large-scale multimodal platforms demand unprecedented throughput, low latency and reliability. In a matter of just a few years, training runs that once required weeks are now expected in days, sometimes even hours.
Yet the physical and human infrastructure supporting this demand is faltering. Three challenges dominate:
- Budget pressures — Scaling AI clusters requires immense capital outlay for GPUs, cooling, power, and interconnects. More than half of surveyed organisations admit their infrastructure ambitions are throttled by cost. This reality forces operators to prioritise selective upgrades rather than wholesale expansion.
- Technical bottlenecks — 55 per cent of respondents pointed to limitations in existing networks. Even as 400G dominates, operators are already trialing 800G and 1.6T connections. Networks once considered cutting-edge are now inadequate for model sizes growing by orders of magnitude.
- Skills shortage — A lack of AI-savvy network engineers and data centre architects was cited by 51 per cent of organisations. Without talent to design, validate and optimise these systems, even the best technology risks underperformance.
Together, these constraints create a widening gap between what AI needs and what infrastructure can provide.
Networks: From backbone to bottleneck
Perhaps the starkest finding is that networks are now the new chokepoint. For decades, bandwidth and latency were treated as enabling backbones. Today, they are make-or-break factors.
While 400G remains the most deployed speed, a third of organisations are already evaluating 800G, and nearly a quarter are testing 1.6T interconnects. Such leaps are not about prestige, they are existential requirements for training trillion-parameter models or supporting complex real-time inference.
Ethernet-based approaches such as Ultra Ethernet, purpose-built for AI and high-performance computing, are gaining traction, with 58 per cent of operators considering adoption. Unlike traditional Ethernet, Ultra Ethernet redefines the stack to guarantee ultra-low latency and deterministic performance. That shift matters: in AI workloads, nanoseconds of delay can snowball into inefficiencies that translate into longer training cycles, higher costs and wasted energy.
The stakes are clear: without faster, smarter and more reliable networking, AI clusters will choke under their own computational hunger. The network is no longer just the backbone — it is the bottleneck, or the breakthrough.
Smarter, not just faster
Yet brute force is not the answer. The report highlights a decisive shift toward smarter scaling strategies. Rather than pouring billions into new hardware, 61 per cent of respondents are optimising what they already have: tuning configurations, adopting software-defined networking (SDN) and validating workloads through advanced testing tools.
This optimisation-first mindset reflects a pragmatic reality. With energy costs rising and sustainability targets looming, organisations cannot endlessly add more GPUs or servers. Instead, the challenge is to extract maximum efficiency from existing assets, fine-tune workloads and ensure predictability across diverse AI models.
Notably, emulation has emerged as a critical capability. 95 per cent of respondents view real-world emulation as essential, yet many lack tools to replicate production-scale AI workloads.
Emulators allow operators to simulate how models behave under real-world conditions, exposing bottlenecks and hidden failures before systems are deployed at scale. This bridges the gap between lab performance and production reliability, which is a vital safeguard in high-stakes AI environments.
The energy and sustainability question
Another dimension that cannot be ignored is energy.
AI is already infamous for its power appetite. Training state-of-the-art models consumes megawatts of electricity, sometimes rivaling small towns. Rising energy prices and mounting environmental scrutiny mean organisations must pursue not just performance, but sustainable performance.
The report highlights that 41 per cent of operators see reducing power consumption as a strategic imperative. Intelligent infrastructure testing that simulates and monitors power draw under realistic AI loads is no longer optional. If organisations cannot optimise for efficiency, their AI ambitions risk becoming economically and politically untenable.
This energy reality also reframes how organisations think about ROI. Faster model training is not just about speed-to-market; it is also about minimising energy costs per iteration. In this sense, efficiency becomes both a cost-saving mechanism and a sustainability metric.
The business implications
For enterprises and governments, the findings carry urgent implications.
First, AI adoption is no longer limited by software innovation; it is increasingly constrained by infrastructure realities. Strategic decisions about networking, orchestration and optimisation will determine which organisations truly harness AI at scale.
Second, the survey underscores that infrastructure optimisation is now a competitive differentiator. 62 per cent of respondents believe squeezing more efficiency out of existing networks directly translates into market advantage. In a world where training timelines, inference speed and uptime define competitiveness, the ability to do “more with less” is strategic, not operational.
Third, organisations must invest in talent and tools. The lack of skilled AI network engineers is not just a hiring issue; it is a systemic barrier. Without expertise to design scalable architectures, validate workloads and optimise traffic flows, investments in GPUs and compute will fall short. This calls for a renewed emphasis on workforce development, from training existing staff in AI networking to building new pipelines of specialists.
Finally, collaboration matters. No single enterprise can solve these challenges in isolation. Standardisation efforts, such as those underpinning Ultra Ethernet, will be crucial in avoiding vendor lock-in and ensuring interoperability across ecosystems. Public-private partnerships will also play a role, particularly in regions where energy, policy and infrastructure development intersect.
Toward an intelligent infrastructure era
The report closes on a note of both urgency and opportunity. AI is reshaping every industry, but the infrastructure supporting it must evolve just as quickly. Tomorrow’s AI-ready organisations will not be those that spend the most, but those that build the smartest: embracing Ultra Ethernet, adopting emulation, optimising workloads, and developing resilient, efficient and sustainable networks.
The lesson is clear. As AI models push boundaries, infrastructure must shift from being a silent enabler to a visible strategic priority. Leaders who recognise this will transform bottlenecks into leverage points, ensuring that AI’s promise translates not just into prototypes but into production-level impact.
AI may be outpacing infrastructure today, but with the right strategies, organisations can ensure they are not left behind tomorrow. The future belongs to those who build not just bigger systems, but smarter ones.













