CPUs are the Next GPUs under Agentic AI. Here’s Who May Benefit

The next phase of the AI trade isn't GPUs or memory...

May 24, 2026

Important Disclaimer: The following stock discussion and analysis is subject to The Inferential Investor’s Disclaimer. This article is for informational and educational purposes only and does not constitute financial product advice or a recommendation to buy or sell any financial product or security. It has been prepared without taking into account any individual’s investment objectives, financial situation, or particular needs. Past performance is not a reliable indicator of future performance. All forward-looking statements are scenario-based assessments and not forecasts or investment recommendations. Recipients should seek independent professional financial advice before making any investment decision. AI can make mistakes and should be checked.

“We previously thought the server CPU market would grow around 18% annually. Now, because of agentic AI, we expect the server CPU TAM to grow greater than 35% annually to more than $120 billion by 2030.” Lisa Su, AMD CEO, May 5 2026.

That is Lisa Su, CEO of AMD, explaining how to understand the shift occurring right now in AI infrastructure investment towards CPUs. The first AI wave was dominated by GPUs because the core workload was training and serving large models. But AMD’s argument is that agentic AI changes the system architecture. In the chatbot era, AI systems often looked like roughly one CPU for every four to eight GPUs. In agentic deployments, that ratio is moving to 1:1 with signs that it will shift even higher because agents require far more CPU-heavy orchestration, retrieval, tool execution, memory management, validation, and security work around the model.

There’s been mounting discussion of this shift amongst investors over the last month but in reality, this trend started to become apparent back in late 2025. Most of us, myself included however, missed the early signs. In November 2025, AWS and OpenAI announced a massive $38bn compute partnership. We’ve become accustomed to interpreting these deals as yet another deployment of GPUs. However, within the text of that deal was the following provision to allow OpenAI to source “hundreds of thousands of state-of-the-art NVIDIA GPUs (as expected), with the ability to expand to tens of millions of CPUs to rapidly scale agentic workloads.”

I didn’t immediately comprehend the shift in that sentence when I first saw the deal details. But it is quite profound: hundreds of thousands of GPUs, leading to tens of millions of CPUs, to support agentic AI.

This is why JP Morgan is appearing even more bullish than AMD’s statements on the CPU shift. They state in this report: “Current AI applications have a CPU to GPU ratio of 1:3 or 1:4, but the ideal ratio for agentic AI workloads could be as high as 7:1”

Here’s Intel CEO Lip-Bu Tan on the subject: “The inference side — in terms of orchestration, control plane, and also managing all the different agents with data — CPU is much more efficient. The ratio of CPU to GPUs used to be 1 to 8, and now it’s 1 to 4, and I think [its moving] towards parity or even better. The demand is very strong.”

Consider for a moment the revenue growth implications for the fabless CPU companies over a 5 year horizon. If correct, and the market moves from 1 CPU in every 4 GPUs to 1 for 1 or up to 7 to 1, then the server demand component of CPUs must grow at a rate that is multiples of the growth rate of GPUs from here. We do have to account at the market level however, for the 80% of CPUs that are consumed in the low single digit growth consumer PC market. We also have to take into account supply constraints at TSMC, Intel and Samsung Foundry and lead times to expand production capacity. However even with that growth dilution, its is apparent from those anecdotes, that market CPU growth is accelerating to potentially match or exceed GPU growth. That has major implications for the relative performance of certain stocks.

So that is the key implication for investors. These anecdotes imply a change in the shape of AI infrastructure demand. CPUs are no longer just a support layer attached to GPU clusters. In agentic AI, they become a larger and more valuable part of the production stack. That dynamic supports both price and volume growth within a segment that has traditionally seen price deflation.

AMD and the OpenAI deal both point to a future where CPUs outnumber GPUs within hyperscaler AI environments.

Before we continue, let me introduce you to Inferent Analyst. This is the state-of-the-art AI investment research analyst project being developed right now. Inferent Analyst is designed to be your agentic equity research assistant. No more pages and pages of raw numbers without context like traditional data terminals. Inferent Analyst presents data with synthesized context and insight to help you discover more ideas and make better investment decisions. Pre-register today for a bonus on launch.

PRE-REGISTER FOR INFERENT ANALYST

Why Agentic AI Changes the Hardware Demand Profile

First-wave generative AI was mostly prompt-in, answer-out. The user sent a prompt, the neural network based transformer model(s) generated tokens via GPUs, and the answer came back. In that architecture, matrix multiplication maths required by neural nets did the heavy lifting, and the GPU was the main enabler of that activity.

Agentic AI is different. Agents plan tasks, retrieve information, call tools, browse files, execute code, interact with APIs, validate outputs, apply policy checks, and repeat those steps until they achieve an outcome. This turns AI from a single inference event into a multi-step workflow, much of which is not actually suited to a neural network approach. Inference is a part of the agentic process in that it interprets instructions and writes code which requires GPUs, but a much larger part is the more conventional logic related activity that happens outside the neural network itself which utilizes the CPU (and memory). This requirement for CPUs only increases if we move towards a more neuro-symbolic hybrid AI structure which many leading labs are now working on to generate the next big leap forward in AI.

That matters because multi-step workflows place greater infrastructure workloads outside the GPU. Retrieval may be memory and storage intensive. Tool execution may be CPU intensive. Code agents may need Bash, Python, file I/O, and test execution. Enterprise agents may require permissioning, sandboxing, audit trails, and compliance checks. Multi-agent systems add orchestration layers that coordinate specialist agents and model endpoints.

Evidence: CPU Stages Are Becoming the Critical Path

Quantitative evidence supports this shift. In representative agentic workloads, CPU-resident stages often dominate the total end-to-end completion time. Retrieval-heavy RAG systems can spend 81% to 89% of total latency in retrieval. Coding-agent workloads can spend 25% to 65% of latency in Bash or Python execution. In tool-dominated agentic workflows, tool processing can consume up to 88% of total latency. These tasks are all often CPU led - not GPU.

That does not mean every AI workload becomes CPU-led. Monolithic inference remains GPU-dominant. In Toolformer-like flows (where the LLM decides when and how to use tools), GPU heavy inference accounted for 77% to 88% of total latency. Large-model training and high-end reasoning inference also remain GPU accelerator-heavy.

But the evidence shows that once the model is only one stage in a broader workflow, the CPU becomes a larger part of the critical path. It is this style of workflow that is rapidly growing as agentic AI usage expands in 2026. Its is also a trend that the major hyperscalers have noted and are planning for and is consequently showing up in their equipment orders and internal ASIC plans.

The Current Demand Tape Still Says GPUs First, but Acceleration is Showing in CPUs Now

The recent market data still shows that GPUs are the dominant AI infrastructure revenue pool. IDC data shows that servers with embedded GPUs represented more than half of server-market revenue in 2025. Vendor results tell the same story. In the latest quarter, NVIDIA’s GPU dominated revenue grew 86% year-on-year. Meanwhile, AMD’s blended CPU and GPU Data Center segment grew 57% year-on-year while Intel’s CPU heavy Data Center segment saw revenue growth of 22% YoY. The implication from that comparison is fairly clear. CPU growth is only just taking off now with agentic CPU demand really only starting to build from late 2025.

Cloud Infrastructure server shapes are also only just starting to show the shift. Compared with prior server shapes, the CPU component is now increasing rapidly. The following cloud vendor specifications demonstrate this:

Amazon Web Services (AWS)

General Purpose: The older M5 instance family maxed out at 96 vCPUs. By contrast, the M7i and M7a generations push limits up to 192 vCPUs.
Compute Optimized: The older C5 instances topped out at 72 vCPUs. Successive jumps landed on 192 vCPUs for C7 (Graviton3), and modern C8g instances offer up to 192 vCPUs with 3x memory capacities.

Google Cloud Platform (GCP)

C-series Compute Optimized: The older C2 series topped out at 60 vCPUs. The newer C3 instances pushed up to 192 vCPUs, while the C4D machine types support up to 384 vCPUs alongside 3,024 GB of RAM.

The host CPU is no longer trivial. It is part of the performance envelope of these infrastructure shapes and is increasing in its portion of the bill of materials. This sudden sift has caught the industry by surprise. Both Intel and AMD have stated they are now sold out of their server CPU production capacity as Microsoft, AWS and Google Cloud re-architected their orders this year for agentic workloads.

Server CPU lead times have gone from 1-2 weeks to 8-22 weeks and even 30 weeks for some AMD products. Price rises have commenced with wholesale price hikes of 10%-15% for CPUs which appears just the start, given the shortage that has emerged.

Specialized CPU Requirements for Agentic AI

Agentic workloads also require a specific kind of CPU capability. It is not enough to say “more CPUs.” The relevant CPUs need power and memory efficiency, high core counts, strong single-thread responsiveness, high memory bandwidth, large memory capacity, fast I/O, and security features.

Core density matters because agents create many concurrent tasks. Memory bandwidth is critical because retrieval, cache management, repeated tool outputs, and CPU-side data processing are often memory-bound. I/O matters because agents constantly move data between GPUs, NICs, NVMe storage, databases, and external systems.

This is why CPU product roadmaps are themselves changing. Intel’s Xeon 6, AMD’s EPYC 9005, Arm-based server CPUs, AWS Graviton, Google Axion, AmpereOne, NVIDIA Grace, and NVIDIA Vera all fit into different parts of the agentic infrastructure story.

Who Benefits if CPU Demand Accelerates?

The clearest beneficiary is AMD. It has exposure to both sides of the heterogeneous stack: EPYC CPUs and Instinct GPUs. If agentic AI increases CPU demand without reducing GPU demand, AMD participates in both layers. This makes Lisa Su’s CPU TAM reset especially important. It gives AMD a second AI growth vector beyond its attempt to compete with NVIDIA in accelerators. In the past, AMD’s CPU exposure was considered a dilution to the GPU story. Now the CPU exposure is shifting to a strength, provided they can secure more capacity at the foundries and stay ahead of hyperscaler’s own in-house CPU ASIC plans.

Intel has the highest asymmetry to a CPU re-rating. Its accelerator position is weaker, but if AI infrastructure spend broadens toward CPUs, Intel’s Xeon roadmap, AMX acceleration, memory bandwidth improvements, and confidential computing features become more relevant. The caveat is capacity constraints and execution risk. Intel needs to convert the CPU opportunity into competitive growth with predictable margins and is undertaking a $100bn multi-year and multi-node capacity expansion plan. Consensus forecasts have yet to incorporate much of a revenue acceleration into Intel’s outlook, given capacity issues and PC dilution.

NVIDIA remains a winner, but the nature of the thesis changes. A CPU acceleration is not necessarily bad for NVIDIA because NVIDIA is increasingly a full-stack infrastructure supplier. Grace CPUs, Vera CPUs, GB200 systems, networking, storage fabric, Dynamo, and rack-scale platforms allow NVIDIA to capture more of the heterogeneous system. The risk is not that CPUs replace GPUs. The risk is that the market eventually values AI infrastructure less as a pure GPU scarcity story and more as a broader platform story. The Vera and Grace CPUs add another layer to Nvidia’s growth horizon that the market wasn’t focused on six months ago.

Arm also benefits because agentic AI increases the relevance of efficient, high-density server CPUs that don’t need to be designed around the x86 architectures of Intel and AMD. Arm’s server market share is projected to surge to nearly 20% by the end of 2026 (up from 12.5% in 2025), with long-term forecasts from firms like UBS estimating Arm will capture 40% to 45% of total server CPU unit shipments by 2030. This is partially driven by the shortage in production capacity at Intel and AMD. But it is also about the efficiency of Arm CPUs over x86 rivals. Arm processors inherently yield about 30% higher power efficiency and 20% to 30% greater memory efficiency than x86 alternatives. AWS Graviton, Google Axion, AmpereOne, and NVIDIA Grace are all aligned to the Arm platform, supporting the idea that Arm-based CPUs can play a larger role in inference, orchestration, and control-plane workloads.

Dell, HPE, Lenovo, and Supermicro benefit as enterprise server deployment becomes more important. Agentic AI requires private infrastructure, validated systems, storage integration, security, and management layers. This favours OEMs and integrators that can package AI systems for corporate use. Initial fears that these companies cannot pass on rising memory costs appear to be fading as results show accelerated earnings growth in the sector.

Key Risks to the CPU Thesis

The biggest uncertainty is where agentic work actually happens. If orchestration, retrieval, and tool use are built behind managed APIs (ie inside the hyperscaler cloud environment), then some CPU upside may be captured inside hyperscalers, via their own in-house silicon, rather than by merchant CPU vendors like AMD. That would still validate the CPU demand thesis, but it may reduce the direct stock-market benefit for AMD or Intel. Arm is more balanced on this outcome given it is the IP layer behind most hyperscaler (and NVIDIA) CPUs but is also expanding its merchant silicon segment with its own Arm AGI CPUs for the first time.

A second sensitivity is model architecture. If GPU-side optimizations, long-context models, and inference engines keep absorbing more of the workflow, then GPUs stay dominant for longer. TensorRT-LLM, NVIDIA Dynamo, FP8, FP4, speculative decoding, KV-cache optimization, and expert parallelism all continually improve GPU utilization. If CPU shortages are a major constraint on agnentic workflow expansion, can a breakthrough be made with the use of GPUs in the way they originally repurposed the GPU to enable AI token generation that formed the foundation of LLMs?

Investor Takeaway

The first AI infrastructure wave was about training scale and GPU scarcity. The next wave may be about complexity of agentic orchestration.

Agentic AI does not end the GPU cycle. GPUs remain essential for frontier training, large-model inference, reasoning workloads, and reinforcement learning. But agentic AI broadens the infrastructure stack. It increases demand for CPUs, memory, networking, storage, orchestration software, and secure enterprise systems.

That is why there’s so much attention all of a sudden on the good old CPU again. A move from roughly 18% expected annual server CPU growth to greater than 35% reflects a plausible architectural shift. If AI agents become the default enterprise interface to software, data, and workflows, the CPU may move from background attachment to foreground growth driver.

The right investment frame is therefore not CPU versus GPU. It is GPU plus CPU plus networking plus storage plus software orchestration. The companies best positioned for this world are those that can sell heterogeneous systems, not just standalone chips. This why NVIDIA, a traditional GPU powerhouse, is ramping its efforts in Vera and Grace CPUs and why all of a sudden AMD’s stock has surged again because it is uniquely placed with both CPU and GPU credibility. We also see this shift in some institutional investor’s positioning, with Steve Cohen’s Point72 reported to have reduced NVDA and multiplied its position in AMD a number of times over in recent 13Fs.

For investors, the underappreciated point is simple: the GPU boom may have created the first AI winners, but the agentic AI boom could broaden the winner set. CPUs may be one of the most important second-order beneficiaries.

Don’t forget to pre-register for Inferent Analyst so you can be one of the first to experience the benefits of a natively agentic investment researcher that’s trained to produce deep insights for you on whats happening in markets.

PRE-REGISTER NOW

Andy West

The Inferential Investor

Discussion about this post

Ready for more?