I. The Demand Cascade: From Tokens to Megawatts

Training compute growing at 4.4 times per year requires exponentially more GPUs, which require exponentially more power, cooling, networking, and physical space. Inference compute — continuous, cumulative, and growing with every user and every enterprise deployment — adds a second, even larger demand vector on top of training. The AI data center is the physical infrastructure where both demand curves converge. It is the factory floor of the AI economy.

And it is, overwhelmingly, a seller’s market.

Exhibit 1Projected Power Growth for Frontier AI Training

Frontier AI training power has grown ~2.2× per year historically; Epoch AI’s forecast implies 2.2–2.9× annual growth through 2030, lifting single-training installations from tens of megawatts toward multi-gigawatt scale — on par with the largest anticipated data center campuses (OpenAI Abilene, Meta Louisiana, Stargate). Source: Epoch AI.

Exhibit 2Power Demand from Frontier AI Training

Training compute growth alone would lift power demand ~4× per year; longer-duration training runs trim that to ~3×, and improved chip efficiency brings the net growth closer to ~2× per year. Even after mitigation, training power demand roughly doubles annually.

Exhibits 1 and 2 capture only the training side of the equation. Inference demand compounds on top — continuous, cumulative, and growing with every user, every enterprise deployment, and every autonomous workflow. As established in Thesis II, inference demand is already substantial and, on current trajectories, is likely to be multiples greater than the demand from training. Both curves — training and inference — ultimately flow through a single physical pipeline: the conversion of electricity into intelligence.

The AI Infrastructure Tech Stack

To appreciate what exponential AI demand means in concrete, investable terms, it helps to see the full process as a tech stack of five dependent layers. At the base sits power: raw electricity, generated and delivered to site. Every watt of AI capability begins here. Above it, the facility: the physical data center — its building, cooling systems, power distribution, and redundancy — that keeps hardware running continuously. Inside the facility sits compute: the GPUs, high-speed interconnects, and memory that perform the trillions of mathematical operations required to process and generate tokens. Above the hardware, models represent the AI itself — frontier systems trained on trillions of tokens and served through inference. At the top, applications are how the world accesses AI: APIs, chatbots, coding assistants, and every product that will be built on these models.

Exhibit 3The AI Infrastructure Tech Stack

AI tech stack: power, facility, compute, models, applications

Five dependent layers: each layer requires the one beneath it. No power, no facility. No facility, no compute. No compute, no models. No models, no applications.

Each layer depends entirely on the one beneath it. No power, no facility. No facility, no compute. No compute, no models. No models, no applications. And critically, both sides of the demand curve flow through the same physical infrastructure. Every training run that ingests tokens and every inference call that generates them requires electricity, a facility to house the hardware, and compute to execute the work.

Exhibit 4The Training-to-Inference Pipeline

Both training and inference workloads flow through the same physical infrastructure stack.

The companies building frontier models and AI applications — the companies driving explosive demand — do not generate their own electricity, build their own facilities, or manufacture their own chips. They must source all of it externally. The AI data center exists to close the gap between the compute these companies need and the compute available to them. At its core, it is a factory that converts electricity into tokens. That gap between supply and demand is enormous, it is widening, and it represents one of the defining infrastructure opportunities of the coming decade.

The logic is straightforward. If demand for training and inference is exponential, then demand at every layer below them must also be exponential. Each layer is a mechanical prerequisite for the one above it. There is no scenario in which compute demand grows while facility and power demand do not. The stack does not allow it.

Now trace the dominos. More training and more inference require more compute — more GPUs, more networking, more memory. GPU procurement has already become a binding constraint for every major AI company. More compute requires more facilities — more data centers to house, cool, and connect the hardware. Data center capacity that was built over years is being consumed in months. More facilities require more power — more electricity to run the hardware and cool the buildings. Power availability has emerged as a binding constraint on new data center development in every major market.

Notice the pattern at each layer: the constraint is not theoretical. It is the present reality. The dominos are not poised to fall — they are falling. What remains is simple arithmetic: for as long as demand at the top of the stack continues to grow, demand at every layer below it will grow in lockstep.

Exhibit 5The Demand Cascade

Demand cascade from intelligence to power

Exponential demand at the top of the stack cascades mechanically through every layer below, from models to facilities to power.

The Supply-Demand Imbalance

Every major cloud provider reports the same condition: demand exceeds supply. Capacity is being monetized as fast as it can be installed. Backlogs are growing, not shrinking, even as the industry builds at an unprecedented pace. The supply-demand imbalance is not a temporary condition. It is a structural feature of an industry in which the demand for compute is growing exponentially while the physical constraints on building — power, land, permitting, construction, and chip manufacturing — are fundamentally linear.

To deliver this compute, the data center industry is building at a scale and speed it has never attempted. The hyperscalers have committed to capital expenditure programs that dwarf anything in the industry’s history.

~$200B

Amazon CapEx

$175–185B

Google 2026 CapEx

$37B/qtr

Microsoft Run Rate

10+ GW

Oracle Capacity Pipeline

Exhibit 6Hyperscaler Capital Expenditure

Capital expenditure commitments from the major cloud infrastructure providers.

Exhibit 7Global AI Power Capacity vs. National Benchmarks

Cumulative global AI data center power capacity has grown from near-zero in 2023 to roughly 31 GW by 2026 — on par with the peak electricity demand of New York State (31 GW) and well above the Netherlands (19 GW) or New Zealand (7 GW). AI chip TDP accounts for roughly a third of total capacity; cooling, networking, and other data-center overhead make up the balance. Source: Epoch AI.

Exhibit 8Cumulative AI Chip Sales by Vendor (USD)

Cumulative spending on AI accelerators from 2024 Q1 through 2025 Q4, stacked by vendor. Nvidia (H100/H200, B200, B300) dominates; Google TPUs (v5e, v6e, v7), AMD MI300X, Huawei Ascend 910B/C, and Amazon Trainium2 each contribute a growing but secondary share. Total cumulative spend exceeds $300B by year-end 2025. Source: Epoch AI.

Exhibit 9Cumulative AI Chip Sales by Vendor (Units Shipped)

Cumulative unit shipments across the same vendor set, reaching roughly 20 million accelerators by Q4 2025. The unit-versus-dollar comparison underlines Nvidia’s dominance on both measures while showing that TPU unit volume runs proportionally higher than its dollar share — a function of lower average selling prices per chip. Source: Epoch AI.

The Master Builders

Before we go further, we need to recalibrate. The prevailing media narrative around cloud capital expenditure treats these companies like overleveraged startups chasing a trend. That framing is fundamentally wrong. The hyperscale cloud providers did not stumble into infrastructure — they invented it, scaled it, and have been planning buildouts in multi-year horizons for decades. These are not faceless institutions making bets driven by greed. We take confidence not just in the businesses in general, but also the specific people running them.

Exhibit 10The Master Builders

The leaders of the companies deploying hundreds of billions into AI infrastructure: track records measured in decades, not quarters.

II. The AI Data Center: A Different Machine Entirely

Consider two data centers, each drawing 100 megawatts of power from the grid. One is a traditional facility running general-purpose enterprise workloads. The other is an AI data center built around NVIDIA Hopper (H100) GPU clusters. They consume the same electricity. They occupy comparable land. They share a label. Beyond that, they are entirely different machines.

The traditional facility houses roughly 10,000 racks of CPU-based servers — running web applications, databases, email, and enterprise software. Power per rack is modest. Cooling is handled with forced air. The AI facility houses roughly 1,500 racks of GPU-dense compute nodes, each drawing 40–60 kilowatts. It requires liquid cooling piped directly to the chips. Its internal network moves data at 50–100× the bandwidth. And it packs roughly 250 times more raw computational throughput into the same power envelope.

Traditional vs. AI: Holding Power Constant

Exhibit 11Traditional vs. AI Data Center at 100 MW

Traditional vs AI data center comparison

Holding power constant at 100 MW reveals order-of-magnitude differences across nearly every operational dimension. Illustrative: figures are directionally accurate but vary depending on specific chips, cooling approaches, and facility architectures.

The differences are not incremental. The AI facility contains fewer racks and fewer processors — but each processor is so much more powerful that the facility delivers ~250× the raw compute throughput from the same 100 megawatts. The capital required is ~5–8× higher — not because the facility contains more hardware, but because it contains fundamentally different, far more powerful hardware. A single H100 GPU costs more than an entire traditional server. A single rack of them can cost over $1 million.

Economic Output per Kilowatt

Exhibit 12Economic Output Comparison

For the same 100 MW, the AI facility delivers ~1,500× the useful compute, commands ~30–50× the price per processor-hour, and produces far more economic output per kilowatt, per square foot, and per dollar of capital deployed. Illustrative: figures are directionally accurate but vary depending on specific chips, workloads, and deployment architectures.

Read these in combination. For the same 100 megawatts, the AI facility delivers roughly ~1,500× the useful compute — a function of each GPU being ~500× more powerful than a CPU, compounded by ~5–6× higher utilization. Each GPU-hour commands ~30–50× the price of a CPU-hour. The facility produces ~3× more processor-hours per year. And despite the dramatically higher absolute capital cost, the cost per unit of useful computation is 5–10× lower. The equipment is more expensive, but it is disproportionately more productive.

The result is a facility that produces far more economic output per kilowatt, per square foot, and per dollar of capital deployed — despite requiring more of each in absolute terms.

Critically, the absolute size of the opportunity is also far larger. Traditional enterprise workloads grow at low single-digit percentages annually. AI compute demand is growing at multiples of that — driven by the training and inference dynamics described earlier. This means the AI data center operator is not only running a more productive facility; they are operating in a market where demand consistently exceeds supply.

NVIDIA’s Revenue-per-Gigawatt Roadmap

NVIDIA’s own roadmap puts the trajectory in concrete terms. At GTC 2026, Jensen Huang presented a framework for measuring the economic output of an AI data center in the metric that matters most: annual revenue per gigawatt. The results across three hardware generations — Blackwell (shipping now), Rubin (next generation), and Vera Rubin + LPX (the generation after that) — are staggering.

To appreciate what that metric represents, it helps to understand the unit economics beneath it. The product that an AI data center sells is the token. Every conversation with Claude or ChatGPT, every line of code generated by Copilot, every summary produced by an enterprise agent is a stream of tokens — and every token is priced. Frontier labs sell API access in dollars per million tokens. Hyperscalers bill AI compute capacity in ways that ultimately resolve to tokens served. Tokens are the product. The data center is the factory. Power is the raw input. And the governing economic ratio is tokens per watt: how many units of salable output a facility can produce for every unit of electricity it consumes.

That ratio is not a fixed physical constant — it is an engineering outcome, and it has been improving relentlessly. Each new GPU generation performs more matrix operations per joule of energy consumed; each generation of networking moves more data per watt of interconnect; each generation of software extracts more useful work from the same silicon. The compounding effect is enormous: a 2× improvement in tokens per watt, for a facility with a fixed power budget, is a 2× improvement in salable output, which — because tokens are priced goods — converts almost directly into a 2× improvement in revenue. A 5× improvement is a 5× revenue uplift. The gains stack across generations and they stack across every gigawatt of installed capacity.

This is why revenue per gigawatt is the right lens. Power is the scarce, hard-to-permit, years-to-build constraint on the entire AI buildout. The value of a gigawatt is not fixed; it is a function of how much salable intelligence can be extracted from it. Each new hardware generation raises that value, which in turn raises the economic return on every existing and planned facility.

Exhibit 13NVIDIA Revenue per Gigawatt by Hardware Generation

The hardware roadmap tells us the AI data center is about to become dramatically more productive. Each generation compounds economic output per watt.

The forward-looking economics reinforce the thesis from every angle. The AI data center is already a fundamentally superior business to the traditional data center. The hardware roadmap tells us it is about to become dramatically more so. And the exponential growth in demand ensures that every incremental gain in throughput per watt will be absorbed by the market. The capex invested today is not merely purchasing current capacity. It is purchasing a platform whose economic output will compound with each successive hardware generation.

III. The Durable Asset: Why GPUs Hold Their Value

In traditional computing, hardware depreciates rapidly. A server purchased today is worth a fraction of its cost in three years, and is typically replaced in five. Moore’s Law and the relentless march of chip performance mean that older hardware is not merely less capable — it is uneconomical to operate. The cost of powering and cooling an old server exceeds the value of the computation it provides. Replacement is not a choice. It is an economic necessity.

AI GPUs are behaving differently, and the reason is the supply-demand imbalance we have already established.

Demand for AI compute is so far in excess of supply that even older-generation GPUs remain productive and valuable. An NVIDIA A100, released in 2020 and now two generations behind the frontier, is still actively deployed, still running inference workloads, and still generating revenue for the data center operators that own it. The reason is simple: there are not enough new-generation GPUs to satisfy demand. And because inference workloads — unlike training — do not always require the absolute latest hardware, older GPUs remain perfectly suitable for a large and growing category of work.

Exhibit 14GPU Economic Lifecycle

The economic useful life of AI GPUs is proving longer than traditional depreciation schedules assume, driven by persistent demand exceeding supply.

This has a direct and material impact on the return profile of AI data center investments. If a GPU that was expected to generate revenue for three to four years instead generates revenue for seven to ten years — and possibly longer — the lifetime return on that asset improves dramatically.

Pricing Evidence & Extended Lifecycle

The strongest defense of a structurally longer useful life is not a theoretical argument — it is a market price. SemiAnalysis data on H100 1-year contract pricing reveals that after an initial decline from approximately $3.00/hr/GPU at launch to a trough near $1.70/hr/GPU in October 2025, rental rates reversed course and surged nearly 40% to $2.35/hr/GPU by March 2026 — well into the product’s third year of commercial deployment.

Exhibit 15H100 Rental Rate Recovery

H100 1-year contract rates surged ~40% from trough to $2.35/hr/GPU by March 2026. Existing contracts renewing at original rates. All new capacity through August 2026 pre-contracted. Source: SemiAnalysis.

More critically, existing H100 contracts are being renewed at the same rates they were originally signed at two to three years ago, with some extended through 2028 on four-year terms. On-demand capacity across all GPU types is fully subscribed, and all new cluster capacity coming online through August–September 2026 has already been contracted. This is not the pricing behavior of an asset approaching economic obsolescence. It is the pricing behavior of an asset class repositioning into a durable, segmented product tier with persistent demand.

The case extends even further back. The NVIDIA A100, launched in May 2020 and discontinued in January 2024, is approaching its sixth year of commercial deployment — having survived three successive architectural generations. Yet as of April 2026, AWS continues to actively sell A100 compute. Specialist providers rent A100 80GB units at ~$1.49/hr, with broader market pricing ranging from roughly $1.20 to over $3.40 per GPU-hour depending on provider and configuration.

Despite being out of production for over two years, the A100 remains one of the most widely deployed GPUs in AI infrastructure, and for many inference and fine-tuning workloads, it is still considered the most cost-efficient option relative to newer chips. In fact, persistent supply constraints and rising inference demand have kept A100 capacity economically relevant — and in some segments, increasingly scarce — despite the introduction of multiple newer architectures. A GPU that has been out of production for over two years remains commercially active across every major cloud provider, serving workloads that did not exist when it was designed.

The Manufacturing Ceiling

The persistence of older-generation GPU value is not an accident of timing. It is a direct consequence of the physical limits on next-generation accelerator manufacturing. Even as demand for AI grows exponentially, the ability of the semiconductor supply chain to deliver new GPUs grows along something much closer to a straight line — because every frontier AI accelerator flows through an extraordinarily narrow set of machines, fabs, and packaging lines at the top of the supply chain.

Every NVIDIA Blackwell, AMD MI-series accelerator, and Google TPU begins as a pattern etched into silicon by an extreme ultraviolet (EUV) lithography machine. These machines are manufactured by exactly one company on Earth — ASML in the Netherlands — and the global fleet grows by only a few dozen systems per year. ASML shipped 48 EUV systems in 2025, up from 44 in 2024. The more advanced High-NA EUV tools required for the next node transition are currently being produced at roughly five or six units per year, with a stated target of 20 per year by 2028. Every leading-edge chip in the world — AI accelerators, smartphone SoCs, advanced memory — is patterned on a tool drawn from that same finite pool.

Exhibit 16ASML EUV Lithography Systems Shipped per Year

The global fleet of EUV lithography machines — required for all leading-edge chip production — expanded from 31 systems in 2020 to 48 in 2025. Next-generation High-NA EUV tools are currently being produced at roughly five or six units per year. Source: ASML SEC 6-K filings and annual reports.

These machines are installed almost exclusively at TSMC, which holds roughly 92% of the sub-5nm foundry market. But even after a compute die has been fabricated on a TSMC wafer, it is still not a deployable accelerator. The die must be bonded to multiple stacks of high-bandwidth memory (HBM) through TSMC’s CoWoS (Chip-on-Wafer-on-Substrate) advanced packaging process — and both of these inputs are independently supply-constrained. TSMC’s CEO C.C. Wei has stated on successive earnings calls that CoWoS is sold out through 2026, with NVIDIA alone estimated to have secured roughly 60% of TSMC’s 2026 allocation. The HBM that CoWoS packages is itself produced by only three companies worldwide — SK Hynix, Samsung, and Micron — and all three have publicly confirmed that their 2026 HBM capacity is fully subscribed, with HBM3E contract prices rising into the next product cycle rather than falling.

Exhibit 17TSMC CoWoS Advanced Packaging Capacity

TSMC CoWoS advanced packaging capacity, wafers per month

CoWoS capacity — the required packaging step for every NVIDIA, AMD, and Google TPU AI accelerator — is scheduled to expand from 35,000 wafers per month in late 2024 to 130,000 per month by end of 2026. Large in absolute terms, but linear, and fully subscribed through 2026. Source: TSMC earnings calls and industry reporting.

CoWoS capacity is expanding meaningfully in absolute terms — from roughly 35,000 wafers per month at the end of 2024 to 75,000 per month by the end of 2025, with a target of 130,000 per month by the end of 2026. But capacity expansion in capital-intensive semiconductor manufacturing is linear, not exponential. Every expansion step — a new ASML machine, a new TSMC fab phase, a new CoWoS packaging line, a new HBM fabrication line — takes 18 to 36 months of lead time and billions of dollars of capital to bring online, and none of the stages can be skipped or substituted.

This is a structural feature that sustains the commercial relevance of older-generation GPUs. So long as supply expands in the shape of a line and demand expands in the shape of a curve, the gap between the two has to be absorbed somewhere — and it is absorbed by the installed base. Every A100 still in productive service, and every H100 whose contract is being renewed into 2028, is carrying demand that the next-generation supply chain physically cannot meet.

IV. The Limits of Precision: Why Exact Modeling Is Impossible

The preceding sections have established a clear picture. AI data centers operate at a fundamentally different scale than traditional data centers. They produce dramatically more economic output per megawatt. The hardware inside them is improving at a pace that compounds economic value with each generation. And the GPUs they deploy hold their value far longer than traditional depreciation schedules assume.

The natural next question for any investor is: what does the pro forma look like?

The honest answer is that a credible one cannot be built — not with the kind of precision that a responsible financial model demands. The economics are clearly compelling. The direction is unambiguous. But the specific inputs required to construct a defensible revenue or profitability model are, at this point in the market’s development, fundamentally unknowable. This is not a temporary data gap. It is a structural feature of a market that is moving too fast, across too many dimensions, with too little pricing transparency, for traditional financial modeling to produce anything other than false precision.

Exhibit 18Variables That Defeat the Financial Model

Each variable, in isolation, would introduce meaningful uncertainty into a financial model. In combination, they make the exercise speculative.

The Directional Case

What can be relied upon is the directional understanding of the levers. And those levers all point the same way.

We know that GPU-hours generate meaningfully more economic value per watt and per square foot than CPU-hours. We know that utilization is structurally higher — not because operators are better at scheduling, but because demand exceeds supply and there is no idle capacity. We know that the hardware roadmap delivers compounding improvements in throughput per watt. We know that GPU useful life is proving longer than assumed. We know that the addressable market is growing at multiples of the rate at which traditional data center demand is growing.

The inability to build a precise model does not weaken the investment thesis. It reflects the reality of a market in its earliest innings — where the direction of value creation is clear, even if the exact magnitude cannot yet be pinned down.

The Bet on the Operators

The companies building and operating AI data centers at scale are not speculating. They are among the most sophisticated capital allocators in the world, deploying tens of billions of dollars per quarter into this exact business. They have access to every input that the outside analyst does not: the actual contract pricing, the realized utilization, the blended cost of capital across GPU generations, the forward commitments from their largest customers. They are doing the detailed work — the work that cannot be replicated externally — to ensure that the capital they deploy earns strong returns.

This is not a limitation of the thesis. It is the nature of betting on operators in a market with structural tailwinds. The investor does not need to reconstruct the exact margin profile of a hyperscaler’s AI infrastructure division. The investor needs to assess whether the people making the capital allocation decisions are competent, whether they are economically motivated, and whether the market they are entering rewards the deployment of capital. On all three counts, the answer is unambiguous.

V. Monetization Velocity: Capacity Converts Immediately

The question an investor must ask is whether the capacity, once built, actually monetizes — or whether it sits idle while customers ramp slowly and use cases take longer to materialize than expected.

The answer, drawn from the most recent earnings disclosures of all four major AI infrastructure providers, is unambiguous: capacity is being monetized as fast as it can be delivered. Not eventually. Immediately.

All four companies are experiencing the same underlying condition simultaneously. Capacity is scarce. Utilization is immediate upon deployment. Revenue scales directly with the physical expansion of the data center footprint. There is no absorption lag. There is no ramp period. The demand precedes the supply. The customers have already committed. The revenue follows the megawatts.

Exhibit 19Monetization Velocity Across Providers

Capacity is being monetized as fast as it comes online across all four major AI infrastructure providers. No absorption lag. No ramp period.

For Oracle, this dynamic is particularly significant because of the company’s operational velocity. A provider that delivers capacity faster monetizes sooner. A provider that monetizes sooner generates cash flow sooner. Cash flow funds the next round of construction. Oracle’s investments in standardized design, tripled manufacturing capacity, and compressed deployment timelines are not merely operational improvements. They are financial accelerants.

And critically, these four companies are not merely competing for the same pool of workloads. They are collectively enabling the expansion of AI itself. The scale of infrastructure required exceeds what any single provider could supply. Every incremental data center, regardless of provider, contributes to the broader advancement of AI capabilities, which generates more demand for compute, which requires more data centers. The providers are not dividing a fixed market. They are expanding it.

VI. The Necessity of Building: Why the Investment Is Not Optional

There is a framing of this investment cycle that treats it as optional — as a choice that companies are making because the returns look attractive, and that they could reverse if conditions change. That framing is wrong. The investment is not optional. It is necessary. And the consequence of stopping is not slower growth. It is the cessation of AI progress itself.

This is not hyperbole. It is the direct implication of the scaling dynamics we have established throughout this paper. Training compute is growing at 4.4 times per year because each generation of frontier model requires exponentially more computation than the last. Inference compute is growing even faster, compounding with every new user, every new enterprise deployment, every autonomous agent added to the global workload. The installed base of AI compute is doubling every seven months. These are not trends that can be paused and resumed. They are the operating requirements of an industry whose output — intelligence — degrades the moment the input — compute — stops scaling.

Consider what happens if the buildout slows. If new data center capacity stops coming online at the current pace, the frontier labs cannot train their next-generation models. The training runs that produce GPT-6, Claude 5, Gemini 4 — the models that will power the next wave of enterprise applications — simply do not happen. The models that enterprises have contracted for, that governments are planning around, that the entire technology industry is building products on top of — those models stop improving.

The inference side is equally unforgiving. Every AI application currently in production — every chatbot, every coding assistant, every customer service agent, every fraud detection system, every clinical documentation tool — consumes inference compute continuously. These are not applications that can be paused. If inference capacity stops expanding, new customers cannot be onboarded. New use cases cannot be deployed. The enterprise AI adoption wave that is just beginning stalls before it starts.

And the agentic future — the autonomous systems that represent the next order of magnitude in compute demand — never arrives. Autonomous agents that operate continuously, that plan and execute across multiple systems, that require persistent compute allocation around the clock — these workloads cannot exist on infrastructure that is not being built. The two-to-three-year lead time from power procurement to operational data center means that the facilities needed in 2028 must be under construction now. A pause today is a gap in 2028 that cannot be closed.

This is why the customers are committing at the scale they are. This is why the backlogs are measured in hundreds of billions. This is why the combined contracted revenue across the four major providers exceeds $1.6 trillion.

$1.6 Trillion in Contracted Demand

Exhibit 20Combined Remaining Performance Obligations

$1.6 trillion in contracted future revenue across four providers. The beyond-twelve-month portion of Microsoft’s RPO grew 156% year-over-year.

One trillion six hundred billion dollars in contracted future revenue across four providers. The beyond-twelve-month portion of Microsoft’s RPO grew 156% year-over-year. That is not customers hedging for the next quarter. That is enterprises and AI labs making structural, multi-year decisions about where their workloads will live. These are platform bets that shape hiring plans, product roadmaps, R&D pipelines, and competitive strategy for years to come.

The frontier labs face the starkest version of this calculation. OpenAI is serving hundreds of millions of users, training frontier models that require enormous clusters, and competing in a market where falling behind by even a few months can meaningfully erode developer mindshare and enterprise positioning — losses that, while not necessarily permanent, tend to compound over time as ecosystems, integrations, and habits accrete around whichever model is perceived to be ahead. The cost of not having compute dwarfs the premium paid for locking it in. Anthropic, training on Amazon’s Trainium clusters through Project Rainier, is not buying chips. It is securing the infrastructure runway to remain competitive at the frontier.

The economics reinforce the behavior. Cloud providers offer meaningful discounts for committed spend versus on-demand pricing — often 30 to 50 percent or more for multi-year reservations. But the deeper economic logic is planning certainty. You cannot build a credible research plan, a credible product roadmap, or a credible business model on spot market availability. The contract with your infrastructure provider becomes the foundation on which everything else is built.

VII. The Buildout: Gigawatt-Scale Facilities and the Path to 2030

The question that follows from $1.6 trillion in contracted demand is physical: where does the compute actually go? The answer is visible in the construction data, and it tells a story not just about how much is being built, but about how the fundamental unit of construction is changing.

Epoch AI tracks the power capacity of individual frontier data centers — the facilities purpose-built for training and serving the world’s most capable models. What the timeline reveals is not merely that more facilities are being built. It is that each successive generation of facility is dramatically larger than the last. In 2022 and 2023, a frontier data center measured in the tens of megawatts. By mid-2024, individual facilities were reaching 100 to 200 megawatts. By 2025, new campuses crossed 500 megawatts. And the facilities now in planning — the ones operational in 2027 — are approaching 3,500 megawatts. For reference, 3,500 megawatts exceeds the annual average power draw of Los Angeles.

The industry is not building more of the same. It is building something categorically different.

Exhibit 21Frontier Data Center Scale Over Time

Individual facility power capacity has grown from tens of megawatts (2022) to approaching 3,500 MW (2027). Source: Epoch AI.

Construction Timelines

But scale means nothing if it cannot be executed. The skeptic’s natural objection to gigawatt-scale facilities has always been timeline: surely you cannot build a power plant in two years. The construction data says otherwise.

Exhibit 22Construction Timeline by Project

Construction timelines for major AI data centers

The Anthropic-Amazon campus: 1 GW in 1.9 years. xAI Colossus 2: targeting 1 year. These are projects under construction, not pitch deck timelines.

Competitive Breakout by Organization

The aggregate numbers establish scale. But they obscure a fact that makes the buildout far more significant — and far more credible — than any single company’s investment plan. This is not one organization making a massive bet. It is six or seven organizations, independently, arriving at the same conclusion at the same time.

Exhibit 23Frontier Data Center Capacity by Organization

Data center capacity breakout by organization

Meta, OpenAI, Google DeepMind, Anthropic, xAI, Microsoft, and Alibaba are all building on parallel trajectories. Separate, competitive buildouts arriving at the same conclusion. Source: Epoch AI.

Exhibit 24Installed Compute Capacity (H100 Equivalents)

Installed compute by organization in H100 equivalents

Total frontier AI compute is on course to increase by an order of magnitude in roughly three years. Funded plans backed by signed contracts and chip supply commitments.

Scenarios: 35 to 80 Gigawatts by 2030

The project-level data establishes what is being built today. The competitive breakout confirms that the buildout is broad, parallel, and independently validated. The remaining question is trajectory: where does the aggregate capacity land by the end of the decade?

Epoch AI models four scenarios for total U.S. AI data center power capacity through 2030. The most conservative, derived from Bloomberg Intelligence estimates of hyperscaler capital expenditure, reaches roughly 35 gigawatts. The most aggressive, driven by maximum feasible growth in AI chip production, approaches 80 gigawatts. Two middle scenarios land between 40 and 60 gigawatts.

Our Assessment

We believe the data points toward the upper end of this range, and that the conservative scenarios are the ones that require justification — not the aggressive ones. The 35-gigawatt scenario does not require you to believe that any trend fails. It requires you to believe that every one of them decelerates simultaneously. That is not a conservative assumption. It is a coordinated failure scenario.

The aggressive scenario, by contrast, requires only that current trends continue. Eighty gigawatts by 2030 is not a bet on acceleration. It is a bet on continuation.

To appreciate what that number means: 80 gigawatts would represent more generating capacity than most G7 nations currently dedicate to their entire industrial base. It exceeds the total installed electricity generation of countries like the United Kingdom or France. And it covers only the United States. China is building at a directionally comparable pace. The Gulf states are positioning as AI infrastructure hubs with sovereign wealth capital behind them. The global total will be substantially higher.

Exhibit 25U.S. AI Data Center Capacity Scenarios Through 2030

Epoch AI scenarios for US AI data center capacity

Four scenarios from Epoch AI ranging from 35 GW (conservative) to 80 GW (aggressive). Even the most aggressive scenario may prove insufficient if agentic workloads scale as projected.

The striking feature of Epoch’s forecast is that even the most aggressive scenario may prove insufficient. If agentic AI workloads scale the way the frontier labs expect — persistent, always-on compute allocation for autonomous systems operating across every enterprise — then the demand curve bends upward again in 2028 and 2029 in ways that none of these four scenarios fully capture. The 80-gigawatt line is not a ceiling. It is a floor for what the next generation of AI applications will require.

The Factory Floor
of the AI Economy

I. The Demand Cascade: From Tokens to Megawatts

The AI Infrastructure Tech Stack

The Supply-Demand Imbalance

The Master Builders

II. The AI Data Center: A Different Machine Entirely

Traditional vs. AI: Holding Power Constant

The Investor’s First Principle

Economic Output per Kilowatt

NVIDIA’s Revenue-per-Gigawatt Roadmap

III. The Durable Asset: Why GPUs Hold Their Value

Pricing Evidence & Extended Lifecycle

The Manufacturing Ceiling

IV. The Limits of Precision: Why Exact Modeling Is Impossible

The Directional Case

The Bet on the Operators

The Structural Alignment

V. Monetization Velocity: Capacity Converts Immediately

What Distinguishes This Cycle

VI. The Necessity of Building: Why the Investment Is Not Optional

$1.6 Trillion in Contracted Demand

VII. The Buildout: Gigawatt-Scale Facilities and the Path to 2030

Construction Timelines

Competitive Breakout by Organization

Scenarios: 35 to 80 Gigawatts by 2030

Our Assessment

Sections

The Factory Floorof the AI Economy

I. The Demand Cascade: From Tokens to Megawatts

The AI Infrastructure Tech Stack

The Supply-Demand Imbalance

The Master Builders

II. The AI Data Center: A Different Machine Entirely

Traditional vs. AI: Holding Power Constant

The Investor’s First Principle

Economic Output per Kilowatt

NVIDIA’s Revenue-per-Gigawatt Roadmap

III. The Durable Asset: Why GPUs Hold Their Value

Pricing Evidence & Extended Lifecycle

The Manufacturing Ceiling

IV. The Limits of Precision: Why Exact Modeling Is Impossible

The Directional Case

The Bet on the Operators

The Structural Alignment

V. Monetization Velocity: Capacity Converts Immediately

What Distinguishes This Cycle

VI. The Necessity of Building: Why the Investment Is Not Optional

$1.6 Trillion in Contracted Demand

VII. The Buildout: Gigawatt-Scale Facilities and the Path to 2030

Construction Timelines

Competitive Breakout by Organization

Scenarios: 35 to 80 Gigawatts by 2030

Our Assessment

The Factory Floor
of the AI Economy