The AI Capex Arms Race: Why Big Tech Is Spending Tens of Billions on GPUs and Data Centres

Learn why big tech is spending tens of billions on GPUs and data centres in the AI capex arms race.

Hide Me

Written By

Joshua
Reading time
» 5 minute read 🤓
Share this

Unlock exclusive content ✨

Just enter your email address below to get access to subscriber only content.
Join 104 others ⬇️
Written By
Joshua
READING TIME
» 5 minute read 🤓

Un-hide left column

Big Tech is burning $10 billion per company on AI – what’s really going on?

A viral Reddit post lays out a stark picture: hyperscalers are pouring tens of billions into AI chips and data centres, while most AI products don’t yet pay their way. The tone is dramatic, but the direction of travel is hard to ignore.

At the heart of it is capex (capital expenditure) on GPUs (graphics processing units) used to train and run large language models. Training is the expensive, one-off process to build a model; inference is the ongoing cost each time the model answers a prompt. Both now require serious hardware and serious money.

“The AI tax is real.”

Key numbers from the Reddit post

These figures are drawn directly from the post and are not independently verified here.

Category Figure (USD) Source (per Reddit post)
Microsoft AI capex in one quarter $14 billion (+79% YoY) Company disclosure (referenced)
Google capex in same quarter $12 billion (+91% YoY) Company disclosure (referenced)
Meta full-year plan Up to $40 billion Company guidance (referenced)
Training cost per frontier model (now) ~$100 million Anthropic CEO (referenced)
Training cost (later this year) ~$1 billion Anthropic CEO (referenced)
Training cost (2026 estimate) $5–10 billion Anthropic CEO (referenced)
Nvidia H100 unit price ~$30,000 Market price (referenced)
Meta H100 order volume ~350,000 units Company comment (referenced)
Cloud rental (H100 cluster) ~$100/hour Amazon pricing (referenced)
Cloud rental (CPU) ~$6/hour Amazon pricing (referenced)
Average data centre size ~412,000 sq ft Industry estimate (referenced)
Data centres globally ~7,000+ Industry estimate (referenced)

Why AI capex is exploding: GPUs, data and data centres

Three forces are pushing costs up:

  • Hardware scarcity – Cutting-edge GPUs such as the Nvidia H100 dominate training/inference. Demand far outstrips supply and prices are elevated.
  • Data centre build-out – Models need power, cooling and space. The post notes much bigger facilities and rapidly rising counts worldwide.
  • Data licensing – Training on high-quality, legally clean data now costs real money. The post cites news licensing and a Google-Reddit deal.

Underpinning this is the “arms race” dynamic: when one lab releases a bigger, better model, peers feel forced to match or risk falling behind.

“This isn’t sustainable but nobody wants to be the first one to blink.”

From $100m to $10bn per model – what that means

The post quotes the Anthropic CEO’s estimate that state-of-the-art models could cost $5–10 billion to train by 2026. If true, the bar to compete at the frontier rises beyond all but a handful of firms. That can entrench incumbents and push everyone else to use their models via APIs, further consolidating power.

There’s also the operational side. Even if a model is trained, serving millions of users with low latency is expensive. This is why some vendors push smaller, cheaper models for specific tasks while keeping giant models for the hardest problems.

Is the AI spend sustainable if products don’t pay yet?

The post argues that monetisation lags far behind spend. Productivity gains are real in some workflows, but broad, proven ROI remains patchy. Meanwhile, platform companies are subsidising growth to win share.

The risk: if revenue doesn’t catch up, prices may rise, capacity may be rationed, or firms may pivot to smaller, more efficient models. The upside: if models materially boost developer output or automate complex tasks, payback could accelerate.

Implications for UK developers, startups and enterprises

  • Cloud costs – Expect GPU capacity to remain tight and premium-priced. Budget for volatility and consider region availability when planning rollouts.
  • Build vs buy – Training from scratch is out of reach for most. Fine-tuning or retrieval-augmented generation (RAG – connecting a model to your own content) on rented GPUs is the pragmatic route.
  • Compliance and data – UK GDPR and sector rules still apply. If you license external data or use user content for training or fine-tuning, ensure consent and auditability.
  • Supplier concentration – Relying on a single model or cloud provider can be risky. Multi-model and multi-cloud strategies help with resilience and pricing leverage.
  • Talent – Salaries for scarce AI roles are rising. Upskill existing teams and be realistic about what you truly need in-house.

Practical strategies to keep AI costs sane

  • Fit the model to the job – Use small/medium models for routine tasks; reserve large models for edge cases. Latency and cost drop fast on smaller models.
  • RAG over retrain – Start with RAG to ground outputs in your data before considering fine-tuning. It’s cheaper, faster and easier to govern.
  • Prompt discipline – Shorten prompts, cache frequent results, and batch requests. This cuts token usage and inference spend.
  • Right-size infrastructure – Use managed inference endpoints, autoscaling, and time-bound GPU usage. Avoid keeping GPUs idle.
  • Measure ROI early – Treat pilots as experiments with clear success metrics. Kill what doesn’t move the needle.
  • Start small, ship value – For simple automation, you can wire LLMs into everyday tools. Example: connect ChatGPT to Google Sheets for structured outputs and basic workflows.

If you do need to rent GPUs, review current pricing from your provider and monitor pre-emptible/spot options. For reference, see AWS accelerated computing pages for their latest instance families and costs.

What to watch next (2025–2026)

  • Frontier model cadence – If $1–10 billion training runs become the norm, expect fewer players at the top and more focus on efficient inference.
  • Data licensing regimes – Rising content costs could reshape who can train and on what terms.
  • Model specialisation – More task-specific, smaller models competing on price/performance for enterprise workloads.
  • Pricing shifts – API and cloud pricing may change as vendors chase profitability rather than growth at all costs.

Bottom line

The post captures a real tension: extraordinary spend chasing extraordinary capability, with business models still catching up. For most UK teams, the smart move is to stay pragmatic – leverage existing platforms, keep experiments tightly scoped, and be ruthless about ROI. Let Big Tech fight the capex war; you can win on execution, data quality and user experience.

Last Updated

October 12, 2025

Category
Views
11
Likes
0

You might also enjoy 🔍

Minimalist digital graphic with a yellow-orange background, featuring 'Investing' in bold white letters at the centre and the 'Joshua Thompson' logo below.
Author picture
GB Group’s H1 FY26 shows steady growth, improved profitability, and a confident outlook for accelerated second-half performance.
This article covers information on GB Group PLC.
Minimalist digital graphic with a yellow-orange background, featuring 'Investing' in bold white letters at the centre and the 'Joshua Thompson' logo below.
Author picture
This article covers information on Renew Holdings PLC.

Comments 💭

Leave a Comment 💬

No links or spam, all comments are checked.

First Name *
Surname
Comment *
No links or spam - will be automatically not approved.

Got an article to share?