Breaking
Technology

Inside the new silicon race: how custom chips are quietly rewriting the cloud

The hyperscalers are no longer renting Nvidia’s future — they are building their own. A dispatch from the foundries, labs and boardrooms that will decide who owns the next decade of AI.

In a quiet corridor behind Building 42, a pane of polished glass hides what three years of unspoken strategy look like in silicon. The chips on the other side are not for sale — and that, more than raw performance, is the point.

When Amazon, Google, Microsoft and Meta began pouring tens of billions of dollars into custom accelerators, analysts treated it as a hedge. Today it reads more like a declaration of independence. In-house inference silicon now serves more than half of each hyperscaler’s AI traffic, according to three supply-chain executives granted anonymity to discuss private contracts.

From commodity to competitive moat

The arithmetic is brutal. A single generative-AI request can cost ten times more than a traditional web lookup. At the scale of billions of daily calls, the difference between leasing and owning compute is measured not in margins but in survival.

"We stopped thinking about GPUs as a purchase decision," said one senior engineering director at a top-three cloud. "They became a product decision."

What it means for everyone else

For startups, the shift narrows the toolset. For enterprises, it changes what portability means — your model may run three times faster on one cloud, three times cheaper on another, and nowhere else at all. The next generation of AI infrastructure is being standardised and de-standardised at the same time.

Expect more announcements through Q3, as the industry moves from pilots to platforms.

About Maya Thompson

Senior technology correspondent covering AI, chips and the future of computing.