Google’s new chips are a shot at Nvidia — and a big hint at where AI goes next

Google CEO Sundar Pichai.Bloomberg/Getty ImagesGoogle is for the first time splitting its AI chips into two lines.The AI battleground is shifting from training to inference. Google's new TPUs are a response.It could help Google chip away at Nvidia's dominance.Google on Wednesday unveiled the latest generation of its AI chip, and for the first time, it's …

Alphabet CEO Sundar Pichai
Google CEO Sundar Pichai.

  • Google is for the first time splitting its AI chips into two lines.
  • The AI battleground is shifting from training to inference. Google’s new TPUs are a response.
  • It could help Google chip away at Nvidia’s dominance.

Google on Wednesday unveiled the latest generation of its AI chip, and for the first time, it’s splitting the line into two focus areas: training and inference.

Google’s Tensor Processing Units (TPUs) have been growing into a credible rival to Nvidia’s chips, which are still the dominant silicon across the AI industry. Anthropic is a big TPU customer, and Apple has used the chips to train its AI models.

As the AI battleground shifts to inference — the process of actually running the models once they’re deployed — Google is responding accordingly.

Google’s new TPU 8t is designed for training the largest frontier AI models, while its TPU 8i is built for inference. Google says both chips will be available later this year.

The split signals a shift happening across the industry. As the quality of models has improved and leading labs have closed the gap between each other, focus is turning to agents and applications that run on top of the models and require more computing power. That’s shifting the economic center of AI up the stack to the inference layer.

Nvidia has also been preparing for the inference explosion. It struck a $20 billion licensing deal with inference chipmaker Groq late last year, and last month debuted a new chip designed for faster inference.

Google says both its new chips are a leap ahead of its seventh-generation Ironwood TPU, which it launched last year. Its new 8i inference chip is making a big jump in high-bandwidth memory (HBM). Google says this solves the “memory wall” — the gap between how fast a processor can make calculations and how fast it can access the data it needs. For running agents, that’s important.

From answering questions to taking actions

Speaking to reporters on Monday, Google Cloud CEO Thomas Kurian called the decision to create two new chips a “natural evolution.”

Kurian also said the new chips were designed to be efficient in how much power they use “because we felt that power efficiency would become a constraint as people continue to scale both training and inference.”

This is also Google betting that agents are going to be the next big AI leap.

“AI is evolving from answering questions to reasoning and taking action,” said Google infrastructure chiefs Amin Vahdat and Mark Lohmeyer in a blog post announcing the new chips.

Google, Amazon, and Microsoft have all been racing to build custom silicon that could reduce their dependence on Nvidia. At the same time, these companies rely on Nvidia either to train their models or to lease Nvidia’s chips to customers through their data centers.

Google has trained its Gemini models using its own TPU chips, but it sells access to Nvidia’s chips through Google Cloud. Google said it will give customers access to Nvidia’s next-generation Vera Rubin GPUs later this year.

Google has spent more than a decade developing its own silicon but has ramped up those efforts internally over the past couple of years as it tries to court new customers. For example, it has opened up support for tools such as PyTorch that make it easier for companies to adopt TPUs.

That could help Google chip away at Nvidia’s dominance while giving a nice bump to its bottom line. Morgan Stanley said in a December note that 500,000 TPU chips sold could add around $13 billion in revenue to Google’s balance sheet in 2027.

Have something to share? Contact this reporter via email at hlangley@businessinsider.com or Signal at 628-228-1836. Use a personal email address and a non-work device; here’s our guide to sharing information securely.

Read the original article on Business Insider
Ray M. Andersen

Ray M. Andersen

Ray M. Andersen is a cryptocurrency researcher and blockchain developer with hands-on experience building smart contracts and decentralized applications. His technical background allows him to break down complex blockchain mechanics into engaging, accessible content for readers of all levels. Ray’s work centers on Ethereum, scalability solutions, and the future of decentralized infrastructure. When not writing, he contributes to open-source Web3 projects and mentors aspiring blockchain developers.