Array
(
    [content] => 
    [params] => Array
        (
            [0] => /forum/threads/google%E2%80%99s-400-000-chip-monster-tensor-processing-unit-just-destroyed-nvidias-future.24139/
        )

    [addOns] => Array
        (
            [DL6/MLTP] => 13
            [Hampel/TimeZoneDebug] => 1000070
            [SV/ChangePostDate] => 2010200
            [SemiWiki/Newsletter] => 1000010
            [SemiWiki/WPMenu] => 1000010
            [SemiWiki/XPressExtend] => 1000010
            [ThemeHouse/XLink] => 1000970
            [ThemeHouse/XPress] => 1010570
            [XF] => 2030770
            [XFI] => 1060170
        )

    [wordpress] => /var/www/html
)

Google’s 400,000-Chip Monster Tensor Processing Unit Just Destroyed NVIDIA's Future!

Daniel Nenni

Admin
Staff member
Google and Samsung just attacked Nvidia from both sides of the map. On one end, Google’s new Ironwood TPU is a seventh-gen accelerator that links over 9,000 chips into a single pod and can chain 43 of those pods into a nearly 400,000-chip cluster—using a custom 3D torus fabric and optical circuit switching instead of traditional GPU racks. Anthropic has already committed to as many as one million TPUs for Claude, a signal that performance, reliability, and economics are strong enough to bet their entire frontier stack on non-NVIDIA silicon.

On the other end, Samsung quietly did something just as wild on your phone. Their new pipeline compresses a 30-billion-parameter model—normally needing more than 16GB of memory—down to under 3GB and runs it directly on consumer devices. Instead of loading the whole model, Samsung streams only the pieces needed in real time, uses smart 8-bit and 4-bit quantization, and treats CPU, GPU, and NPU as one coordinated system so latency stays low and responses feel “cloud-level” without leaving the device. If this approach scales, the next AI wave won’t just live in massive data centers—it will sit inside the smartphone already in your pocket.

In this video, we break down how Ironwood’s fabric actually beats classic GPU clusters, why hyperscalers moving to their own chips is a real threat to Nvidia’s dominance, how Samsung’s compression and scheduling tricks make giant models truly on-device, and what a multi-architecture future means for AI creators, startups, and regular users.

 
The BS about Google and selling their TPUs in the press has reached a new pinnacle of stupidity. All of the external TPU deals are really Google Cloud deals, not chip sales. Meta remains a bit of a mystery, since it's possible they might buy racks and systems, but I doubt it. I think the Meta deal will also be a Google Cloud deal. This is really better for Google, and an even greater threat to Nvidia than the dumb mainstream press people think chip sales are. Google Cloud is an ecosystem just like Nvidia CUDA and its associated networking, and Google Cloud's hold on customers will be even greater, because there are no competitors at all at the system level. It's just like the AWS tight hold on customers, once you get hooked on their storage, databases, and serverless run-time stacks, but Google's is about AI, not IT applications.

At least Anthropic is honest about it. The Anthropic announcement is just a Google Cloud deal.

I'm a lot more optimistic about what Google is doing by selling Cloud contracts than the silliness of them selling TPU chips.
 
Last edited:
Do you know if Google owns the TSMC contract for TPUs, or does Broadcom?
The BS about Google and selling their TPUs in the press has reached a new pinnacle of stupidity. All of the external TPU deals are really Google Cloud deals, not chip sales. Meta remains a bit of a mystery, since it's possible they might buy racks and systems, but I doubt it. I think the Meta deal will also be a Google Cloud deal. This is really better for Google, and an even greater threat to Nvidia that the dumb mainstream press people think chip sales are. Google Cloud is an ecosystem just like Nvidia CUDA and its associated networking, and Google Cloud's hold on customers will be even greater, because there are no competitors at all at the system level. It's just like the AWS tight hold on customers, once you hooked on their storage, databases, and serverless run-time stacks, but Google's is about AI, not IT applications.

At least Anthropic is honest about it. The Anthropic announcement is just a Google Cloud deal.

I'm a lot more optimistic about what Google is doing by selling Cloud contracts than the silliness of them selling TPU chips.

True. Google will make so much more money selling cloud space than chips. I wonder how many people are asking Google about chip sales? :ROFLMAO:

SemiWiki is in the Google Cloud and while it is expensive it is easy to use and has everything SemiWiki will ever need. I chose Google Cloud because they were behind Amazon and Microsoft with a single digit market share. This was 5 years ago. SemiWiki turns 15 on January 1st, amazing journey.
 
Do you know if Google owns the TSMC contract for TPUs, or does Broadcom?
My understanding is Broadcom & Marvel act as the middleman between the likes of Google, Amazon and foundries like TSMC. They usually do the back-end design work and reserve foundry capacity etc. That is why I think Intel's Custom ASIC business LBT launched is a brilliant move. That group can iron out a lot of issues that Custom chip design houses (Broadcom, Marvel, Mediatek etc) face when working with IFS & Intel IP & PDKs.
 
My understanding is Broadcom & Marvel act as the middleman between the likes of Google, Amazon and foundries like TSMC. They usually do the back-end design work and reserve foundry capacity etc. That is why I think Intel's Custom ASIC business LBT launched is a brilliant move. That group can iron out a lot of issues that Custom chip design houses (Broadcom, Marvel, Mediatek etc) face when working with IFS & Intel IP & PDKs.

True but Google deals directly with TSMC, uses the PDKs, etc... Google does other chips besides the TPU. Google used Avago at first to do turnkey designs. Over the years Google has built an internal team that does more of the design work and TSMC does the packaging. Google is working on TSMC N2 currently.
 
That is why I think Intel's Custom ASIC business LBT launched is a brilliant move. That group can iron out a lot of issues that Custom chip design houses (Broadcom, Marvel, Mediatek etc) face when working with IFS & Intel IP & PDKs.
I like that move too. One of the more difficult problems to solve in ASIC design is keeping your back-end team busy enough. Very few companies have the multiple development pipelines it takes. And great back-end people like circuit designers are expensive. Intel's move may help proliferate companies designing their own chips.
 
I wonder how many people are asking Google about chip sales? :ROFLMAO:
There is no “there” there for chip sales. TPUs and even NVIDIA AI chips are useless in the data center without the massive interconnection, water cooling, power distribution and data center software stack, because they all have to be co-optimized. The game seems much more wide open for chips doing client side inference. But the Samsung announcement sounds like pure BS.
 
There are some cavets using TPUs:
»Right now, the one biggest advantage of NVIDIA, and this has been true for past three companies I worked on is because AWS, Google Cloud and Microsoft Azure, these are the three major cloud companies.

Every company, every corporate, every customer we have will have data in one of these three. All these three clouds have NVIDIA GPUs. Sometimes the data is so big and in a different cloud that it is a lot cheaper to run our workload in whatever cloud the customer has data in.

I don’t know if you know about the egress cost that is moving data out of one cloud is one of the bigger cost. In that case, if you have NVIDIA workload, if you have a CUDA workload, we can just go to Microsoft Azure, get a VM that has NVIDIA GPU, same GPU in fact, no code change is required and just run it there.

With TPUs, once you are all relied on TPU and Google says, “You know what? Now you have to pay 10X more,” then we would be screwed, because then we’ll have to go back and rewrite everything. That’s why. That’s the only reason people are afraid of committing too much on TPUs. The same reason is for Amazon’s Trainium and Inferentia.«
 
There are some cavets using TPUs:
There's a lot of subtlety in this argument. If you stick to PyTorch development, everyone supports it. If you want to chose a more optimized approach, like TensorFlow, then, yeah, you may end up with a software port. But I think this article is more clickbait than enlightenment.

Data export costs from clouds are a valid concern, especially for inter-continental transfers, but unless you're exporting to China the costs look trivial compared to owning your own datacenter. But this concern that you wake up one morning and see a message from Google Cloud in your email that your costs are going up 10x next month strikes me as nonsense. Even 2x seems like nonsense. A stunt like that might knock hundreds of billions of dollars off their market capitalization, which swamps any revenue increase. Google is just as susceptible to other price increases as other companies, like for electrical power and equipment costs, but I suspect it's still better than managing your own datacenter, by a lot.
 
That video is a real mess..

Re: Samsung's "breakthrough" - it sounds like they're using Deepseek-R1 type techniques, perhaps with hardware assistance. In Nvidia's favor here - the smartphone market is really segmented, and this chip / capability will likely only end up on their own smartphones. That's not going to drive any sales away from Nvidia.

To be a real threat to Nvidia there needs to be a "local AI computing standard' (full HW and SW stack) that can work seamlessly (UX) across multiple platforms -- x86 PCs, Apple and Google smartphones, etc..
 
Back
Top