Array
(
    [content] => 
    [params] => Array
        (
            [0] => /forum/threads/samsung-foundry-nabs-nvidia.24757/page-2
        )

    [addOns] => Array
        (
            [DL6/MLTP] => 13
            [Hampel/TimeZoneDebug] => 1000070
            [SV/ChangePostDate] => 2010200
            [SemiWiki/Newsletter] => 1000010
            [SemiWiki/WPMenu] => 1000010
            [SemiWiki/XPressExtend] => 1000010
            [ThemeHouse/XLink] => 1000970
            [ThemeHouse/XPress] => 1010570
            [XF] => 2030970
            [XFI] => 1060170
        )

    [wordpress] => /var/www/html
)

Samsung Foundry nabs Nvidia

The dramatic reduction in projected revenue suggests Groq may be facing challenges in securing data center space as it works to sell its hardware to large companies and foreign governments.
The question is, to me, is any start-up chip designer successful in the AI market who is not partnered with a major player (cloud computing company or top tier chip maker)? I can't think of any. While a couple of my friends point to Cerebras, they are a remarkable technical success, but revenue-wise they are still well under $500M, and losing money.
My take is that Groq only had a partial solution for data-center scale inference. The Groq chips are great at simple, fast , low latency decode (MoE execution), but don’t have enough memory or the right memory management (KV store and caches) for optimized long-context prefill, where raw memory bandwidth isn’t as important. Plus they lacked rack/pod level resource management, routing, and memory tiering orchestration to make it all work efficiently.
But inference is where Groq has been aiming all along, correct? Except for Cerebras, are any start-ups competing in training anymore? (I can't think of any.)
Add in that they had to spin up and operate their own data centers to sell their solutions since there weren’t any CSP / hyperscaler takers for their raw boards / racks without trialing on real hardware. I don’t think they directly ran into a data centers power / capacity issue, but did hit a cost of operations vs revenue wall.
Building an inference solution on Groq, or any start-up, is a risky proposition that they'll fold from lack of funding, and you'll have wasted precious time to market. Now that Groq has the backing of Nvidia, I think they could become the frontrunner. Not because they're so awesome, but because of Nvidia. Then there's SambaNova and Intel. I think there's hope for Samba-tel too.
 
The question is, to me, is any start-up chip designer successful in the AI market who is not partnered with a major player (cloud computing company or top tier chip maker)? I can't think of any. While a couple of my friends point to Cerebras, they are a remarkable technical success, but revenue-wise they are still well under $500M, and losing money.

I would put Cerebras in the path to success category, but they still might need the major partnering tie-in, like Amazon, to go the distance. Other than that, I think you are right, with two caveats:
* There's likely still opportunity on the client side. It's not clear that Apple, Intel and AMD have gotten their XPUs right for real-world client side applications, and there are more specialized apps like autonomous driving, etc. where chips optimized for something completely different (al la AI5, AI6) are required.
* China - who knows what's going to evolve as the home-grown data center inference win there. Probably still ties in with Huawei or a hyperscaler/CSPs like Alibaba. I do find it interesting that Alibaba has been doing some major contributions with respect to Mooncake to NVIDIA's open source "OS" for racks/pods, Dynamo.



But inference is where Groq has been aiming all along, correct? Except for Cerebras, are any start-ups competing in training anymore? (I can't think of any.)

I can't, though again, China is the unknown. There isn't much space given TPU, Trainium 3 and AMD are already filling up the rest of the training space.

Building an inference solution on Groq, or any start-up, is a risky proposition that they'll fold from lack of funding, and you'll have wasted precious time to market. Now that Groq has the backing of Nvidia, I think they could become the frontrunner. Not because they're so awesome, but because of Nvidia. Then there's SambaNova and Intel. I think there's hope for Samba-tel too.

My guess is that there will be fallout and specialization in data-center level rack/pod level AI inference hardware over the next couple years. There's just so much that has to come together for the co-optimized general solution - processor systems (2 or more different types), connectivity, storage tiers, orchestration OS, all the different model serving environments, plus the physical racks with extreme power and cooling considerations. And all the hardware components require new chips that are not general purpose and have to be co-optimized together - so we're back to a vertical focus like IBM mainframes, at least for while.
 
I would put Cerebras in the path to success category, but they still might need the major partnering tie-in, like Amazon, to go the distance. Other than that, I think you are right, with two caveats:
* There's likely still opportunity on the client side. It's not clear that Apple, Intel and AMD have gotten their XPUs right for real-world client side applications, and there are more specialized apps like autonomous driving, etc. where chips optimized for something completely different (al la AI5, AI6) are required.
Agreed on Cerebras and Amazon. Cerebras really gets a win if Google and Azure feel they need to follow suit.

As for clients... I'm confused by what I see happening on Windows PCs. Intel, AMD, and Qualcomm have incompatible NPUs. Microsoft supports all three, but this looks like a silly plan, and gives Apple an advantage. If I were Microsoft I'd be pushing for a common instruction set NPU spec.
* China - who knows what's going to evolve as the home-grown data center inference win there. Probably still ties in with Huawei or a hyperscaler/CSPs like Alibaba. I do find it interesting that Alibaba has been doing some major contributions with respect to Mooncake to NVIDIA's open source "OS" for racks/pods, Dynamo.
China remains a mystery to me.
I can't, though again, China is the unknown. There isn't much space given TPU, Trainium 3 and AMD are already filling up the rest of the training space.
Agreed.
My guess is that there will be fallout and specialization in data-center level rack/pod level AI inference hardware over the next couple years. There's just so much that has to come together for the co-optimized general solution - processor systems (2 or more different types), connectivity, storage tiers, orchestration OS, all the different model serving environments, plus the physical racks with extreme power and cooling considerations. And all the hardware components require new chips that are not general purpose and have to be co-optimized together - so we're back to a vertical focus like IBM mainframes, at least for while.
And this is the case for Nvidia continuing to win big. They already have the co-optimization, even in software. Only Google has a comparable solution, but I suspect it only works in Google Cloud (regardless of the Broadcom Anthropic stories we continue to read).
 
As for clients... I'm confused by what I see happening on Windows PCs. Intel, AMD, and Qualcomm have incompatible NPUs. Microsoft supports all three, but this looks like a silly plan, and gives Apple an advantage. If I were Microsoft I'd be pushing for a common instruction set NPU spec.
It's too much work to create a single ISA for an ASIC shared by Vendors what would be difficult and Microsoft don't have the capability to do so. what would be better is DirectX like API so each HW has to support like a fixed amount of operation which can be than mapped to the API and the developers can use the API to program stuff.
If it were me I would cut the NPU to minimum and don't want it over a particular Compute Capacity
 
It's too much work to create a single ISA for an ASIC shared by Vendors what would be difficult and Microsoft don't have the capability to do so. what would be better is DirectX like API so each HW has to support like a fixed amount of operation which can be than mapped to the API and the developers can use the API to program stuff.
If it were me I would cut the NPU to minimum and don't want it over a particular Compute Capacity
I think an API is too high level for NPU processing. I suspect that for Qualcomm's NPUs for the Surface PCs, Microsoft worked closely with Qualcomm, because Microsoft's name is on the Surface. I am wondering how Microsoft supports Intel and AMD. Somewhere some team is generating detailed requirements and specs for programming NPUs, and some other team is writing NPU code, which I'm assuming is in assembler. The assembler code is probably in libraries, but the libraries would need common functionality and parameters to make them useful. I'm wondering... who writes the NPU code and to whose specs for Windows? For more specialized libraries, like VINO, those coders are more likely in Intel and AMD. Apple claims FaceID and various other image processing software use the NPUs, but I'm wondering... are NPUs more for marketing than functionality at this point?

Apple's client AI strategy is considered weak for the time being. Copilot probably uses Qualcomm NPUs, but does Copilot really use Intel and AMD NPUs for anything significant? Microsoft's web page says mostly nothing specific.


Coming soon to a new marketing campaign...
 
Apple's client AI strategy is considered weak for the time being. Copilot probably uses Qualcomm NPUs, but does Copilot really use Intel and AMD NPUs for anything significant? Microsoft's web page says mostly nothing specific.
it's artificial restriction on Microsoft part to Upsell you windows on ARM you can get Intel NPU to run AI models with intel AI playground it's capable of all the stuff that Qcom NPU is capable of.
 
it's artificial restriction on Microsoft part to Upsell you windows on ARM you can get Intel NPU to run AI models with intel AI playground it's capable of all the stuff that Qcom NPU is capable of.
I can't believe Microsoft is that silly, but I've often suspected that Windows is still haunted by the ghost of Steve Ballmer, and it's really Office and Azure that keeps Microsoft relevant.
 
Back
Top