Cisco launched its Silicon One G300 AI networking chip in a move that aims to compete with Nvidia and Broadcom.

KevinK · 2026-02-18T11:21:31-0800

IanD said:
AFAIK Cerebras have not done this for the scaled-up AI case which will drive the entire industry in the next few years (as opposed to particular benchmarks that they chose) and neither has any independent test including anything on SemiAnalysis -- am I wrong?

I think you are correct - they have found a lucrative sub-market market for super-fast response times and token production. For instance, Opus-4.6 from Anthropic on Cerebras, is fast is but ~5x more expensive than the regular version on Cursor. Not sure if and when Cerebras will benchmark over a broader operating range outside of their sweet spot.

I find this guy's blogs to be interesting on the hardware/software challenges of serving coding agents. He explains why different compute paradigms / architectures are needed for different phases of inference for coding agents.

CES and Groq "Acqui-hire" Reflection: Nvidia's Plan to Build Real Time Agents? | Hanchen Li

Opus-4.6 and GPT-5.3-Codex both use 𝗖𝗲𝗿𝗲𝗯𝗿𝗮𝘀 for fast inference options. Seems like companies are 𝗱𝗶𝘀𝗰𝗮𝗿𝗱𝗶𝗻𝗴 𝗡𝘃𝗶𝗱𝗶𝗮 for real-time agents. But is this the future trend? In my newest blog, I argue that Nvidia’s latest moves still demonstrate great potential for fast but economical agent...

www.linkedin.com

He's one of the guys who originally developed KV caching, while doing a post-Doc at University of Chicago. Now has a startup that is focused on making inference far more cost efficient.

IanD · 2026-02-19T03:13:30-0800

We've been talking about inference here -- how about training? Or is this so much smaller as a fraction of the total AI market that it doesn't really matter?

KevinK · 2026-02-19T10:29:54-0800

IanD said:
We've been talking about inference here -- how about training? Or is this so much smaller as a fraction of the total AI market that it doesn't really matter?

Most analysis I've seen shows the TAMs between data center inference and training to be about 50/50 right now but tipping toward inference with a 35% CAGR / 20% CAGR differential. I think most new entrants also view training as the harder problem, with more legacy infrastructure, so they seem willing to cede that to NVIDIA and Google, in favor of the faster growing market.

Search

Cisco launched its Silicon One G300 AI networking chip in a move that aims to compete with Nvidia and Broadcom.

KevinK

Well-known member

CES and Groq "Acqui-hire" Reflection: Nvidia's Plan to Build Real Time Agents? | Hanchen Li

IanD

Well-known member

KevinK

Well-known member