You are currently viewing SemiWiki as a guest which gives you limited access to the site. To view blog comments and experience other SemiWiki features you must be a registered member. Registration is fast, simple, and absolutely free so please, join our community today!
Intel CEO Lip-Bu Tan stamps out chip bugs with aggressive new quality standards, says major validation errors can result in termination — 'B0, you keep your job. Anything above that, you are fired'
"One thing about timetable, I have a culture right now I have just implemented. It has to be A0 to production," said Lip-Bu Tan at JP Morgan's Global Technology, Media and Communications Conference. "A0 is when you tape out, first time pass. Intel does not have that culture, so I tell that, first time pass A0. B0, you keep your job. Anything above that, you are fired."
For relatively simple chips (compared to CPUs), A1 or A2 is often possible. For chips which use considerable state machine logic, B0 is aggressive, in my experience and from what I hear more recently from others. Just a guess... but chiplets must change the development norms a lot. Big dies with complex logic makes a B0 very aggressive. With chiplets, perhaps B0 is much more realistic.
For relatively simple chips (compared to CPUs), A1 or A2 is often possible. For chips which use considerable state machine logic, B0 is aggressive, in my experience and from what I hear more recently from others. Just a guess... but chiplets must change the development norms a lot. Big dies with complex logic makes a B0 very aggressive. With chiplets, perhaps B0 is much more realistic.
I'm curious how well or poorly the implementation of this is going to go. I can see bad responses like much longer time to (first) tape-out, or adding a lot more redundancy logic (die costs/size), or even performance missing targets (just remove a potentially buggy feature) to ensure A0 production.
I'm curious how well or poorly the implementation of this is going to go. I can see bad responses like much longer time to (first) tape-out, or adding a lot more redundancy logic (die costs/size), or even performance missing targets (just remove a potentially buggy feature) to ensure A0 production.
Your concerns are valid. A longer time to tape-out might not be such a bad thing, as long as milestone planning objectives are realistically extended. If not, dumb actions will result. I haven't ever seen cases of redundancy logic, except for adding cores to raise effective yields in production. But I was a development DoE who had to take over a project that unsuccessfully tried to avoid another letter stepping by disabling a performance feature which used state machines. Your concern is legitimate, IMO. My predecessor tried to remove the feature, and that caused other bugs which were not predicted, so I was appointed to fix the project. My answer, after an intensive design review - the letter stepping was required, and the feature had to be reinstated, or the result would be far worse. Let's just say my bad news delivery didn't enhance my Intel career path.
I have seen that "no B step" demand on and off for 30 years. it is hardly new
I can make a A-10 stepping (the things you can do edits are amazing)
I can ship a marginal part or do a B.... your choice
Also I can spend tons of time on presilicon or A0 Validation or speed new steppings. your choice.
Validation should find issues but design can always make improvements with new steppings. you can choose not to tape it out or not. If you threaten to fire me, then I will not do a B step.
I have seen that "no B step" demand on and off for 30 years. it is hardly new
I can make a A-10 stepping (the things you can do edits are amazing)
I can ship a marginal part or do a B.... your choice
Also I can spend tons of time on presilicon or A0 Validation or speed new steppings. your choice.
Validation should find issues but design can always make improvements with new steppings. you can choose not to tape it out or not. If you threaten to fire me, then I will not do a B step.
I don't agree. Metal layer steppings are effective only when you can fix a problem just by altering interconnects. If you need to change transistor logic, you're into an all-layer stepping, which means new mask set, costing many millions for an advanced process, and takes a long calendar time. Some kinds of chips, especially in the storage and communications industries, do not have performance requirements as stringent as processors or many accelerators, and don't have as much custom logic in state machines like CPU caches and superscalar processors do, so metal layer fixes work more often. CPUs, GPUs, specialized computing accelerators, and DRAM controllers with multi-level caches are far more dependent on custom logic than I/O chips, so a metal layer fix may often not be applicable.
Intel CPUs also used to avoid licensed IP, which made the design and validation processes slower and less efficient. I don't know what Intel's current policies are. CPUs like the AWS Gravitons are mostly licensed IP blocks, so they don't have these problems. Once you go custom, like Ampere, Apple, Tenstorrent, AMD, and Intel, you're either spending more time in design and validation, or you're going to have long and expensive development projects from all-layer steppings.
The issue is not a B0, it's a C0. The B0 doesn't get you fired, only gets you a lecture. If don't do a necessary B-step or you lie about needing one, I'd fire you on the spot.
OT - but IIRC Pentium 3 Coppermine was both a very successful chip, but also required a D0 stepping to get to 1.10/1.13 GHz reliably. This was after Intel launched a C0 version that wasn't completely reliable at that speed. (Linux kernel compilation was the easiest way to reproduce the instability).