Joe has devoted his career to understanding and designing cache coherent systems and has been granted over 95 patents on the subject. For the past four years, he has been Chief Architect at NetSpeed, a developer of network-on-chip SoC interconnect.
Why did cache and coherency catch your interest in the first place?
There are many, many aspects of caches that are interesting, and I find them all very interesting. I sometimes joke that caches are just a performance tweak. They are architecturally transparent. And so, they improve your performance but they are supposed to appear as if they don't even exist. That makes both the building and the verification of them complicated because, how do you verify something that you can't actually detect whether it's there or not---if is transparent? You end up testing that it's transparent. It's kind of an interesting problem. Coherency is very interesting. I find it analogous to some of these strategy board games like chess or go in that the rules are very simple. Many protocols are only four coherent states, ---modified, exclusive, shared, and invalid ---and yet there is an infinite number of variations in terms of design trade-offs and complexity. And even though you start off with some very simple principles you can spend a lifetime mastering it.
With your focus on cache and coherency, how did you come to join NetSpeed, an interconnect company?
Sundari (NetSpeed’s co-founder) was looking for a coherency expert in this domain. And I was looking for someone that was interested in solving some difficult coherency problems. In terms of the match, bringing coherency to the interconnect is a powerful idea. Coherency is extremely difficult to get right. Even though it always sounds simple, because, hey, there are only four states, right. It is one of the most complex parts of the system. There are many different cases and you have to understand the whole design to understand any of it, to see how it's all interacting. It's an enormous amount of complexity. It's hard building and it's hard in terms of verification.
So, this idea of generating a coherency IP that is usable in a wide variety of cases is a particularly challenging problem. It requires solving unique problems that most people don't run into and don't have to deal with. Configurable coherency, that's really tough. We were looking at this as much more of a full SoC solution. It has to be very flexible to deal with all of the different use cases people have, the heterogeneity that people have.
With all this flexibility, how do you approach the problem of so many degrees of freedom and so many permutations?
Our solution not only has to work, but it has to be configurable by the end-user and it has to be in a format that we can trust. So, we rely heavily on our NocStudio software layer. What we've done is we've made the software layer intelligent enough to ensure that the coherent interconnect can be modified and can be controlled by the user. You can configure it the way that you want and NocStudio will guarantee correctness. We provide certain levels of control so you can configure things like typology, physical floor planning, the number of agents, how much capacity caches you want, even cache hierarchy. You can control all of these sorts of things, but our software layer allows you to describe these from an architectural point of view. Then the software layer goes and creates the implementation details that ensure that it is correct. And that's really necessary. If we were just doing a coherency IP without the software layer, it would basically have to be a fixed solution, or it would be a solution that we would have to design every time for the customer.
We have the vision, we really want you to describe what you need at the architectural level and then our software layer goes in and creates the solution. The software allows you to optimize it and make changes but ensures that your intent, whatever your intent, is going to be functional. That is true for either coherent or noncoherent solutions, the idea of providing maximum flexibility while ensuring correctness. We do all these things that even fixed designs don't do rigorously enough. For example, we have fully automated deadlock detection to make sure that whatever you do come up with, as we build it, we build it in a deadlock-free manner. That's critical.
In EDA we sometimes talk about synthesis, but this seems a little smarter than that, right?
Yes. You know sometimes there might be confusion about our synthesis approach. Some people view that as primarily like an area or a performance order trade-off. Their view is you can press a button and it'll optimize for you. That's an important aspect, and it is an important part of the product, but the more critical part is you press a button and it provides a functional solution. It satisfies your functionality requirements. And from there, yes, it optimizes it and allows the users to optimize it. It provides all the hooks to allow the end product to be optimized in a variety of different ways. But first and foremost, it builds it in a correct manner.
You mentioned that a big part of NetSpeed solution is this unique approach to verification, tell us about that.
There's a lot to it, but there are a few key ideas. One is that we are not verifying a static IP and we are not verifying static components. We are in the business of providing coherent or noncoherent interconnects. So our verification has to be at the interconnect level and yet we have the classic problem, that it is a configurable interconnect. You can vary hugely in terms of agents and kinds of agents, and bus width and the system requirements, that sort of thing. We have to approach this problem in a different way than most verification approaches. Our approach has to be cognizant of the fact that we are building full interconnects. We actually start from the beginning by building interconnects. We have a method of building them. We actually have an interconnect generator that will provide different interconnect specifications. We actually run those through our software layer, because the software layer is part of the IP that we provide, and it generates RTL and verification IP. And then we wrap it all up and we verify that with a bunch of different kinds of tests.
We have all the usual verification stuff. But we're dealing with a much larger state space. I think in a smaller state space where you have only one solution you hammer that one thing over and over.
Because we have a very large state space, we have to have substantially more automation we are dealing with many different kinds topologies, many different kinds of systems. I think we are unique in how we build it and how we verify it. And again there are a lot of pieces there, but it is solving an entirely new problem which is building radically new IP and being able to test it, and also being able to tell that you have tested it well. I think that's the big picture. Obviously. we could spend huge amounts of time talking about all of the things we do for verification. All the typical things apply to us, but we have the more interesting problem that it's a huge state space.
You get in front of customers quite a bit, how are they looking at the challenge of coherency?
A lot of people want coherency in some form. Coherency is extremely widespread at this point. In terms of building it themselves, there is kind of a trade-off there. We've found lots of people who don't have the expertise and they don't have the time frame to be able to build the expertise and build the coherency IP themselves. Probably that's the majority of people who are looking at an IP solution.
There are others who traditionally have built their own, or who want to build their own. They may feel that coherency is important enough that they should build it themselves, they should have that level of control for the optimizations that they want. There are always some people that are in that space. We found, even in that space, the biggest problem again is that all of these coherent solutions take many years of development. They take huge teams, they are high risk, and just years and years of building and verification. And when you get it all done and it's time to build the next chip, it has very different requirements than the first chip and your solution doesn't scale or doesn't scale well. So even with the people who are used to building their own, we've run into a different sort of compelling point which is that even though they can build one, it takes a long time. They can build it, they can optimize it for whatever, but it just doesn't provide the flexibility that they need. If you can only do one new coherency solution every three or four years, you can’t really do a family of products. You can't use the same product elsewhere inside the same company. So, we get traction in those situations, people who have a solution but they don't have that level of configurability.
With our stuff you get a substantial time-to-market benefit, it's massive. Instead of years of development, you can license our IP, press a button and in no time you've got a solution. And you can always optimize further, but you've got everything you need very early in the design process. So you get the huge time-to-market, but more importantly, you also have the flexibility to adapt that solution to different spaces. You can have a whole family of products, or reuse it in other areas, so it's this heavy reuse opportunity. And I think these are attractive even for people who, like the classic architect, thinks he can do everything better than everyone else. Sure. Maybe they're right, but you can't do it all. So, they may be able to do one instance of it well. But for companies, with the design time, the number of tape-outs, the number of products that these things have to go into, all this figures into the equation. People can get kind of stuck if they don't move out of thinking, “I’ll do it myself and it's going to take a long time, but it's okay.” If they don't move out of that mentality, they just can't keep up with the competition. The competition is able to spin new chips and upgrades quickly. We help there, we’re able to give them the flexibility and the rapid time-to-market.