You are currently viewing SemiWiki as a guest which gives you limited access to the site. To view blog comments and experience other SemiWiki features you must be a registered member. Registration is fast, simple, and absolutely free so please, join our community today!

  • Clock Gating Optimization

    You can save a lot of power in a design by gating clocks. For much of the time in a complex multi-function design, many (often most) of the clocks are toggling registers whose input values arenít changing. Which means that those toggles are changing nothing functionally yet they are still burning power. Why not turn off those clock toggles at those registers when theyíre not needed? Thatís the reasoning behind clock gating.

    Article: Introduction to FinFET Technology Part III-power-exploration-flow-min.jpg

    Nice principle, but figuring out how best to apply it to your design in realistic use-cases takes some work. First, modifying the design to add clock gating definitely isnít practical at layout or even pre-layout gate-level. Youíre going to need to add logic and wiring and thatís best done at RTL. But then you worry about accuracy: to know where best to gate, you need to be able to run power estimates. At the layout/gate-level these can be pretty accurate (typically within 5% of silicon). But at RTL thereís more uncertainty so accuracy is more like Ī15% of accuracy at layout.

    Is this uncertainty too big to make power optimization useful at this level? Not at all but you have to think carefully about your approach and the choices you make. An important point is that, in this kind of analysis, relative accuracy is typically much better than absolute accuracy. If estimation shows that gating opportunity X will save twice as much power as gating opportunity Y, you can be pretty sure those two options would rank the same way in final implementation, irrespective of absolute predicted power reduction. And Ė engineering 101 Ė you want to focus on bigger savings. The error bars are still big enough you donít want to waste time on 1% deltas. That said, if you want to delve into tweaking you can improve absolute accuracy in RTL power estimation through correlation with previous similar and implemented designs.

    Building on that base you now have two objectives Ė figuring out where gating has most impact and then figuring out how you should drive that gating. The first takes careful analysis, across the design and across use-case scenarios. Synopsysí SpyGlass Power has been tuned for many years to help with this task. It starts with a hierarchical spreadsheet view of power, leakage and dynamic, so you can figure out where you want to focus. Itís all neatly color-coded so you can easily spot priority problems then drill down through those to find top offenders and so on down. The tool provides more than 150 metrics so you can slice and dice this however you want.

    Article: Introduction to FinFET Technology Part III-clock-gating-efficiency-min.jpg

    Some of the metrics called out in the webinar look at efficiency of clock gating. SpyGlass looks at this through several metrics. One, clock gating ratio, is purely structural Ė how many register clocks (within a block) are actually gated, when you trace back to the root clock for that block? This may be a somewhat crude estimate but it provides a baseline.

    Clock gating efficiency is a more refined estimate, requiring activity (simulation) data. Looking at all registers in the block, how many clock toggles occur on those registers versus clock toggles at the root clock? By this metric if a clock on a register toggles a few times versus the root clock, it is efficiently gated. Of course maybe the clock on a particular register has to toggle a lot because data is frequently changing, but this metric still provides a pointer to root cause for dynamic power.

    The next two metrics, ROADEô and ROADFô, are more refined. ROADF considers the number of Q pin toggles versus the number of clock toggles on a register. This is obviously much closer to a measure of ideal efficiency. If Q doesnít change, in principle the clock didnít need to toggle. ROADE considers the same measure across all registers sharing a common clock enable, so is a metric by enable signal.

    Put these metrics together and you can build a pretty decent sense of where clock gating can have the biggest impact on power. But of course what you are seeing in the dynamic power analysis is metrics for one use-case/scenario; efficiency metrics could change significantly in different use cases.

    The next step is how to drive those clock enables, once you have decided which clocks you want to gate. This step was illuminated by a question that came up in the Q&A. Why wouldnít I just use automatic clock gating where a tool figures out formally or in some other way where best to gate clocks, figures out how to build the enable and maybe even inserts the logic for me? The answer was interesting. There are such tools available and they indeed work but there arenít getting heavy traction. In practice designers seem to be much more interested in making their own decisions around enabling logic, often factoring in considerations (e.g. boundary use-cases not represented in the scenarios you ran) which simply arenít visible to a tool.

    You can get a lot more detail from the recorded Webinar HERE.