WP_Term Object
(
    [term_id] => 34
    [name] => Ansys, Inc.
    [slug] => ansys-inc
    [term_group] => 0
    [term_taxonomy_id] => 34
    [taxonomy] => category
    [description] => 
    [parent] => 157
    [count] => 256
    [filter] => raw
    [cat_ID] => 34
    [category_count] => 256
    [category_description] => 
    [cat_name] => Ansys, Inc.
    [category_nicename] => ansys-inc
    [category_parent] => 157
)
            
ansys sim world 2024 800X100 reg a (1)
WP_Term Object
(
    [term_id] => 34
    [name] => Ansys, Inc.
    [slug] => ansys-inc
    [term_group] => 0
    [term_taxonomy_id] => 34
    [taxonomy] => category
    [description] => 
    [parent] => 157
    [count] => 256
    [filter] => raw
    [cat_ID] => 34
    [category_count] => 256
    [category_description] => 
    [cat_name] => Ansys, Inc.
    [category_nicename] => ansys-inc
    [category_parent] => 157
)

RTL Design For Power

RTL Design For Power
by Daniel Payne on 08-11-2013 at 2:25 pm

My Samsung Galaxy Note II lasts about two days on a single battery charge, which is quite the improvement from the Galaxy Note I with only a one day battery charge. Mobile SoCs are being constrained by battery life limitations, and consumers love longer-laster devices.

There are at least two approaches to Design For Power:

  • Gate-level techniques
  • RTL-level techniques



In this blog I’ll focus on RTL-level because it happens sooner in the design process where larger power savings are easier to achieve than at the Gate-level, when it is often too late to make power related design changes.

Design Intent

Consider an Adder block coded in RTL and then viewed in a tool showing the functional view and also the gate-level, instance view:

Power analysis at the RTL level will identify the adder as a power hotspot, and the designer understands that this function is still an adder as shown on the left as a Functional view. The right-side shows the same Adder as a gate-level view after logic synthesis, where you now have no real idea what the function is because it has been decomposed into gates. An SoC designer could choose to shut off this adder to lower the power, while designing at the RTL level.

Performance

Another reason that you want to optimize for low power at the RTL level is run times, because a mobile graphics processor team ran RTL power analysis for one of their design blocks in just 22 minutes while the same block took 20 hours for a gate-level power analysis (waiting for logic synthesis and gate-level simulation).

Activity

Power for interconnect is described as P = C * V * V * frequency, so knowing the frequency or activity level of all nets is important in calculating power consumption. RTL simulations are quick and easy to determine node toggling.

Accuracy

Yes, gate-level power simulations are more accurate than RTL level, however the RTL power numbers are usually within 15% to 20% of what you see at post-layout. So RTL power accuracy is sufficient to use for power reduction.

Making Power Trade-offs

Because RTL power analysis runs are fast, you can expect that a 1 million-instance design completes in just a few minutes. With this speed you can evaluate how a parallel versus serial architecture compare, power gating, or even clock and data gating approaches.

Power Profiling

An RTL power profiling tool can show you when functional blocks that aren’t being used but that are still consuming power should be further optimized to reduce power. The figure below shows the power of two different blocks, one in Red color the other in Green color. During Web Browsing the Green block should not be drawing any power, however we’ve uncovered a power bug because it is clearly consuming power.

Types of Power

With RTL activity analysis an SoC designer can now see and measure all of the different types of power:

  • Average power
  • Cycle average peak power
  • Transient peak power
  • Sustained worst-case power

Here’s another design example showing activity analysis as a function of time, where the architectural behavior is anotated:

Clock Gating Efficiency

There are two ways to measure efficiency for clock gating: Static and Dynamic. Static CGE looks at the percentage of gate flip flops, while dynamic CGE looks at the percentage of gated clock cycles. Here’s a timing diagram showing a Clock signal at the top, an enable signal for clock gating, the gated clock and finally when data becomes valid.

A designer can only optimize what they can see and measure so having a CGE report will help you to identify how efficient your static and dynamic clock gating regions really are.

Power Hotspots
Apache offers visual debugging tools to show an RTL designer their entire design as a hierarchical tree widget, power per block, and dynamic generated data flow diagrams. The colors correspond to estimated power values, so you can concentrate first on the block with red and orange colors.

With this graphical approach you can understand both dynamic and standby power per block.

Regressions for Power

Power regression tests are a way to keep track of your power during the design phase, ensuring that power is within budget and not creeping upwards per design change. Here’s a visual presentation of Power versus Time for an SoC where each color is a different block.

Summary

RTL design for power is a proven approach to understand and lower SoC power to meet requirements. The older approach of gate-level power reduction is simply not as efficient or as fast as RTL design for power.

Further Reading

Preeti Gupta of Apachewrote a longer article on this topic last month at the Low-Power High-Performance site. Read the full article now.


Preeti Gupta, Apache

lang: en_US

Share this post via:

Comments

0 Replies to “RTL Design For Power”

You must register or log in to view/post comments.