Not Your Daddy's Thermal ManagementThe fast paced dotcom boom days in late nineties and early 2000 caused an equally hyper paced drive to quench heat in electronics and get products out to the market. Thermal engineers with stock options had a net worth calculator dashboard on their computer screen. Heat sink company stocks were high growth investments. Despite such glory, electronics cooling was one of the important, but downstream tasks in the product development flow. It involved living within given head room and foot print allocated for the heat sink. Since board layout change was out of question by the time thermal engineering kicked off, 'Flow Baffle Maker' was a phrase you could type into any Inktomi powered search engine and find job openings.
Fast forward 10+ years, and some (sorry, no career in baffle making nowadays) of what we wrote about optimizing within constraints are still essential parts of thermal engineering methodology. A key difference lies in the fact that enclosure level optimization is not the only breed of thermal management being practiced today. Thermal awareness not only found its way to board layout, but is also essential at the IC design stage if you are serious about optimizing price, power, and performance.
At the heart of thermally aware IC design is not the concern about mobile phones catching fire. We hope that dynamic frequency scaling (AKA throttling) is adequately taking care of that with cooperation from patient end users. What's at stake is the viability of hardware platforms which have a way of becoming obsolete at the introduction of better performing and multier tasking competitor. Recognition of the inseparable nature of temperature rise, frequency scaling, and performance has happened at all tiers of electronics supply chain including the consumer level. It is now clear that optimizing for power alone is not the catch-all that automatically covers thermal management. Nowadays, it's normal to find the thermal engineer in an IC level floor planning huddle. This evolution in engineering methodology may be ascribed to the following drivers:
Dynamic Thermal Management :
Any serious attempt at dynamic thermal management should incorporate the following phenomenon affecting heat dissipation and temperature rise:
• Fan speed control - this is a system level control which is already being practiced for many years now. If you hear the fan in your desktop computer whirring for no apparent reason, the heat sink may have collected dust or the fan is responding to an extrapolated temperature based on the actual temperature sensed by the diode.
• Dynamic frequency scaling is also being already deployed with power conservation (1) and temperature control goals.
• Thermally aware task distribution to cores on the chip is an option that's less punitive to the end user compared fan speed control (noise concerns) and throttling (performance loss).
• With the advance to lower technology nodes, leakage is a major factor affecting power consumption. Leakage current increases exponentially with temperature. As a consequence of the power minimization objective, the thermal management goal is not merely staying below a certain spec temperature. Instead it's an intricate art of operating the chip at lowest temperature through the use of cores as well as multiple voltage domains and completely shutting of certain blocks (1).
TSV based 3D Stacking:
Junction to ambient thermal resistance (Rja) used to be a bread and butter thermal management tool at the IC design level due to an implied conventional wisdom: heat always originates at the IC and flows outwards to the ambient. TSV based 3D stacking is throwing hot coal on that belief due to the possibility of hot spots on neighboring chips. Thermal awareness needs to start way upstream at the stack level path finding stage and needs to be adhered all along the design flow through floor planning and final verification stages. Consequences of avoiding this diligence can result in the possibility of a chip overheating much below its rated power due to a neighboring hot spot. Incorporating packaging as well ascooling solution reference designs at the very early stages is essential to avoid over simplification based on assumptions that are quickly attaining obsolescence due to 3D stacking roadmap.
In summary, there's a lot more to thermal management than what we used to practice in the 20th century because the thermal management goal is not just to stay below a certain temperature spec.. It is now aligned with the overall product objective to offer a positive overall experience to the end user. Performance optimization for combined power and thermal goals incorporating target packaging and cooling solutions is the thermal management of the future.
(1) Narayanan, Arvind, "Total Power Optimization in RTL-to-GDSII Implementation Flow", EE Times Design, 3/12/2007.
URL: Total Power Optimization in RTL-to-GDSII Implementation Flow