Array

Instance

Array
(
    [title] => Recent Forum Threads
    [title_url] => 
    [ignore_sticky] => 0
    [exclude_current] => 0
    [limit] => 10
    [sluglist] => ["jobs-dashboard"]
    [rw_opt] => Array
        (
            [widget_select] => 1
            [pageid_281769] => 1
            [pageid_281772] => 1
        )

    [display_widget_mobile] => 
    [rw_opt_exclude] => Array
        (
            [pageid_274493] => 1
            [cpt_podcast] => 1
            [cpta_podcast] => 1
            [category_16613] => 1
            [category_16631] => 1
            [taxonomy_series] => 1
        )

    [node_id] => Array
        (
            [0] => 2
        )

)

Threads

Recent Article Comments

Real men have fabs!
Samsung is the dominant foundry and capex is likely spent more effective in conjunction with the IDM part of the…

— Claus Aasholm on April 24, 2024
Real men have fabs!
We track many more companies than what you see in the chart. Our insight model requires input data and Siemens…

— Claus Aasholm on April 24, 2024
Real men have fabs!
Under EDA you should add Siemens, Keysight EDA, etc.

— Daniel Payne on April 24, 2024
Real men have fabs!
Curious about the last chart - why is Korea's CapEx:Revenue ratio for foundry so much lower than all the others…

— Peter Bennet on April 24, 2024
Huawei’s and SMIC’s Requirement for 5nm Production: Improving Multipatterning Productivity
Yes, with damascene the cuts are blocks. The arrangement of the two etch materials like oxide and nitride comes about…

— Fred Chen on April 23, 2024
Huawei’s and SMIC’s Requirement for 5nm Production: Improving Multipatterning Productivity
Interesting round up Fred. Given how the Damascene process works, my understanding is that you can't do cuts you have…

— nghanayem on April 23, 2024
ASML- Soft revenues & Orders – But…China 49% – Memory Improving
What does "the street" base their projections on? Would "the street" like to see ASML jacking up their prices as…

— Barnsley on April 22, 2024
ASML- Soft revenues & Orders – But…China 49% – Memory Improving
DUV will not go away, KrF sales show that. Even the I-line is still present. High-NA EUV has not been…

— Fred Chen on April 20, 2024
The Data Crisis is Unfolding – Are We Ready?
Sounds like this trend will be a major driver for HBM memory to cut down on the traffic load. Also,…

— Arthur Hanson on April 12, 2024
MZ Technologies Enables Multi-Die Design with GENIO
it looks Siemens may have intereset to acquries this company

— yanfeng on April 9, 2024

hip webinar automating integration workflow 800x100 (1)

WP_Term Object
(
    [term_id] => 157
    [name] => EDA
    [slug] => eda
    [term_group] => 0
    [term_taxonomy_id] => 157
    [taxonomy] => category
    [description] => Electronic Design Automation
    [parent] => 0
    [count] => 3904
    [filter] => raw
    [cat_ID] => 157
    [category_count] => 3904
    [category_description] => Electronic Design Automation
    [cat_name] => EDA
    [category_nicename] => eda
    [category_parent] => 0
)

July 27, 2015 by Daniel Payne

Designing an IDCT for H.265 using High Level Synthesis

Designing an IDCT for H.265 using High Level Synthesis
by Daniel Payne on 07-27-2015 at 8:00 pm
Categories: EDA

Math geeks know all about Inverse Discrete Cosine Transforms (IDCT) and a popular use is in the hardware architecture of High Efficiency Video Coding (HEVC), also known as H.265, the new video compression standard and widely used in consumer and industrial video devices. You could go about hand-coding RTL to create an IDCT function, but it would take you too many lines of code and precious engineering time compared to using higher level languages like C++ or SystemC. The promise of using High Level Synthesis (HLS) is that you can actually code your video algorithms in much less time and code compared to RTL, thus getting to market quicker with less engineering effort.

Uday Das from Calypto presented a tutorial at the #52DACevent last month in San Francisco with the subject, “Building an IDCT for H.265 Using Catapult“, so I reviewed the 46 slides and share my impressions in this brief blog. The HEVC specification calls for 4 transform units of various sizes: 4×4, 8×8, 16×16 and 32×32 to code the prediction residual. The hardware architecture here uses a row column decomposition approach that performs a 1-D operation on each row, followed by another 1-D operation on each column:

Algorithm
The IDCT algorithm can be described as a lower order matrix embedded in a higher order matrix, then detailed in a signal flow graph as an 8 point IDCT A8, made up of 4 point 1D IDCT A4 and an odd matrix M4:

Data flow for this algorithm can be designed using two major functions: Butterfly, Mult_odd.

An interface description can then be written in either C or SystemC, where C code is more compact:

A core class can be written and then re-used for the 4, 8, 16 and 32 points of Mult_odd and Butterfly member functions:

The Butterfly function is common for all sizes, and notice that there is no timing information at this level. The HLS tool Catapult will unroll the loop to create hardware for parallel execution.

Our functional model of the 1-D IDCT has instances of function calls and some muxes:

To meet the H.265 specification we have to make a parallel implementation and create a 2-D IDCT using some hierarchy:

Using HLS
Designers use the HLS tool Catapult by adding design files, clicking on a hierarchy tab selecting the top-level blocks, then clicking on libraries to select a specific technology and RAM models. Next you click on mapping an choose a target clock frequency, than map your data_in and data_out as RAM.

You next select your main loop and see which resources are being used in the design:

To schedule when operations are to occur you click on the schedule tab and work with a Gantt chart. Finally, you are ready to generate RTL code.

Verification
To double check that the generated RTL code is actually performing what we had in mind with our algorithm we need to create a testbench and verification flow. Most of this process is now push-button automated for us:

The transactors are what converts function calls into pin-level signal activity.

Summary
The tutorial from DAC showed me that C++ and SystemC coding are more compact to describe my video hardware than using RTL code. The Catapult tool for HLS is used to control micro-architectural decisions so that I can trade off power, performance and area metrics.

Companies like Google have found that using HLS on their VP9 video compression design was 2X faster than the previous approaches using hand-coded RTL, while dramatically reducing the number of lines written. Give the folks at Calypto a call to start discussing how appropriate HLS is for your hardware architecture, you may just find out that you can get your next IP or SoC to market in less time with fewer engineers, a nice benefit.

Share this post via:

Comments

There are no comments yet.

You must register or log in to view/post comments.

Real men have fabs!
Samsung is the dominant foundry and capex is likely spent more effective in conjunction with the IDM part of the…

— Claus Aasholm on April 24, 2024
Real men have fabs!
We track many more companies than what you see in the chart. Our insight model requires input data and Siemens…

— Claus Aasholm on April 24, 2024
Real men have fabs!
Under EDA you should add Siemens, Keysight EDA, etc.

— Daniel Payne on April 24, 2024
Real men have fabs!
Curious about the last chart - why is Korea's CapEx:Revenue ratio for foundry so much lower than all the others…

— Peter Bennet on April 24, 2024
Huawei’s and SMIC’s Requirement for 5nm Production: Improving Multipatterning Productivity
Yes, with damascene the cuts are blocks. The arrangement of the two etch materials like oxide and nitride comes about…

— Fred Chen on April 23, 2024
Huawei’s and SMIC’s Requirement for 5nm Production: Improving Multipatterning Productivity
Interesting round up Fred. Given how the Damascene process works, my understanding is that you can't do cuts you have…

— nghanayem on April 23, 2024
ASML- Soft revenues & Orders – But…China 49% – Memory Improving
What does "the street" base their projections on? Would "the street" like to see ASML jacking up their prices as…

— Barnsley on April 22, 2024
ASML- Soft revenues & Orders – But…China 49% – Memory Improving
DUV will not go away, KrF sales show that. Even the I-line is still present. High-NA EUV has not been…

— Fred Chen on April 20, 2024

Search Semiwiki

Recent Forum Threads

Recent Article Comments

Recent Podcast Episodes

Comments

Recent Forum Threads

Recent Article Comments