WP_Term Object
(
    [term_id] => 77
    [name] => Sonics
    [slug] => sonics
    [term_group] => 0
    [term_taxonomy_id] => 77
    [taxonomy] => category
    [description] => 
    [parent] => 14433
    [count] => 49
    [filter] => raw
    [cat_ID] => 77
    [category_count] => 49
    [category_description] => 
    [cat_name] => Sonics
    [category_nicename] => sonics
    [category_parent] => 14433
)

Sonics’ New NoC

Sonics’ New NoC
by Paul McLellan on 06-23-2015 at 7:00 am

 Today Sonics announced the latest version of their network-on-chip (NoC) technology, SonicsGN-3.0. As with any new release there are lots of improvements that are of interest mainly to existing users, but the big area with increased capability is the expanded interleaved memory technology (IMT). This was first introduced in SonicsSX in 2008 and has been in wide customer use ever since.

SoCs often communicate through external DRAM memory. But bandwidth requirements tend to go up faster than DRAM performance, and so the solution to this is to use more pins and wider memories. But with modern memory technologies such as DDR3/4 the access is burst oriented and it is really only efficient to use an 8-word burst. But that might deliver 256 bytes of which only a few are useful, while the next access has to wait for its own burst. The solution is to interleave the memory, instead of having 4 banks have 2 banks of 2. But then the challenge then is to make sure that memory access is balanced so that both banks get used heavily. If access is mostly to the first bank then that will become a bottleneck and bandwidth to the second bank goes to waste.

In a PC or Mac this is fairly straightforward to arrange, memory alternates between the two banks. That’s why, if you upgrade your memory, the DIMMs come in pairs. Alternating blocks of 64 bytes are stored in first one bank and then the other. Since most data-structures that the microprocessor is manipulating are larger than this, typically activity to the two banks will statistically get balanced between the two banks. However, this doesn’t work for memories that are being used, say, by the video subsystem on an SoC.

As Drew Wingard, the CTO of Sonics, points out:SoC designers were early in the move to multi-channel memory architectures. Use of multi-channel DRAMs and memory sub-systems is now pervasive in SoC designs, for example, in mobile applications where maximizing memory throughput takes precedence over increasing memory capacity.


In SoC designs with multi-channel DRAM subsystems, the processor speed and the total number of processors has grown to the point that the memory bandwidth bottleneck is the biggest problem. The purpose of using multi-channel memory is to exploit parallel access to memory, while avoiding bandwidth loss due to wide DRAM data buses. By breaking data into smaller word sizes and interleaving the transactions across multiple channels, IMT enables designers to improve concurrency and overall throughput by up to 20 percent using the same external DRAMs.

But to be manageable, this needs to happen transparently to the software address space view of the world (just as in your laptop you didn’t need to know about the interleaving to write code). The traffic needs to be split for delivery to the correct channel. Doing it in the memory controller causes a bottleneck at the arbiter and doesn’t really scale past two channels. IMT is a full distributed architecture which scales well. Throughput is maximized since the network automatically overlaps channel access. But by using IMT to isolate the channels from the IP cores (including processors) it is transparent to software and other hardware.


Another related concurrency management problem is ordering of requests that may be simultaneously outstanding to multiple targets, including multi-channel memories. SGN’s flexible reordering buffer architecture enables single initiator agents to have transactions outstanding to an arbitrary collection of targets, including memory channels, while respecting the protocol-defined ordering.

 The other area for a major upgrade are improved links with physical design. Modern SoCs are often so big and so fast that a signal cannot get across the chip in a single clock cycle requiring registers to be inserted for retiming. Obviously these registers need to be physically spread across the chip, it is no good putting all the re-timing registers right by the receiver, for example. But with multiple power and clock domains a register cannot just be dropped down at an arbitrary location. SGN 3.0 includes new, user-controlled hierarchical partitioning and re-timing stage insertion capabilities that give designers control of which domains are associated with each re-timing stage so that they can match the floorplan. Using these features, designers can better structure their SoC designs and optimize RTL netlist generation for physical layout, which results in fewer iterations in the back-end design flow.

The Sonics press release page with full details is here.

Share this post via:

Comments

There are no comments yet.

You must register or log in to view/post comments.