The average ASIC or ASSP today is on the order of 8 to 10M gates, and that includes things up to an ARM Cortex-A9 processor core comfortably. However, that size has until recently swamped FPGA technology, forcing an RTL model to be partitioned artificially across several FPGAs before it can fit into an FPGA-based prototyping system. After spending a bunch of time integrating verified RTL IP blocks into a single functional design, it seems a bit counter-productive to split it back up to see if it really works at the validation stage. Depending on the skills of the partitioner, the diamond that was a nice RTL design can be reduced to rubble quickly.
That risk has kept many designers from using FPGA-based prototyping for large and fast designs, opting instead for virtual platform and simulation techniques which can handle very large models today. These are both good approaches to verify functional integrity, but more and more designs are unearthing IP issues that only appear when running with faster I/O and real software (which could take WEEKS in a simulation platform). If a design team doesn’t crank things up and stress the RTL getting a look at-speed, there’s a bigger chance for failure on the first silicon pass, and that can get brutally expensive in time, money and missed markets.
We’ve seen one major development on the FPGA-based prototyping front recently – from Aldec – and we’ve been pre-briefed on another one coming from another vendor shortly. (Insert #ICTYBTIHTKY hashtag here. You’ll read it here as soon as we can talk about it.) Let’s dig a bit into why the Aldec approach gets my attention.
We first learned about the Aldec HES-7 in an earlier post from Daniel Payne about a month ago. I’ve been digging through the white paper co-authored between Aldec and Xilinx, looking beyond the headline that the HES-7 system now goes to 96M gates. While that’s an impressively large size, utilizing that capability requires a design to be partitioned across 8 FPGAs in pairs separated by a PCI Express interconnect.
As the Aldec-Xilinx white paper describes, when you partition RTL to fit FPGA-based prototyping environments, you suddenly need to worry a lot about the clock tree, balancing resources between the FPGA partitions, dealing with which part of the logic gets the memory interface and I/O pins, and more. Some of you out there may be very comfortable with your partitioning skills and might have developed a formula that splits your gigantic RTL design reliably into 8 FPGA-sized pieces without side effects – I’d be thrilled to hear of a real-world example we could share here, especially how much effort this takes.
But let’s face it, the reason Chevy puts Corvettes into showrooms is to sell Silverados, so most people can get real work done while they dream of someday needing a lot more horsepower. Not many SoC designers need 96M gates. I’m betting that the vast majority of SoC designers would love to have a 12M gate platform, running at around 50 MHz in a single FPGA without RTL partitioning. That would be the exact value proposition the Aldec HES-7 XV2000 offers. Insert one ARM Cortex-A9 based design, no partitioning, and a lot less waiting for results.
There’s an interesting study John Blyler blogged recently on a survey asking why designers turn to FPGA prototyping for SoC design. That HW/SW co-verification bar in the chart he shows is huge. He’s also hinting at the issue we talked about of verified IP falling down during validation when integrated in a larger design.
What are your thoughts on the state of FPGA-based prototyping? Does the ability to put an entire 12M gate design in a single large FPGA on a prototyping system open up the methodology for more SoC designers? Or does it just push the envelope so larger RTL designs can fit now and partitioning will still be required? Are the results of FPGA prototyping worth the effort of partitioning? Does the ability to validate with real software in a much shorter time offset the investment in the methodology?