It was spring 2010 and I was asked to attend an HLS (High Level Synthesis) meeting. To be honest I cringed, after my bad relationship with Accel DSP and broken promises my heart was all walled up and needed counseling. But my management had a way of making me an offer I could not refuse, like keeping my job. So reluctantly I went. Does your employer do lunch and learns instead of real training? You know what that equals right? A 1/8 pay cut, but let’s play nice.
Anyways after the usual introductions at the meeting they began to get into the meat of the tool. I quickly diverted and asked if we could see the tool in action and move away from the power point and boy did they. First up was a cookie cutter FIR filter but it worked, really! Then they moved into floating point designs etc. This HLS was the greatest thing since sliced bread. I saw its potential and I needed to try it. We all agreed on an evaluation period. Now I am by no means the best coder in the world, but even the best would have a hard time beating the HLS tool with respect to design time, area and latency.
What HLS is not: It is not a coder in a box, thus sit down the software guy and have him designing FPGAs. You need to understand the FPGA, no exceptions or you will have a fat, slow design. The C or its variant will need to be restructured, smartly, thus helping the tool out so it can perform better. It is not a button you press and you have a bit image. I know how program managers think .
I leverage HLS tools in this fashion. I view it as Xilinx Corgen on steroids which are driven by a C file. The speed up in design time is not in the translation from C to VHDL but really is in the simulation domain. You are no longer verifying designs piece by piece using RTL. For example, I design a Beamfomer in C. I compile it and then run ‘a.exe’ and verify that the answer matches the expects. That took about a second. For many PRIs of data that could of taken hours in ModelSim. Catching on? I then bring up the HLS tool and pull in the C file and the tool reports the latency, area, clock frequency etc. From that information I can determine which FPGA to use. I then start using directives to optimize the area / latency by using unrolls and pipeline directives. About an hour later my beamformer is done. I then simulate the RTL at my top level but I already know the math works and the tool took care of the boundary conditions. The goal of this article is by no means a recipe on HLS usage but hopefully entices you to check it out, you won’t be sorry.