You are currently viewing SemiWiki as a guest which gives you limited access to the site. To view blog comments and experience other SemiWiki features you must be a registered member. Registration is fast, simple, and absolutely free so please, join our community today!

  • Being Intelligent about AI ASICs

    The progression from CPU to GPU, FPGA and then ASIC affords an increase in throughput and performance, but comes at the price of decreasing flexibility and generality. Like most new areas of endeavor in computing, artificial intelligence (AI) began with implementations based on CPUís and software. And, as have so many other applications, it has moved up this chain to the optimal trade off point for both flexibility and performance. However, unlike cryptocurrency mining, AI utilizes constantly changing algorithms and architectures, which are often specific within a given application area.

    Biggest Challenge of Adoption of 3D IC Technology-cpu-asic-evolution.jpg


    AI when implemented for ADAS, search, VR, voice recognition, image analysis etc., calls for unique solutions in each space. The inability of a general-purpose ASIC to best handle a variety of applications has created a road block to the adoption of ASICs for AI. Yet, this leaves potentially huge gains in power efficiency and performance on the table. eSilicon just announced an offering they call neuASIC that promises to solve this problem and give developers of AI ASICs much more flexibility and a well thought out methodology for implementation. Before their official announcement at the Machine Learning and AI Developerís Conference, I had a chance to talk with eSiliconís Carlos MaciŠn, Sr Director of Innovation, about their new approach.

    The key insight driving this approach is that while there is a common notion of what an AI ASIC looks like, the parallel processing elements in the system are most likely to vary based on the application. Thereís benefit to dividing the system up so there is a chassis consisting of IOs, data and control path interconnect, a CPU for control and internal and external memory. In addition, there is an array of AI tiles which have the AI neural network processing elements and their local memory. These AI tiles and the array that connects them are the most likely elements to change with new system requirements and new applications. Of course, the base elements may need to change too, but in general they will prove more durable and hence more reusable.

    Biggest Challenge of Adoption of 3D IC Technology-esilicon-neuasic.jpg


    eSilicon is a pioneer in applying 2.5D technology to building efficient and high-performance designs. It comes as no surprise that they are using this technology to create their solution for AI based systems. They divide the design up into two parts, the AI core die and the ASIC Chassis die. The ASIC Chassis contains scratchpad memory, CPU, NoC based scalable control path, 3D data path interconnect, bus interfaces and external memory IOs. The AI core die consists of AI tiles custom designed for the AI application that the system is targeting. Together they go into a package along with HBM to create a full system for AI.

    Designers are not left on their own to design each of the pieces. eSilicon provides their Chassis Builder software, AI tile cells and the IP for the ASIC Chassis itself. They offer Giga cells and Mega Cells that contain full AI subsystems and AI primitives respectively. Putting all this together they have everything needed to design and implement optimized ASICs for AI, and this offers the ability to modify select portions for future designs as algorithms and approaches change and improve.

    They also have two very interesting pieces of IP that further help improve system performance. The first is their Word All Zero Power Saving memory. It offers 1RW (1 read or 1 write access per cycle) along with an 80% reduction in power. The power for all zero word read/write is 20% lower than WC read/write. This is useful with the sparse matrices used in AI. These benefits come with a very nominal 2% overhead (16Kx64) compared to conventional memory. The second is their Pseudo Four Port memory. They use a foundry 8T bit cell and perform 2 read and 2 write per cycle. They can be configured as 2R2W, 1R2W, and 2R1W multiport memories.

    Biggest Challenge of Adoption of 3D IC Technology-neuasic-platform.jpg


    As AI becomes more prevalent in silicon design we are seeing an explosion of innovation in system implementation. While nobody is standing still Ė witness the GPU companies adapting their silicon to meet AI market needs - itís interesting to see the synthesis of several technologies like 2.5D, advanced data/control paths, AI tile design, etc. used to create flexibility, improved turnaround time and better performance. eSilicon has a large team working on AI IP and system integration. The results look impressive. For more information on eSiliconís neuASIC look at their website.