Altera FPGAs Accelerate Microsoft Neural Network Engine

23 February 2015

Altera Corp has said that Microsoft is using its Arria 10 FPGAs to host convolutional neural network (CNN) algorithms for such tasks as image classification, image recognition and natural language processing within data centers.

Altera said that Microsoft researchers have been using samples of Arria 10 FPGAs achieving performance levels of 40-GFLOPS per watt, about the three times the performance-per-power achieved when running CNNs on general-purpose GPUs. Microsoft has been using OpenCL, or VHDL, to code the Arria 10 FPGAs with their on-chip floating-point DSP blocks.

Microsoft has been working with Altera for some time on the use of FPGAs to produce more energy-efficient hardware for the software-defined data center (see Microsoft, Bing Bet on Programmable Logic for Servers). This was the Catapult project, which demonstrated an effort to accelerate Bing Ranking by a factor of nearly two using FPGAs in the datacenter. Microsoft Research has now developed a high-throughput CNN FPGA accelerator that achieves excellent performance while consuming a small fraction of server power.

Top-level architecture of convolutional neural network accelerator. Source: Microsoft Research.

According to a research paper from Microsoft Research its CNN FPGA accelerator engine is characterized by three features: (1) a software configurable engine that can support multiple layer configurations at run-time (without requiring hardware re-compilation), (2) an efficient data buffering scheme and on-chip re-distribution network that minimizes traffic to off-chip memory, and (3) a spatially distributed array of processing elements (PEs) that can be scaled easily up to thousands of units.

Microsoft's efforts in machine learning can be used to target such services as Bing, Cortana, One Drive, Skype Translator, and Microsoft Band.

"We are seeing a significant leap forward in CNN performance and power efficiency with Arria 10 engineering samples," said Doug Burger, director of client and cloud apps at Microsoft Research. "The FPGA has an architectural advantage for neural algorithms with the ability to convolve and do pooling very efficiently with a flexible data path which enables many OpenCL kernels to pass data directly to each other without having to go to external memory," said Michael Strickland, director of the compute and storage business unit at Altera. "Arria 10 has an additional architectural advantage of supporting hard floating point for both multiplication and addition – this hard floating point enables more logic and a faster clock speed than traditional FPGA products."

Questions or comments on this story? Contact

Related links and articles:

More on convolutional neural networks

IHS MCU and MPU research

News articles:

Ceva Has Vision For Neural Network Processing

Microsoft, Bing Bet on Programmable Logic for Servers

IBM Seeks Customers For Neural Network Breakthrough

"BrainCard" Maker Project Marks Startup's Market Entry

Intel Follows Qualcomm Down Neural Network Path

Powered by CR4, the Engineering Community

Discussion – 0 comments

By posting a comment you confirm that you have read and accept our Posting Rules and Terms of Use.
Engineering Newsletter Signup
Get the GlobalSpec
Stay up to date on:
Features the top stories, latest news, charts, insights and more on the end-to-end electronics value chain.
Weekly Newsletter
Get news, research, and analysis
on the Electronics industry in your
inbox every week - for FREE
Sign up for our FREE eNewsletter