Neural networks have enabled many recent advances in artificial intelligence systems like speech- and face-recognition programs. These artificial systems replicate the function of natural brains, modeling neurons and synapses (gaps between neurons) as a large network of nodes linked by connections of varying weights.
Neural networks learn to perform tasks by training on large sets of example data. They find the solutions to new problems by setting their internal weights to values that result in correct solutions to problems in the training data.
This process is computationally intense, consuming a lot of energy. For this reason it’s impractical to run neural networks on handheld devices like smartphones, which currently send data over the internet to be processed on remote servers in the cloud before receiving the results.
A new special-purpose chip developed by researchers at MIT slashes neural network power consumption by 94 to 95 percent, while increasing computation speed by three to seven times compared to older processors.
The chip accomplishes this by streamlining the way neural network computations are carried out to more closely mimic a real brain. Equipped with the new chip, battery-powered devices like smartphones could run neural networks locally, opening up new possibilities for enhancing apps with artificial intelligence.
Connection Weights
Neural networks contain many processing nodes, each receiving data from several nodes in a layer of nodes below it and passing data on to several nodes in a layer above it. The connections between nodes are assigned weights that determine the importance of the output of each node.
As the neural network processes its solution, it carries out numerous dot product calculations, multiplying each node input by the connection weights and summing the result.
In conventional neural networks, this operation involves retrieving an input and its associated weight from memory, multiplying these two values, storing the result and repeating the operation for each input for each node. With potentially many millions of nodes in the network, this process becomes computationally and energy intensive.
Improved Efficiency
The new chip improves efficiency by performing dot product operations for up to 16 nodes at a time without transferring data between a processor and memory for every calculation.
The chip simply coverts each node’s input values into voltages, multiplies them by their corresponding weights and converts only the combined voltages back to digital values to be stored for further processing.
“The general processor model is that there is a memory in some part of the chip, and there is a processor in another part of the chip, and you move the data back and forth between them when you do these computations,” said Avishek Biswas, an MIT graduate student in electrical engineering and computer science, who led the new chip’s development under thesis advisor Anantha Chandrakasan, dean of MIT’s School of Engineering and the Vannevar Bush Professor of Electrical Engineering and Computer Science.
“Since these machine-learning algorithms need so many computations, this transferring back and forth of data is the dominant portion of the energy consumption. But the computation these algorithms do can be simplified to one specific operation, called the dot product. Our approach was, can we implement this dot-product functionality inside the memory so that you don’t need to transfer this data back and forth?”
Accurate Solutions
A crucial aspect of the chip is that it represents all weights as either 1 or -1, allowing them to be implemented in memory as simple switches that close or open a circuit. Comparing the accuracy of this approach to a conventional neural network with a full range of weights demonstrated that results from the new method were within 2 to 3 percent of the traditional process.
"This is a promising real-world demonstration of SRAM-based in-memory analog computing for deep-learning applications,” said Dario Gil, vice president of artificial intelligence at IBM. "The results show impressive specifications for the energy-efficient implementation of convolution operations with memory arrays. It certainly will open the possibility to employ more complex convolutional neural networks for image and video classifications in IoT [the internet of things] in the future."