Leverage physical systems to perform computations more efficiently
Like many historical developments in artificial intelligence, the widespread adoption of deep neural networks (DNNs) has been made possible in part by synergistic hardware. In 2012, building on previous work, Krizhevsky et al. showed that the backpropagation algorithm could be run efficiently with graphics processing units to form large DNNs for image classification. Since 2012, the computational requirements of DNN models have increased rapidly, surpassing Moore’s Law. Now, DNNs are increasingly limited by hardware power efficiency.
The emerging problem of DNN energy has inspired specialized hardware: DNN “accelerators”, most of which are based on a direct mathematical isomorphism between hardware physics and mathematical operations in DNNs. Several accelerator proposals use physical systems beyond conventional electronics, such as optics and analog electronic crossbar networks. Most devices target the inference phase of deep learning, which accounts for up to 90% of deep learning energy costs in commercial deployments, although increasingly devices also address the training phase.
However, implementing trained mathematical transformations by designing hardware for strict mathematical isomorphism, operation by operation, is not the only way to perform effective machine learning. Instead, directly train the physical transformations of the hardware to perform the desired calculations.
Cornell researchers have found a way to train physical systems, ranging from computer speakers and lasers to simple electronic circuits, to perform machine learning calculations, such as identifying handwritten numbers and vowels pronounced.
By turning these physical systems into the same kind of neural networks that drive services like Google Translate and online searches, researchers have demonstrated an early but viable alternative to conventional electronic processors – one that has the potential to be Orders of magnitude faster and more power-efficient than the power-hungry chips in data centers and server farms that support many AI applications.
“Many different physical systems have enough complexity to be able to perform a wide range of calculations”, said Peter McMahon, assistant professor of applied physics and engineering at the College of Engineering, who led the project. “The systems we’ve demonstrated with look nothing alike, and they seem to have nothing to do with recognizing handwritten digits or classifying vowels, yet you can train them to do that.”
For this project, the researchers focused on one type of computation: machine learning. The goal was to discover how to use different physical systems to perform machine learning in a generic way that can be applied to any system. The researchers developed a training procedure that allowed demonstrations with three different types of physical systems – mechanical, optical and electrical. All it took was a little tweaking and a suspension of disbelief.
“Artificial neural networks work mathematically by applying a series of parameterized functions to input data. The dynamics of a physical system can also be thought of as the application of a function to the input of data into that physical system,” McMahon said. “This mathematical connection between neural networks and physics is, in a way, what makes our approach possible, even though the idea of creating neural networks from unusual physical systems may at first seem really ridiculous. .”
The researchers placed a titanium plate over a commercially available loudspeaker for the mechanical system, creating what is known in physics as a driven multi-mode mechanical oscillator. The optical system consisted of a laser beam through a nonlinear crystal that converted the colors of incoming light into new colors by combining pairs of photons. The third experiment used a small electronic circuit with only four components – a resistor, a capacitor, an inductor and a transistor – of the type that a middle school student might assemble in science class.
In each experiment, the pixels of an image of a handwritten number were encoded in a pulse of light or electrical voltage introduced into the system. The system processed the information and gave its output in a similar type of pulse or optical voltage. Most importantly, they had to be trained for the systems to perform the proper processing. So the researchers changed specific input parameters and ran multiple samples – such as different numbers in different handwritings – through the physical system, then used a laptop computer to determine how the parameters should be adjusted to get the greatest accuracy for the task. This hybrid approach leveraged the standard training algorithm of conventional artificial neural networks, called backpropagation, in a way that is resistant to noise and experimental imperfections.
The researchers were able to train the optical system to classify handwritten numbers with 97% accuracy. Although this accuracy is below the state of the art for conventional neural networks running on a standard electronic processor, experience shows that even a very simple physical system, with no obvious connection to conventional neural networks, can be taught to perform machine learning and could potentially do so much faster and using much less energy than conventional electronic neural networks.
The optical system has also been successfully trained to recognize spoken vowel sounds.
The researchers have posted their Physics-Aware-Training code online so that others can turn their own physical systems into neural networks. The training algorithm is generic enough to be applied to almost any such system, even fluids or exotic materials, and various systems can be chained together to exploit the most useful processing capabilities of each.
“It turns out you can turn just about any physical system into a neural network,” McMahon said. “However, not every physical system will be a good neural network for every task, so there is an important question of which physical systems work best for important machine learning tasks. But now there is a way to trying to find out – that’s what my lab is currently pursuing.
- Logan G. Wright, Tatsuhiro Onodera, Martin M. Stein, Tianyu Wang, Darren T. Schachter, Zoey Hu, and Peter L. McMahon. Deep physical neural networks trained with backpropagation. Nature 601, 549-555 (2022). DO I: 10.1038/s41586-021-04223-6