Review 6: Spiking Neural Nets
-
Spiking Neural Networks by Anil Ananthaswamy.
- Citation: Anil Ananthaswamy, “Spiking Neural Networks” Simons Institute, Dec 13, 2021
I believe it is immensly understated how critical understanding the dynamics of spiking neural networks (SNN) is to advancing energy efficient machine intellgience. This quote from this article sums it up perfectly:
computational neuroscientist Friedemann Zenke qtd. in Ananthaswamy 2021
My main motivation to think about spiking neural networks is because it’s the language the brain speaks.
Because SNNs are not researched as much as other types of artificial neural nets (ANNs), I really enjoyed the historical perspective Anil presents here. This writing shows how deep ANNs like AlexNet put deep learning on a central stage and how SNNs have stayed backstage due to a variety of tractability issues. A review of such issues is as such:
-
Backprop and non-differentiablity of SNNs: the first challenge is that spiking neurons are often using non-continous and discrete functions like the step function which prevent researchers from perforiming backprop. See Dr. Sander Bohte’s approach to this problem.
- Memory storage and backprop through time: assuming we can compute gradients (or approximate ones), we now need to deal with the fact the human neural nets show spiking latency at several different temporal scales. This is a problem becuase encoding neuron states at the scale of miliseconds for an hour of activity is incredibly memory ineffcient. Especially considering neurons exhibit “lifetime sparseness”(Quiroga and Kreiman 2016), which means that in the brain, they don’t really fire that often. As a result, we end up encoding a lot of useless information about a neuron’s membrane potential.
- Single cell activation functions: This issue isn’t directly cited here, but I’d like to use it to group a few issues together. The issue that I think it’s most related to in the text is: Homogeneous vs. heterogenous cell types. This is the idea that having neurons that vary in comparison to eachother and have different firing rules and thresholds allows the information to encode more information and actually results in improved prediction in temporally structured tasks, such as SNN Audio MNIST from Zenke Lab. I’d like to add some more considerations here beyond homogeneity and heterogeneity. There are two other factors not mentioned here that result in heterogeneity. First, I think synapse types are important. Many researchers work on excitation and inhibition (E/I) balance in the brain has led to better understanding how attractor states might occur in the brain (see my previous review). Second, I don’t have many references for this on-hand, but any experienced EEG practicioner will tell you how much background activity and local field potentials impact single neuron firing rates. I think these ideas have been experimented with by SNN researchers, but aren’t quite as impactful as the cited results by Goodman lab on heterogenous neuron types resulting in a 15-20% increase in accuracy on audio MNIST.
Anil also presents an array of ingenious research solutions to these problems such as EventProp (Dr. Pehle 2021) and Surrogate Gradients (Neftci, Mostafa, and Zenke). These are covered pretty well in the original article. However, looking at this from a broader perspective, these approaches don’t seem like they’ve lead to anything that performs so well on MNIST or any other SOTA ML result that researchers enamored with transformers are going to jump ship anytime soon. Looking at this article within the broader context of consolidation with ML/AI, written about by Andrej Karpathy, it seems most ML and AI researchers are focused on ANN based deep learning and are unlikely to migrate over just because of surrogate gradients. I don’t think this is a bad thing for either focus – in fact it’s a good thing for both. No matter how open minded we frame ourselves, most researchers are biased towards their own research direction. We have to keep reinforcing a narrative that studying X is more interesting than stuyding Y for reason J,K and L. This means that SNN researchers are either going to have to show that SNNs can be significantly more energy efficent, or better yet convincingly more accurate. Likely, the case for SNNs is as a worthwile tradeoff between accuracy and energy efficiency.
As a enthusiast of biologically inspired networks, I find myself rooting for the future promise of SNNs. However, at the end of the day, for a field of machine intelligence that is extremely based on comparison and benchmarks, the world is not so simple. Mean absolute error (MAE) is important, but not all encompassing. Dr. Tara Hamilton’s presentation at SNUFA 2021 really helped me realize that there are plently of insteresting use cases for SNNs, for example cochlea implants or heart rate monitoring. However, when it comes to self-driving cars, convolutional neural networks (CNNs) remain an obvious choice. This dynamic is actually a really amazing thing. However, as a researcher, I have to go back to the original quote, that SNNs are the language of the brain. For someone who is interested in reverse engineering the brain, I think it’s hard not consider SNNs an engaging model problem. My own experience with SNNs is very beginner, but it’s posts like this one by Anil that set the stage for an exciting future in the field.