Computers are awesome. They allow us to perform a wide variety of complicated tasks easily and quickly, from doing complex math quickly to playing video games to writing articles like this one (what do you think I write these by hand?). Since their inception back in World War 2 with Alan Turing’s enigma cracking machine, followed by his post-war research, computers have gotten better over the past fifty years. The first digital computer was ENIAC (Electronic Numerical Integrator and Computer). It was also invented during the war, and its purpose was to calculate values for artillery range tables (there was a long road to reach the technology we now take for granted, like smart toasters and Grande Vegas casino bonuses).
By being completely electric and using plug inputs, it was infinitely faster than the mechanical equivalents, which would have had to be fed data via punch cards or some other physical means (although reprogramming the ENIAC by swapping the plugs around could take days of work). The point stands that for the longest time, digital was the way to go.
Since then, everything has slowly become digitized. Everything piece of tech you own probably uses digital data in some way, shape, or form. Cell phones are just smaller computers. Your TV plays digital video, text messages are just bits of ones and zeros, and video games render entire worlds out of digital assets.
However, there is one new field of research that’s pushing the limits of what digital technology is capable of handling: Machine Learning Algorithms, more colloquially known as AI.
What Is Machine Learning?
Machine Learning Algorithms are one the most important advancements in computer science and the key behind some of the most advanced artificial systems in use every day. You probably use some without even realizing it. The search results on just about every major website, like YouTube, use Machine Learning in order to present you with better and better results. This is how these search bars are capable of predicting what you might be interested in searching for, even if you’ve only typed in one letter.
The reason why a lot of people use AI and Machine Learning interchangeably is because of the algorithm’s origins. These algorithms are actually modeled off of the function of neurons in our brain. This idea was outlined as early as 1949 by Donald Hebb in his book “The Organization of Behavior”.
Another scientist, Arthur Samuel, built a checker’s playing AI that could improve over time. He’s the one that coined the term “machine learning” in 1952. Following these two men, we reach the hero of our story, Frank Rosenblatt, who built The Perceptron in 1957. He did it by combined Hebb’s and Rosenblatt’s work to create an algorithm capable of identifying shapes accurately. He did this by teaching the AI by rote.
Like the neurons Hebb wrote about, Machine Learning algorithms have their own “neurons”. The algorithm takes an input (usually an image) with a neuron that corresponds to each pixel. These Input Neurons fire depending on how bright the pixel is. Each of these neurons carries a bias / weight from 1 to zero. All of these input neurons then connect to a single output neuron. The output neuron fires depending on this formula:
[Input Neuron Activation * Input Neuron’s Weight] + [Sum of all other input neurons * Their Weights] > Arbitrary Bias
That may seem complicated, but I promise it’s not so bad. It’s just a bunch of multiplication and addition. If the result is greater than some arbitrary number, say 6, then the output neuron fires. In short, if the output fires, that means that the algorithm has identified something!
The trick that makes all of these algorithms so useful is what happens when the algorithm makes a mistake. We compare the algorithm’s answer to the real answer, and if the algorithm is wrong, then we adjust the weights on the pixels.
The result is an algorithm that improves in accuracy over time, so long as enough correctly-labeled images are fed into it. To solve more complicated problems and higher resolution images, we simply just need to crunch more numbers through more and more neurons- which is done through some not-very-difficult (but very tedious) matrix multiplication.
The problem is that the greater the degree of capabilities that you want for these algorithms, there are exponentially more numbers to crunch. Like, trillions of mathematical calculations, without exaggeration.
The Limits of Digital Technology
Modern computers are pretty damn powerful, all things considered. That said, even our modern hardware has trouble performing trillions of calculations within a reasonable amount of time. The more crunch required, the more power the computer uses too.
A competition called the ImageNet Large Scale Visual Recognition Challenge. Every year, people design new versions of algorithms capable of more accurately identifying dog breeds from just an image. The scale used to determine the accuracy of these algorithms is called the Top 5 Error Rate- the percentage of times the correct answer was NOT in the algorithm’s five best guesses.
In 2010, the winning AI had a Top 5 Error rate of 30%, which means that 30% of the time, the dog presented to the algorithm in the photo, would not match the algorithm’s five best guesses at all. However, a breakthrough in 2012 dropped this number to 16.4%! This was done by simply using a massive algorithm that consisted of eight layers and 500,000 neurons. To train this algorithm, called AlexNet, it had to update over 60 million weights and biases. With all those matrix operations, that works out to over 700 million individual mathematical operations!
The team managed it by using graphics cards in sync to perform the calculations, which are designed to perform lots and lots of calculations in parallel (they’re made for video games, after all, which have to render complex lighting algorithms, shaders, and potentially thousands of 3d models at a time). Other competitors followed AlexNet’s lead, and in 2015 an AI called ResNet had a top 5 error rate of a mere 3.6%, which is better than the human average.
So what’s the problem?
The problem is size and power. This setup uses big gaming graphics cards, which use between 100-250 Watts of power. Training these AIs uses as much power as the average yearly use of three households. The components within these devices are close to the size of atoms at this point, so we can’t just cram more into a single space. Plus, a lot of time and power is wasted retrieving data from memory instead of performing computations.
The Analogous Solution
Before we continue, I want to define another key difference between digital and analog systems. Digital works with ones and zeros- either “on” or “off”. Analog systems do not have this limitation and can take any value as input or any value as output.
We’re reaching some pretty hard limits of digital technology. Digital tech, by its binary nature, has to use all sorts of tricks and ingenious solutions to perform computations more complicated than basic arithmetic.
Analog systems don’t have this limitation. If you want to add two numbers together analog style, you can simply send two currents whose magnitude equals the values you want to add through a single resistor/wire, and the resulting current is just the input currents added together.
This is Kirkoff’s First Law, which you might have learned in high school physics. No binary operations are required. Other arithmetic operations can be done through similar tricks, and an analog system designed for it can perform far more complex problems, like differential equations, while digital systems simply can’t (not without resorting to “cheats”, like Taylor Series, which do difficult computations by addition and approximation).
HOWEVER: While analog is capable of doing lots and more complex computations, it sacrifices accuracy… slightly. Like, about a 1% difference. Depending on the use case, this can be acceptable or not. In the case of Neural Networks, this difference is perfectly acceptable (being 98% confident in an answer as opposed to 97% works just fine).
So, since Neural Networks are just massive sets of matrix multiplication problems, analog systems that can deliver on speed and efficiency, that can compare to the best digital systems or perform better, are suddenly looking very desirable.
Veritasium (the YouTuber whose video inspired this article) visited a Texas start-up called Mythic AI. They want to make analog chips capable of running these computations. And from the video… they’ve done it. They demonstrate algorithms running live on chips about the size of your thumb, capable of performing 25 trillion computations a second. He explains how these chips work a lot better than I can (and with some snazzy visual aids that I cannot reproduce in a written article).
The CEO claims that his chips perform comparably to the best graphics cards on the market and only consume about 3 Watts of power. For comparison, the best digital systems can do anywhere from 25 to 100 trillion computations per second but are big, bulky, expensive, and use anywhere from 50 to 250 watts of power.
That said, even that 1% error adds up over trillions of computations. It has to be converted back to digital, then sent back to the analog to adjust for this.
So is analog the solution for AI systems? Yes and no. Every tool is made for a single purpose. When you need nails nailed, you use a hammer. When you need screws screwed, you use a screwdriver. But when you need a Neural Network to run complex algorithmic processing requiring trillions of mathematical operations every second…?
Digital or Analogue? The answer is looking like both! They each have their strengths and weaknesses, and, perhaps, the computers of tomorrow- your cell phone, car, wristwatch- are all going to be capable of things we can only imagine right now!