The human brain operates with a “grow and prune” strategy, initially starting off with a massive amount of neural connections and then pruning away the unused connections over time. Recently, a team of AI researches has applied this approach to AI systems and found that it could substantially reduce the amount of energy required to train an AI.
A team of researchers from Princeton University recently created a new method of training artificial intelligence systems. This new training method seems able to meet or surpass the industry standards for accuracy, but it’s able to accomplish this while consuming much less computational power, and therefore less energy, than traditional machine learning models. Over the course of two different papers, the Princeton researchers demonstrated how to grow a network by adding neurons and connections to it. The unused connections were then pruned away over time, leaving just the most effective and efficient portions of the model.
Niraj Jha, professor of Electrical Engineering at Princeton, explained to Princeton news that the model developed by the researchers operates on a “row-and-prune paradigm”. Jha explained that a human’s brain is the most complex it ever will be at around three years of age, and after this point the brain begins trimming away unneeded synaptic connections. The result is that the fully developed brain is able to carry out all the extraordinarily complex tasks we do every day, but it uses about half of all the synapses it had at its peak. Jha and the other researchers mimicked this strategy to enhance the training of AI.
“Our approach is what we call a grow-and-prune paradigm. It’s similar to what a brain does from when we are a baby to when we are a toddler. In its third year, the human brain starts snipping away connections between brain cells. This process continues into adulthood, so that the fully developed brain operates at roughly half its synaptic peak. The adult brain is specialized to whatever training we’ve provided it. It’s not as good for general-purpose learning as a toddler brain.”
Thanks to the growing and pruning technique, equally good predictions can be made about patterns in data using just a fraction of the computational power that was previously required. Researchers are aiming to find methods of reducing energy consumption and computational cost, as doing so is key to bringing machine learning to small devices like phones and smartwatches. Reducing the amount of energy consumed by machine learning algorithms can also help the industry reduce its carbon footprint. Xiaoliang Dai, the first author on the papers, explained that the models need to be trained locally due to transmission to the cloud requiring a lot of energy.
During the course of the first study, The researchers tried to develop a neural network creation tool that they could use to engineer neural networks and recreate some of the highest performing networks from scratch. he tool was called NeST (Neural network Synthesis Tool), and when it is provided with just a few neurons and connections it rapidly increases in complexity by adding more neurons to the network. Once the network meets a selected benchmark it begins pruning itself over time. While previous network models have used pruning techniques, the method engineered by the Princeton researchers was the first to take a network and simulate stages of development, going from “baby” to “toddler” and finally to “adult brain”.
During the second paper, the researchers collaborated with a team from the University of California-Berkely and Facebook to improve upon their technique using a tool called Chameleon. Chameleon is capable of starting with the desired endpoint, the wanted outcomes, and working backward to construct the right type of neural network. This eliminates much of the guesswork involved in tweaking a network manually, giving engineers starting points that are likely to be immediately useful. Chameleon predicts the performance of different architectures under different conditions. Combining Chameleon and the NeST framework could help research organizations who lack heavy computation resources take advantage of the power of neural networks.
AI Developed to Translate Brain Activity into Words
Researchers at the University of California, San Francisco have developed artificial intelligence (AI) that can translate brain activity into text. The system works on neural patterns that are detected when someone is speaking, but experts hope that it can eventually be used on individuals who are unable to speak, like people suffering from locked in syndrome.
Dr. Joseph Makin was co-author of the research.
“We are not there yet but we think this could be the basis of a speech prosthesis,” said Makin.
The research was published in the journal Nature Neuroscience.
Testing the System
Joseph Makin and his team relied on deep learning algorithms to study the brain signals of four women as they spoke. All of the women have epilepsy, and electrodes were attached to their brains to monitor seizures.
After the electrodes were attached, each woman then read aloud a set of sentences while her brain activity was measured. The largest amount of unique words used was 250. They could choose from a set of 50 different sentences, including “Tina Turner is a pop singer,” and “Those thieves stole 30 jewels.”
The brain activity data was then fed to a neural network algorithm, and it was trained to identify regularly occurring patterns. These patterns could then be linked to repeated aspects of speech like vowels or consonants. They were then fed to a second neural network that attempted to convert them into words to form a sentence.
Each woman was asked to repeat the sentences at least twice, with the final repetition not making it into the training data. This allowed the researchers to test the system.
“Memorising the brain activity of these sentences wouldn’t help, so the network instead has to learn what’s similar about them so that it can generalise to this final example,” says Makin.
The first results from the system did not make sentences that made sense, but it improved as the system compared each sequence of words with the sentences that were read aloud.
The team then tested the system by generating written text only from the brain activity during speech.
There were a lot of mistakes in the translation, but the accuracy rate was still very impressive and much better than previous approaches. Accuracy varied from person to person, but for one individual only 3% of each sentence on average needed corrections.
The team also learned that the training algorithm on one individual’s data allowed the final user to provide much less.
According to Dr. Christian Herff, who is from Maastricht University but not involved in the study, it is impressive that the system required less than 40 minutes of training data for each participant and a limited collection of sentences, compared to the millions of hours normally required.
“By doing so they achieve levels of accuracy that haven’t been achieved so far,” he said.
“Of course this is fantastic research but those people could just use ‘OK Google’ as well,” he said. “This is not translation of thought [but of brain activity involved in speech].”
Another challenge could be that people with speech disabilities might have different brain activity.
“We want to deploy this in a patient with an actual speech disability,” Makin says, “although it is possible their brain activity may be different from that of the women in this study, making this more difficult.”
There is still a long way to go to translate brain signal data comprehensively. Humans use a massive amount of words, and the study only used a very restricted set of speech.
Neural Hardware and Image Recognition
Artificial intelligence (AI) is traditionally based on software, but researchers from the Vienna University of Technology have created faster intelligent hardware. The newly developed chip is able to analyze images and provide the correct output in a matter of nanoseconds.
In today’s world, automatic image recognition is used for a variety of different applications, and certain computer programs can accurately diagnose health problems like skin cancer, navigate self-driving vehicles, and control robots. This is normally done by the evaluation of image data that is delivered by cameras, but one downside is that it is time-consuming. For example, when the number of images recorded per second is high, the large volume of data that is generated often cannot be handled.
Special 2D Material
The scientists at TU Wien decided to use a special 2D material. They developed an image sensor that can recognize certain objects through training. The chip is based on an artificial neural network, and the chip can provide data about what it is seeing within nanoseconds.
The research was presented in the scientific journal Nature.
Neural networks, which are artificial systems, can represent the nerve cells that are connected to other nerve cells within our brain. One cell can affect many others, and artificial learning on a computer works in a similar way.
“Typically, the image data is first read out pixel by pixel and then processed on the computer,” says Thomas Mueller. “We, on the other hand, integrate the neural network with its artificial intelligence directly into the hardware of the image sensor. This makes object recognition many orders of magnitude faster.”
The chip, which is based on photodetectors made of tungsten diselenide, was developed and manufactured at the TU Vienna. Tungsten diselenide is an ultra-thin material that consists of just three atomic layers. Each one of the individual photodetectors, or the “pixels” of the camera, are connected to output elements, which provides the results of object recognition.
“In our chip, we can specifically adjust the sensitivity of each individual detector element — in other words, we can control the way the signal picked up by a particular detector affects the output signal,” says Lukas Mennel, first author of the publication. “All we have to do is simply adjust a local electric field directly at the photodetector.”
They make this adaptation externally and through the use of a computer program. The sensor can be used to record different letters and adjust the sensitivities of the individual pixels. There will always be a corresponding output signal.
Neural Network Takes Over
After the completion of the learning process, the computer is not needed. The neural network is capable of operating alone, and it can produce an output signal within 50 nanoseconds.
“Our test chip is still small at the moment, but you can easily scale up the technology depending on the task you want to solve,” says Thomas Mueller. “In principle, the chip could also be trained to distinguish apples from bananas, but we see its use more in scientific experiments or other specialized applications.”
This technology will be most useful in areas that require extremely high speed, such as fracture mechanics and particle detection.
Noah Schwartz, Co-Founder & CEO of Quorum AI – Interview Series
Noah is an AI systems architect. Prior to founding Quorum AI, Noah spent 12 years in academic research, first at the University of Southern California and most recently at Northwestern as the Assistant Chair of Neurobiology. His work focused on information processing in the brain and he has translated his research into products in augmented reality, brain-computer interfaces, computer vision, and embedded robotics control systems.
Your interest in AI and robotics started as a little boy. How were you first introduced to these technologies?
The initial spark came from science fiction movies and a love for electronics. I remember watching the movie, Tron, as an 8-year old, followed by Electric Dreams, Short Circuit, DARYL, War Games, and others over the next few years. Although it was presented through fiction, the very idea of artificial intelligence blew me away. And even though I was only 8-years old, I felt this immediate connection and an intense pull toward AI that has never diminished in the time since.
How did your passions for both evolve?
My interest in AI and robotics developed in parallel with a passion for the brain. My dad was a biology teacher and would teach me about the body, how everything worked, and how it was all connected. Looking at AI and looking at the brain felt like the same problem to me – or at least, they had the same ultimate question, which was, How is that working? I was interested in both, but I didn’t get much exposure to AI or robotics in school. For that reason, I initially pursued AI on my own time and studied biology and psychology in school.
When I got to college, I discovered the Parallel Distributed Processing (PDP) books, which was huge for me. They were my first introduction to actual AI, which then led me back to the classics such as Hebb, Rosenblatt, and even McCulloch and Pitts. I started building neural networks based on neuroanatomy and what I learned from biology and psychology classes in school. After graduating, I worked as a computer network engineer, building complex, wide-area-networks, and writing software to automate and manage traffic flow on those networks – kind of like building large brains. The work reignited my passion for AI and motivated me to head to grad school to study AI and neuroscience, and the rest is history.
Prior to founding Quorum AI, you spent 12 years in academic research, first at the University of Southern California and most recently at Northwestern as the Assistant Chair of Neurobiology. At the time your work focused on information processing in the brain. Could you walk us through some of this research?
In a broad sense, my research was trying to understand the question: How does the brain do what it does using only what it has available? For starters, I don’t subscribe to the idea that the brain is a type of computer (in the von Neumann sense). I see it as a massive network that mostly performs stimulus-response and signal-encoding operations. Within that massive network there are clear patterns of connectivity between functionally specialized areas. As we zoom in, we see that neurons don’t care what signal they’re carrying or what part of the brain they’re in – they operate based on very predictable rules. So if we want to understand the function of these specialized areas, we need to ask a few questions: (1) As an input travels through the network, how does that input converge with other inputs to produce a decision? (2) How does the structure of those specialized areas form as a result of experience? And (3) how do they continue to change as we use our brains and learn over time? My research tried to address these questions using a mixture of experimental research combined with information theory and modeling and simulation – something that could enable us to build artificial decision systems and AI. In neurobiology terms, I studied neuroplasticity and microanatomy of specialized areas like the visual cortex.
You then translated your work into augmented reality, and brain-computer interfaces. What were some of the products you worked on?
Around 2008, I was working on a project that we would now call augmented reality, but back then, it was just a system for tracking and predicting eye movements, and then using those predictions to update something on the screen. To make the system work in realtime, I built a biologically-inspired model that predicted where the viewer would based on their microsaccades – tiny eye movements that occur just before you move your eye. Using this model, I could predict where the viewer would look, then update the frame buffer in the graphics card while their eyes were still in motion. By the time their eyes reached that new location on the screen, the image was already updated. This ran on an ordinary desktop computer in 2008, without any lag. The tech was pretty amazing, but the project didn’t get through to the next round of funding, so it died.
In 2011, I made a more focused effort at product development and built a neural network that could perform feature discovery on streaming EEG data that we measured from the scalp. This is the core function of most brain-computer interface systems. The project was also an experiment in how small of a footprint could we get this running on? We had a headset that read a few channels of EEG data at 400Hz that were sent via Bluetooth to an Android phone for feature discovery and classification, then sent to an Arduino-powered controller that we retrofitted into an off-the-shelf RC car. When in use, an individual who was wearing the EEG headset could drive and steer the car by changing their thoughts from doing mental math to singing a song. The algorithm ran on the phone and created a personalized brain “fingerprint” for each user, enabling them to switch between a variety of robotic devices without having to retrain on each device. The tagline we came up with was “Brain Control Meets Plug-and-Play.”
In 2012, we extended the system so it operated in a much more distributed manner on smaller hardware. We used it to control a multi-segment, multi-joint robotic arm in which each segment was controlled by an independent processor that ran an embedded version of the AI. Instead of using a centralized controller to manipulate the arm, we allowed the segments to self-organize and reach their target in a swarm-like, distributed manner. In other words, like ants forming an ant bridge, the arm segments would cooperate to reach some target in space.
We continued moving in this same direction when we first launched Quorum AI – originally known as Quorum Robotics – back in 2013. We quickly realized that the system was awesome because of the algorithm and architecture, not the hardware, so in late 2014, we pivoted completely into software. Now, 8 years later, Quorum AI is coming full-circle, back to those robotics roots by applying our framework to the NASA Space Robotics Challenge.
Quitting your job as a professor to launch a start-up had to have been a difficult decision. What inspired you to do this?
It was a massive leap for me in a lot of ways, but once the opportunity came up and the path became clear, it was an easy decision. When you’re a professor, you think in multi-year timeframes and you work on very long-range research goals. Launching a start-up is the exact opposite of that. However, one thing that academic life and start-up life have in common is that both require you to learn and solve problems constantly. In a start-up, that could mean trying to re-engineer a solution to reduce product development risk or maybe studying a new vertical that could benefit from our tech. Working in AI is the closest thing to a “calling” as I’ve ever felt, so despite all the challenges and the ups and downs, I feel immensely lucky to be doing the work that I do.
You’ve since then developed Quorum AI, which develops realtime, distributed artificial intelligence for all devices and platforms. Could you elaborate on what exactly this AI platform does?
The platform is called the Environment for Virtual Agents (EVA), and it enables users to build, train, and deploy models using our Engram AI Engine. Engram is a flexible and portable wrapper that we built around our unsupervised learning algorithms. The algorithms are so efficient that they can learn in realtime, as the model is generating predictions. Because the algorithms are task-agnostic, there is no explicit input or output to the model, so predictions can be made in a Bayesian manner for any dimension without retraining and without suffering from catastrophic forgetting. The models are also transparent and decomposable, meaning they can be examined and broken apart into individual dimensions without losing what has been learned.
Once built, the models can be deployed through EVA to any type of platform, ranging from custom embedded hardware or up to the cloud. EVA (and the embeddable host software) also contain several tools to extend the functionality of each model. A few quick examples: Models can be shared between systems through a publication/subscription system, enabling distributed systems to achieve federated learning over both time and space. Models can also be deployed as autonomous agents to perform arbitrary tasks, and because the model is task-agnostic, the task can be changed during runtime without retraining. Each individual agent can be extended with a private “virtual” EVA, enabling the agent to simulate models of other agents in a scale-free manner. Finally, we’ve created some wrappers for deep learning and reinforcement learning (Keras-based) systems to enable these models to operate on the platform, in concert with more flexible Engram-based systems.
You’ve previously described the Quorum AI algorithms as “mathematical poetry”. What did you mean by this?
When you’re building a model, whether you’re modeling the brain or you’re modeling sales data for your enterprise, you start by taking an inventory of your data, then you try out known classes of models to try and approximate the system. In essence, you are creating rough sketches of the system to see what looks best. You don’t expect things to fit the data very well, and there’s some trial and error as you test different hypotheses about how the system works, but with some finesse, you can capture the data pretty well.
As I was modeling neuroplasticity in the brain, I started with the usual approach of mapping out all the molecular pathways, transition states, and dynamics that I thought would matter. But I found that when I reduced the system to its most basic components and arranged those components in a particular way, the model got more and more accurate until it fit the data almost perfectly. It was like every operator and variable in the equations were exactly what they needed to be, there was nothing extra, and everything was essential to fitting the data.
When I plugged the model into larger and larger simulations, like visual system development or face recognition, for instance, it was able to form extremely complicated connectivity patterns that matched what we see in the brain. Because the model was mathematical, those brain patterns could be understood through mathematical analysis, giving new insight into what the brain is learning. Since then, we’ve solved and simplified the differential equations that make up the model, improving computational efficiency by multiple orders of magnitude. It may not be actual poetry, but it sure felt like it!
Quorum AI’s platform toolkit enables devices to connect to one another to learn and share data without needing to communicate through cloud-based servers. What are the advantages of doing it this way versus using the cloud?
We give users the option of putting their AI anywhere they want, without compromising the functionality of the AI. The status quo in AI development is that companies are usually forced to compromise security, privacy, or functionality because their only option is to use cloud-based AI services. If companies do try to build their own AI in-house, it often requires a lot of money and time, and the ROI is rarely worth the risk. If companies want to deploy AI to individual devices that are not cloud-connected, the project quickly becomes impossible. As a result, AI adoption becomes a fantasy.
Our platform makes AI accessible and affordable, giving companies a way to explore AI development and adoption without the technical or financial overhead. And moreover, our platform enables users to go from development to deployment in one seamless step.
Our platform also integrates with and extends the shelf-life of other “legacy” models like deep learning or reinforcement learning, helping companies repurpose and integrate existing systems into newer applications. Similarly, because our algorithms and architectures are unique, our models are not black boxes, so anything that the system learns can be explored and interpreted by humans, and then extended to other areas of business.
It’s believed by some that Distributed Artificial Intelligence (DAI), could lead the way to Artificial General Intelligence (AGI). Do you subscribe to this theory?
I do, and not just because that’s the path we’ve set out for ourselves! When you look at the brain, it’s not a monolithic system. It’s made up of separate, distributed systems that each specialize in a narrow range of brain functions. We may not know what a particular system is doing, but we know that its decisions depend significantly on the type of information it’s receiving and how that information changes over time. (This is why neuroscience topics like the connectome are so popular.)
In my opinion, if we want to build AI that is flexible and that behaves and performs like the brain, then it makes sense to consider distributed architectures like those that we see in the brain. One could argue that deep learning architectures like multi-layer networks or CNNs can be found in the brain, and that’s true, but those architectures are based on what we knew about the brain 50 years ago.
The alternative to DAI is to continue iterating on monolithic, inflexible architectures that are tightly coupled to a single decision space, like those that we see in deep learning or reinforcement learning (or any supervised learning method, for that matter). I would suggest that these limitations are not just a matter of parameter tweaking or adding layers or data conditioning – these issues are fundamental to deep learning and reinforcement learning, at least as we define them today, so new approaches are required if we’re going to continue innovating and building the AI of tomorrow.
Do you believe that achieving AGI using DAI is more likely than reinforcement learning and/or deep learning methods that are currently being pursued by companies such as OpenAI and DeepMind?
Yes, although from what they’re blogging about, I suspect OpenAI and DeepMind are using more distributed architectures than they let on. We’re starting to hear more about multi-system challenges like transfer learning or federated/distributed learning, and coincidentally, about how deep learning and reinforcement learning approaches aren’t going to work for these challenges. We’re also starting to hear from pioneers like Yoshua Bengio about how biologically-inspired architectures could bridge the gap! I’ve been working on biologically-inspired AI for almost 20 years, so I feel very good about what we’ve learned at Quorum AI and how we’re using it to build what we believe is the next generation of AI that will overcome these limitations.
Is there anything else that you would like to share about Quorum AI?
We will be previewing our new platform for distributed and agent-based AI at the Federated and Distributed Machine Learning Conference in June 2020. During the talk, I plan to present some recent data on several topics, including sentiment analysis as a bridge to achieving empathic AI.
I would like to give a special thank you to Noah for these amazing answers, and I would recommend that you visit the Quorum to learn more.