In what is a monumental moment in both AI and physics, a neural network has “rediscovered” that Earth orbits the Sun. The new development could be critical in solving quantum-mechanics problems, and the researchers hope that it can be used to discover new laws of physics by identifying patterns within large data sets.
The neural network, named SciNet, was fed measurements showing how the Sun and Mars appear from Earth. Scientists at the Swiss Federal Institute of Technology then tasked SciNet with predicting where the Sun and Mars would be at different times in the future.
The research will be published in Physical Review Letters.
Designing the Algorithm
The team, including Physicist Renato Renner, set out to make the algorithm capable of distilling large data sets into basic formulae. This is the same system used by physicists when coming up with equations. In order to do this, the researchers had to base the neural network on the human brain.
The formulas that were generated by SciNet placed the Sun at the center of our solar system. One of the remarkable aspects of this research was that SciNet did this similarly to how astronomer Nicolaus Copernicus discovered heliocentricity.
The team highlighted this in a paper published on the preprint repository arXiv.
“In the 16th century, Copernicus measured the angles between a distant fixed star and several planets and celestial bodies and hypothesized that the Sun, and not the Earth, is in the centre of our solar system and that the planets move around the Sun on simple orbits,” the team wrote. “This explains the complicated orbits as seen from Earth.”
The team tried to get SciNet to predict the movements of the Sun and Mars in the simplest way possible, so SciNet uses two sub-networks to send information back and forth. One of the networks analyzes the data and learns from it, and the other one makes predictions and tests accuracy based on that knowledge. Because these networks are connected together by just a few links, information is compressed and communication is simpler.
Conventional neural networks learn to identify and recognize objects through huge data sets, and they generate features. Those are then encoded in mathematical ‘nodes,’ which are considered the artificial equivalent of neurons. Unlike physicists, neural networks are more unpredictable and difficult to interpret.
Artificial Intelligence and Scientific Discoveries
One of the tests involved giving the network simulated data about the movements of Mars and the Sun, as seen from Earth. The orbit of Mars around the Sun appears unpredictable and often reverses its course. It was in the 1500s when Nicolaus Copernicus discovered that simpler formulas could be used to predict the movements of the planets orbiting the Sun.
When the neural network “discovered” similar formulas for Mar’s trajectory, it rediscovered one of the most important pieces of knowledge in history.
Mario Krenn is a physicist at the University of Toronto in Canada, and he works on using artificial intelligence to make scientific discoveries.
SciNet rediscovered “one of the most important shifts of paradigms in the history of science,” he said.
According to Renner, humans are still needed to interpret the equations and determine how they are connected to the movement of the planets around the Sun.
Hod Lipson is a roboticist at Columbia University in New York City.
“This work is important because it is able to single out the crucial parameters that describe a physical system,” he says. “I think that these kinds of techniques are our only hope of understanding and keeping pace with increasingly complex phenomena, in physics and beyond.”
Neural Hardware and Image Recognition
Artificial intelligence (AI) is traditionally based on software, but researchers from the Vienna University of Technology have created faster intelligent hardware. The newly developed chip is able to analyze images and provide the correct output in a matter of nanoseconds.
In today’s world, automatic image recognition is used for a variety of different applications, and certain computer programs can accurately diagnose health problems like skin cancer, navigate self-driving vehicles, and control robots. This is normally done by the evaluation of image data that is delivered by cameras, but one downside is that it is time-consuming. For example, when the number of images recorded per second is high, the large volume of data that is generated often cannot be handled.
Special 2D Material
The scientists at TU Wien decided to use a special 2D material. They developed an image sensor that can recognize certain objects through training. The chip is based on an artificial neural network, and the chip can provide data about what it is seeing within nanoseconds.
The research was presented in the scientific journal Nature.
Neural networks, which are artificial systems, can represent the nerve cells that are connected to other nerve cells within our brain. One cell can affect many others, and artificial learning on a computer works in a similar way.
“Typically, the image data is first read out pixel by pixel and then processed on the computer,” says Thomas Mueller. “We, on the other hand, integrate the neural network with its artificial intelligence directly into the hardware of the image sensor. This makes object recognition many orders of magnitude faster.”
The chip, which is based on photodetectors made of tungsten diselenide, was developed and manufactured at the TU Vienna. Tungsten diselenide is an ultra-thin material that consists of just three atomic layers. Each one of the individual photodetectors, or the “pixels” of the camera, are connected to output elements, which provides the results of object recognition.
“In our chip, we can specifically adjust the sensitivity of each individual detector element — in other words, we can control the way the signal picked up by a particular detector affects the output signal,” says Lukas Mennel, first author of the publication. “All we have to do is simply adjust a local electric field directly at the photodetector.”
They make this adaptation externally and through the use of a computer program. The sensor can be used to record different letters and adjust the sensitivities of the individual pixels. There will always be a corresponding output signal.
Neural Network Takes Over
After the completion of the learning process, the computer is not needed. The neural network is capable of operating alone, and it can produce an output signal within 50 nanoseconds.
“Our test chip is still small at the moment, but you can easily scale up the technology depending on the task you want to solve,” says Thomas Mueller. “In principle, the chip could also be trained to distinguish apples from bananas, but we see its use more in scientific experiments or other specialized applications.”
This technology will be most useful in areas that require extremely high speed, such as fracture mechanics and particle detection.
Noah Schwartz, Co-Founder & CEO of Quorum – Interview Series
Noah is an AI systems architect. Prior to founding Quorum, Noah spent 12 years in academic research, first at the University of Southern California and most recently at Northwestern as the Assistant Chair of Neurobiology. His work focused on information processing in the brain and he has translated his research into products in augmented reality, brain-computer interfaces, computer vision, and embedded robotics control systems.
Your interest in AI and robotics started as a little boy. How were you first introduced to these technologies?
The initial spark came from science fiction movies and a love for electronics. I remember watching the movie, Tron, as an 8-year old, followed by Electric Dreams, Short Circuit, DARYL, War Games, and others over the next few years. Although it was presented through fiction, the very idea of artificial intelligence blew me away. And even though I was only 8-years old, I felt this immediate connection and an intense pull toward AI that has never diminished in the time since.
How did your passions for both evolve?
My interest in AI and robotics developed in parallel with a passion for the brain. My dad was a biology teacher and would teach me about the body, how everything worked, and how it was all connected. Looking at AI and looking at the brain felt like the same problem to me – or at least, they had the same ultimate question, which was, How is that working? I was interested in both, but I didn’t get much exposure to AI or robotics in school. For that reason, I initially pursued AI on my own time and studied biology and psychology in school.
When I got to college, I discovered the Parallel Distributed Processing (PDP) books, which was huge for me. They were my first introduction to actual AI, which then led me back to the classics such as Hebb, Rosenblatt, and even McCulloch and Pitts. I started building neural networks based on neuroanatomy and what I learned from biology and psychology classes in school. After graduating, I worked as a computer network engineer, building complex, wide-area-networks, and writing software to automate and manage traffic flow on those networks – kind of like building large brains. The work reignited my passion for AI and motivated me to head to grad school to study AI and neuroscience, and the rest is history.
Prior to founding Quorum, you spent 12 years in academic research, first at the University of Southern California and most recently at Northwestern as the Assistant Chair of Neurobiology. At the time your work focused on information processing in the brain. Could you walk us through some of this research?
In a broad sense, my research was trying to understand the question: How does the brain do what it does using only what it has available? For starters, I don’t subscribe to the idea that the brain is a type of computer (in the von Neumann sense). I see it as a massive network that mostly performs stimulus-response and signal-encoding operations. Within that massive network there are clear patterns of connectivity between functionally specialized areas. As we zoom in, we see that neurons don’t care what signal they’re carrying or what part of the brain they’re in – they operate based on very predictable rules. So if we want to understand the function of these specialized areas, we need to ask a few questions: (1) As an input travels through the network, how does that input converge with other inputs to produce a decision? (2) How does the structure of those specialized areas form as a result of experience? And (3) how do they continue to change as we use our brains and learn over time? My research tried to address these questions using a mixture of experimental research combined with information theory and modeling and simulation – something that could enable us to build artificial decision systems and AI. In neurobiology terms, I studied neuroplasticity and microanatomy of specialized areas like the visual cortex.
You then translated your work into augmented reality, and brain-computer interfaces. What were some of the products you worked on?
Around 2008, I was working on a project that we would now call augmented reality, but back then, it was just a system for tracking and predicting eye movements, and then using those predictions to update something on the screen. To make the system work in realtime, I built a biologically-inspired model that predicted where the viewer would based on their microsaccades – tiny eye movements that occur just before you move your eye. Using this model, I could predict where the viewer would look, then update the frame buffer in the graphics card while their eyes were still in motion. By the time their eyes reached that new location on the screen, the image was already updated. This ran on an ordinary desktop computer in 2008, without any lag. The tech was pretty amazing, but the project didn’t get through to the next round of funding, so it died.
In 2011, I made a more focused effort at product development and built a neural network that could perform feature discovery on streaming EEG data that we measured from the scalp. This is the core function of most brain-computer interface systems. The project was also an experiment in how small of a footprint could we get this running on? We had a headset that read a few channels of EEG data at 400Hz that were sent via Bluetooth to an Android phone for feature discovery and classification, then sent to an Arduino-powered controller that we retrofitted into an off-the-shelf RC car. When in use, an individual who was wearing the EEG headset could drive and steer the car by changing their thoughts from doing mental math to singing a song. The algorithm ran on the phone and created a personalized brain “fingerprint” for each user, enabling them to switch between a variety of robotic devices without having to retrain on each device. The tagline we came up with was “Brain Control Meets Plug-and-Play.”
In 2012, we extended the system so it operated in a much more distributed manner on smaller hardware. We used it to control a multi-segment, multi-joint robotic arm in which each segment was controlled by an independent processor that ran an embedded version of the AI. Instead of using a centralized controller to manipulate the arm, we allowed the segments to self-organize and reach their target in a swarm-like, distributed manner. In other words, like ants forming an ant bridge, the arm segments would cooperate to reach some target in space.
We continued moving in this same direction when we first launched Quorum AI – originally known as Quorum Robotics – back in 2013. We quickly realized that the system was awesome because of the algorithm and architecture, not the hardware, so in late 2014, we pivoted completely into software. Now, 8 years later, Quorum AI is coming full-circle, back to those robotics roots by applying our framework to the NASA Space Robotics Challenge.
Quitting your job as a professor to launch a start-up had to have been a difficult decision. What inspired you to do this?
It was a massive leap for me in a lot of ways, but once the opportunity came up and the path became clear, it was an easy decision. When you’re a professor, you think in multi-year timeframes and you work on very long-range research goals. Launching a start-up is the exact opposite of that. However, one thing that academic life and start-up life have in common is that both require you to learn and solve problems constantly. In a start-up, that could mean trying to re-engineer a solution to reduce product development risk or maybe studying a new vertical that could benefit from our tech. Working in AI is the closest thing to a “calling” as I’ve ever felt, so despite all the challenges and the ups and downs, I feel immensely lucky to be doing the work that I do.
You’ve since then developed Quorum AI, which develops realtime, distributed artificial intelligence for all devices and platforms. Could you elaborate on what exactly this AI platform does?
The platform is called the Environment for Virtual Agents (EVA), and it enables users to build, train, and deploy models using our Engram AI Engine. Engram is a flexible and portable wrapper that we built around our unsupervised learning algorithms. The algorithms are so efficient that they can learn in realtime, as the model is generating predictions. Because the algorithms are task-agnostic, there is no explicit input or output to the model, so predictions can be made in a Bayesian manner for any dimension without retraining and without suffering from catastrophic forgetting. The models are also transparent and decomposable, meaning they can be examined and broken apart into individual dimensions without losing what has been learned.
Once built, the models can be deployed through EVA to any type of platform, ranging from custom embedded hardware or up to the cloud. EVA (and the embeddable host software) also contain several tools to extend the functionality of each model. A few quick examples: Models can be shared between systems through a publication/subscription system, enabling distributed systems to achieve federated learning over both time and space. Models can also be deployed as autonomous agents to perform arbitrary tasks, and because the model is task-agnostic, the task can be changed during runtime without retraining. Each individual agent can be extended with a private “virtual” EVA, enabling the agent to simulate models of other agents in a scale-free manner. Finally, we’ve created some wrappers for deep learning and reinforcement learning (Keras-based) systems to enable these models to operate on the platform, in concert with more flexible Engram-based systems.
You’ve previously described the Quorum AI algorithms as “mathematical poetry”. What did you mean by this?
When you’re building a model, whether you’re modeling the brain or you’re modeling sales data for your enterprise, you start by taking an inventory of your data, then you try out known classes of models to try and approximate the system. In essence, you are creating rough sketches of the system to see what looks best. You don’t expect things to fit the data very well, and there’s some trial and error as you test different hypotheses about how the system works, but with some finesse, you can capture the data pretty well.
As I was modeling neuroplasticity in the brain, I started with the usual approach of mapping out all the molecular pathways, transition states, and dynamics that I thought would matter. But I found that when I reduced the system to its most basic components and arranged those components in a particular way, the model got more and more accurate until it fit the data almost perfectly. It was like every operator and variable in the equations were exactly what they needed to be, there was nothing extra, and everything was essential to fitting the data.
When I plugged the model into larger and larger simulations, like visual system development or face recognition, for instance, it was able to form extremely complicated connectivity patterns that matched what we see in the brain. Because the model was mathematical, those brain patterns could be understood through mathematical analysis, giving new insight into what the brain is learning. Since then, we’ve solved and simplified the differential equations that make up the model, improving computational efficiency by multiple orders of magnitude. It may not be actual poetry, but it sure felt like it!
Quorum AI’s platform toolkit enables devices to connect to one another to learn and share data without needing to communicate through cloud-based servers. What are the advantages of doing it this way versus using the cloud?
We give users the option of putting their AI anywhere they want, without compromising the functionality of the AI. The status quo in AI development is that companies are usually forced to compromise security, privacy, or functionality because their only option is to use cloud-based AI services. If companies do try to build their own AI in-house, it often requires a lot of money and time, and the ROI is rarely worth the risk. If companies want to deploy AI to individual devices that are not cloud-connected, the project quickly becomes impossible. As a result, AI adoption becomes a fantasy.
Our platform makes AI accessible and affordable, giving companies a way to explore AI development and adoption without the technical or financial overhead. And moreover, our platform enables users to go from development to deployment in one seamless step.
Our platform also integrates with and extends the shelf-life of other “legacy” models like deep learning or reinforcement learning, helping companies repurpose and integrate existing systems into newer applications. Similarly, because our algorithms and architectures are unique, our models are not black boxes, so anything that the system learns can be explored and interpreted by humans, and then extended to other areas of business.
It’s believed by some that Distributed Artificial Intelligence (DAI), could lead the way to Artificial General Intelligence (AGI). Do you subscribe to this theory?
I do, and not just because that’s the path we’ve set out for ourselves! When you look at the brain, it’s not a monolithic system. It’s made up of separate, distributed systems that each specialize in a narrow range of brain functions. We may not know what a particular system is doing, but we know that its decisions depend significantly on the type of information it’s receiving and how that information changes over time. (This is why neuroscience topics like the connectome are so popular.)
In my opinion, if we want to build AI that is flexible and that behaves and performs like the brain, then it makes sense to consider distributed architectures like those that we see in the brain. One could argue that deep learning architectures like multi-layer networks or CNNs can be found in the brain, and that’s true, but those architectures are based on what we knew about the brain 50 years ago.
The alternative to DAI is to continue iterating on monolithic, inflexible architectures that are tightly coupled to a single decision space, like those that we see in deep learning or reinforcement learning (or any supervised learning method, for that matter). I would suggest that these limitations are not just a matter of parameter tweaking or adding layers or data conditioning – these issues are fundamental to deep learning and reinforcement learning, at least as we define them today, so new approaches are required if we’re going to continue innovating and building the AI of tomorrow.
Do you believe that achieving AGI using DAI is more likely than reinforcement learning and/or deep learning methods that are currently being pursued by companies such as OpenAI and DeepMind?
Yes, although from what they’re blogging about, I suspect OpenAI and DeepMind are using more distributed architectures than they let on. We’re starting to hear more about multi-system challenges like transfer learning or federated/distributed learning, and coincidentally, about how deep learning and reinforcement learning approaches aren’t going to work for these challenges. We’re also starting to hear from pioneers like Yoshua Bengio about how biologically-inspired architectures could bridge the gap! I’ve been working on biologically-inspired AI for almost 20 years, so I feel very good about what we’ve learned at Quorum AI and how we’re using it to build what we believe is the next generation of AI that will overcome these limitations.
Is there anything else that you would like to share about Quorum AI?
We will be previewing our new platform for distributed and agent-based AI at the Federated and Distributed Machine Learning Conference in June 2020. During the talk, I plan to present some recent data on several topics, including sentiment analysis as a bridge to achieving empathic AI.
I would like to give a special thank you to Noah for these amazing answers, and I would recommend that you visit the Quorum to learn more.
New Neural Tangent Library From Google Gives Data Scientists “Unprecedented” Insight Into Models
Google has designed a new open-source library intended to crack open the black box of machine learning and give engineers more insight into how their machine learning systems operate. As reported by VentureBeat, the Google research team says that the library could grant “unprecedented” insight into how machine learning models operate.
Neural networks operate through neurons containing mathematical functions that transform the data in various ways. The neurons in the network are joined together in layers, and neural networks have depth and width. The depth of a neural network is controlled by how many layers is has, and the different layers of the networks adjust the connections between neurons, impacting how the data is handled as it moves between layers. The number of neurons in the layer is the layer’s width. According to Google research engineer Roman Novak and senior research scientist at Google, Samuel S. Schoenholz, the width of models is tightly correlated with regular, repeatable behavior. In a blog post, the two researchers explained that making neural networks wider makes their behavior more regular and easier to interpret.
There exists a different type of machine learning model called a Gaussian process. A Gaussian process is a stochastic process that can be represented as a multivariate normal distribution. With a Gaussian process, every set/finite linear combination of variables will be normally distributed. This means it is possible to represent extraordinarily complex interactions between variables as interpretable linear algebra equations, and therefore it’s possible for an AI’s behavior to be studied through this lens. How exactly are machine learning models related to Gaussian processes? Machine learning models that are infinitely large in width converge on a Gaussian process.
However, while it’s possible to interpret machine learning models through the lens of a Gaussian process, it requires deriving the infinite-width limit of a model. This is a complex series of calculations that must be done for each separate architecture. In order to make these calculations easier and quicker, the Google research team designed Neural Tangents. Neural Tangents enables a data scientist to use just a few lines of code and train multiple infinite-width networks at one time. Multiple neural networks are often trained on the same datasets and their predictions are averaged, in order to get a more robust prediction immune to the problems that might occur in any individual model. Such a technique is called ensemble learning. One of the drawbacks to ensemble learning is that it is often computationally expensive. Yet when a network that is infinitely wide is trained, the ensemble is described by a Gaussian process and the variance and mean can be calculated.
Three different infinite-width neural network architectures were compared as a test, and the results of the comparison were published in the blog post. In general, the results of ensemble networks driven by Gaussian processes are similar to regular, finite neural network performance:
As the research team explains in a blog post:
“We see that, mimicking finite neural networks, infinite-width networks follow a similar hierarchy of performance with fully-connected networks performing worse than convolutional networks, which in turn perform worse than wide residual networks. However, unlike regular training, the learning dynamics of these models is completely tractable in closed-form, which allows [new] insight into their behavior.”
The release of Neural Tangents seems timed to coincie with the TensorFlow Dev Summit. The dev summit sees machine learning engineers that utilize Google’s TensorFlow platform meet together. The Neural Tangents announcement also comes not long after TensorFlow Quantum was announced.
Neural Tangents has been made available via GitHub and there is a Google Colaboratory notebook and tutorial that those interested can access.
- AI Powered State Surveillance On Rise, COVID-19 Used as Scapegoat
- Anastassia Loukina, Senior Research Scientist (NLP/Speech) at ETS – Interview Series
- How Governments Have Used AI to Fight COVID-19
- Neural Hardware and Image Recognition
- Charles J. Simon, Author, Will Computers Revolt? – Interview Series