Google has designed a new open-source library intended to crack open the black box of machine learning and give engineers more insight into how their machine learning systems operate. As reported by VentureBeat, the Google research team says that the library could grant “unprecedented” insight into how machine learning models operate.
Neural networks operate through neurons containing mathematical functions that transform the data in various ways. The neurons in the network are joined together in layers, and neural networks have depth and width. The depth of a neural network is controlled by how many layers is has, and the different layers of the networks adjust the connections between neurons, impacting how the data is handled as it moves between layers. The number of neurons in the layer is the layer’s width. According to Google research engineer Roman Novak and senior research scientist at Google, Samuel S. Schoenholz, the width of models is tightly correlated with regular, repeatable behavior. In a blog post, the two researchers explained that making neural networks wider makes their behavior more regular and easier to interpret.
There exists a different type of machine learning model called a Gaussian process. A Gaussian process is a stochastic process that can be represented as a multivariate normal distribution. With a Gaussian process, every set/finite linear combination of variables will be normally distributed. This means it is possible to represent extraordinarily complex interactions between variables as interpretable linear algebra equations, and therefore it’s possible for an AI’s behavior to be studied through this lens. How exactly are machine learning models related to Gaussian processes? Machine learning models that are infinitely large in width converge on a Gaussian process.
However, while it’s possible to interpret machine learning models through the lens of a Gaussian process, it requires deriving the infinite-width limit of a model. This is a complex series of calculations that must be done for each separate architecture. In order to make these calculations easier and quicker, the Google research team designed Neural Tangents. Neural Tangents enables a data scientist to use just a few lines of code and train multiple infinite-width networks at one time. Multiple neural networks are often trained on the same datasets and their predictions are averaged, in order to get a more robust prediction immune to the problems that might occur in any individual model. Such a technique is called ensemble learning. One of the drawbacks to ensemble learning is that it is often computationally expensive. Yet when a network that is infinitely wide is trained, the ensemble is described by a Gaussian process and the variance and mean can be calculated.
Three different infinite-width neural network architectures were compared as a test, and the results of the comparison were published in the blog post. In general, the results of ensemble networks driven by Gaussian processes are similar to regular, finite neural network performance:
As the research team explains in a blog post:
“We see that, mimicking finite neural networks, infinite-width networks follow a similar hierarchy of performance with fully-connected networks performing worse than convolutional networks, which in turn perform worse than wide residual networks. However, unlike regular training, the learning dynamics of these models is completely tractable in closed-form, which allows [new] insight into their behavior.”
The release of Neural Tangents seems timed to coincie with the TensorFlow Dev Summit. The dev summit sees machine learning engineers that utilize Google’s TensorFlow platform meet together. The Neural Tangents announcement also comes not long after TensorFlow Quantum was announced.
Neural Tangents has been made available via GitHub and there is a Google Colaboratory notebook and tutorial that those interested can access.