A new study led by researchers at Linköping University demonstrates how an artificial neural network (ANN) can reveal large amounts of gene expression data, and it can lead to the discovery of groups of disease-related genes. The study was published in Nature Communications, and the scientists want the method to be applied within precision medicine and individualized treatment.
Scientists are currently developing maps of biological networks that are based on how different proteins or genes interact with each other. The new study involves the use of artificial intelligence (AI) in order to find out if biological networks can be discovered through the use of deep learning. Artificial neural networks, which are trained by experimental data in the process of deep learning, are able to find patterns within massive amounts of complex data. Because of this, they are often used in applications such as image recognition. Even with its seemingly enormous potential, the use of this machine learning method has been limited within biological research.
Sanjiv Dwivedi is a postdoc in the Department of Physics, Chemistry and Biology (IFM) at Linköping University.
“We have for the first time used deep learning to find disease-related genes. This is a very powerful method in the analysis of huge amounts of biological information, or ‘big data’,” says Dwivedi.
The scientists relied on a large database with information regarding the expression patterns of 20,000 genes in a large number of people. The artificial neural network was not told which gene expression patterns were from people with diseases, or which ones were from healthy individuals. The AI model was then trained to find patterns of gene expression.
One of the mysteries surrounding machine learning is that it is currently impossible to see how an artificial neural network gets to its final result. It is only possible to see the information that goes in and the information that is produced, but everything that happens in-between consists of several layers of mathematically processed information. These inner workings of an artificial neural network are not yet able to be deciphered. The scientists wanted to know if there were any similarities between the designs of the neural network and the familiar biological networks.
Mike Gustafsson is a senior lecturer at IFM and leads the study.
“When we analysed our neural network, it turned out that the first hidden layer represented to a large extent interactions between various proteins. Deeper in the model, in contrast, on the third level, we found groups of different cell types. It’s extremely interesting that this type of biologically relevant grouping is automatically produced, given that our network has started from unclassified gene expression data,” says Gustafsson.
The scientists then wanted to know if their model of gene expression was capable of being used to determine which gene expression patterns are associated with disease and which are normal. They were able to confirm that the model can discover relative patterns that agree with biological mechanisms in the body. Another discovery was that the artificial neural network could possibly discover brand new patterns since it was trained with unclassified data. The researchers will now investigate previously unknown patterns and whether they are relevant within biology.
“We believe that the key to progress in the field is to understand the neural network. This can teach us new things about biological contexts, such as diseases in which many factors interact. And we believe that our method gives models that are easier to generalise and that can be used for many different types of biological information,” says Gustafsson.
Through collaborations with medical researchers, Gustafsson hopes to apply the method in precision medicine. This could help determine which specific types of medicine patients should receive.
The study was financially supported by the Swedish Foundation for Strategic Research (SSF) and the Swedish Research Council.