A team of scientists at the University of Kentucky that was able to, as The Guardian says, found in the holy ark of a synagogue in En-Gedi in Israel, and which contained text from the biblical book of Leviticus, is now involved in an even harder and more complex task – reading the carbonised scrolls left after the eruption of Mount Vesuvius in AD 79 in the Italian city of Pompeii.
While the team lead by Prof Brent Seales was able to read the parchment found in a synagogue in En-Gedi Israel with ‘just’ high-energy x-rays, this time around, due to the manner in which the Pompeii scrolls were made and written, they will have to use machine learning to try and solve the mysteries hidden in these scrolls.
They will test their prices on two unopened scrolls that belong to the Institut de France in Paris and are part of a collection of about 1,800 scrolls that was first discovered in 1752 during excavations of Herculaneum. As The Guardian points out, they make up the only known intact library from antiquity, with the majority of the collection now preserved in a museum in Naples.
Professor Seales explained the problem his team faces – “although you can see on every flake of papyrus that there is writing, to open it up would require that papyrus to be really limber and flexible – and it is not anymore.” The problem also lies in the fact that“while the En-Gedi scroll contained a metal-based ink which shows up in x-ray data, the inks used on the Herculaneum scrolls are thought to be carbon-based, made using charcoal or soot, meaning there is no obvious contrast between the writing and the papyrus in x-ray scans.”
To be able to resolve the problem, the team has decided to use both high-energy x-rays and artificial intelligence. The method they are using involves photographs of scroll fragments with writing visible to the naked eye. These are then fed to “teach machine learning algorithms where ink is expected to be in x-ray scans of the same fragments, collected using a number of techniques.”
The team is guided by the concept that “the system will pick out and learn subtle differences between inked and blank areas in the x-ray scans, such as differences in the structure of papyrus fibers.” After the system is trained on these fragments, the idea is to apply it to the data from the intact scrolls and hopefully, that will reveal the text that is contained in the scrolls.
Seals added that the team has finished collecting the x-ray data and is now in the process of training the designated algorithms, which will then be applied to the scrolls in the coming months. “The first thing we are hoping to do is perfect the technology so that we can simply repeat it on all 900 scrolls that remain [unwrapped].”
Talking about the importance of possible discoveries, Dr. Dirk Obbink, a papyrologist and classicist at the University of Oxford, also involved in the project said that there is a possibility that the text might be in Latin. He added that “a new historical work by Seneca the Elder was discovered among the unidentified Herculaneum papyri only last year, thus showing what uncontemplated rarities remain to be discovered there.”
Researchers Develop Computer Algorithm Inspired by Mammalian Olfactory System
Researchers from Cornell University have created a computer algorithm inspired by the mammalian olfactory system. Scientists have long sought out explanations of how mammals learn and identify smells. The new algorithm provides insight into the workings of the brain, and applying it to a computer chip allows it to quickly and reliably learn patterns better than current machine learning models.
Thomas Cleland is a professor of psychology and senior author of the study titled “Rapid Learning and Robust Recall in a Neuromorphic Olfactory Circuit,” published in Nature Machine Intelligence on March 16.
“This is a result of over a decade of studying olfactory bulb circuitry in rodents and trying to figure out essentially how it works, with an eye towards things we know animals can do that our machines can’t,” Cleland said.
“We now know enough to make this work. We’ve built this computational model based on this circuitry, guided heavily by things we know about the biological systems’ connectivity and dynamics,” he continued. “Then we say, if this were so, this would work. And the interesting part is that it does work.”
Intel Computer Chip
Cleland was joined by co-author Nabil Imam, a researcher at Intel, and together they applied the algorithm to an Intel computer chip. The chip is called Loihi, and it is neuromorphic, which means it is inspired by the functions of the brain. The chip has digital circuits that mimic the way in which neurons learn and communicate.
The Loihi chip relies on parallel cores that communicate via discrete spikes, and each one of these spikes has an effect that can change depending on local activity. This requires different strategies for algorithm design than what is used in existing computer chips.
Through the use of neuromorphic computer chips, machines could work a thousand times faster than a computer’s central or graphics processing units at identifying patterns and carrying out certain tasks.
The Loihi research chip can also run certain algorithms while using around a thousand times less power than traditional methods. This is well-suited for the algorithm, which can accept input patterns from various different sensors, learn patterns quickly and sequentially, and identify each of the meaningful patterns even with strong sensory interference. The algorithm is capable of successfully identifying odors, and it can do so when the pattern is an astounding 80% different from the pattern originally learned by the computer.
“The pattern of the signal has been substantially destroyed,” Cleland said, “and yet the system is able to recover it.”
The Mammalian Brain
The brain of a mammal is able to identify and remember smells extremely well, and there can be thousands of olfactory receptors and complex neural networks working to analyze the patterns associated with odors. One of the things that mammals can do better than artificial intelligence systems is retain what they’ve learned, even after there is new knowledge. In deep learning approaches, the network must be presented with everything at once, since new information can affect or even destroy what the system previously learned.
“When you learn something, it permanently differentiates neurons,” Cleland said. “When you learn one odor, the interneurons are trained to respond to particular configurations, so you get that segregation at the level of interneurons. So on the machine side, we just enhance that and draw a firm line.”
Cleland spoke about how the team came up with new experimental approaches.
“When you start studying a biological process that becomes more intricate and complex than you can just simply intuit, you have to discipline your mind with a computer model,” he said. “You can’t fuzz your way through it. And that led us to a number of new experimental approaches and ideas that we wouldn’t have come up with just by eyeballing it.”
Human Genome Sequencing and Deep Learning Could Lead to a Coronavirus Vaccine – Opinion
The AI community must collaborate with geneticists, in finding a treatment for those deemed most at risk of coronavirus. A potential treatment could involve removing a person’s cells, editing the DNA and then injecting the cells back in, now hopefully armed with a successful immune response. This is currently being worked on for some other vaccines.
The first step would be sequencing the entire human genome from a sizeable segment of the human population.
Sequencing Human Genomes
Sequencing the first human genome cost $2.7 billion and took nearly 15 years to complete. The current cost of sequencing an entire human has dropped dramatically. As recent as 2015 the cost was $4000, now the cost is less than $1000 per person. This cost could drop a few percentage points more when economies of scale are taken into consideration.
We need to sequence the genome of two different types of patients:
- Infected with Coronavirus; but healthy
- Infected with Coronavirus; but poor immune response
It is impossible to predict which data point will be most valuable, but each sequenced genome would provide a dataset. The more data the more options there are to locate DNA variations which increase a body’s resistance to the disease vector.
Nations are currently losing trillions of dollars to this outbreak, the cost of $1000 a human genome is minor in comparison. A minimum of 1,000 volunteers for both segments of the population would arm researchers with significant volumes of big data. Should the trial increase in size by one order of magnitude, the AI would have even more training data which would increase the odds of success by several orders of magnitude. The more data the better, which is why a target of 10,000 volunteers should be aimed for.
While multiple functionalities of machine learning would be present, deep learning would be used to find patterns in the data. For instance, there might be an observation that certain DNA variables correspond to a high immunity, while others correspond to a high mortality. At a minimum we would learn which segments of the human population are more susceptible and should be quarantined.
To decipher this data an Artificial Neural Network (ANN) would be located on the cloud, and sequenced human genomes from around the world would be uploaded. With time being of the essence, parallel computing will reduce the time required for the ANN to work its magic.
We could even take it one step further and use the output data sorted by the ANN,and feed it into a separate system called a Recurrent Neural Network (RNN). The RNN uses reinforcement learning to identify which gene selected by the initial ANN is most successful in a simulated environment. The reinforcement learning agent would gamify the entire process of creating a simulated setting, to test which DNA changes are more effective.
A simulated environment is like a virtual game environment, something many AI companies are well positioned to take advantage of based on their previous success in designing AI algorithms to win at esports. This includes companies such DeepMind and OpenAI.
These companies can use their underlying architecture optimized at mastering video games, to create a stimulated environment, test gene edits, and learn which edits lead to specific desired changes.
Once a gene is identified, another technology is used to make the edits.
Recently, the first ever study using CRISPR to edit DNA inside the human body was approved. This was to treat a rare type of genetic disorder that effects one of every 100,000 newborns. The condition can be caused by mutations in as many as 14 genes that play a role in the growth and operation of the retina. In this case, CRISPR sets out to carefully target DNA and to cause slight temporary damage to the DNA strand, causing the cell to repair itself. It is this restorative healing process which has the potential to restore eyesight.
While we are still waiting for results on if this treatment will work, the precedent of having CRISPR approved for trials in the human body is transformational. Potential disorders which can be treated include improving a body’s immune response to specific disease vectors.
Potentially, we can manipulate the body’s natural genetic resistance to a specific disease. The diseases that could potentially be targeted are diverse, but the community should be focusing on the treatment of the new global epidemic coronavirus. A threat that if unchecked could lead to a death sentence to a large percentage of our population.
While there are many potential options to achieving success, it will require that geneticists, epidemiologists, and machine learning specialists unify. A potential treatment option may be as described above, or may be revealed to be unimaginably different, the opportunity lies in the genome sequencing of a large segment of the population.
Deep learning is the best analysis tool that humans have ever created; we need to at a minimum attempt to use it to create a vaccine.
When we take into consideration what is currently at risk with this current epidemic, these three scientific communities need to come together to work on a cure.
Allan Hanbury, Co-Founder of contextflow – Interview Series
Allan Hanbury is Professor for Data Intelligence at the TU Wien, Austria, and Faculty Member of the Complexity Science Hub, where he leads research and innovation to make sense of unstructured data. He is initiator of the Austrian ICT Lighthouse Project, Data Market Austria, which is creating a Data-Services Ecosystem in Austria. He was scientific coordinator of the EU-funded Khresmoi Integrated Project on medical and health information search and analysis, and is co-founder of contextflow, the spin-off company commercialising the radiology image search technology developed in the Khresmoi project. He also coordinated the EU-funded VISCERAL project on evaluation of algorithms on big data, and the EU-funded KConnect project on technology for analysing medical text.
contextflow is a spin-off from the Medical University of Vienna and European research project KHRESMOI. Could you tell us about the KHRESMOI project?
Sure! The goal of Khresmoi was to develop a multilingual, multimodal search and access system for biomedical information and documents, which required us to effectively automate the information extraction process, develop adaptive user interfaces and link both unstructured and semi-structured text information to images. Essentially, we wanted to make the information retrieval process for medical professionals reliable, fast, accurate and understandable.
What’s the current dataset which is powering the contextflow deep learning algorithm?
Our dataset contains approximately 8000 lung CTs. As our AI is rather flexible, we’re moving towards brain MRIs next.
Have you seen improvements with how the AI performs as the dataset has become larger?
We’re frequently asked this question, and the answer is likely not satisfying to most readers. To a certain extent, yes, the quality improves with more scans, but after a particular threshold, you don’t gain much more simply from having more. How much is enough really depends on various factors (organ, modality, disease pattern, etc), and it’s impossible to give an exact number. What’s most important is the quality of the data.
Is contextflow designed for all cases, or to simply be used for determining differential diagnosis for difficult cases?
Radiologists are really good at what they do. For the majority of cases, the findings are obvious, and external tools are unnecessary. contextflow has differentiated itself in the market by focusing on general search rather than automated diagnosis. There are a few use cases for our tools, but the main one is for helping with difficult cases where the findings aren’t immediately apparent. Here radiologists must consult various resources, and that process takes time. contextflow SEARCH, our 3D image-based search engine) aims to reduce the time it takes for radiologists to search for information during image interpretation by allowing them to search via the image itself. Because we also provide reference information helpful for differential diagnosis, training new radiologists is another promising use case.
Can you walk us through the process of how a radiologist would use the contextflow platform?
contextflow SEARCH Lung CT is completely integrated into the radiologist’s workflow (or else they would not use it). The radiologist performs their work as usual, and when they require additional information for a particular patient, they simply select a region of interest in that scan and click the contextflow icon in their workstation to open up our system in a new browser window. From there, they will receive reference cases from our database of patients with similar disease patterns present as the patient they are currently evaluating plus statistics and medical literature (e.g. radiopedia). They can scroll through their patient in our system normally, selecting additional regions to search for additional information and compare side-by-side with the reference cases. There are also heatmaps providing a visualization of the overall distribution of disease patterns, which helps with reporting findings as well. We really tried to put everything a radiologist needs to write a report in one place and available within seconds.
This was designed initially for lung CT scans, will contextflow be expanding to other types of scans?
Yes! We have a list of organs and modalities requested by radiologists that we are eager to add. The ultimate goal is to provide a system that covers the entire human body, regardless of organ or type of scan.
contextflow has received the support of two amazing incubator programs INiTS and i2c TU Wien. How beneficial have these programs been and what have you learned from the process?
We owe a lot of gratitude to these incubators. Both connected us with mentors, consultants and investors which challenged our business model and ultimately clarified our who/why/how. They also act very practically, providing funding and office space so that we could really focus on the work and not worry SO much about administrative topics. We truly could not have come as far as we have without their support. The Austrian startup ecosystem is still small, but there are programs out there to help bring innovative ideas to fruition.
You are also the initiator of the Austrian ICT Lighthouse Project which aims to build a sustainable Data-Services Ecosystem in Austria. Could you tell us more about this project and about your role in it?
The amount of data produced daily is exponentially growing, and its importance to most industries is also exploding…it’s really one of the world’s most important resources! Data Market Austria’s Lighthouse project aims to develop or reform requirements for successful data-driven businesses, ensuring low cost, high quality and interoperability. I coordinated the project for the first year in 2016-2017. This led to the creation of the Data Intelligence Offensive where I am on the board of directors. The DIO’s mission is to exchange information and know-how between members regarding data management and security.
Is there anything else that you would like to share with our readers about contextflow?
Radiology workflows are not on the average citizen’s mind, and that’s how it should be. The system should just work. Unfortunately, once you become a patient, you realize that is not always the case. contextflow is working to transform that process for both radiologists and patients. You can expect a lot of exciting developments from us in the coming years, so stay tuned!
Please visit contextflow to learn more.
- AI Powered State Surveillance On Rise, COVID-19 Used as Scapegoat
- Anastassia Loukina, Senior Research Scientist (NLP/Speech) at ETS – Interview Series
- How Governments Have Used AI to Fight COVID-19
- Neural Hardware and Image Recognition
- Charles J. Simon, Author, Will Computers Revolt? – Interview Series