Amazon’s annual re:Invent conference in Las Vegas began this week with three major AI announcements. The company presented the public with Transcribe Medical, SageMaker Operators for Kubernetes, and DeepComposer.
What is being called the biggest announcement of the three, Transcribe Medical is the newest edition to the company’s transcribe speech recognition service. It will transcribe medical speech for primary care. The program is capable of operating in medical speech as well as standard conversational diction.
According to the company, Transcribe Medical can be used across thousands of healthcare facilities, and it will help aid medical professionals in taking notes and other important information. It offers an API and will be able to be used with most smart devices containing a microphone. When the program reads and processes the information, it returns text in real-time.
Transcribe Medical is currently being used by SoundLines and Amgen.
Vadim Khazan is the president of technology at SoundLines.
“For the 3,500 health care partners relying on our care team optimisation strategies for the past 15 years, we’ve significantly decreased the time and effort required to get ton insightful data,” he said in a statement.
DeepComposer is an AI-enabled piano keyboard that will allow AWS customers to use AI and a MIDI controller to compose music. Amazon is calling the new technology the “world’s first” machine learning-enabled musical keyboard. It has 32 keys, and it is a two-octave keyboard.
Composers who use the program can choose whether to record a short musical tune or use a prerecorded one. They will then select a model for their desired genre and the model’s architecture parameters. They can also set the loss function, a feature used to measure the difference between the algorithm’s output and expected value. The composer can also choose hyperparameters and a validation sample. DeepComposer then creates a composition which can either be played in the AWS console or exported or shared on SoundCloud.
DeepComposer uses a generative adversarial network (GAN) to fill in compositional gaps in songs. Random data is taken by a generator component and used to create samples which are forwarded to a discriminator bit. The discriminator bit then separates the real samples from the fake ones, and the generator improves along with the discriminator. The generator progressively gets better at learning how to create samples as close to the genuine ones as possible.
SageMaker Operators for Kubernetes
AWS also launched Amazon SageMaker Operators for Kubernetes, which allows data scientists to train, tune, and deploy AI models in Amazon’s SageMaker machine learning development platform. AWS customers are able to install SageMaker Operators on Kubernetes clusters, and this can create Amazon SageMaker jobs natively using the Kubernetes API and command-line Kubernetes tools.
Aditya Bindal is the AWS Deep Learning senior product manager.
“Now with Amazon SageMaker Operators for Kubernetes, customers can continue to enjoy the portability and standardization benefits of Kubernetes … along with integrating the many additional benefits that come out-of-the-box with Amazon SageMaker, no custom code required,” she wrote in a press release.
Kubernetes is an open-source general-purpose container orchestration system that is used to deploy and manage containerized applications. This is often done via a managed service like Amazon Elastic Kubernetes Service (EKS). Scientists and developers are able to gain greater control over their training and interface workloads with the program.
Researchers Develop Computer Algorithm Inspired by Mammalian Olfactory System
Researchers from Cornell University have created a computer algorithm inspired by the mammalian olfactory system. Scientists have long sought out explanations of how mammals learn and identify smells. The new algorithm provides insight into the workings of the brain, and applying it to a computer chip allows it to quickly and reliably learn patterns better than current machine learning models.
Thomas Cleland is a professor of psychology and senior author of the study titled “Rapid Learning and Robust Recall in a Neuromorphic Olfactory Circuit,” published in Nature Machine Intelligence on March 16.
“This is a result of over a decade of studying olfactory bulb circuitry in rodents and trying to figure out essentially how it works, with an eye towards things we know animals can do that our machines can’t,” Cleland said.
“We now know enough to make this work. We’ve built this computational model based on this circuitry, guided heavily by things we know about the biological systems’ connectivity and dynamics,” he continued. “Then we say, if this were so, this would work. And the interesting part is that it does work.”
Intel Computer Chip
Cleland was joined by co-author Nabil Imam, a researcher at Intel, and together they applied the algorithm to an Intel computer chip. The chip is called Loihi, and it is neuromorphic, which means it is inspired by the functions of the brain. The chip has digital circuits that mimic the way in which neurons learn and communicate.
The Loihi chip relies on parallel cores that communicate via discrete spikes, and each one of these spikes has an effect that can change depending on local activity. This requires different strategies for algorithm design than what is used in existing computer chips.
Through the use of neuromorphic computer chips, machines could work a thousand times faster than a computer’s central or graphics processing units at identifying patterns and carrying out certain tasks.
The Loihi research chip can also run certain algorithms while using around a thousand times less power than traditional methods. This is well-suited for the algorithm, which can accept input patterns from various different sensors, learn patterns quickly and sequentially, and identify each of the meaningful patterns even with strong sensory interference. The algorithm is capable of successfully identifying odors, and it can do so when the pattern is an astounding 80% different from the pattern originally learned by the computer.
“The pattern of the signal has been substantially destroyed,” Cleland said, “and yet the system is able to recover it.”
The Mammalian Brain
The brain of a mammal is able to identify and remember smells extremely well, and there can be thousands of olfactory receptors and complex neural networks working to analyze the patterns associated with odors. One of the things that mammals can do better than artificial intelligence systems is retain what they’ve learned, even after there is new knowledge. In deep learning approaches, the network must be presented with everything at once, since new information can affect or even destroy what the system previously learned.
“When you learn something, it permanently differentiates neurons,” Cleland said. “When you learn one odor, the interneurons are trained to respond to particular configurations, so you get that segregation at the level of interneurons. So on the machine side, we just enhance that and draw a firm line.”
Cleland spoke about how the team came up with new experimental approaches.
“When you start studying a biological process that becomes more intricate and complex than you can just simply intuit, you have to discipline your mind with a computer model,” he said. “You can’t fuzz your way through it. And that led us to a number of new experimental approaches and ideas that we wouldn’t have come up with just by eyeballing it.”
Human Genome Sequencing and Deep Learning Could Lead to a Coronavirus Vaccine – Opinion
The AI community must collaborate with geneticists, in finding a treatment for those deemed most at risk of coronavirus. A potential treatment could involve removing a person’s cells, editing the DNA and then injecting the cells back in, now hopefully armed with a successful immune response. This is currently being worked on for some other vaccines.
The first step would be sequencing the entire human genome from a sizeable segment of the human population.
Sequencing Human Genomes
Sequencing the first human genome cost $2.7 billion and took nearly 15 years to complete. The current cost of sequencing an entire human has dropped dramatically. As recent as 2015 the cost was $4000, now the cost is less than $1000 per person. This cost could drop a few percentage points more when economies of scale are taken into consideration.
We need to sequence the genome of two different types of patients:
- Infected with Coronavirus; but healthy
- Infected with Coronavirus; but poor immune response
It is impossible to predict which data point will be most valuable, but each sequenced genome would provide a dataset. The more data the more options there are to locate DNA variations which increase a body’s resistance to the disease vector.
Nations are currently losing trillions of dollars to this outbreak, the cost of $1000 a human genome is minor in comparison. A minimum of 1,000 volunteers for both segments of the population would arm researchers with significant volumes of big data. Should the trial increase in size by one order of magnitude, the AI would have even more training data which would increase the odds of success by several orders of magnitude. The more data the better, which is why a target of 10,000 volunteers should be aimed for.
While multiple functionalities of machine learning would be present, deep learning would be used to find patterns in the data. For instance, there might be an observation that certain DNA variables correspond to a high immunity, while others correspond to a high mortality. At a minimum we would learn which segments of the human population are more susceptible and should be quarantined.
To decipher this data an Artificial Neural Network (ANN) would be located on the cloud, and sequenced human genomes from around the world would be uploaded. With time being of the essence, parallel computing will reduce the time required for the ANN to work its magic.
We could even take it one step further and use the output data sorted by the ANN,and feed it into a separate system called a Recurrent Neural Network (RNN). The RNN uses reinforcement learning to identify which gene selected by the initial ANN is most successful in a simulated environment. The reinforcement learning agent would gamify the entire process of creating a simulated setting, to test which DNA changes are more effective.
A simulated environment is like a virtual game environment, something many AI companies are well positioned to take advantage of based on their previous success in designing AI algorithms to win at esports. This includes companies such DeepMind and OpenAI.
These companies can use their underlying architecture optimized at mastering video games, to create a stimulated environment, test gene edits, and learn which edits lead to specific desired changes.
Once a gene is identified, another technology is used to make the edits.
Recently, the first ever study using CRISPR to edit DNA inside the human body was approved. This was to treat a rare type of genetic disorder that effects one of every 100,000 newborns. The condition can be caused by mutations in as many as 14 genes that play a role in the growth and operation of the retina. In this case, CRISPR sets out to carefully target DNA and to cause slight temporary damage to the DNA strand, causing the cell to repair itself. It is this restorative healing process which has the potential to restore eyesight.
While we are still waiting for results on if this treatment will work, the precedent of having CRISPR approved for trials in the human body is transformational. Potential disorders which can be treated include improving a body’s immune response to specific disease vectors.
Potentially, we can manipulate the body’s natural genetic resistance to a specific disease. The diseases that could potentially be targeted are diverse, but the community should be focusing on the treatment of the new global epidemic coronavirus. A threat that if unchecked could lead to a death sentence to a large percentage of our population.
While there are many potential options to achieving success, it will require that geneticists, epidemiologists, and machine learning specialists unify. A potential treatment option may be as described above, or may be revealed to be unimaginably different, the opportunity lies in the genome sequencing of a large segment of the population.
Deep learning is the best analysis tool that humans have ever created; we need to at a minimum attempt to use it to create a vaccine.
When we take into consideration what is currently at risk with this current epidemic, these three scientific communities need to come together to work on a cure.
Allan Hanbury, Co-Founder of contextflow – Interview Series
Allan Hanbury is Professor for Data Intelligence at the TU Wien, Austria, and Faculty Member of the Complexity Science Hub, where he leads research and innovation to make sense of unstructured data. He is initiator of the Austrian ICT Lighthouse Project, Data Market Austria, which is creating a Data-Services Ecosystem in Austria. He was scientific coordinator of the EU-funded Khresmoi Integrated Project on medical and health information search and analysis, and is co-founder of contextflow, the spin-off company commercialising the radiology image search technology developed in the Khresmoi project. He also coordinated the EU-funded VISCERAL project on evaluation of algorithms on big data, and the EU-funded KConnect project on technology for analysing medical text.
contextflow is a spin-off from the Medical University of Vienna and European research project KHRESMOI. Could you tell us about the KHRESMOI project?
Sure! The goal of Khresmoi was to develop a multilingual, multimodal search and access system for biomedical information and documents, which required us to effectively automate the information extraction process, develop adaptive user interfaces and link both unstructured and semi-structured text information to images. Essentially, we wanted to make the information retrieval process for medical professionals reliable, fast, accurate and understandable.
What’s the current dataset which is powering the contextflow deep learning algorithm?
Our dataset contains approximately 8000 lung CTs. As our AI is rather flexible, we’re moving towards brain MRIs next.
Have you seen improvements with how the AI performs as the dataset has become larger?
We’re frequently asked this question, and the answer is likely not satisfying to most readers. To a certain extent, yes, the quality improves with more scans, but after a particular threshold, you don’t gain much more simply from having more. How much is enough really depends on various factors (organ, modality, disease pattern, etc), and it’s impossible to give an exact number. What’s most important is the quality of the data.
Is contextflow designed for all cases, or to simply be used for determining differential diagnosis for difficult cases?
Radiologists are really good at what they do. For the majority of cases, the findings are obvious, and external tools are unnecessary. contextflow has differentiated itself in the market by focusing on general search rather than automated diagnosis. There are a few use cases for our tools, but the main one is for helping with difficult cases where the findings aren’t immediately apparent. Here radiologists must consult various resources, and that process takes time. contextflow SEARCH, our 3D image-based search engine) aims to reduce the time it takes for radiologists to search for information during image interpretation by allowing them to search via the image itself. Because we also provide reference information helpful for differential diagnosis, training new radiologists is another promising use case.
Can you walk us through the process of how a radiologist would use the contextflow platform?
contextflow SEARCH Lung CT is completely integrated into the radiologist’s workflow (or else they would not use it). The radiologist performs their work as usual, and when they require additional information for a particular patient, they simply select a region of interest in that scan and click the contextflow icon in their workstation to open up our system in a new browser window. From there, they will receive reference cases from our database of patients with similar disease patterns present as the patient they are currently evaluating plus statistics and medical literature (e.g. radiopedia). They can scroll through their patient in our system normally, selecting additional regions to search for additional information and compare side-by-side with the reference cases. There are also heatmaps providing a visualization of the overall distribution of disease patterns, which helps with reporting findings as well. We really tried to put everything a radiologist needs to write a report in one place and available within seconds.
This was designed initially for lung CT scans, will contextflow be expanding to other types of scans?
Yes! We have a list of organs and modalities requested by radiologists that we are eager to add. The ultimate goal is to provide a system that covers the entire human body, regardless of organ or type of scan.
contextflow has received the support of two amazing incubator programs INiTS and i2c TU Wien. How beneficial have these programs been and what have you learned from the process?
We owe a lot of gratitude to these incubators. Both connected us with mentors, consultants and investors which challenged our business model and ultimately clarified our who/why/how. They also act very practically, providing funding and office space so that we could really focus on the work and not worry SO much about administrative topics. We truly could not have come as far as we have without their support. The Austrian startup ecosystem is still small, but there are programs out there to help bring innovative ideas to fruition.
You are also the initiator of the Austrian ICT Lighthouse Project which aims to build a sustainable Data-Services Ecosystem in Austria. Could you tell us more about this project and about your role in it?
The amount of data produced daily is exponentially growing, and its importance to most industries is also exploding…it’s really one of the world’s most important resources! Data Market Austria’s Lighthouse project aims to develop or reform requirements for successful data-driven businesses, ensuring low cost, high quality and interoperability. I coordinated the project for the first year in 2016-2017. This led to the creation of the Data Intelligence Offensive where I am on the board of directors. The DIO’s mission is to exchange information and know-how between members regarding data management and security.
Is there anything else that you would like to share with our readers about contextflow?
Radiology workflows are not on the average citizen’s mind, and that’s how it should be. The system should just work. Unfortunately, once you become a patient, you realize that is not always the case. contextflow is working to transform that process for both radiologists and patients. You can expect a lot of exciting developments from us in the coming years, so stay tuned!
Please visit contextflow to learn more.
- AI Powered State Surveillance On Rise, COVID-19 Used as Scapegoat
- Anastassia Loukina, Senior Research Scientist (NLP/Speech) at ETS – Interview Series
- How Governments Have Used AI to Fight COVID-19
- Neural Hardware and Image Recognition
- Charles J. Simon, Author, Will Computers Revolt? – Interview Series