Just recently, Google announced the creation of a new cloud platform intended to make gaining insight into how an AI program renders decisions, making debugging a program easier and enhancing transparency. As reported by The Register, the cloud platform is called Explainable AI, and it marks a major attempt by Google to invest in AI explainability.
Artificial neural networks are employed in many, perhaps most, of the major AI systems employed in the world today. The neural networks that run major AI applications can be extraordinarily complex and large, and as a system’s complexity grows it becomes harder and harder to intuit why a particular decision has been made by the system. As Google explains in their white paper, as AI systems become more powerful, they also become more complex and hence harder to debug. Transparency is also lost when this occurs, which means that biased algorithms can be difficult to recognize and address.
The fact that the reasoning which drives the behavior of complex systems is so hard to interpret often has drastic consequences. In addition to making it hard to combat AI bias, it can make it extraordinarily difficult to tell spurious correlations from genuinely important and interesting correlations.
Many companies and research groups are exploring how to address the “black box” problem of AI and create a system that adequately explains why certain decisions have been made by an AI. Google’s Explainable AI platform represents its own bid to tackle this challenge. Explainable AI is comprised of three different tools. The first tool a system that describes which features have been selected by an AI and it also displays an attribution score which represents the amount of influence that a particular feature has on the final prediction. Google’s report on the tool gives an example of predicting how long a bike ride will last based on variables like rainfall, current temperature, day of the week, and start time. After the network renders the decision, feedback is given that displays which features had the most impact on the predictions.
How does this tool provide such feedback in the case of image data? In this case, the tool produces an overlay that highlights the regions of the image that weighted most heavily on the rendered decision.
Another tool found in the toolkit is the “What-If” tool, which displays potential fluctuations in model performance as individual attributes are manipulated. Finally, the last tool enables can be set up to give sample results to human reviewers on a consistent schedule.
Dr. Andrew Moore, Google’s chief scientist for AI and machine learning, described the inspiration for the project. Moore explained that around five years ago the academic community started to become concerned about the harmful byproducts of AI use and that Google wanted to ensure their systems were only being used in ethical ways. Moore described an incident where the company was trying to design a computer vision program to alert construction workers if someone wasn’t wearing a helmet, but they become concerned that the monitoring could be taken too far and become dehumanizing. Moore said there was a similar reason that Google decided not to release a general face recognition API, as the company wanted to have more control over how their technology was used and ensure it was only being used in ethical ways.
Moore also highlighted why it was so important for AI’s decision to be explainable:
“If you’ve got a safety critical system or a societally important thing which may have unintended consequences if you think your model’s made a mistake, you have to be able to diagnose it. We want to explain carefully what explainability can and can’t do. It’s not a panacea.
Neural Network Makes it Easier to Identify Different Points in History
One area that is not covered as much in terms of artificial intelligence (AI) potential is how it can be used in history, anthropology, archaeology, and other similar fields. This is being demonstrated by new research that shows how machine learning can act as a tool for archaeologists to differentiate between two major periods: the Middle Stone Age (MSA) and the Later Stone Age (LSA).
This differentiation may seem like something that academia and archaeologists already have established, but that is far from the case. In many instances, it is not easy to distinguish between the two.
MSA and LSA
Around 300 thousand years ago, the first MSA toolkits appeared during the same time as the earliest fossils of Homo Sapiens. Those same tool kits were used all the way up until about 30 thousand years ago. A major shift in behavior took place around 67 thousand years ago when there were changes in stone tool production, and the resulting toolkits were LSA.
LSA toolkits were still being used in the recent past, and it is now becoming clear that the shift from MSA to LSA was anything but a linear process. The changes took place throughout different times and in different places, which is why researchers are so focused on this process that can help explain cultural innovation and creativity.
The foundation of this understanding is the differentiation between MSA and LSA.
Dr. Jimbob Blinkhorn is an archaeologist from the Pan African Evolution Research Group, Max Planck Institute for the Science of Human History and the Centre for Quaternary Research, Department of Geography, Royal Holloway.
“Eastern Africa is a key region to examine this major cultural change, not only because it hosts some of the youngest MSA sites and some of the oldest LSA sites, but also because a large number of well excavated and dated sites make it ideal for research using quantitative methods,” Dr. Blinkhorn says. “This enabled us to pull together a substantial database of changing patterns to stone tool production and use, spanning 130 to 12 thousand years ago, to examine the MSA-LSA transition.”
Artificial Neural Networks (ANNs)
The study is based on 16 alternate tool types across 92 stone tool assemblages, with a focus on their presence or absence. The study emphasizes the constellations of tool forms that often occur together rather than each individual tool.
Dr. Matt Grove is an archaeologist at the University of Liverpool.
“We’ve employed an Artificial Neural Network (ANN) approach to train and test models that differentiate LSA assemblages from MSA assemblages, as well as examining chronological difference between older (130-71 thousand years ago) and younger (71-28 thousand years ago) MSA assemblages with a 94% success rate,” Dr. Glove says.
Artificial Neural Networks (ANNs) mimic certain information processing features of the human brain, and the processing power is heavily reliant on the action of many simple units acting together.
“ANNs have sometimes been described as a ‘black box’ approach, as even when they are highly successful, it may not always be clear exactly why,” Grove says. “We employed a simulation approach that breaks open this black box to understand which inputs have a significant impact on the results. This enabled us to identify how patterns of stone tool assemblage composition vary between the MSA and LSA, and we hope this demonstrates how such methods can be used more widely in archaeological research in the future.”
“The results of our study show that MSA and LSA assemblages can be differentiated based on the constellation of artifact types found within an assemblage alone,” Blinkhorn says. “The combined occurrence of backed pieces, blade and bipolar technologies together with the combined absence of core tools, Levallois flake technology, point technology and scrapers robustly identifies LSA assemblages, with the opposite pattern identifying MSA assemblages. Significantly, this provides quantified support to qualitative differences noted by earlier researchers that key typological changes do occur with this cultural transition.”
The team will now use the newly developed method to look further into cultural change in the African Stone Age.
“The approach we’ve employed offers a powerful toolkit to examine the categories we use to describe the archaeological record and to help us examine and explain cultural change amongst our ancestors,” Blinkhorn says.
Researchers Develop “DeepTrust” Tool to Help Increase AI Trustworthiness
The safety and trustworthiness of artificial intelligence (AI) is one of the biggest aspects of the technology. It is constantly being improved and worked on by top experts within the different fields, and it will be crucial to the full implementation of AI throughout society.
Some of that new work is coming out of the University of Southern California, where USC Viterbi Engineering researchers have developed a new tool capable of generating automatic indicators for whether or not AI algorithms are trustworthy in their data and predictions.
The research was published in Frontiers in Artificial Intelligence, titled “There is Hope After All: Quantifying Opinion and Trustworthiness in Neural Networks”. The authors of the paper include Mingxi Cheng, Shahin Nazarian, and Paul Bogdan of the USC Cyber Physical Systems Group.
Trustworthiness of Neural Networks
One of the biggest tasks in this area is getting neural networks to generate predictions that can be trusted. In many cases, this is what stops the full adoption of technology that relies on AI.
For example, self-driving vehicles are required to act independently and make accurate decisions on auto-pilot. They need to be capable of making these decisions extremely quickly, while deciphering and recognizing objects on the road. This is crucial, especially in scenarios where the technology would have to decipher the difference between a speed bump, some other object, or a living being.
Other scenarios include the self-driving vehicle deciding what to do when another vehicle faces it head-on, and the most complex decision of all is if that self-driving vehicle needs to decide between hitting what it perceives as another vehicle, some object, or a living being.
This all means we are putting an extreme amount of trust into the capability of the self-driving vehicle’s software to make the correct decision in just fractions of a second. It becomes even more difficult when there is conflicting information from different sensors, such as computer vision from cameras and Lidar.
Lead author Minxi Cheng decided to take this project up after thinking, “Even humans can be indecisive in certain decision-making scenarios. In cases involving conflicting information, why can’t machines tell us when they don’t know?”
The tool that was created by the researchers is called DeepTrust, and it is able to quantify the amount of uncertainty, according to Paul Bogdan, an associate professor in the Ming Hsieh Department of Electrical and Computer Engineering.
The team spent nearly two years developing DeepTrust, primarily using subjective logic to assess the neural networks. In one example of the tool working, it was able to look at the 2016 presidential election polls and predict that there was a greater margin of error for Hillary Clinton winning.
The DeepTrust tool also makes it easier to test the reliability of AI algorithms normally trained on up to millions of data points. The other way to do this is by independently checking each one of the data points to test accuracy, which is an extremely time consuming task.
According to the researchers, the architecture of these neural network systems is more accurate, and accuracy and trust can be maximized simultaneously.
“To our knowledge, there is no trust quantification model or tool for deep learning, artificial intelligence and machine learning. This is the first approach and opens new research directions,” Bogdan says.
Bogdan also believes that DeepTrust could help push AI forward to the point where it is “aware and adaptive.”
AI Researchers Design Program To Generate Sound Effects For Movies and Other Media
Researchers from the University of Texas San Antonio have created an AI-based application capable of observing the actions taking place in a video and creating artificial sound effects to match those actions. The sound effects generated by the program are reportedly so realistic that when human observers were polled, they typically thought the sound effects were legitimate.
The program responsible for generating the sound effects, AudioFoley, was detailed in a study recently published in IEEE Transactions on Multimedia. According to IEEE Spectrum, the AI program was developed by Jeff Provost, professor at UT San Antonio, and Ph.D. student Sanchita Ghose. The researchers created the program utilizing multiple machine learning models joined together.
The first task in generating sound effects appropriate to the actions on a screen was recognizing those actions and mapping them to sound effects. To accomplish this, the researchers designed two different machine learning models and tested their different approaches. The first model operates by extracting frames from the videos it is fed and analyzing these frames for relevant features like motions and colors. Afterward, a second model was employed to analyze how the position of an object changes across frames, to extract temporal information. This temporal information is used to anticipate the next likely actions in the video. The two models have different methods of analyzing the actions in the clip, but they both use the information contained in the clip to guess what sound would best accompany it.
The next task is to synthesize the sound, and this is accomplished by matching activities/predicted motions to possible sound samples. According to Ghose and Prevost, AutoFoley was used to generate sound for 1000 short clips, featuring actions and items like a fire, a running horse, ticking clocks, and rain falling on plants. While AutoFoley was most successful in creating sound for clips where there didn’t need to be a perfect match between the actions and sounds, and it had trouble matching clips where actions happened with more variation, the program was still able to fool many human observers into picking its generated sounds over the sound that originally accompanied a clip.
Prevost and Ghose recruited 57 college students and had them watch different clips. Some clips contained the original audio, some contained audio generated by AutoFoley. When the first model was tested, approximately 73% of the students selected the synthesized audio as the original audio, neglecting the true sound that accompanied the clip. The other model performed slightly worse, with only 66% of the participants selecting the generated audio over the original audio.
Prevost explained that AutoFoley could potentially be used to expedite the process of producing movies, television, and other pieces of media. Prevost notes that a realistic Foley track is important to making media engaging and believable, but that the Foley process often takes a significant amount of time to complete. Having an automated system that could handle the creation of basic Foley elements could make producing media cheaper and quicker.
Currently, AutoFoley has some notable limitations. For one, while the model seems to perform well while observing events that have stable, predictable motions, it suffers when trying to generate audio for events with variation in time (like thunderstorms). Beyond this, it also requires that the classification subject is present in the entire clip and doesn’t leave the frame. The research team is aiming to address these issues with future versions of the application.
- Andrew Stein, Software Engineer Waymo – Interview Series
- Michael Schrage, Author of Recommendation Engines (The MIT Press) – Interview Series
- Scientists Detect Loneliness Through The Use Of AI And NLP
- Engineers Develop New Machine-Learning Method Capable of Cutting Energy Use
- Artificial Intelligence Enhances Speed of Discoveries For Particle Physics