By Balakrishna D R, Senior Vice President, Service Offering Head – Energy, Communications, Services, and AI and Automation services, at Infosys.
On January 9, 2020, the World Health Organization notified the public of the Coronavirus outbreak in China. Three days prior, the US Centers for Disease Control and Prevention had gotten the word out. But it was a Canadian health monitoring platform that had beaten them both to the punch, sending word of the outbreak to its customers as early as on December 31, 2019! The platform, BlueDot uses artificial intelligence-driven algorithms that scours foreign-language news reports, animal and plant disease networks, and official proclamations to give its clients advance warning to avoid danger zones like Wuhan.
Over the past few years, artificial intelligence has become the key source of transformation, disruption and competitive advantage in today’s fast changing economy. From epidemic tracking to defense to healthcare to autonomous vehicles and everything in between, AI is gaining widespread adoption. PwC predicts that AI could contribute up to $15.7 trillion to the global economy in 2030, at its current growth rate.
Yet, for all the hope that AI brings, it still poses unanswered questions around transparency and trustworthiness. The need to understand, predict and trust the decision-making ability of AI systems is important particularly in areas that are critical to life, death, and personal wellness.
Into the unknown
When automated reasoning systems were first introduced to support decision-making, they relied on hand-crafted rules. While this made it easy to interpret as well as modify their behavior, they were not scalable. Machine learning based models arrived to address the latter need; they did not require human intervention and could train from data – the more the better. While deep learning models are unsurpassed in their modelling capacity and scope of applicability, the fact that these models are black boxes for the most part, raises disturbing questions regarding their veracity, trustworthiness and biases in the context of their wide usage.
There is currently no direct mechanism to trace the reasoning implicitly used by deep learning models. With machine learning models that have a black-box nature, the primary kind of explainability is known as post-hoc explainability, implying that the explanations are derived from the nature and properties of the outputs generated by the model. Early attempts to extract rules from neural networks (as deep learning was earlier known) are not currently pursued since the networks have become too large and diverse for tractable rule extraction. There is, therefore, an urgent need to introduce interpretability and transparency into the very fabric of AI modelling.
Exit night, enter light
This concern has created a need for transparency in machine learning, which has led to the growth of explainable AI, or XAI. It seeks to address the major issues that hinder our ability to fully trust AI decision-making — including bias and transparency. This new field of AI brings accountability to ensure that AI benefits society with better outcomes for all involved.
XAI will be critical in helping with the bias inherent to AI systems and algorithms, which are programmed by people whose backgrounds and experiences unintentionally lead to the development of AI systems that exhibit bias. Unwanted biases such as discrimination against a particular nationality or ethnicity may creep in because the system adds a value to it based on real data. To illustrate, it may be found that typical loan defaulters come from a particular ethnic background, however, implementing any restrictive policy based on this may be against fair practices. Erroneous data is another cause of bias. Example, if a particular face recognition scanner is inaccurate 5% of the time because of the complexion of the person or the light falling on the face, it could bring in bias. Lastly, if your sample data isn’t a true representation of the whole population, bias is inevitable.
XAI aims to address how black box decisions of AI systems are arrived at. It inspects and tries to understand the steps and models involved in making decisions. It answers crucial questions such as: Why did the AI system make a specific prediction or decision? Why didn’t the AI system do something else? When did the AI system succeed or fail? When do AI systems give enough confidence in the decision that you can trust it, and how can the AI system correct errors?
Explainable, predictable and traceable AI
One way to gain explainability in AI systems is to use machine learning algorithms that are inherently explainable. For example, simpler forms of machine learning such as decision trees, Bayesian classifiers, and other algorithms that have certain amounts of traceability and transparency in their decision making. They can provide the visibility needed for critical AI systems without sacrificing too much performance or accuracy.
Noticing the need to provide explainability for deep learning and other more complex algorithmic approaches, the US Defense Advanced Research Project Agency (DARPA) is pursuing efforts to produce explainable AI solutions through a number of funded research initiatives. DARPA describes AI explainability in three parts which include: prediction accuracy, which means models will explain how conclusions are reached to improve future decision making; decision understanding and trust from human users and operators, as well as inspection and traceability of actions undertaken by the AI systems.
Traceability will empower humans to get into AI decision loops and have the ability to stop, or, control its tasks, whenever need arises. An AI system is not only expected to perform a certain task or impose decisions, but also provide a transparent report of why it took specific decisions with the supporting rationale.
Standardization of algorithms or even XAI approaches isn’t currently possible, but it might certainly be possible to standardize levels of transparency / levels of explainability. Standards organizations are trying to arrive at common, standard understandings of these levels of transparency to facilitate communication between end users and technology vendors.
As governments, institutions, enterprises and the general public come to depend on AI-based systems, winning their trust through clearer transparency of the decision-making process is going to be fundamental. The launch of the first global conference exclusively dedicated to XAI, the International Joint Conference on artificial intelligence: Workshop on Explainable Artificial Intelligence, is further proof that the age of XAI has come.
Researchers Create AI Model Capable Of Singing In Both Chinese and English
A team of researchers from Microsoft and Zhajiang University have recently created an AI model capable of singing in numerous languages. As VentureBeat reported, the DeepSinger AI developed by the team was trained on data from various music websites, using algorithms that captured the timbre of the singer’s voice.
Generating the “voice” of an AI singer requires algorithms that are capable of predicting and controlling both the pitch and duration of audio. When people sing, the noises they produce have vastly more complex rhythms and patterns compared to simple speech. Another problem for the team to overcome was that while there is a fair amount of speaking/speech training data available, singing training data sets are fairly rare. Combine these challenges with the fact that songs need to have both sound and lyrics analyzed, and the problem of generating singing is incredibly complex.
The DeepSinger system created by the researchers overcame these challenges by developing a data pipeline that mined and transformed audio data. The clips of singing were extracted from various music websites, and then the singing is isolated from the rest of the audio and divided into sentences. The next step was to determine the duration of every phoneme within the lyrics, resulting in a series of samples each representing a unique phoneme in the lyrics. Cleaning of the data is done to deal with any distorted training samples after the lyrics and accompanying audio samples are sorted according to confidence score.
The exact same methods seem to work for a variety of languages. DeepSinger was trained on Chinese, Cantone, and English vocal samples comprised from 89 different singers singing for over 92 hours. The results of the study found that the DeepSinger system was able to reliably generate high quality “singing” samples according to metrics like accuracy of pitch and how natural the singing sounded. The researchers had 20 people rate both songs generated by DeepSinger and the training songs according to these metrics and the gap between scores for the generated samples and genuine audio was quite small. The participants gave DeepSinger a mean opinion score that deviated by between 0.34 and 0.76.
Looking forward, the researchers want to try and improve the quality of the generated voices by jointly training the various submodels that comprise DeepSinger, done with the assistance of speciality technologies like WaveNet that are designed specifically for the task of generating natural sounding speech through audio waveforms.
The DeepSinger system could be used to help singers and other musical artists make corrections to work without having to head back into the studio for another recording session. IT could also potentially be used to create audio deepfakes, making it seem like an artist sang a song they never actually did. While it could be used for parody or satire, it’s also of dubious legality.
DeepSinger is just one of a wave of new AI-based music and audio systems that could transform how music and software interact. OpenAI recently released their own AI system, dubbed JukeBox, that is capable of producing original music tracks in the style of a certain genre or even a specific artist. Other musical AI tools include Google’s Magenta and Amazon’s DeepComposer. Magnets is an open source audio (and image) manipulation library that can be used to produce everything from automated drum backing to simple music based video games. Meanwhile, Amazon’s DeepComposer is targeted at those who want to train and customize their own music-based deep learning models, allowing the user to take pre-trained sample models and tweak the models to their needs.
You can listen to some of the audio samples generated by DeepSinger at this link.
New Study Attempts to Improve Hate Speech Detection Algorithms
Social media companies, especially Twitter, have long faced criticism for how they flag speech and decide which accounts to ban. The underlying problem almost always has to do with the algorithms that they use to monitor online posts. Artificial intelligence systems are far from perfect when it comes to this task, but there is work constantly being done to improve them.
Included in that work is a new study coming out of the University of Southern California that attempts to reduce certain errors that could result in racial bias.
Failure to Recognize Context
One of the issues that doesn’t receive as much attention has to do with algorithms that are meant to stop the spread of hateful speech but actually amplify racial bias. This happens when the algorithms fail to recognize context and end up flagging or blocking tweets from minority groups.
The biggest problem with the algorithms in regard to context is that they are oversensitive to certain group-identifying terms like “black,” “gay,” and “transgender.” The algorithms consider these hate speech classifiers, but they are often used by members of those groups and the setting is important.
In an attempt to resolve this issue of context blindness, the researchers created a more context-sensitive hate speech classifier. The new algorithm is less likely to mislabel a post as hate speech.
The researchers developed the new algorithms with two new factors in mind: the context in regard to the group identifiers, and whether there are also other features of hate speech present in the post, like dehumanizing language.
Brendan Kennedy is a computer science Ph.D. student and co-lead author of the study, which was published on July 6 at ACL 2020.
“We want to move hate speech detection closer to being ready for real-world application,” said Kennedy.
“Hate speech detection models often ‘break,’ or generate bad predictions, when introduced to real-world data, such as social media or other online text data, because they are biased by the data on which they are trained to associate the appearance of social identifying terms with hate speech.”
The reason the algorithms are oftentimes inaccurate is that they are trained on imbalanced datasets with extremely high rates of hate speech. Because of this, the algorithms fail to learn how to handle what social media actually looks like in the real world.
Professor Xiang is an expert in natural language processing.
“It is key for models to not ignore identifiers, but to match them with the right context,” said Ren.
“If you teach a model from an imbalanced dataset, the model starts picking up weird patterns and blocking users inappropriately.”
To test the algorithm, the researchers used a random sample of text from two social media sites that have a high-rate of hate speech. The text was first hand-flagged by humans as prejudiced or dehumanizing. The state-of-the-art model was then measured against the researchers’ own model for inappropriately flagging non-hate speech, through the use of 12,500 New York Times articles with no hate speech present. While the state-of-the-art models were able to achieve 77% accuracy in identifying hate vs non-hate, the researcher’s model was higher at 90%.
“This work by itself does not make hate speech detection perfect, that is a huge project that many are working on, but it makes incremental progress,” said Kennedy.
“In addition to preventing social media posts by members of protected groups from being inappropriately censored, we hope our work will help ensure that hate speech detection does not do unnecessary harm by reinforcing spurious associations of prejudice and dehumanization with social groups.”
Researchers Use AI To Investigate How Reflections Differ From Original Images
Researchers at Cornell University recently utilized machine learning systems to investigate how reflections of images are different from the original images. As reported by ScienceDaily, the algorithms created by the team of researchers found that there were telltale signs, differences from the original image, that an image had been flipped or reflected.
Associate professor of computer science at Cornell Tech, Noah Snavely, was the study’s senior author. According to Snavely, the research project started when the researchers became intrigued by how images were different in both obvious and subtle ways when they were reflected. Snavely explained that even things that appear very symmetrical at first glance can usually be distinguished as a reflection when studied. I’m intrigued by the discoveries you can make with new ways of gleaning information,” said Snavely, according to ScienceDaily.
The researchers focused on images of people, using them to train their algorithms. This was done because faces don’t seem obviously asymmetrical. When trained on data that distinguished flipped images from original images, the AI reportedly achieved an accuracy of between 60% to 90% across various types of images.
Many of the visual hallmarks of a flipped image the AI learned are quite subtle and difficult for humans to discern when they look at the flipped images. In order to better interpret the features that the AI was using to distinguish between flipped and original images, the researchers created a heatmap. The heatmap showed regions of the image that the AI tended to focus on. According to the researchers, one of the most common clues the AI used to distinguish flipped images was text. This was unsurprising, and the researchers removed images containing text from their training data in order to get a better idea of the more subtle differences between flipped and original images.
After images containing text were dropped from the training set, the researchers found that the AI classifier focused on features of the images like shirt callers, cell phones, wristwatches, and faces. Some of these features have obvious, reliable patterns that the AI can hone in on, such as the fact that people often carry cell phones in their right hand and that the buttons on shirt collars are often on the left. However, facial features are typically highly symmetrical with differences being small and very hard for a human observer to detect.
The researchers created another heatmap that highlighted the areas of faces that the AI tended to focus on. The AI often used people’s eyes, hair, and beards to detect flipped images. For reasons that are unclear, people often look slightly to the left when they have photos taken of them. As for why hair and beards are indicators of flipped images, the researchers are unsure but they theorize that a person’s handedness could be revealed by the way they shave or comb. While these indicators can be unreliable, by combining multiple indicators together the researchers can achieve greater confidence and accuracy.
More research along these lines will need to be carried out, but if the findings are consistent and reliable then it could help researchers find more efficient ways of training machine learning algorithms. Computer vision AI is often trained using reflections of images, as it is an effective and quick way of increasing the amount of available training data. It’s possible that analyzing how the reflected images are different could help machine learning researchers gain a better understanding of the biases present in machine learning models that might cause them to inaccurately classify images.
As Snavely was quoted by ScienceDaily:
“This leads to an open question for the computer vision community, which is, when is it OK to do this flipping to augment your dataset, and when is it not OK? I’m hoping this will get people to think more about these questions and start to develop tools to understand how it’s biasing the algorithm.”
- Andrea Sommer, Founder & Business Lead at UvvaLabs – Interview Series
- Three Uses Of Automation Within Supply Chain 4.0
- Jean Belanger, Co-Founder & CEO at Cerebri AI – Interview Series
- Researchers Develop Self-Healing Soft Robot Actuators
- Researchers Design AI Model Capable of Distinguishing Different Odor Percepts