Researchers at Cornell University recently utilized machine learning systems to investigate how reflections of images are different from the original images. As reported by ScienceDaily, the algorithms created by the team of researchers found that there were telltale signs, differences from the original image, that an image had been flipped or reflected.
Associate professor of computer science at Cornell Tech, Noah Snavely, was the study’s senior author. According to Snavely, the research project started when the researchers became intrigued by how images were different in both obvious and subtle ways when they were reflected. Snavely explained that even things that appear very symmetrical at first glance can usually be distinguished as a reflection when studied. I’m intrigued by the discoveries you can make with new ways of gleaning information,” said Snavely, according to ScienceDaily.
The researchers focused on images of people, using them to train their algorithms. This was done because faces don’t seem obviously asymmetrical. When trained on data that distinguished flipped images from original images, the AI reportedly achieved an accuracy of between 60% to 90% across various types of images.
Many of the visual hallmarks of a flipped image the AI learned are quite subtle and difficult for humans to discern when they look at the flipped images. In order to better interpret the features that the AI was using to distinguish between flipped and original images, the researchers created a heatmap. The heatmap showed regions of the image that the AI tended to focus on. According to the researchers, one of the most common clues the AI used to distinguish flipped images was text. This was unsurprising, and the researchers removed images containing text from their training data in order to get a better idea of the more subtle differences between flipped and original images.
After images containing text were dropped from the training set, the researchers found that the AI classifier focused on features of the images like shirt callers, cell phones, wristwatches, and faces. Some of these features have obvious, reliable patterns that the AI can hone in on, such as the fact that people often carry cell phones in their right hand and that the buttons on shirt collars are often on the left. However, facial features are typically highly symmetrical with differences being small and very hard for a human observer to detect.
The researchers created another heatmap that highlighted the areas of faces that the AI tended to focus on. The AI often used people’s eyes, hair, and beards to detect flipped images. For reasons that are unclear, people often look slightly to the left when they have photos taken of them. As for why hair and beards are indicators of flipped images, the researchers are unsure but they theorize that a person’s handedness could be revealed by the way they shave or comb. While these indicators can be unreliable, by combining multiple indicators together the researchers can achieve greater confidence and accuracy.
More research along these lines will need to be carried out, but if the findings are consistent and reliable then it could help researchers find more efficient ways of training machine learning algorithms. Computer vision AI is often trained using reflections of images, as it is an effective and quick way of increasing the amount of available training data. It’s possible that analyzing how the reflected images are different could help machine learning researchers gain a better understanding of the biases present in machine learning models that might cause them to inaccurately classify images.
As Snavely was quoted by ScienceDaily:
“This leads to an open question for the computer vision community, which is, when is it OK to do this flipping to augment your dataset, and when is it not OK? I’m hoping this will get people to think more about these questions and start to develop tools to understand how it’s biasing the algorithm.”
Researchers Create AI Model Capable Of Singing In Both Chinese and English
A team of researchers from Microsoft and Zhajiang University have recently created an AI model capable of singing in numerous languages. As VentureBeat reported, the DeepSinger AI developed by the team was trained on data from various music websites, using algorithms that captured the timbre of the singer’s voice.
Generating the “voice” of an AI singer requires algorithms that are capable of predicting and controlling both the pitch and duration of audio. When people sing, the noises they produce have vastly more complex rhythms and patterns compared to simple speech. Another problem for the team to overcome was that while there is a fair amount of speaking/speech training data available, singing training data sets are fairly rare. Combine these challenges with the fact that songs need to have both sound and lyrics analyzed, and the problem of generating singing is incredibly complex.
The DeepSinger system created by the researchers overcame these challenges by developing a data pipeline that mined and transformed audio data. The clips of singing were extracted from various music websites, and then the singing is isolated from the rest of the audio and divided into sentences. The next step was to determine the duration of every phoneme within the lyrics, resulting in a series of samples each representing a unique phoneme in the lyrics. Cleaning of the data is done to deal with any distorted training samples after the lyrics and accompanying audio samples are sorted according to confidence score.
The exact same methods seem to work for a variety of languages. DeepSinger was trained on Chinese, Cantone, and English vocal samples comprised from 89 different singers singing for over 92 hours. The results of the study found that the DeepSinger system was able to reliably generate high quality “singing” samples according to metrics like accuracy of pitch and how natural the singing sounded. The researchers had 20 people rate both songs generated by DeepSinger and the training songs according to these metrics and the gap between scores for the generated samples and genuine audio was quite small. The participants gave DeepSinger a mean opinion score that deviated by between 0.34 and 0.76.
Looking forward, the researchers want to try and improve the quality of the generated voices by jointly training the various submodels that comprise DeepSinger, done with the assistance of speciality technologies like WaveNet that are designed specifically for the task of generating natural sounding speech through audio waveforms.
The DeepSinger system could be used to help singers and other musical artists make corrections to work without having to head back into the studio for another recording session. IT could also potentially be used to create audio deepfakes, making it seem like an artist sang a song they never actually did. While it could be used for parody or satire, it’s also of dubious legality.
DeepSinger is just one of a wave of new AI-based music and audio systems that could transform how music and software interact. OpenAI recently released their own AI system, dubbed JukeBox, that is capable of producing original music tracks in the style of a certain genre or even a specific artist. Other musical AI tools include Google’s Magenta and Amazon’s DeepComposer. Magnets is an open source audio (and image) manipulation library that can be used to produce everything from automated drum backing to simple music based video games. Meanwhile, Amazon’s DeepComposer is targeted at those who want to train and customize their own music-based deep learning models, allowing the user to take pre-trained sample models and tweak the models to their needs.
You can listen to some of the audio samples generated by DeepSinger at this link.
New Study Attempts to Improve Hate Speech Detection Algorithms
Social media companies, especially Twitter, have long faced criticism for how they flag speech and decide which accounts to ban. The underlying problem almost always has to do with the algorithms that they use to monitor online posts. Artificial intelligence systems are far from perfect when it comes to this task, but there is work constantly being done to improve them.
Included in that work is a new study coming out of the University of Southern California that attempts to reduce certain errors that could result in racial bias.
Failure to Recognize Context
One of the issues that doesn’t receive as much attention has to do with algorithms that are meant to stop the spread of hateful speech but actually amplify racial bias. This happens when the algorithms fail to recognize context and end up flagging or blocking tweets from minority groups.
The biggest problem with the algorithms in regard to context is that they are oversensitive to certain group-identifying terms like “black,” “gay,” and “transgender.” The algorithms consider these hate speech classifiers, but they are often used by members of those groups and the setting is important.
In an attempt to resolve this issue of context blindness, the researchers created a more context-sensitive hate speech classifier. The new algorithm is less likely to mislabel a post as hate speech.
The researchers developed the new algorithms with two new factors in mind: the context in regard to the group identifiers, and whether there are also other features of hate speech present in the post, like dehumanizing language.
Brendan Kennedy is a computer science Ph.D. student and co-lead author of the study, which was published on July 6 at ACL 2020.
“We want to move hate speech detection closer to being ready for real-world application,” said Kennedy.
“Hate speech detection models often ‘break,’ or generate bad predictions, when introduced to real-world data, such as social media or other online text data, because they are biased by the data on which they are trained to associate the appearance of social identifying terms with hate speech.”
The reason the algorithms are oftentimes inaccurate is that they are trained on imbalanced datasets with extremely high rates of hate speech. Because of this, the algorithms fail to learn how to handle what social media actually looks like in the real world.
Professor Xiang is an expert in natural language processing.
“It is key for models to not ignore identifiers, but to match them with the right context,” said Ren.
“If you teach a model from an imbalanced dataset, the model starts picking up weird patterns and blocking users inappropriately.”
To test the algorithm, the researchers used a random sample of text from two social media sites that have a high-rate of hate speech. The text was first hand-flagged by humans as prejudiced or dehumanizing. The state-of-the-art model was then measured against the researchers’ own model for inappropriately flagging non-hate speech, through the use of 12,500 New York Times articles with no hate speech present. While the state-of-the-art models were able to achieve 77% accuracy in identifying hate vs non-hate, the researcher’s model was higher at 90%.
“This work by itself does not make hate speech detection perfect, that is a huge project that many are working on, but it makes incremental progress,” said Kennedy.
“In addition to preventing social media posts by members of protected groups from being inappropriately censored, we hope our work will help ensure that hate speech detection does not do unnecessary harm by reinforcing spurious associations of prejudice and dehumanization with social groups.”
Researchers Believe AI Can Be Used To Help Protect People’s Privacy
Two professors of information science have recently published a piece in The Conversation, arguing that AI could help preserve people’s privacy, rectifying some of the issues that it has created.
Zhiyuan Chen and Aryya Gangopadhyay argue that artificial intelligence algorithms could be used to defend people’s privacy, counteracting some of the many privacy concerns other uses of AI have created. Chen and Gangopadhyay acknowledge that many of the AI-driven products we use for convenience wouldn’t work without access to large amounts of data, which at first glance seems at odds with attempts to preserve privacy. Furthermore, as AI spreads out into more and more industries and applications, more data will be collected and stored in databases, making breaches of those databases tempting. However, Chen and Gangopadhyay believe that when used correctly, AI can help mitigate these issues.
Chen and Gangopadhyay explain in their post that the privacy risks associated with AI come from at least two different sources. The first source is the large datasets collected to train neural network models, while the second privacy threat is the models themselves. Data can potentially “leak” from these models, with the behavior of the models giving away details about the data used to train them.
Deep neural networks are comprised of multiple layers of neurons, with each layer connected to the layers around them. The individual neurons, as well as the links between neurons, encode for different bits of the training data. The model may prove to be too good as remembering patterns of the training data, even if the model isn’t overfitting. Traces of the training data exist within the network and malicious actors may be able to ascertain aspects of the training data, as Cornell University found during one of their studies. Cornell researchers found that facial recognition algorithms could be exploited by attackers to reveal which images, and therefore which people, were used to train the face recognition model. The Cornell researchers discovered that even if an attacker doesn’t have access to the original model used to train the application, the attacker may still be able to probe the network and determine if a specific person was included in the training data simply by using models was that were trained on highly similar data.
Some AI models are currently being used to protect against data breaches and try to ensure people’s privacy. AI models are frequently used to detect hacking attempts by recognizing the patterns of behavior that hackers use to penetrate security methods. However, hackers often change up their behavior to try and fool pattern-detecting AI.
New methods of AI training and development aim to make AI models and applications less vulnerable to hacking methods and security evasion tactics. Adversarial learning endeavors to train AI models on simulations of malicious or harmful inputs and in doing so make the model more robust to exploitation, hence the “adversarial” in the name. According to Chen and Gangopadhyay, their research has discovered methods of combatting malware designed to steal people’s private info. The two researchers explained that one of the methods they found to be most effective at resisting malware was the introduction of uncertainty into the model. The goal is to make it more difficult for bad actors to anticipate how the model will react to any given input.
Other methods of utilizing AI to protect privacy include minimizing data exposure when the model is created and trained, as well as probing to discover the network’s vulnerabilities. When it comes to preserving data privacy, federated learning can help protect the privacy of sensitive data, as it allows a model to be trained by without the training data ever leaving the local devices that contain the data, insulating the data, and much of the model’s parameters from spying.
Ultimately, Chen and Gangopadhyay argue that while the proliferation of AI has created new threats to people’s privacy, AI can also help protect privacy when designed with care and consideration.
- Matt Carlson, VP Business Development at WiBotic – Interview Series
- U.S. National Institutes of Health Turns to AI for Fight Against COVID-19
- WiBotic Receives Industry-First FCC Approval for High Power Wireless Charging of Robots
- AI Browser Tools Aim To Recognize Deepfakes and Other Fake Media
- Dave Ryan, General Manager, Health & Life Sciences Business at Intel – Interview Series