A neuroscience graduate student at Northwestern University recently created a text-based video game where the text the user reads is entirely generated by AI. The AI responsible for generating the text is based on the GPT-2 algorithm created by OpenAI earlier this year.
Many early computer games had no graphics, instead, they use a text-based interface. These text adventure games would take in user commands and deliver a series of pre-programmed responses. The user would have to use text commands to solve puzzles and advance farther in the game, a task which could prove challenging depending on the sophistication of the text parser. Early text-based adventure games had a very limited range of potential commands the game could respond to.
As reported by ZME Science, Nathan Whitemore, a neuroscience graduate from Northwestern University, has revitalized this game concept, using AI algorithms to generate responses in real-time, as opposed to pre-programmed responses. Whitmore was apparently inspired to create the project by a Mind Game that appeared in the Sci-Fi novel Ender’s Game, which responded to user input and reformed the game world around the user.
The algorithm that drives the text-based adventure game is the GPT2 algorithm, which was created by OpenAI. The predictive text algorithm was trained on a text dataset, dubbed WebText, which was over 40 GB in size and pulled from Reddit links. The result was an extremely effective predictive text algorithm that could generate shockingly realistic and natural-sounding paragraphs, achieving state-of-the-art performance on a number of different language tests. The OpenAI algorithm was apparently so effective at generating fake news stories that OpenAI was hesitant to release the algorithm to the public, fearing its misuse. Thankfully, Whitmore has used the algorithm for something much more benign then making fake news articles. ‘
Whitmore explained to Digital Trends that in order to produce the game he had to modify the GPT-2 output by training it extensively on a number of adventure game scripts, using various algorithms to adjust the parameters of GPT-2 until the text output by the algorithm resembled the text of adventure games.
What’s particularly interesting about the game is that it is genuinely creative. The user can input almost any text that they can think of, regardless of the particular setting or context of the game, and the game will try to adapt and determine what should happen next. Whitemore explained that you can enter almost any random prompt you would like because the model has enough “ common sense” to adapt to the input.
Whitemore’s custom GPT2 Algorithm does have some limitations. It easily forgets things the user has already told it, having a short “memory”. In other words, it doesn’t preserve the context of the situation with regards to commands, as a traditional pre-programmed text adventure game would, and of course, like many passages of text generated by AI, the generated text doesn’t always make sense.
However, the program does markedly well at simulating the structure and style of text adventure games, providing the user with descriptions of the setting and even providing them with various options they can select to interact with the environment it has created.
“I think it’s creative in a very basic way, like how a person playing ‘Apples to Apples’ is creative,” Whitmore explained. “It’s taking things from old adventure games and rearranging them into something that’s new and interesting and different every time. But it’s not actually generating an overall plot or overarching idea. There are a lot of different kinds of creativity and I think it’s doing one: Generating novel environments, but not the other kinds: Figuring out an intriguing plot for a game.”
Whitmore’s project also seems to confirm that the GPT-2 algorithms are robust enough to be used for a wide variety of other purposes outside of generating text intended only to be read. Whitemore demonstrates the algorithms can be used in a system that enables user responses and feedback, and it will be interesting to see what other responsive applications of GPT-2 will surface in the future.
TextFooler Algorithm Fools NLP AI
As impressive as natural language processing algorithms and systems have become in recent years, they are still vulnerable to a kind of exploit known as an “adversarial example”. Adversarial examples of carefully engineered phrases that can cause an NLP system to behave in unexpected and undesirable ways. AI programs can be made to misbehave with these strange examples, and as a result, AI researchers are trying to design ways to protect against the effects of adversarial examples.
Recently, a team of researchers from both the University of Hong Kong and the Agency for Science, Technology, and Research in Singapore collaborated to create an algorithm that demonstrates the danger of adversarial examples. As Wired reported, the algorithm was dubbed TextFooler by the research team and it functions by subtly changing parts of a sentence, impacting how an NLP classifier might interpret the sentence. As an example, the algorithm converted one sentence to another similar sentence and the sentence was fed into a classifier designed to determine if a review was negative or positive. The original sentence was:
“The characters, cast in impossibly contrived situations, are totally estranged from reality.”
It was converted to this sentence:
“The characters, cast in impossibly engineered circumstances, are fully estranged from reality.”
These subtle changes prompted the text classifier to classify the review as positive instead of negative. The research team tested the same approach (swapping certain words with synonyms) on several different datasets and text classification algorithms. The research team reports that they were able to drop an algorithm’s classification accuracy to just 10%, down from 90%. This is despite the fact that people reading these sentences would interpret them to have the same meaning.
These results are concerning in an era where NLP algorithms and AI are being used more and more frequently, and for important tasks like assessing medical claims or analyzing legal documents. It’s unknown just how much of a danger to currently utilized algorithms adversarial examples are. Research teams around the world are still trying to ascertain just how much of an impact they can have. Recently, a report published by Stanford Human-Centered AI group suggested that adversarial examples could deceive AI algorithms and be used to perpetrate tax fraud.
There are some limitations to the recent study. For instance, while Sameer Singh, an assistant professor of computer science at UC Irvine, notes that the adversarial method used was effective, it relies on some knowledge of the AI’s architecture. The AI has to be repeatedly probed until an effective group of words can be found, and such repeated attacks might be noticed by security programs. Singh and colleagues have done their own research on the subject and found that advanced systems like OpenAI algorithms can deliver racist, harmful text when prompted with certain trigger phrases.
Adversarial examples are also a potential issue when dealing with visual data like photos or video. One famous example involves applying certain subtle digital transformations to an image of a kitten, prompting the image classifier to interpret it as a monitor or desktop PC. In another example, research done by UC Berekely professor Dawn Song found that adversarial examples can be used to change how road signs are perceived by computer vision systems, which could potentially be dangerous for autonomous vehicles.
Research like the kind done by the Hong Kong-Singapore team could help AI engineers better understand what kinds of vulnerabilites AI algorithms have, and potentially design ways to safeguard against these vulnerabilities. As an example, ensemble classifiers can be used to reduce the chance that an adversarial example will be able to deceive the computer vision system. With this technique, a number of classifiers are used and slight transformations are made to the input image. The majority of the classifiers will typically discern aspects of the image’s true content, which are then aggregated together. The result is that even if a few of the classifiers are fooled, most of them won’t be and the image will be properly classified.
Google’s New Meena Chatbot Can Hold Sensible, Specific Conversations About Almost Anything
As impressive and useful as virtual assistants like Siri, Alexa, and Google Assistant are, their conversational skills are typically limited to receiving certain commands and delivering pre-defined responses. Companies like Google and Amazon have been pursuing methods of AI training and development that can make AI chatbots more robust and flexible, able to carry on conversations with users in a much more natural way. As reported by DigitalTrends, Google has recently published a paper demonstrating the capabilities of its new chatbot, dubbed “Meena”. According to a blog post from the researchers, Meena can engage in conversation with its users on just about any topic.
Meena is an open-domain chatbot, meaning that it responds to the context of the conversation so far and adapts to inputs in order to deliver more natural responses. Most other chatbots are closed-domain, which means that their responses are themed around certain ideas and limited to accomplishing specific tasks.
According to Google’s report, Meena’s flexibility was the result of a massive training dataset. Meena was trained on around 40 billion words pulled from social media conversations and filtered for the most relevant and representative words. Google aimed to deal with some of the problems that are found in most voice assistants, such as an ability to handle topics and commands that unfold over multiple turns in the conversation, with the user providing additional inputs after the bot has responded to one input. This means that man chatbots are unable to prompt the user for clarification and when there is a query that can’t be interpreted they often just default to web results.
In order to deal with this particular problem, Google’s researchers enabled its algorithms to keep track of the context of the conversation, meaning that it can generate specific answers. The model used an encoder that processes what has already been said in the conversation and a decoder that creates a response based on the context. The model was trained on specific and non-specific data. Specific data is words that are closely related to the proceeding statement. As the Google post explained:
“For example, if A says, ‘I love tennis,’ and B responds, ‘That’s nice,’ then the utterance should be marked, ‘not specific’. That reply could be used in dozens of different contexts. But if B responds, ‘Me too, I can’t get enough of Roger Federer!’, then it is marked as ‘specific’ since it relates closely to what is being discussed.
The data that was used to train the model consisted of seven “turns” in the conversation. During training, the model had 2.6 billion parameters which examined 341 GB of text data for patterns, a dataset around 8.5 times larger than the dataset used to train the GPT-2 model created by OpenAI.
Google reported how Meena performed at the Sensibleness and Specificity Average (SSA) metric. The SSA is a metric designed by Google researchers and it’s intended to quantify the ability of a conversational entity to reply with specific, relevant responses as a conversation goes on.
SSA scores are calculated by testing a model against a fixed number of prompts, and the number of sensible responses that the model gives is tracked. The model’s score is derived based on the percentage of sensible/specific responses the model was able to give with respect to the prompts. Generic responses are penalized. According to Google, an average person scores about 86% on the SSA, while Meena was able to score a 79%. Another famous AI model, an agent created by Pandora Bots, won the Loebner Prize in recognition of the fact that their AI bots achieved sophisticated human-like communication. The Pandora Bots agent achieved approximately 56% in the SSA test.
Microsoft and Amazon are also trying to make more flexible and natural chatbots. Microsoft has been attempting to create multiturn dialogue in chatbots for two years, acquiring Semantic Machines, an AI startup, to improve Cortana. Amazon recently ran the Alexa Prize challenge, which prompted participants to design a bot capable of conversing for approximately 20 minutes.
AI Opens Up New Ways To Fight Illegal Opiod Sales And Other Cybercrime
The US HHS (Department of Health and Human Services) and the National Institute on Drug Abuse (NIDA) are investing in the use of AI to curb the illegal sale of opioids and hopefully reduce drug abuse. As Vox reported, NIDA’s AI tool will endeavor to track illegal internet pharmaceutical markets, but the approaches used by the AI could easily be applied to other forms of cybercrime.
One of the researchers responsible for the development of the tool, Timothy Mackey, recently spoke to Vox, where it was explained that the AI algorithms used to track the illegal sale of opioids could also be used to detect other forms of illegal sales, such as counterfeit products and illegal wildlife trafficking.
NIDA’s AI tool must be able to distinguish between general discussion of opioids and attempts to negotiate the sale of opioids. According to Mackey, only a relatively small percentage of tweets referencing opioids are actually related to the illegal sales of opioids. Mackey explained that out of approximately 600,000 tweets referencing one of several different opioids only about 2,000 actually marketed those drugs in any way. The AI-tool must also be robust enough to keep up with changes in the language used to illegally market opioids. People who illegally sell drugs frequently use coded language and non-obvious keywords to sell them, and they quickly change strategies. Mackey explains that misspelled aliases for the names of drugs are commonly used and that images of things other than the drugs in question are often used to creating listings on websites like Instagram.
While Instagram and Facebook ban the marketing of drugs and encourage users to report instances of abuse, the illegal content can be very difficult to catch, precisely because drug sellers tend to change strategies and code words quickly. Mackey explained that these coded posts and hashtags on Instagram typically contain information about how to contact the dealer and purchase illegal drugs from them. Mackey also explained that some illegal sellers represent themselves as legitimate pharmaceutical companies and link to e-commerce platforms. While the FDA has often tried to crack-down on these sites, they remain an issue.
In designing AI tools to detect illegal drug marketing, Mackey and the rest of the research team utilized a combination of deep learning and topic modeling. The research team designed a deep learning model that made use of a Long Short-Term Memory network trained on the text of Instagram posts, with the goal of creating a text classifier that could automatically flag posts that could be related to illegal drug sales. The research team also made use of topic modeling, letting their AI model discern keywords associated with opioids like Fentanyl and Percocet. This can make the model more robust and sophisticated, and it is able to match topics and conversations, not just single words. The topic modeling helped the research team reduce a dataset of around 30,000 tweets regarding fentanyl to just a handful of tweets that seemed to be marketing it.
Markey and the rest of the research team may have developed their AI application for use by NIDA, but social media companies like Facebook, Twitter, Reddit, and YouTube are also investing heavily in the use of AI to flag content that violates their policies. According to Markey, he has been in talks with Twitter and Facebook about such application before, but right now the focus in on creating a commercially available application based off of his research for NIDA, and that he hopes the tool could be used by social media platforms, regulators, and more.
Markey explained that the approach developed for the NIDA research could be generalized to fight other forms of cybercrime, such as the trafficking of animals or the illegal sale of firearms. Instagram has had problems with illegal animal trafficking before, banning the advertising of all animal sales in 2017 as a response. The company also tries to remove any posts related to animal trafficking as soon as they pop up, but despite this there is a continued black market for exotic pets and advertisements for them still show up in Instagram searches.
There are some ethical issues that will have to be negotiated if the NIDA tool is to be implemented. Drug policy experts warn that it could enable the over-criminalization of sales by low-level drug sellers and that it could also give the false impression that the problem is being solved even though such AI tools may not reduce the overall demand for the substance. Nonetheless, if properly used the AI tools could help law enforcement agencies establish links between online sellers and offline supply chains, helping them quantify the scope of the problem. In addition, similar techniques to those used by NIDA could be utilized to help combat opioid addiction, directing people towards rehabilitative sources when searches are made. As with any innovation, there are both risks and opportunities.