There is an ongoing debate among AI researchers whether Artificial Intelligence, as TheNext Web (TNW) notes, “will soon be able to develop the kind of general intelligence that humans have,” with heated arguments for and against.
But there is yet another field of knowledge where AI is making giant steps forward, and that is with Natural Language Processing (NLP), a part of a much larger umbrella of machine learning, with “an aim to assess, extract and evaluate information from textual data.” To that effect, TNW points to a paper recently published in Nature which reports that an AI has now “managed to predict future scientific discoveries by simply extracting meaningful data from research publications.”
Researching and understanding a specific scientific question requires the obvious step of consulting books, specialized publications, web pages, and any other relevant sources. Of course, this can be an extremely time-consuming exercise, particularly if we have a very complex problem or question at hand. That is where NLP comes in. By using “sophisticated methods and techniques, computer programs can identify concepts, mutual relationships, general topics and specific properties from large textual datasets.”
As is discussed in the aforementioned study, “so far, most of the existing automated NLP-based methods are supervised, requiring input from humans. Despite being an improvement compared to a purely manual approach, this is still a labor-intensive job.” But researchers who prepared this paper were able to create an AI system that “could accurately identify and extract information independently. It used sophisticated techniques based on statistical and geometrical properties of data to identify chemical names, concepts, and structures. This was based on about 1.5 million abstracts of scientific papers on material science.”
Then, this machine learning program “classified words in the data based on specific features such as “elements”, “energetics” and “binders”. For example, “heat” was classified as part of “energetics”, and “gas” as “elements”. This helped connect certain compounds with types of magnetism and similarity with other materials among other things, providing insight on how the words were connected with no human intervention required.”
This method made it possible for the AI to “capture complex relationships and identify different layers of information, which would be virtually impossible to carry out by humans.” This made it possible to give insights well ahead in comparison to what the scientists dealing with the field are able to do at this moment. AI actually recommended materials “for functional applications several years before their actual discovery. There were five such predictions, all based on papers published before the year 2009. For example, the AI managed to identify a substance known as CsAgGa2Se4as as a thermoelectric material, which scientists only discovered in 2012. So if the AI had been around in 2009, it could have speeded up the discovery.”
Text-Based Video Game Created With OpenAI’s Powerful GPT-2 Algorithm
A neuroscience graduate student at Northwestern University recently created a text-based video game where the text the user reads is entirely generated by AI. The AI responsible for generating the text is based on the GPT-2 algorithm created by OpenAI earlier this year.
Many early computer games had no graphics, instead, they use a text-based interface. These text adventure games would take in user commands and deliver a series of pre-programmed responses. The user would have to use text commands to solve puzzles and advance farther in the game, a task which could prove challenging depending on the sophistication of the text parser. Early text-based adventure games had a very limited range of potential commands the game could respond to.
As reported by ZME Science, Nathan Whitemore, a neuroscience graduate from Northwestern University, has revitalized this game concept, using AI algorithms to generate responses in real-time, as opposed to pre-programmed responses. Whitmore was apparently inspired to create the project by a Mind Game that appeared in the Sci-Fi novel Ender’s Game, which responded to user input and reformed the game world around the user.
The algorithm that drives the text-based adventure game is the GPT2 algorithm, which was created by OpenAI. The predictive text algorithm was trained on a text dataset, dubbed WebText, which was over 40 GB in size and pulled from Reddit links. The result was an extremely effective predictive text algorithm that could generate shockingly realistic and natural-sounding paragraphs, achieving state-of-the-art performance on a number of different language tests. The OpenAI algorithm was apparently so effective at generating fake news stories that OpenAI was hesitant to release the algorithm to the public, fearing its misuse. Thankfully, Whitmore has used the algorithm for something much more benign then making fake news articles. ‘
Whitmore explained to Digital Trends that in order to produce the game he had to modify the GPT-2 output by training it extensively on a number of adventure game scripts, using various algorithms to adjust the parameters of GPT-2 until the text output by the algorithm resembled the text of adventure games.
What’s particularly interesting about the game is that it is genuinely creative. The user can input almost any text that they can think of, regardless of the particular setting or context of the game, and the game will try to adapt and determine what should happen next. Whitemore explained that you can enter almost any random prompt you would like because the model has enough “ common sense” to adapt to the input.
Whitemore’s custom GPT2 Algorithm does have some limitations. It easily forgets things the user has already told it, having a short “memory”. In other words, it doesn’t preserve the context of the situation with regards to commands, as a traditional pre-programmed text adventure game would, and of course, like many passages of text generated by AI, the generated text doesn’t always make sense.
However, the program does markedly well at simulating the structure and style of text adventure games, providing the user with descriptions of the setting and even providing them with various options they can select to interact with the environment it has created.
“I think it’s creative in a very basic way, like how a person playing ‘Apples to Apples’ is creative,” Whitmore explained. “It’s taking things from old adventure games and rearranging them into something that’s new and interesting and different every time. But it’s not actually generating an overall plot or overarching idea. There are a lot of different kinds of creativity and I think it’s doing one: Generating novel environments, but not the other kinds: Figuring out an intriguing plot for a game.”
Whitmore’s project also seems to confirm that the GPT-2 algorithms are robust enough to be used for a wide variety of other purposes outside of generating text intended only to be read. Whitemore demonstrates the algorithms can be used in a system that enables user responses and feedback, and it will be interesting to see what other responsive applications of GPT-2 will surface in the future.
Researchers Develop JL2P Computer Model to Translate Film Scripts Into Animations
Researchers at Carnegie Mellon University have developed a computer model that is capable of translating text that describes physical movements into simple computer-generated animations. These new developments could make it possible for movies and other animations to be created directly from a computer model reading the scripts.
Scientists have been making progress in getting computers to understand both natural language and generate physical poses from script. This new computer model can be the link between them.
Louis-Philippe Morency, an associate professor in the Language Technologies Institute (LTI), and Chaitanya Ahuja, an LTI Ph.D. student, have been using a neural architecture that is called Joint Language-to-Pose (JL2P). The JL2P model is capable of jointly embedding sentences and physical motions. This allows it to learn how language is connected to action, gestures, and movements.
“I think we’re in an early stage of this research, but from a modeling, artificial intelligence and theory perspective, it’s a very exciting moment,” Morency said. “Right now, we’re talking about animating virtual characters. Eventually, this link between language and gestures could be applied to robots; we might be able to simply tell a personal assistant robot what we want it to do.
“We also could eventually go the other way — using this link between language and animation so a computer could describe what is happening in a video,” he added.
The Joint Language-to-Pose model will be presented by Ahuja on September 19 at the International Conference on 3D Vision. That conference will be taking place in Quebec City, Canada.
The JL2P model was created by a curriculum-learning approach. The first important step was for the model to learn short, easy sequences. That would be something like “A person walks forward.” It then moved on to longer and harder sequences such as “A person steps forward, then turns around and steps forward again,” or “A person jumps over an obstacle while running.”
When the model is using the sequences, it looks at verbs and adverbs. These describe the action and speed/acceleration of the action. Then, it looks at nouns and adjectives which describe locations and directions. According to Ahuja, the end goal for the model is to animate complex sequences with multiple actions that are happening simultaneously or in sequence.
As of right now, the animations are limited to stick figures, but the scientists are going to keep developing the model. One of the complications that arises is that according to Morency, a lot of things are happening at the same time. Some of them are even happening in simple sequences.
“Synchrony between body parts is very important,” Morency said. “Every time you move your legs, you also move your arms, your torso and possibly your head. The body animations need to coordinate these different components, while at the same time achieving complex actions. Bringing language narrative within this complex animation environment is both challenging and exciting. This is a path toward better understanding of speech and gestures.”
If the Joint Language-to-Pose model is able to develop to the point in which it can create complex animations and actions based on language, the possibilities are huge. Not only can it be used in areas such as film and animation, but it will also help lead to developments in understanding speech and gestures.
Turning to artificial intelligence, this JL2P model could be used on robots. For example, robots might be able to be controlled and told what to do, and they would be able to understand the language and respond accordingly.
These new developments will impact many different fields, and the model will keep getting more capable of understanding complex language.
Researchers Develop Set of Questions Able to Confuse the Best Computers
Researchers from the University of Maryland were able to create a set of questions that are easy for people to answer but hard for some of the best computer answering systems that exist today. The team generated the questions through a human-computer collaboration, and they were able to create a database of more than 1,200 words. If a computer system is able to learn and master these questions, it will have the best understanding of human language among any computer systems that currently exist.
The work was published in an article in the journal Transactions of the Association for Computational Linguistics.
Jordan Boyd-Gaber, an associate professor of computer science at UMD and senior author of the paper, spoke about the new developments.
“Most question-answering computer systems don’t explain why they answer the way they do, but our work helps us see what computers actually understand,” he said. “In addition, we have produced a dataset to test on computers that will reveal if a computer language system is actually reading and doing the same sorts of processing that humans are able to do.”
As of right now, questions for these programs and systems are generated by human authors or computers. The problem is that when humans are the ones generating questions, they aren’t aware of all of the different elements of a question that are confusing to computers. Computer systems on the other hand, they use formulas, write fill-in-the blank questions, or make mistakes all which can generate nonsense.
In order to get cooperation between humans and computers that allowed them to generate the questions, Boyd-Garber and the team of researchers created a special computer interface. According to them, it is able to tell what a computer is “thinking” while a human types out a question. The writer is then able to edit and change the question based on the computer’s weaknesses. This is able to generate confusion for the computer.
As the writer types the question, the computer’s guesses are put in a ranked order. The words that are responsible for the computer’s guesses are highlighted.
The system can correctly answer a question and the interface will highlight the words or phrases that led to the answer. With that info, the author is then able to edit the question to make it more difficult for the computer, but the question will still have the same meaning. While the computer will eventually be confused, expert humans would still be able to answer.
When the humans and computers worked together, they were able to develop 1,213 computer questions that the computer was not able to answer. The researchers tested the questions in a competition between human players and the computers. The human players included high school trivia teams and “Jeopardy!” champions. The weakest human team was able to defeat the strongest computer system.
Shi Feng, a computer science graduate student from UMD and co-author of the paper spoke about the new research.
“For three or four years, people have been aware that computer question-answering systems are very brittle and can be fooled very easily,” she said. “But this is the first paper we are aware of that actually uses a machine to help humans break the model itself.”
The questions used were able to reveal six different language phenomena that confuse computers. There are two different categories. The first one is linguistic phenomena that includes paraphrasing, distracting language, and unexpected contexts. The second is reasoning skills and includes logic and calculation, mental triangulation of elements in a question, and putting together multiple steps to form a conclusion.
“Humans are able to generalize more and to see deeper connections,” Boyd-Garber said. “They don’t have the limitless memory of computers, but they still have an advantage in being able to see the forest for the trees. Cataloguing the problems computers have helps us understand the issues we need to address, so that we can actually get computers to begin to see the forest through the trees and answer questions in the way humans do.”
This research lays the foundation for computer systems to eventually master the human language. It will undoubtedly keep getting developed and improved.
“This paper is laying out a research agenda for the next several years so that we can actually get computers to answer questions well,” Boyd-Garber said.