Connect with us

Natural Language Processing

GPT-2, Artificial Intelligence Text-Generator Is Being Released In Full

mm

Published

 on

GPT-2, Artificial Intelligence Text-Generator Is Being Released In Full

As TheNextWeb (TNW) reports, OpenAI, the non-profit organization behind a number of artificial intelligence projects has just published the final model in the planned staged release for GPT-2, a text generator that has caused quite a debate since its announced release in February.

Based on OpenAI’s research paper titled Language Models are Unsupervised Multitask Learners, “GPT-2 uses machine learning to generate novel text-based on limited input.” What that means is that a user can type in a sentence or two about any subject and the AI generator will come up with a text that has some relation to the original input. In essence, as TNW notes, unlike most ‘text generators’ it doesn’t output pre-written strings. GPT-2 makes up text that didn’t previously exist.”

In his tweet, Scott B. Weingart, program director of Carnegie Mellon University Libraries gives a concrete example:

 

OpenAI was initially concerned about possible malicious uses of their system so back in February 2019 it decided to release GPT-2 in four parts over eight months. As they explained in their blog, “Due to our concerns about malicious applications of the technology, we are not releasing the trained model. As an experiment in responsible disclosure, we are instead releasing a much smaller model for researchers to experiment with, as well as a technical paper.”

As explained, the full model contains 1.5 billion parameters. “The more parameters a model is trained with, the ‘smarter’ it appears to be – just like humans, practice makes perfect.”

TNW notes that initially OpenAI released a model with 124 million parameters subsequently followed by releases with 355 and 774 million. According to them, after testing the released models, “each iteration showed a significant improvement in capability over previous iterations.”

To prevent misuse OpenAI released GPT-2 detection models that are supposed “to preemptively combat misuse.” To their own admission in a blog post, these detection models still need additional work to reach the quality level achieved so far in GPT-2 itself.

Those interested can download the GPT-2 model here on Github, check out the model card here, and read OpenAI‘s blog post here.

Spread the love

Deep Learning Specialization on Coursera

Former diplomat and translator for the UN, currently freelance journalist/writer/researcher, focusing on modern technology, artificial intelligence, and modern culture.

AI 101

New AI Powered Tool Enables Video Editing From Themed Text Documents

mm

Published

on

New AI Powered Tool Enables Video Editing From Themed Text Documents

A team of computer science researchers from Tsinghua and Beihand University in China, IDC Herzilya in Israel, and Harvard University have recently created a tool that generates edited videos based on a text description and a repository of video clips.

Massive amounts of video footage are recorded every day by professional videographers, hobbyists, and regular people. Yet editing this video down into a presentation that makes sense is still a costly time investment, often requiring the use of complex editing tools that can manipulate raw footage. The international team of researchers recently developed a tool that takes themed text descriptions and generates videos based on them. The tool is capable of examining video clips in a repository and selecting the clips that correspond with the input text describing the storyline. The goal is that the tool is user-friendly and powerful enough to produce quality videos without the need for extensive video editing skills or expensive video editing software.

While current video editing platforms require knowledge of video editing techniques, the tool created by the researchers lets novice video creates create compositions that tells stories in a more natural, intuitive fashion. “Write-A-Video”, as it is dubbed by its creators, lets users edit videos by just editing the text that accompanies the video. If a user deletes text, adds text, or moves sentences around, these changes will be reflected in the video. Corresponding shots will be cut or added as the user manipulates the text and the final resulting video tailored to the user’s description.

Ariel Shamir, the Dean of the Efi Arazi School of Computer Science at IDC Herzliya explained that the Write-A-Video tool lets the user interact with the video mainly through text, using natural language processing techniques to match video shots based on the provided semantic meaning. An optimization algorithm is then used to assemble the video by cutting and swapping shots. The tool allows users to experiment with different visual styles as well, tweaking how scenes are presented by using specific film idioms that will speed up or slow down the action, or make more/fewer cuts.

The program selects possible shots based on their aesthetic appeal. The program considers how shots are framed, focused, and light in order to determine the aesthetic appeal. The tool  will select shots that are better focused, instead of blurry or unstable, and it will also prioritize shots that are well lit. According to the creators of Write-A-Video, the user can render the generated video at any point and preview it with a voice-over narration that describes the text used to select the clips.

According to the research team, their experiment demonstrated that digital techniques that combine aspects of computer vision and natural language processing can assist users in creative processes like the editing of videos.

“Our work demonstrates the potential of automatic visual-semantic matching in idiom-based computational editing, offering an intelligent way to make video creation more accessible to non-professionals,” explained Shamir to TechXplore.

The researchers tested their tool out on different video repositories combined with themed text documents. User studies and quantitative evaluation was performed to interpret the results of the experiment. The results of the user studies found that non-professionals could sometimes produce high quality edited videos using the tool faster than professionals using frame-based editing software could. As reported by TechXplore, the team will be presenting their work in a few days at the ACM SIGGRAPH Asia conference held in Australia. Other entities are also using AI to augment video editing. Adobe has also been working on its own AI-powered extensions for Premiere Pro, its editing platform. The tool helps people ensure that changes in aspect ratio don’t cut out important pieces of video.

Spread the love

Deep Learning Specialization on Coursera
Continue Reading

AI 101

What is Natural Language Processing?

mm

Published

on

What is Natural Language Processing?

Natural Language Processing (NLP) is the study and application of techniques and tools that enable computers to process, analyze, interpret, and reason about human language. NLP is an interdisciplinary field and it combines techniques established in fields like linguistics and computer science. These techniques are used in concert with AI to create chatbots and digital assistants like Google Assistant and Amazon’s Alexa.

Let’s take some time to explore the rationale behind Natural Language Processing, some of the techniques used in NLP, and some common uses cases for NLP.

Why Is Natural Language Processing Important?

In order for computers to interpret human language, they must be converted into a form that a computer can manipulate. However, this isn’t as simple as converting text data into numbers. In order to derive meaning from human language, patterns have to be extracted from the hundreds or thousands of words that make up a text document. This is no easy task. There are few hard and fast rules that can be applied to the interpretation of human language. For instance, the exact same set of words can mean different things depending on the context. Human language is a complex and often ambiguous thing, and a statement can be uttered with sincerity or sarcasm.

Despite this, there are some general guidelines that can be used when interpreting words and characters, such as the character “s” being used to denote that an item is plural. These general guidelines have to be used in concert with each other to extract meaning from the text, to create features that a machine learning algorithm can interpret.

Natural Language Processing involves the application of various algorithms capable of taking unstructured data and converting it into structured data. If these algorithms are applied in the wrong manner, the computer will often fail to derive the correct meaning from the text. This can often be seen in the translation of text between languages, where the precise meaning of the sentence is often lost. While machine translation has improved substantially over the past few years, machine translation errors still occur frequently.

Natural Language Processing Techniques

What is Natural Language Processing?

Photo: Tamur via WikiMedia Commons, Public Domain (https://commons.wikimedia.org/wiki/File:ParseTree.svg)

Many of the techniques that are used in natural language processing can be placed in one of two categories: syntax or semantics. Syntax techniques are those that deal with the ordering of words, while semantic techniques are the techniques that involve the meaning of words.

Syntax NLP Techniques

Examples of syntax include:

  • Lemmatization
  • Morphological Segmentation
  • Part-of-Speech Tagging
  • Parsing
  • Sentence Breaking
  • Stemming
  • Word Segmentation

Lemmatization refers to distilling the different inflections of a word down to a single form. Lemmatization takes things like tenses and plurals and simplifies them, for example, “feet” might become “foot” and “stripes” may become “stripe”.  This simplified word form makes it easier for an algorithm to interpret the words in a document.

Morphological segmentation is the process of dividing words into morphemes or the base units of a word. These units are things like free morphemes (which can stand alone as words) and prefixes or suffixes.

Part-of-speech tagging is simply the process of identifying which part of speech every word in an input document is.

Parsing refers to analyzing all the words in a sentence and correlating them with their formal grammar labels or doing grammatical analysis for all the words.

Sentence breaking, or sentence boundary segmentation, refers to deciding where a sentence begins and ends.

Stemming is the process of reducing words down to the root form of the word. For instance, connected, connection, and connections would all be stemmed to “connect”.

Word Segmentation is the process of dividing large pieces of text down into small units, which can be words or stemmed/lemmatized units.

Semantic NLP Techniques

Semantic NLP techniques include techniques like:

  • Named Entity Recognition
  • Natural Language Generation
  • Word-Sense disambiguation

Named entity recognition involves tagging certain text portions that can be placed into one of a number of different preset groups. Pre-defined categories include things like dates, cities, places, companies, and individuals.

Natural language generation is the process of using databases to transform structured data into natural language. For instance, statistics about the weather, like temperature and wind speed could be summarized with natural language.

Word-sense disambiguation is the process of assigning meaning to words within a text based on the context the words appear in.

Deep Learning Models For Natural Language Processing

Regular multilayer perceptrons are unable to handle the interpretation of sequential data, where the order of the information is important. In order to deal with the importance of order in sequential data, a type of neural network is used that preserves information from previous timesteps in the training.

Recurrent Neural Networks are types of neural networks that loop over data from previous timesteps, taking them into account when calculating the weights of the current timestep. Essentially, RNN’s have three parameters that are used during the forward training pass: a matrix based on the Previous Hidden State, a matrix based on the Current Input, and a matrix that is between the hidden state and the output. Because RNNs can take information from previous timesteps into account, they can extract relevant patterns from text data by taking earlier words in the sentence into account when interpreting the meaning of a word.

Another type of deep learning architecture used to process text data is a Long Short-Term Memory (LSTM) network. LSTM networks are similar to RNNs in structure, but owing to some differences in their architecture they tend to perform better than RNNs. They avoid a specific problem that often occurs when using RNNs called the exploding gradient problem.

These deep neural networks can be either unidirectional or bi-directional. Bi-directional networks are capable of taking not just the words that come prior to the current word into account, but the words that come after it. While this leads to higher accuracy, it is more computationally expensive.

Use Cases For Natural Language Processing

What is Natural Language Processing?

Photo: mohammed_hassan via Pixabay, Pixabay License (https://pixabay.com/illustrations/chatbot-chat-application-artificial-3589528/)

Because Natural Language Processing involves the analysis and manipulation of human languages, it has an incredibly wide range of applications. Possible applications for NLP include chatbots, digital assistants, sentiment analysis, document organization, talent recruitment, and healthcare.

Chatbots and digital assistants like Amazon’s Alexa and Google Assistant are examples of voice recognition and synthesis platforms that use NLP to interpret and respond to vocal commands. These digital assistants help people with a wide variety of tasks, letting them offload some of their cognitive tasks to another device and free up some of their brainpower for other, more important things. Instead of looking up the best route to the bank on a busy morning, we can just have our digital assistant do it.

Sentiment analysis is the use of NLP techniques to study people’s reactions and feelings to a phenomenon, as communicated by their use of language. Capturing the sentiment of a statement, like interpreting whether a review of a product is good or bad, can provide companies with substantial information regarding how their product is being received.

Automatically organizing text documents is another application of NLP. Companies like Google and Yahoo use NLP algorithms to classify email documents, putting them in the appropriate bins such as “social” or “promotions”. They also use these techniques to identify spam and prevent it from reaching your inbox.

Groups have also developed NLP techniques are being used to identify potential job hires, finding them based on relevant skills. Hiring managers are also using NLP techniques to help them sort through lists of applicants.

NLP techniques are also being used to enhance healthcare. NLP can be used to improve the detection of diseases. Health records can be analyzed and symptoms extracted by NLP algorithms, which can then be used to suggest possible diagnoses. One example of this is Amazon’s Comprehend Medical platform, which analyzes health records and extracts diseases and treatments. Healthcare applications of NLP also extend to mental health. There are apps such as WoeBot, which talks users through a variety of anxiety management techniques based in Cognitive Behavioral Therapy.

To Learn More

Recommended Natural Language Processing CoursesOffered ByDurationDifficulty


Introduction to Artificial Intelligence



IBM

9 Hours

Beginner


Natural Language Processing in TensorFlow


Deep Learning AI

9 Hours

Intermediate


An Introduction to Practical Deep Learning


Intel Software

12 Hours

Intermediate


Natural Language Processing


Higher School of Economics

34 Hours

Advanced
Spread the love

Deep Learning Specialization on Coursera
Continue Reading

Natural Language Processing

Machine Learning Makes Inroads Into The Intricate Art of Translation

mm

Published

on

Machine Learning Makes Inroads Into The Intricate Art of Translation

Language and writing expert Reuven Koret discussed in detail the state of influence and use of artificial intelligence in translation for the online publication readwrite. Koret points out that the use of machine translation tools based on AI in all aspects of the translation process is becoming widespread. This is not solely reserved for proprietary ML translation tools from Google, Microsoft,  Facebook, and Amazon are in daily use, but detailed professional tools from companies like SDL.

Still, many professional translators and agencies like William Mamane, Head of Digital Marketing at Tomedes, a professional language services agency are still skeptics about the use of AI in translation. But even those skeptics like Mamane admit that machine translation has made serious advances, and as he points out, “there still is a place for AI and Machine Translation in the translation services value chain.”

To explain the challenge of machine translation, Koret notes that “at a basic level, MT uses algorithms to substitute words in one language for those in another. That proves insufficient to translate successfully. Understanding of whole phrases is necessary for both source and target languages. We can understand MT as decoding the source language and recording its meaning in the target language.”

Resolving this challenge is a very complex process and currently, the most developed processes are using “statistics to choose the best translation for a given phrase,” or “structured rules to select the most likely meaning.” These approaches still require the engagement of editors and proofreaders, but “that supervisory, editorial, or auditing role is less demanding and less time-consuming than translation.”

These methods are the ones on which most web translation apps like Google Translate are based on. As is noted, Google processed translations that would fill one million books per day. 

Currently, though, even bigger strides in using AI in the translation process are accomplished with the use of neural machine translation (NMT), Using deep learning when translating, “looks at full sentences, not only just individual words.” At the same time, NMT requires “a  fraction of the memory needed by statistical methods,” meaning that at the same time it works much faster.

The use of NMT was first researched only in 2014, but the rapid advances in the last five years have made it possible for the development of the bidirectional recurrent neural network or RNN. “These networks combine an encoder which formulated a source sentence for a second RNN, called a decoder. A decoder predicts the words that should appear in the target language.” Google is no using this approach in the NMT to drive Google Translate. Also, Microsoft uses RNN in Microsoft Translator and Skype Translator.

As Koret concludes, NMTs can assist in translating while skilled linguists can finish and polish the translation output. Future translators will be more often working with artificial intelligence rather than against it.”

 

Spread the love

Deep Learning Specialization on Coursera
Continue Reading