Connect with us

Natural Language Processing

Google Adds Two New Artificial Intelligence Features To Its Applications

mm

Published

 on

Google Adds Two New Artificial Intelligence Features To Its Applications

As  The Verge and CNET report, Google is adding two new AI features to its applications. The first is the  Smart Compose feature that will help Google Docs users, while the second is the capability for the users to buy movie tickets through its Duplex booking system.

Smart Compose

With Smart Compose, when it becomes fully available, the users will be able to access “AI-powered writing suggestions outside of their inbox.” At the moment, “only domain administrators can sign up for the beta.”

This new feature will use Google’s machine learning models which will study the user’s “past writing to personalize its prompts (in Gmail you can turn this feature off in settings).” Theoretically, this would mean that Smart Compose is supposed to give writing suggestions based on the writing style of the user.

The Verge suggests  that “Smart Compose to Google Docs could be a big step up for the tool, challenging its AI autosuggestions with a larger range of writing styles.” The new tool could be applied to all documents that can be created with the application – “from schoolwork to corporate planning documents,” to first drafts of a novel.

In the beginning, Google will limit Smart Compose’s reach and will target businesses only. As mentioned, Smart Compose for Docs is only available in beta, only in English, and only domain administrators can volunteer to test it. (You can sign up for it here.)

Google Duplex

Another feature that Google announced on November 21, is Duplex on the Web, a tool that can be used as a booking service that lets users buy movie tickets easily.

As CNET notes, the “ service is available on Android phones. To use it, you’d ask the Assistant — Google’s digital helper software akin to Amazon’s Alexa and Apple’s Siri — to look up showtimes for a particular movie in your area. The software then opens up Google’s Chrome browser and finds the tickets. “

To offer the service Google partnered with “ 70 movie theater and ticket companies, including AMC, Fandango and Odeon.” The company plans to expand the booking system to car rental reservations next.

The AI software itself included in the tool is “patterned after the human speech, using verbal tics like ‘uh’ and ‘um.’ It speaks with the cadence of a real person, pausing before responding and elongating certain words as though it’s buying time to think.” Duplex actually premiered last year and offered to book for restaurants and hair salons. “Google later said it would build in disclosures so people would know they were talking to automated software.“

As explained, in the new Duplex version for ordering movie tickets works as follows: “Once you’ve asked the Assistant for movie tickets, the software opens up a ticketing website in Chrome and starts filling in fields. The system enters information in the form by using data culled from your calendar, Gmail inbox and Chrome autofill (like your credit card and login information). 

Throughout the process, you see a progress bar, like you’d see if you were downloading a file. Whenever the system needs more information, like a price or seat selection, the process pauses and prompts you to make a selection. When it’s done, you tap to confirm the booking or payment.”

Spread the love

Former diplomat and translator for the UN, currently freelance journalist/writer/researcher, focusing on modern technology, artificial intelligence, and modern culture.

Natural Language Processing

Google’s New Meena Chatbot Can Hold Sensible, Specific Conversations About Almost Anything

mm

Published

on

Google's New Meena Chatbot Can Hold Sensible, Specific Conversations About Almost Anything

As impressive and useful as virtual assistants like Siri, Alexa, and Google Assistant are, their conversational skills are typically limited to receiving certain commands and delivering pre-defined responses. Companies like Google and Amazon have been pursuing methods of AI training and development that can make AI chatbots more robust and flexible, able to carry on conversations with users in a much more natural way. As reported by DigitalTrends, Google has recently published a paper demonstrating the capabilities of its new chatbot, dubbed “Meena”. According to a blog post from the researchers, Meena can engage in conversation with its users on just about any topic.

Meena is an open-domain chatbot, meaning that it responds to the context of the conversation so far and adapts to inputs in order to deliver more natural responses. Most other chatbots are closed-domain, which means that their responses are themed around certain ideas and limited to accomplishing specific tasks.

According to Google’s report, Meena’s flexibility was the result of a massive training dataset. Meena was trained on around 40 billion words pulled from social media conversations and filtered for the most relevant and representative words. Google aimed to deal with some of the problems that are found in most voice assistants, such as an ability to handle topics and commands that unfold over multiple turns in the conversation, with the user providing additional inputs after the bot has responded to one input. This means that man chatbots are unable to prompt the user for clarification and when there is a query that can’t be interpreted they often just default to web results.

In order to deal with this particular problem, Google’s researchers enabled its algorithms to keep track of the context of the conversation, meaning that it can generate specific answers. The model used an encoder that processes what has already been said in the conversation and a decoder that creates a response based on the context. The model was trained on specific and non-specific data. Specific data is words that are closely related to the proceeding statement. As the Google post explained:

“For example, if A says, ‘I love tennis,’ and B responds, ‘That’s nice,’ then the utterance should be marked, ‘not specific’. That reply could be used in dozens of different contexts. But if B responds, ‘Me too, I can’t get enough of Roger Federer!’,  then it is marked as ‘specific’ since it relates closely to what is being discussed.

The data that was used to train the model consisted of seven “turns” in the conversation. During training, the model had 2.6 billion parameters which examined 341 GB of text data for patterns, a dataset around 8.5 times larger than the dataset used to train the GPT-2 model created by OpenAI.

Google reported how Meena performed at the Sensibleness and Specificity Average (SSA) metric. The SSA is a metric designed by Google researchers and it’s intended to quantify the ability of a conversational entity to reply with specific, relevant responses as a conversation goes on.

SSA scores are calculated by testing a model against a fixed number of prompts, and the number of sensible responses that the model gives is tracked. The model’s score is derived based on the percentage of sensible/specific responses the model was able to give with respect to the prompts. Generic responses are penalized. According to Google, an average person scores about 86% on the SSA, while Meena was able to score a 79%. Another famous AI model, an agent created by Pandora Bots, won the Loebner Prize in recognition of the fact that their AI bots achieved sophisticated human-like communication. The Pandora Bots agent achieved approximately 56% in the SSA test.

Microsoft and Amazon are also trying to make more flexible and natural chatbots. Microsoft has been attempting to create multiturn dialogue in chatbots for two years, acquiring Semantic Machines, an AI startup, to improve Cortana. Amazon recently ran the Alexa Prize challenge, which prompted participants to design a bot capable of conversing for approximately 20 minutes.

Spread the love
Continue Reading

Natural Language Processing

AI Opens Up New Ways To Fight Illegal Opiod Sales And Other Cybercrime

mm

Published

on

AI Opens Up New Ways To Fight Illegal Opiod Sales And Other Cybercrime

The US HHS (Department of Health and Human Services) and the National Institute on Drug Abuse (NIDA) are investing in the use of AI to curb the illegal sale of opioids and hopefully reduce drug abuse. As Vox reported, NIDA’s AI tool will endeavor to track illegal internet pharmaceutical markets, but the approaches used by the AI could easily be applied to other forms of cybercrime.

One of the researchers responsible for the development of the tool, Timothy Mackey, recently spoke to Vox, where it was explained that the AI algorithms used to track the illegal sale of opioids could also be used to detect other forms of illegal sales, such as counterfeit products and illegal wildlife trafficking.

NIDA’s AI tool must be able to distinguish between general discussion of opioids and attempts to negotiate the sale of opioids. According to Mackey, only a relatively small percentage of tweets referencing opioids are actually related to the illegal sales of opioids. Mackey explained that out of approximately 600,000 tweets referencing one of several different opioids only about 2,000 actually marketed those drugs in any way. The AI-tool must also be robust enough to keep up with changes in the language used to illegally market opioids. People who illegally sell drugs frequently use coded language and non-obvious keywords to sell them, and they quickly change strategies. Mackey explains that misspelled aliases for the names of drugs are commonly used and that images of things other than the drugs in question are often used to creating listings on websites like Instagram.

While Instagram and Facebook ban the marketing of drugs and encourage users to report instances of abuse, the illegal content can be very difficult to catch, precisely because drug sellers tend to change strategies and code words quickly. Mackey explained that these coded posts and hashtags on Instagram typically contain information about how to contact the dealer and purchase illegal drugs from them. Mackey also explained that some illegal sellers represent themselves as legitimate pharmaceutical companies and link to e-commerce platforms. While the FDA has often tried to crack-down on these sites, they remain an issue.

In designing AI tools to detect illegal drug marketing, Mackey and the rest of the research team utilized a combination of deep learning and topic modeling. The research team designed a deep learning model that made use of a Long Short-Term Memory network trained on the text of Instagram posts, with the goal of creating a text classifier that could automatically flag posts that could be related to illegal drug sales. The research team also made use of topic modeling, letting their AI model discern keywords associated with opioids like Fentanyl and Percocet. This can make the model more robust and sophisticated, and it is able to match topics and conversations, not just single words. The topic modeling helped the research team reduce a dataset of around 30,000 tweets regarding fentanyl to just a handful of tweets that seemed to be marketing it.

Markey and the rest of the research team may have developed their AI application for use by NIDA, but social media companies like Facebook, Twitter, Reddit, and YouTube are also investing heavily in the use of AI to flag content that violates their policies. According to Markey, he has been in talks with Twitter and Facebook about such application before, but right now the focus in on creating a commercially available application based off of his research for NIDA, and that he hopes the tool could be used by social media platforms, regulators, and more.

Markey explained that the approach developed for the NIDA research could be generalized to fight other forms of cybercrime, such as the trafficking of animals or the illegal sale of firearms. Instagram has had problems with illegal animal trafficking before, banning the advertising of all animal sales in 2017 as a response. The company also tries to remove any posts related to animal trafficking as soon as they pop up, but despite this there is a continued black market for exotic pets and advertisements for them still show up in Instagram searches.

There are some ethical issues that will have to be negotiated if the NIDA tool is to be implemented. Drug policy experts warn that it could enable the over-criminalization of sales by low-level drug sellers and that it could also give the false impression that the problem is being solved even though such AI tools may not reduce the overall demand for the substance. Nonetheless, if properly used the AI tools could help law enforcement agencies establish links between online sellers and offline supply chains, helping them quantify the scope of the problem. In addition, similar techniques to those used by NIDA could be utilized to help combat opioid addiction, directing people towards rehabilitative sources when searches are made. As with any innovation, there are both risks and opportunities.

Spread the love
Continue Reading

Big Data

Ricky Costa, CEO of Quantum Stat – Interview Series

mm

Published

on

Ricky Costa, CEO of Quantum Stat - Interview Series

Ricky Costa is the CEO of Quantum Stat a company that offers business solutions for NLP and AI Initiatives

What initially got you interested in artificial intelligence?

Randomness. I was reading a book on probability when I came across a famous theorem. At the time, I naively wondered if I could apply this theorem into a natural language problem I was attempting to solve at work. As it turns out, the algorithm already existed unbeknownst to me, it was called the Naïve Bayes, a very famous and simple generative model used in classical machine learning. That theorem was Bayes theorem. I felt this coincidence was a clue, and planted a seed of curiosity to keep learning more.

 

You’re the CEO of Quantum Stat a company which offers solutions for Natural Language Processing. How did you find yourself in this position?

When there’s a revolution in a new technology some companies are most hesitant than others when facing the unknown. I started my company because pursuing the unknown is fun to me.  I also felt it was the right time to venture into the field of NLP given all of the amazing research that has arrived in the past 2 years. The NLP community has the capacity now to achieve a lot more with a lot less given the advent of new NLP techniques that require less data to scale performance.

 

For readers who may not be familiar with this field, could you share with us what Natural Language Processing does?

NLP is a subfield of AI and analytics that attempts to understand natural language in text, speech or multi-modal learning (text and images/video) and computing it to the point where you are driving insight and/or providing a valuable service. Value can arrive from several angles, from information retrieval in a company’s internal file system, to classifying sentiment in the news, or a GPT-2 twitter bot that helps with your social media marketing (like the one we built couple of weeks ago).

 

You have a Bachelor of Arts from Hunter College in Experimental Psychology. Do you feel that understanding the human brain and human psychology is an asset when it comes to understanding and expanding the field of Natural Language Processing?

This is contrarian, but unfortunately, no. The analogy of neurons and deep neural networks is simply for illustration and instilling intuition. One can probably learn a lot more from complexity science and engineering. The difficulty with understanding how the brain works is that we are dealing with a complex system. “Intelligence” is an emergent phenomenon from the brain’s complexity interacting with its environment, and very difficult to pin down. Psychology and other social sciences, which are dependent on “reductionism” (top-down) don’t work under this complex paradigm. Here’s the intuition: imagine someone attempting to reduce the Beatle’s song “Let it Be” to the C Major scale. There’s nothing about that scale that predicts “Let it Be” will emerge from it. The same follows with someone attempting to reduce behavior to neural activity in the brain.

 

Could you share with us why Big Data is so important when it comes to Deep Learning and more specifically Natural Language Processing?

As it stands, because deep learning models interpolate data, the more data you feed into the model the less edge cases it will see when making an inference in the wild. This architecture “incentivizes” large datasets to be computed by models in order to increase accuracy of output. However, if we want to achieve more intelligent behavior by AI models, we need to look beyond how much data we have and more towards how we can improve the ability of model’s ability to reason more efficiently, which intuitively, shouldn’t require lots of data. From a complexity perspective, the cellular automata experiments conducted in the past century by physicists John von Neumann and Stephen Wolfram show that complexity can emerge from simple initial conditions and rules. What these conditions/rules should be with regards to AI, is what everyone’s hunting.

 

You recently launched the ‘Big Bad NLP Database’. What is this database and why does it matter to those in the AI industry?

This database was created for NLP developers to have a seamless access to all the pertinent datasets in the industry. This database helps to index datasets which has a nice secondary effect of being able to be queried by users. Preprocessing data takes the majority of time in the deployment pipeline, and this database attempts to mitigate this problem as much as possible. In addition, it’s a free platform for anyone regardless of whether you are an academic researcher, practitioner, or independent AI guru that wants to get up to speed with NLP data. Link

 

Quantum Stat currently offers end-end solutions. What are some of these solutions?

We help companies facilitate their NLP modeling pipeline by offering development at any stage. We can cover a wide range of services from data cleaning in the preprocessing stage all the way up to model server deployment in production (these services are also highlighted on our homepage). Not all AI projects come to fruition due to the unknown nature of how your specific data/project architecture works with a state-of-the-art model. Given this uncertainty, our services give companies a chance to iterate on their project at the fraction of cost of hiring a full-time ML engineer.

 

What recent advancement in AI do you find the most interesting?

The most important advancement of late is the transformer model, you may have heard of it: BERT, RoBERTa, ALBERT, T5 and so on. These transformer models are very appealing because they allow the researcher to achieve state-of-the-art performance with a smaller datasets. Prior to transformers, a developer would require a very large dataset to train a model from scratch. Since these transformers come pretrained on billions of words, it allows for faster iteration of AI projects and it’s what we are mostly involved with at the moment.

 

Is there anything else that you would like to share about Quantum Stat?

We are working on a new project dealing with financial market sentiment analysis that will be released soon. We have leveraged multiple transformers to give unprecedented insight to how financial news unfolds in real-time. Stay tuned!

To learn more visit Quantum Stat or read our article on the Big Bad NLP Database.

Spread the love
Continue Reading