Xinhua, the Chinese state news agency, has released its latest artificial intelligence (AI) 3D news anchor. The AI anchor joins a list of growing virtual presenters that are being developed by the agency.
The AI news anchor is named Xin Xiaowei, and it is modeled after Zhao Wanwei, who is one of the agency’s human news presenters.
According to the search engine Sogou, who co-developed the technology, the AI anchor utilizes “multi-modal recognition and synthesis, facial recognition and animation and transfer learning.”
Here comes Xin Xiaowei, the world's first 3D #AINewsAnchor.
Jointly developed by Sogou and Xinhua News Agency, she will report for Xinhua News Agency on the #TwoSessions, creating a new and dynamic viewing experience. pic.twitter.com/5Tok2Mm3Pl
— Sogou Inc. (@Sogou_Inc) May 21, 2020
The video released by Sogou shows Xin Xiaowei speaking on set about how the anchor can “intelligently imitate human voices, facial expressions, lip movements and mannerisms.”
Previous Virtual Presenters
Xin Xiaowei is not the only virtual presenter that has been developed by Xinhua and the Beijing-based Sogou. It joins a growing list that includes their 2018 digital anchor Qiu Hao and 2019 Russian-speaking version.
In 2018, the pair debuted two different AI news anchors, identical to each other in appearance, at the World Internet Conference. The two versions’ biggest difference was language, with one speaking English and the other Mandarin.
Both of the 2018 models were based on Zhang Zhao, who was another human-anchor like Zhao Wanwei.
In order to develop these first models, hours of video footage was used to replicate the movements, expressions, and other features of real-life anchors.
According to a report released by Xinhua in 2018, “AI anchors have officially become members of Xinhua‘s reporting team. Together with other anchors, they will bring you authoritative, timely and accurate news information in Chinese and English.”
The 2018 AI news anchors were used on various distribution channels including WeChat, the TV webpage, Weibo, and Xinhua’s English and Chinese Apps.
The Russian-speaking anchor was released at the St. Petersburg International Economic Forum 2019. It was developed through a different partnership than the other two versions, with Xinhua working with Russia’s leading news agency, ITAR-TASS.
The announcement came as the two nations celebrated their 70th year of diplomatic relations.
ITAT-TASS is one of the largest news organizations in the world, consisting of a network of businesses, media organizations, diplomatic missions, and financial and research institutions. They have over 1,500 reporters present in more than 63 countries.
“We are very excited to launch the world’s first Russian-speaking AI News Anchor,” said Xiaochuan Wang, CEO of Sogou back in 2018. “The development of the Russian-speaking AI News Anchor allows us to share the benefits of Sogou’s leading AI technologies with more diverse audiences around the world. As one of the world’s largest news organizations, ITAR-TASS is an ideal partner for Sogou, and we look forward to introducing this new AI News Anchor to Russian-speaking audiences.”
The Spread of AI Personalities
The newest AI anchor coming from Xinhua and Soguo highlights the increasing presence of AI-personalities, especially in the realm of media. The technology is improving at such a rapid rate that it will soon be undetectable when put next to a real-life human presenter.
The use of these AI-anchors could dramatically alter the media landscape, but it is really just a part of the larger takeover of AI in the industry. Whether it is AI writers, news anchors, or some other use for the technology, it is going to become increasingly difficult to differentiate between what is human-based and what is AI-based.
Dor Skuler, the CEO and Co-Founder of Intuition Robotics – Interview Series
Dor Skuler is the co-founder and CEO of Intuition Robotics, a company redefining the relationship between humans and machines. They build digital companions including ElliQ – the sidekick for happier aging which improves the lives of older adults.
Intuition Robotics is your fifth venture. What inspired you to launch this company?
Throughout my career, I’ve enjoyed finding brand new challenges that are in need of the latest technology innovations. As technology around us became more sophisticated, we believed that there was a need to redefine the relationship between humans and machines, through digital companion agents. We decided to start with helping older adults stay active and engaged with a social companion. We felt this was an important space to start with, as we could create a solution to help older adults avoid loneliness and social isolation. We’re doing this by focusing on celebrating aging and the joys of this specific period in life, rather than focusing on disabilities.
Intuition Robotics’ first product is ElliQ a digital assistant for the elderly. How does ElliQ help older adults fight loneliness, dementia, etc?
90% of older adults prefer to age at home, and we’re seeing a positive trend of “aging in place” at home and within their own communities, as opposed to moving to a senior care facility. We’re also seeing a strong willingness to adopt non-medical approaches to improve quality of life for older adults, including technologies that allow them to thrive and continue living independently, rather than offerings that only treat issues.
Many home assistants on the market today are reactive and command-based; they only respond to questions and do tasks when prompted. This does little to create a relationship and combat loneliness as you feel like you’re just talking to a machine. ElliQ is different in that she intuitively learns users’ interests and proactively makes suggestions. Instead of waiting for someone to ask her to play music, for example, ElliQ will suggest digital content like TED talks, trivia games, or music. She’ll learn her user’s routines and preferences and will prompt them to engage in an activity after ElliQ notices inactivity. ElliQ creates an emotional bond and helps users feel like they aren’t alone.
You’ve stated that pro-active AI initiated Interactions is very important. In one product demo one of the interesting functions is ElliQ will randomly introduce a piece of interesting information. Is this simply a way of connecting with the user? What are some of the other advantages of doing this?
Proactivity helps to create a bi-directional relationship with the user. Not only is the user prompting the device, but since the device is a goal-based AI agent wanting to motivate the user to be more connected and engaged in the world around him, she’ll proactively initiate interactions that will promote the agent’s goals. Proactivity also helps the user in relating better to the device and feeling as if this is a lifelike entity and not a piece of hardware.
Being pro-active is important, but one of the challenges of a digital assistant is not to annoy a user, how do you tackle this challenge?
We have been designing our digital companions to encompass a “do not disturb the user” goal. This goal is part of our decision making algorithm based on which the agent makes a decision what to proactively initiate. This goal competes with the agent’s other goals such as keeping the user entertained or connected to family members. Based on reinforcement learning, one of these goals “wins”.
Can you discuss designing personality or character in order to enable the human to bond with the machine?
A distinct, character-based personality makes an AI agent more fun, intriguing, and approachable, so the user feels much more comfortable opening up and engaging in a two-way exchange of information. The agent’s personality also provides the unique opportunity to reflect and personify the brand and product that it’s embedded into – we like to think of it as a character in a movie or play. The agent is like an actor that was selected to play a specific role in a movie, serving its unique purpose in its environment (or “scene”) accordingly.
As such, the agent for a car would have a completely different personality and way of communicating than an agent designed for work or home. Nevertheless, its personality should be as distinct and recognizable as possible. We like to describe ElliQ’s personality as a combination of a Labrador and Sam from Lord of the Rings – highly knowledgeable, yet loyal and playful. Discovering the agent’s personality over time helps the user open up and get to know the agent, and the enticement keeps the user coming back for more.
Sometimes an AI may interrupt a conversation or some other event. Is ElliQ programmed to ask for forgiveness? If yes, how is this achieved without further annoying the end user?
ElliQ’s multi modality allows her to express her mistakes. For example, she can bow down her head to signal that she’s apologetic. Overall in designing an AI agent, it is very important to create fail and repair mechanisms that will allow the agent to sophisticatedly apologize for disturbing or not understanding.
One of the interesting things you stated is that ‘users don’t want to anticipate what she (ElliQ) will do next’. Do you believe that people yearn for the type of unpredictability that is normally associated with humans?
We think that users yearn for many elements of human interaction, including quirks like unpredictability, spontaneity, and fun. ElliQ achieves this with unprompted questions, suggestions, and recommendations for activities in the digital world and the physical world. To increase the feeling of having a machine, ElliQ is designed to not repeat herself but to surprise the users with her interactions. This is all to invoke the feeling of being lifelike and to allow the creation of a real bond.
Users will learn to expect ElliQ to anticipate their needs. Do you believe that some type of resentment towards the AI can begin to brew if this anticipation remains unmet?
Yes, I think users will be disappointed if AI around them doesn’t act on their behalf or expectations are unmet. This is why transparency is important when designing such agents – so the user really understands the boundaries of what is possible.
You also stated that users do not see ElliQ as something that is alive, instead they see it as an in-between, something not alive or a machine, but something closer to a presence or a companion. What does this tell us about the human condition, and how should people who design AI systems take this into consideration?
This tells us that as humans, we need interaction and to build relationships in order to feel connected. ElliQ won’t replace other humans, but she can help evoke similar feelings of companionship and help users not feel so lonely or like they’re just talking to a box. She’s much more than an assistant or a machine; she’s a companion with a personality. She’s an emotive entity that users feel as if she lifelike but they truly comprehend that she is actually a device.
Intuition Robotics also has a second product which is an in-car digital companion. Could you give us some details about this product?
In 2019, Toyota Research Institute (TRI) selected Intuition Robotics to collaborate on an in-car AI agent. Through this collaboration, we’re helping TRI create an experience in which the car will engage drivers and passengers in a proactive and personalized way. The experience is powered by our cognitive AI platform, Q. It’s an in-cabin digital companion that creates a unique, personalized experience and aims to accelerate consumer’s trust with autonomy in cars and create a much more engaging and seamless in-car experience.
Thank you for the interview, I really enjoyed learning more about your company and how elliQ can be such a powerful solution for an elderly population that technology often ignores. Readers who wish to learn more should visit Intuition Robotics, or visit ElliQ.
Startups Creating AI App To Alert Pet Owner To Abnormal Behaviors
Sometime in the next few months, the startup Furbo aims to enable its streaming cameras to distinguish between the different types of barks made by dogs, alerting their owners if anything might be wrong. Furbo’s AI-driven system will be able to tell if a dog is whining, howling, or otherwise in alarm or distress, alerting pet owners to a potential problem at home when they are out and about.
Furbo is a company that provides streaming cameras to pet owners. The camera systems are capable of dispensing treats for pets if the owner desires it, and will even alert wonders that their dog is barking, enabling the owner to log in and view a live feed of their home. As reported by the New York Times, Furbo plans on expanding these services over the coming months with a tool that “translates” the various noises a dog makes into data that the system uses to alert owners to a possible crisis.
AI is being used in more and more pet-related products. AI-enabled devices let pet owners dispense treats for their dogs and play with their cats remotely. According to Andrew Bleiman, general manager for Tomofun (Furbo’s producer), the company wants to be able to tell its users when an emergency might be occurring at home, based on the behavior of dogs. Bleiman says this is a natural next step in the use of AI for pet-related products, pointing to how dogs have a long history of being owned and trained to alert their owners to potential danger. Bleiman explained that the AI-enabled version of Furbo was trained on video data gathered from thousands of the device’s users. Ten-second clips of dogs making various noises and communicating were used to train the machine learning algorithms within the device, after the users of the device provided relevant feedback on the clips.
According to Bleiman, the company has gathered a massive amount of data on dogs, which was analyzed using computer vision and bioacoustical techniques. The large amount of data enables the company’s researchers and engineers to create extremely precise and sensitive models, taking into account differences in individual dog breeds.
Tomofun isn’t the only company working on AI devices and models to help owners interact with and monitor their pets. The device known as Petcube comes equipped with Alexa and a camera, and the developers of the device are working on AI techniques that can identify “unusual behaviors”. The device compares a dog’s behavior to a baseline of healthy behavior, perhaps recommending the owner check in with a vet if their dog is unusually lethargic.
Meanwhile, the toy called Felik was designed by Yuri Brigance, and it utilize AI along with computer vision techniques like semantic segmentation to entertain cats throughout the day. The device is equipped with a camera and a laser pointer, and it moves the laser in an intelligent fashion. The device collects data on what the cat has been doing and where it has been, trying to recreate the feeling of chasing live prey for the cat. Cats need a stimulus that resembles hunting in order to be happy and healthy, so in this respect, the device is an animal wellness tool.
Associate professor at the University of Michigan’s School of Information, Lionel P Robert Jr., predicts that the future of AI pet technology will come to focus more on the welfare of pets. Up until now most of the AI-supported technologies centering around pets are for owners to be able to determine that their pet is okay while they are out of the house. Robert hypothesizes that all the data being collected can be processed and fed to veterinarians in real-time, allowing automated systems to keep track of a pet’s health by monitoring variables like movement and weight.
Marc Sloan, Co-Founder & CEO of Scout – Interview Series
Marc Sloan is the Co-Founder & CEO of Scout, the world’s first web browser chatbot, a digital assistant for getting anything done online. Scout suggests useful things it can do for you based on what you’re doing online.
What initially attracted you to AI?
My first experience of working on AI was during a gap year I spent working in the natural language processing research team at GCHQ during my Bachelor’s degree. I got to see first-hand the impact machine learning could have on real world problems and the difference it makes.
It flipped a switch in my mind about how computers can be used to solve problems: software engineering teaches you to create programs that take data and produce results, but machine learning lets you take data and describe the results you want to produce a program. Meaning you can use the same framework to solve thousands of different problems. To me this felt far more impactful than having to write a program for each problem.
I was already studying optimisation problems in mathematics alongside computer science, so once I got back to university I focused on AI and completed my dissertation on speech processing before applying for a PhD in Information Retrieval at UCL.
You researched reinforcement learning in web search under supervision of David Silver, the founder of AlphaGo. Could you discuss some of this research?
My PhD was on the topic of applying reinforcement learning to learning to rank problems in information retrieval, a field I helped create called Dynamic Information Retrieval. I was supervised by Prof Jun Wang and Prof David Silver, both experts in agent-based reinforcement learning.
Our research looked at how search engines could learn from user behaviour to improve search results autonomously over time. Using a Multi-Armed Bandit approach, our system would attempt different search rankings and collect click behaviour to determine if they were effective or not. It could also adapt to individual users over time and was particularly effective in handling ambiguous search queries. At the time, David was focusing deeply on the Go problem and he helped me determine the appropriate reinforcement learning setup of states and value function for this particular problem.
What are some of the entrepreneur lessons that you learned from working with David Silver?
Research at UCL is often entrepreneurial. David had previously founded Elixir studios with Demis Hassabis and then of course joined DeepMind to work on Alpha Go. But other members of our Media Futures research group also ended up spinning out a range of different startups: Jun founded Mediagamma (applying RL to online ad spend), Simon Chan started prediction.io (acquired by SalesForce) and Jagadeesh Gorla started Jaggu (a recommendation service for e-commerce). Our team often discussed the commercial impact our research could have, I think perhaps because UCL’s base in London makes it a natural starting point for creating a business.
You recently launched Scout, the world’s first web browser chatbot. What was the inspiration behind launching Scout?
The idea naturally evolved from my PhD research. I went straight from finishing my PhD to joining Entrepreneur First where I started to think about how I could turn my research into a product.
Before I started this, I completed an internship at Microsoft Research where I applied my research to Bing. At the time, the main thing I learned from my research was that information finding could be predicted based on online user behaviour. But I became frustrated that the only real way to surface these predictions in a search engine was by making auto-suggest better. So I started to think about how the user’s entire online experience could be improved using these predictions, not just the search experience.
It was this thinking that led me and my new co-founder on Entrepreneur First to create a browser add-on that observes user behaviour, predicts what information the user is likely to need next online, and fetches it for them. After a few years of experiments and prototypes, this evolved into a chatbot interface where the browser ‘chats’ to you about what you’re up to online and tries to help you along the way.
Which web browsers will Scout be compatible with?
We’re focusing on Chrome at the moment due to it being the most popular web browser and having a mature add-on architecture, but we have prototypes working on Firefox and Safari and even a mobile app.
The Scout shopping assistant functionality sounds like it could save users both time and money. Assuming someone is researching a product on Amazon, what happens in the backend, and how does Scout interact with the user?
The idea is that once you have Scout installed, you just continue using the web as normal. If you’re shopping, you may visit Amazon to look at products. At this point, Scout recognises that you’re shopping on Amazon, and the product you’re looking at, and it will say “Hello”. It pops up as a chat widget on the webpage, kind of like how Intercom works, except Scout can appear on potentially any webpage. You can see what it looks like on my website.
Because you’re shopping, it’ll start to suggest ways it can help. It’ll ask you if you want to see reviews online, other prices, YouTube videos of the product and more. You interact by pressing buttons and the chatbot tailors the experience to what you want it to do. Whenever it finds information (like a YouTube video), it will embed it within the chat thread, just like how a friend might share media with you on WhatsApp. Over time, you end up having a dialogue with the browser about what you are doing online, with the browser helping you along the way.
The webpage processing happens within the browser itself. The only information our backend sees is the chat thread, meaning that the privacy implications are minimal.
We have a bespoke architecture for understanding online browsing behaviour and managing dialogues with the user. We use machine learning to identify what tasks we can help with online and how we should help. Originally, we used reinforcement learning to adapt to user preferences over time. However, one of the biggest lessons I’ve learned from running an AI startup is to keep processes simple and to try to only use machine learning to optimise an existing process. So instead, we now have a sophisticated rules engine for handling tasks over time that can be managed by reinforcement learning once we need to scale.
What are some examples of how Scout can assist with event planning?
We realised that event planning (and travel booking) are not so different from shopping online. You’re still looking at products, reading reviews and committing to purchase/attend. So a lot of what we’ve built for shopping also applies here.
The biggest difference is that time and location are now important. So for instance, if you’re looking at concert tickets on Ticketmaster, Scout can identify the address of the venue and suggest finding you directions from your current location to it, or find the price of an Uber, or suggest what time you should leave. If you’ve connected Scout into your calendar, then Scout can check to see if you’re available at the time of the event and add it to your calendar for you.
In the future, we foresee Scout users being able to communicate to their friends through the platform to discuss the things they’re doing online such as event planning, shopping, work etc.
Dialogue triggers will be used for Scout to initiate communications. What are some of these triggers?
By default, Scout won’t disturb you unless it encounters a trigger that tells it you may need help. There are several types of trigger:
- Visiting a specific website.
- Visiting a type of website (such as news, shopping etc.).
- Visiting a website containing a certain type of information (i.e. an address, a video etc.).
- Clicking links or buttons on webpages.
- Interacting with Scout by pressing buttons
- Scout retrieving certain types of media such as videos, music, tweets etc.
We plan to allow users to fine-tune what type of triggers they want Scout to respond to, and eventually, learn their preference automatically.
Can you discuss some of the difficulties behind ensuring that Scout is genuinely helpful when it decides to interact with a user without becoming annoying?
We take user engagement very seriously and try to measure whether interactions led to positive or negative outcomes. We try to maintain a good ratio for how often Scout tries to start a conversation and how often it’s used. However, it’s a tricky balance to get right and we’re always trying to improve.
Because of the intrusive nature of this product, getting the interface and UX right is critical. We’ve spent a lot of time trying completely different interfaces and user interaction methods. This work has led us to the current, chatbot style interface, which we find gives us the greatest flexibility in the help we can provide, coupled with user familiarity and minimal user effort for interactions.
Can you provide other scenarios of how Scout can assist end users?
Our focus at the moment is in market-testing specific applications for Scout. Shopping and event planning have already been mentioned, but we’re also looking at how Scout can help academics (with finding research papers, author details and reference networks) and even guitarists (finding guitar sheet music, playing music and videos alongside sheet music online and helping to tune a guitar). We’ve also spent some time exploring professional scenarios such as online recruitment, financial analysis and law.
Ultimately, Scout can potentially work on any website and help in any scenario, which is what makes the technology incredibly exciting, but also makes it difficult to get started.
Is there anything else that you would like to share about Scout?
If you’d like to see what it’s like if your browser could talk to you, you can read more on Scout’s blog.
- How Quantum Mechanics will Change the Tech Industry
- Jim McGowan, head of product at ElectrifAi – Interview Series
- NASA to Use Machine Learning to Enhance Search for Alien Life on Mars
- New Study Attempts to Improve Hate Speech Detection Algorithms
- Pentagon’s Joint AI Center (JAIC) Testing First Lethal AI Projects