Connect with us

Reinforcement Learning

DeepMind Creates AI That Replays Memories Like The Hippocampus




The human brain often recalls past memories (seemingly) unprompted. As we go throughout our day, we have spontaneous flashes of memory from our lives. While this spontaneous conjuration of memories has long been of interest to neuroscientists, AI research company DeepMind recently published a paper detailing how an AI of theirs replicated this strange pattern of recall.

The conjuration of memories in the brain, neural replay, is tightly linked with the hippocampus. The hippocampus is a seahorse-shaped formation in the brain that belongs to the limbic system, and it is associated with the formation of new memories, as well as the emotions that memories spark. Current theories on the role of the hippocampi (there is one in each hemisphere of the brain), state that different regions of the hippocampus are responsible for the handling of different types of memories. For instance, spatial memory is believed to be handled in the rear region of the hippocampus.

As reported by Jesus Rodriguez,  Dr. John O’Keefe is responsible for many contributions to our understanding of the hippocampus, including the hippocampal “place” cells. The place cells in the hippocampus are triggered by stimuli in a specific environment. As an example, experiments on rats showed that specific neurons would fire when the rats ran through certain portions of a track. Researchers continued to monitor the rats even when they were resting, and they found that the same patterns of neurons denoting a portion of the maze would fire, although they fired at an accelerated speed. The rats seemed to be replaying the memories of the maze in their minds.

In humans, recalling memories is an important part of the learning process, but when trying to enable AI to learn, it is difficult to recreate the phenomenon.

The DeepMind team set about trying to recreate the phenomenon of recall using reinforcement learning. Reinforcement learning algorithms work by getting feedback from their interactions with the environment around them, getting rewarded whenever they take actions that bring them closer to the desired goal. In this context, the reinforcement learning agent records events and then plays them back at later times, with the system being reinforced to improve how efficiently it ends up recalling past experiences.

DeepMind added the replaying of experiences to a reinforcement learning algorithm using a replay buffer that would playback memories/recorded experiences to the system at specific times. Some versions of the system had the experiences played back in random orders while other models had pre-selected playback orders. While the researchers experimented with the order of playback for the reinforcement agents, they also experimented with different methods of replaying the experiences themselves.

There are two primary methods that are used to provide reinforcement algorithms with recalled experiences. These methods are the imagination replay method and the movie replay method. The DeepMind paper uses an analogy to describe both of the strategies:

“Suppose you come home and, to your surprise and dismay, discover water pooling on your beautiful wooden floors. Stepping into the dining room, you find a broken vase. Then you hear a whimper, and you glance out the patio door to see your dog looking very guilty.”

As reported by Rodriguez, the imagination replay method doesn’t record the events in the order that they were experienced. Rather, a probable cause between the events is inferred. The events are inferred based on the agent’s understanding of the world. Meanwhile, the movie replay method stores memories in the order in which the events occurred, and replays the sequence of stimuli – “spilled water, broken vase, dog”. The chronological ordering of events is preserved.

Research from the field of neuroscience implies that the movie replay method is integral to the creation of associations between concepts and the connection of neurons between events. Yet the imagination replay method could help the agent create new sequences when it reasons by analogy. For instance, the agent could reason that if a barrel is to oil as a vase is to water, a barrel could be spilled by a factory robot instead of a dog. Indeed, when DeepMind probed further into the possibilities of the imagination replay method, they found that their learning agent was able to create impressive, innovative sequences by taking previous experiences into account.

Most of the current progress being made in the area of reinforcement learning memory is being made with the movie strategy, although researchers have recently begun to make progress with the imagination strategy. Research into both methods of AI memory can not only enable better performance from reinforcement learning agents, but they can also help us gain new insight into how the human mind might function.

Spread the love

Reinforcement Learning

AI Model Might Let Game Developers Generate Lifelike Animations




A team of researchers at Electronic Arts have recently experimented with various artificial intelligence algorithms, including reinforcement learning models, to automate aspects of video game creation. The researchers hope that the AI models can save their developers and animators time doing repetitive tasks like coding character movement.

Designing a video game, particularly the large, triple-A video games designed by large game companies, requires thousands of hours of work. As video game consoles, computers, and mobile devices become more powerful, video games themselves become more and more complex. Game developers are searching for ways to produce more game content with less effort, for example, they often choose to use procedural generation algorithms to produce landscapes and environments. Similarly, artificial intelligence algorithms can be used to generate video game levels, automate game testing, and even animate character movements.

Character animations for video games are often completed with the assistance of motion capture systems, which track the movements of real actors to ensure more life-like animations. However, this approach does have limitations. Not only does the code that drives the animations still need to be written, but animators are also limited only to the actions that have been captured.

As Wired reported, researchers from EA set out to automate this process and save both time and money on these animations. The team of researchers demonstrated that a reinforcement learning algorithm could be used to create a human model that moves in realistic fashions, without the need to manually record and code the movements. The research team used “Motion Variational Autoencoders” (Motion VAEs) to identify relevant patterns of movement from motion-capture data. After the autoencoders extracted the movement patterns, a reinforcement learning system was trained with the data, with the goal of creating realistic animations based on certain objectives (such as running after a ball in a soccer game). The planning and control algorithms used by the research team were able to generate the desired motions, even producing motions that weren’t in the original set of motion-capture data. This means that after learning how a subject walks, the reinforcement learning model can determine what running looks like.

Julian Togelius, NYU professor and AI tools company co-founder was quoted by Wired as saying that the technology could be quite useful in the future and is likely to change how content for games is created.

“Procedural animation will be a huge thing. It basically automates a lot of the work that goes into building game content,” Togelius said to Wired.

According to professor Michiel van de Panne from UBC, who was involved with the reinforcement learning project, the research team is looking to take the concept further by animating non-human avatars with the same process. Van de Panne said to Wired that although the process of creating new animations can be quite difficult, he is confident the technology will be able to render appealing animations someday.

Other applications of AI in the development of video games include the generation of basic games. For instance, researchers at the University of Toronto managed to design a generative adversarial network that could recreate the game Pac-Man without access to any of the code used to design the game. Elsewhere, researchers from the University of Alberta used  AI models to generate levels of video games based off on the rules of different games like Super Mario Bros. and Mega Man.

Spread the love
Continue Reading


Dor Skuler, the CEO and Co-Founder of Intuition Robotics – Interview Series




Dor Skuler is the co-founder and CEO of Intuition Robotics, a company redefining the relationship between humans and machines.  They build digital companions including ElliQ – the sidekick for happier aging which improves the lives of older adults.

Intuition Robotics is your fifth venture. What inspired you to launch this company?

Throughout my career, I’ve enjoyed finding brand new challenges that are in need of the latest technology innovations. As technology around us became more sophisticated, we believed that there was a need to redefine the relationship between humans and machines, through digital companion agents. We decided to start with helping older adults stay active and engaged with a social companion. We felt this was an important space to start with, as we could create a solution to help older adults avoid loneliness and social isolation. We’re doing this by focusing on celebrating aging and the joys of this specific period in life, rather than focusing on disabilities.


Intuition Robotics’ first product is ElliQ a digital assistant for the elderly. How does ElliQ help older adults fight loneliness, dementia, etc?

90% of older adults prefer to age at home, and we’re seeing a positive trend of “aging in place” at home and within their own communities, as opposed to moving to a senior care facility. We’re also seeing a strong willingness to adopt non-medical approaches to improve quality of life for older adults, including technologies that allow them to thrive and continue living independently, rather than offerings that only treat issues.

Many home assistants on the market today are reactive and command-based; they only respond to questions and do tasks when prompted. This does little to create a relationship and combat loneliness as you feel like you’re just talking to a machine. ElliQ is different in that she intuitively learns users’ interests and proactively makes suggestions. Instead of waiting for someone to ask her to play music, for example, ElliQ will suggest digital content like TED talks, trivia games, or music. She’ll learn her user’s routines and preferences and will prompt them to engage in an activity after ElliQ notices inactivity. ElliQ creates an emotional bond and helps users feel like they aren’t alone.


You’ve stated that pro-active AI initiated Interactions is very important. In one product demo one of the interesting functions is ElliQ will randomly introduce a piece of interesting information. Is this simply a way of connecting with the user? What are some of the other advantages of doing this?

Proactivity helps to create a bi-directional relationship with the user. Not only is the user prompting the device, but since the device is a goal-based AI agent wanting to motivate the user to be more connected and engaged in the world around him, she’ll proactively initiate interactions that will promote the agent’s goals. Proactivity also helps the user in relating better to the device and feeling as if this is a lifelike entity and not a piece of hardware.


Being pro-active is important, but one of the challenges of a digital assistant is not to annoy a user, how do you tackle this challenge?

We have been designing our digital companions to encompass a “do not disturb the user” goal. This goal is part of our decision making algorithm based on which the agent makes a decision what to proactively initiate. This goal competes with the agent’s other goals such as keeping the user entertained or connected to family members. Based on reinforcement learning, one of these goals “wins”.


Can you discuss designing personality or character in order to enable the human to bond with the machine?

A distinct, character-based personality makes an AI agent more fun, intriguing, and approachable, so the user feels much more comfortable opening up and engaging in a two-way exchange of information. The agent’s personality also provides the unique opportunity to reflect and personify the brand and product that it’s embedded into – we like to think of it as a character in a movie or play. The agent is like an actor that was selected to play a specific role in a movie, serving its unique purpose in its environment (or “scene”) accordingly.

As such, the agent for a car would have a completely different personality and way of communicating than an agent designed for work or home. Nevertheless, its personality should be as distinct and recognizable as possible. We like to describe ElliQ’s personality as a combination of a Labrador and Sam from Lord of the Rings – highly knowledgeable, yet loyal and playful. Discovering the agent’s personality over time helps the user open up and get to know the agent, and the enticement keeps the user coming back for more.


Sometimes an AI may interrupt a conversation or some other event. Is ElliQ programmed to ask for forgiveness? If yes, how is this achieved without further annoying the end user?

ElliQ’s multi modality allows her to express her mistakes. For example, she can bow down her head to signal that she’s apologetic. Overall in designing an AI agent, it is very important to create fail and repair mechanisms that will allow the agent to sophisticatedly apologize for disturbing or not understanding.


One of the interesting things you stated is that  ‘users don’t want to anticipate what she (ElliQ) will do next’. Do you believe that people yearn for the type of unpredictability that is normally associated with humans?

We think that users yearn for many elements of human interaction, including quirks like unpredictability, spontaneity, and fun. ElliQ achieves this with unprompted questions, suggestions, and recommendations for activities in the digital world and the physical world. To increase the feeling of having a machine, ElliQ is designed to not repeat herself but to surprise the users with her interactions. This is all to invoke the feeling of being lifelike and to allow the creation of a real bond.


Users will learn to expect ElliQ to anticipate their needs. Do you believe that some type of resentment towards the AI can begin to brew if this anticipation remains unmet?

Yes, I think users will be disappointed if AI around them doesn’t act on their behalf or expectations are unmet. This is why transparency is important when designing such agents – so the user really understands the boundaries of what is possible.


You also stated that users do not see ElliQ as something that is alive, instead they see it as an in-between, something not alive or a machine, but something closer to a presence or a companion. What does this tell us about the human condition, and how should people who design AI systems take this into consideration?

This tells us that as humans, we need interaction and to build relationships in order to feel connected. ElliQ won’t replace other humans, but she can help evoke similar feelings of companionship and help users not feel so lonely or like they’re just talking to a box. She’s much more than an assistant or a machine; she’s a companion with a personality. She’s an emotive entity that users feel as if she lifelike but they truly comprehend that she is actually a device.


Intuition Robotics also has a second product which is an in-car digital companion. Could you give us some details about this product?

In 2019, Toyota Research Institute (TRI) selected Intuition Robotics to collaborate on an in-car AI agent. Through this collaboration, we’re helping TRI create an experience in which the car will engage drivers and passengers in a proactive and personalized way. The experience is powered by our cognitive AI platform, Q. It’s an in-cabin digital companion that creates a unique, personalized experience and aims to accelerate consumer’s trust with autonomy in cars and create a much more engaging and seamless in-car experience.

Thank you for the interview, I really enjoyed learning more about your company and how elliQ can be such a powerful solution for an elderly population that technology often ignores. Readers who wish to learn more should visit Intuition Robotics, or visit ElliQ.

Spread the love
Continue Reading

Reinforcement Learning

Google’s AI teaches robots how to move by watching dogs




Even some of the most advanced robots today still move in somewhat clunky, jerky ways. In order to get robots to move in more lifelike, fluid ways, researchers at Google have developed an AI system that is capable of learning from the motions of real animals. The Google research team published a preprint paper that detailed their approach late last week. In the paper and an accompanying blog post, the research team describes the rationale behind the system. The authors of the paper believe that endowing robots with more natural movement could help them accomplish real-world tasks that require precise movement, such as delivering items between different levels of a building.

As VentureBeat reported, the research team utilized reinforcement learning to train their robots. The researchers began by collecting clips of real animals moving and using reinforcement learning (RL) techniques to push the robots towards imitating the movements of the animals in the video clips. In this case, the researchers trained the robots on clips of a dog, designed in a physics simulator, instructing a four-legged Unitree Laikago robot to imitate the dog’s movements. After the robot was trained it was capable of accomplishing complex motions like hopping, turning, and walking swiftly, at a speed of around 2.6 miles per hour.

The training data consisted of approximately 200 million samples of dogs in motion, tracked in a physics simulation. The different motions were then run through reward functions and policies that the agents learned with. After the policies were created in the simulation, they were transferred to the real world using a technique called latent space adaptation. Because the physics simulators used to train the robots could only approximate certain aspects of real-world motion, the researchers randomly applied various perturbations to the simulation, intended to simulate operation under different conditions.

According to the research team, they were able to adapt the simulation policies to the real-world robots utilizing just eight minutes of data gathered from across 50 different trials. The researchers managed to demonstrate that the real-world robots were able to imitate a variety of different, specific motions like trotting, turning around, hopping, and pacing. They were even able to imitate animations created by animation artists, such as a combination hop and turn.

The researchers summarize the findings in the paper:

“We show that by leveraging reference motion data, a single learning-based approach is able to automatically synthesize controllers for a diverse repertoire [of] behaviors for legged robots. By incorporating sample efficient domain adaptation techniques into the training process, our system is able to learn adaptive policies in simulation that can then be quickly adapted for real-world deployment.”

The control policies used during the reinforcement learning process had their limitations. Because of constraints imposed by the hardware and algorithms, there were a few things the robots simply couldn’t do. They weren’t able to run or make large jumps, for instance. The learned policies also didn’t exhibit as much stability when compared with movements that were manually designed. The research team wants to take the work farther by making the controllers more robust and capable of learning from different types of data. Ideally, future versions of the framework will be able to learn from video data.

Spread the love
Continue Reading