Interviews

James Kaplan, CEO & Co-Founder of MeetKai Metaverse – Interview Series

Published

3 years ago

January 21, 2022

James Kaplan, is the CEO & Co-Founder of MeetKai an Artificial Intelligence, VR, and Conversational Search Company based in Los Angeles, California, currently leading the AI speech race with never-before-seen features. Its conversational AI can understand more complex speech and give personalized results in a natural conversation about many subjects, in different realities. MeetKai’s technology is deployed globally through iOS, Google Play, and AppGallery.

You had a passion for AI at the early age of 6, how did you first get introduced to this technology?

My introduction to AI came from video games. First, it was from trying to understand how the AI worked in the game Oregon Trail — not that intelligent, but still a form of AI nonetheless. From there my interest in AI grew further as I got into MMORPGs. I really liked playing online games, but I hated grinding for items. Hence, I got into writing Bots.

What were some of the first AI applications that you coded?

Writing bots for MMOs was really the first foray I had into developing a specific form of AI. In the beginning, my bots were pretty simple and closer to macros than artificial intelligence. But as I got older and as bot detection got better in many games, this started requiring having the bots look more and more like a player. I have always enjoyed writing bots — I ended up writing a bot to win a Taylor Swift contest while I was in school (and she actually came to perform!). Likewise, I also wrote the first Pokémon Go bot and regretfully got many people banned when I lost interest in evading detection.

You launched MeetKai in 2018 after being frustrated with current AI voice assistants. Why do most AI assistants offer a lackluster experience?

The crux of the issue is that most AI assistants depend far too much on external APIs for fulfillment. Even when they control the fulfillment, such as Alexa for e-commerce search, they suffer from the same problems. Simply put, how can you expect a voice assistant to be smart when all it does is turn speech to text and put that text into a text-based search engine? We started MeetKai with the idea that we could provide a “leapfrog” AI assistant by controlling the entire end-to-end processing pipeline that makes up a voice assistant. We developed a conversational search engine rather than a keyword-based one to support more complicated queries and conversations. Other assistants are stuck with lackluster experiences because they can't build multi-turn conversation support on top of such limiting factors. While our goal is to get there, we are still very much at the very early stage of scaling out our technology to fulfill the same number of domains as existing players.

What are some of the natural language understanding and natural language processing challenges behind building a state-of-the-art voice assistant experience?

One of the primary challenges with next-gen NLU is to move beyond intents and entities. Most NLU is focused on having a very traditional approach to language understanding. Each input utterance is classified into intent, and then the tokens within are labeled into entities using a sequence labeling model. I could enumerate dozens of problems with this standard approach. However, the most critical ones are:

An intent classification that is context-free fails to handle a multi-turn conversation. Most approaches only care about the raw text that was transcribed. They don’t care about context — not who the user is, not what the user likes, only what they just asked about. This is particularly important when the user says something verse terse. For example, if someone says cosmopolitan, it might mean the drink or the magazine and is highly dependent on the person.
Entity recognition models do a poor job of anything that is not a categorical value. Large language models are not able to adapt quickly enough to new entities that are in the wild because they are not in the dataset. AI needs to have a much more sophisticated way to recognize entities by considering a much deeper context. For example, a user’s location should heavily influence if something is a restaurant name versus something else.
Entity relationships are not well considered. My favorite example is how often most search engines fail when it comes to negation. Try searching for a movie without romance on other voice assistants, and you'll see what I mean.

Currently most voice assistants simply translate voice to text and conduct a Google search. How does MeetKai AI operate differently than this?

The primary difference between MeetKai and Google when it comes to search is that we utilize a much richer language understanding model to search for items themselves rather than just web pages. When you search “Tom Cruise movies with no action,” Google is looking for pages that have that set of tokens appear on the page (Tom Cruise, movies, action). At MeetKai, we correctly understand that Tom Cruise is an actor, movies are the class of media they are looking for, and that action is the undesired genre. With this, we can conduct much more intelligent searches.

Meetkai recently launched its first lifestyle VR world: MeetKai Metaverse. Could you discuss what this application is specifically?

Most companies in the metaverse space are working on person<>person interaction. Beyond that, the content is also largely either cartoonish or is just a 360° video. Our goal with the MeetKai Metaverse is to focus on an entirely different angle — person<>AI. We are developing a metaverse where the characters you're interacting with are all powered by our cutting-edge Conversational AI. Furthermore, we are working towards performing procedural generation of the environment to make it much more realistic looking and immersive when compared to other companies in the space. The two initial worlds available to explore in our metaverse are for two initial use cases: meditation and museums. In the former, we have digitized a Wing Chun expert, and for the first time, we created an AI character that is able to instruct users on how to use revolutionary meditation techniques to enter a state of relaxation. In the latter, we have created an ever-growing art museum and provided an AI-powered curator capable of answering questions about the art in the space and providing tours.

What are some examples of how AI is used in this Metaverse?

We utilize AI in three places:

To power the conversational capabilities of each character in our metaverse.
To dynamically create the content that is made available to the user through voice guidance. Examples of this include meditation sessions and art gallery tours in our initial two experiences.
To create the 3D space procedurally rather than requiring a hand layout.

What’s your vision for the future of voice assistants?

For voice assistants to have a future, they need to evolve into something much more than a command-based system. This means getting deep expertise and capabilities in many specific domains. I think that assembling different domain-specific voice assistants will be the key to building out an all-intelligent meta assistant. This is in stark contrast to the attempts to “do it all at once” that we have seen since voice assistants first entered the space.

Is there anything else that you would like to share about MeetKai or the MeetKai Metaverse?

We are still at the very beginning of our metaverse roadmap. Our eventual goal is that we want to be able to replicate any experience you have in the real world with the metaverse,and then go beyond it. This means that we want to eliminate the cost and time-prohibitive factors that limit those same experiences in reality. The metaverse can allow us to live much richer lives, not replace them. We have several technical challenges that still need to be solved, however, we have a clear set of milestones that are achievable assuming hardware continues to improve. We are working closely with hardware partners to ensure that the VR space moves forward quickly. Beyond just VR, we want to make our metaverse experience possible outside of VR. We will be announcing more information about this in the coming months.

Thank you for the great interview, I look forward to following your progress on your version of the metaverse. Readers who wish to learn more should visit MeetKai.

Related Topics:Interview meetkai meetkai metaverse

Up Next

Petr Malyukov, CEO & Co-Founder of YOUS – Interview Series

Don't Miss

Edward Cui, Founder & CEO of Graviti – Interview Series

Antoine Tardif

A founding partner of unite.AI & a member of the Forbes Technology Council, Antoine is a futurist who is passionate about the future of AI & robotics.

He is also the Founder of Securities.io, a website that focuses on investing in disruptive technology.