Richard Boyd is an entrepreneur, author and speaker on a range of topics from education to healthcare to virtual worlds, computer gaming, machine learning and human / computer interfaces. Over three decades, Richard has led or helped create some of the most innovative technology companies and services across several industries, including starting and serving as chief executive officer of four companies in North Carolina’s Research Triangle Park region. Richard sold his last company to Lockheed Martin, and stayed on as Director of the Virtual World Labs.
Richard is co-founder and CEO of artificial intelligence and machine learning company Tanjo Inc based in North Carolina’s Research Triangle Park region.
You’ve been working on VR since the 90s, and in 2001 you co-founded 3Dsolve. What were some of the initial projects that 3Dsolve worked on?
The most impactful first project for 3dsolve was helping the US Army Training and Doctrine Command (TRADOC) learn how to harness simulation learning for small unit tactical operations. We created the first “Level 4 interactive multimedia instruction” (IMI) course for the Army that passed TRADOC validation. Essentially this was harnessing the value of “safe practice in a fully simulated 3D environment for ground units” The first course was over 100 hours of instruction in a collaborative simulation 3D game-based world for the 25B10 MOS (military occupational specialty) for communications.
We were sending soldiers into Afghanistan and Iraq at the time and training them to work in DTOCs (digital tactical operations centers) when we had no DTOCS in the US to train them. 3Dsolve traveled to Fort Hood, Fort Gordon and various other facilities in order to find equipment, meet with subject matter experts and build the virtual DTOCs where soldiers could train. The validation results determined that soldiers using this method were trained in a shorter amount of time with a higher degree of knowledge (and class pass rates) than in any classroom methods used before. I think of this as the beginning of the serious games industry.
I also served on the ADL (Advanced Distributed Learning) Colab advisory board where the standards for reusable learning content were created. I worked with Phillip Dodds there on 3d standards for SCORM (Shareable content object reference model). Side note: Phillip was the guy who played the organ in the Spielberg movie Close Encounters of the Third Kind.
I also served on another international standards body called 3DIF, chaired by Intel and Boeing, where we created an international ECMA standard by the same name for 3d interchange formats. The idea was to finally capture all of the value in the 3D CAD models built for everything made in the World and translate it for use in serious games and 3D technical documents. It lives on in Adobe Acrobat and other platforms.
We continued to work with headsets and various VR peripherals, collaborating with industry pioneers Warren Robinett, Dr. Fred Brooks, Alan Kay and others. My co-founder David Smith created an entire open source platform together with Alan Kay called OpenCroquet, that still lives today.
How did life change for you after Lockheed Martin acquired 3Dsolve in 2007?
One of the other pioneering projects we worked on at 3Dsolve that caused Lockheed to buy us was a simulation of an entire Los Angeles class submarine. At the time the navy was still setting entire ships aside for training. We pioneered the idea of “Total Ship Simulation”, replicating the entire submarine in a multiplayer game environment. We used the Epic Games Unreal engine and really transformed the training for these vessels. At Lockheed we created sims of destroyers, the Littoral Combat Ship as well as all of the sub systems.
It was a challenge at first adapting from a game technology company to the extra layers of oversight and reporting necessary in a 100 year old defense contractor. We learned how to create our own reality. I formed an informal organization called Virtual World Labs and modeled it after the famous LM Skunkworks in California. In fact, the Skunkworks became a member of VWL . We learned in the first year that any time you come up with a patent proposal you get a check, and another one when it is granted. So, this became our incentive program. We sat around inventing stuff in AR, VR and AI. By the end of my 5-and-a-half-year stint there we had accumulated over 100 patent applications in a small group of about 40 people.
One of the most fun and relevant programs was the creation of the DOD Virtual World Framework. We had been part of several large-scale joint war game exercises and watched the frustrations over lack of interoperability among proprietary systems that needed to work together in these big live, constructive and virtual simulation exercises. Our first reaction was that this was a solved problem…It’s called the Internet! If the acquisition community could just get everyone to adhere to web services we can build better integrated training systems. And WebGL had just passed muster with the World Wide Web Consortium. The time had come for a shakeup of business as usual. The Pentagon put out a request for proposal seeking a “common virtual world framework” for integrated simulation. The head of that program at the Pentagon was a creative former air force pilot named Frank Digiovanni. We called him D9. He reminds me a lot of the stories of Col John Boyd who pushed for creative destruction of our fighter jet programs and thinking in the air force.
The problem was, D9 explicitly told his acquisition team that he “didn’t want any Lockmart like companies” to build that new framework. But David and I went in there along with the usual list of about 17 vendors bidding on the program and we won it. We learned afterwards that everyone else was showing up, predictably, with some proprietary solution and trying to get the government to adopt it. We showed up for our verbal presentation at the Reagan building in DC with nothing, but said that the answer to this deep problem lies in the architecture of the Internet. We said we could design it in a few months and have a working prototype in six. We also said it should be open source. We won “hands down slam dunk” according to D9, because our approach was so fresh and different and “outside the box”.
When I returned to Orlando to explain this new win to the leadership of our Lockheed Martin division I was kind of called out on the carpet. I was congratulated, but then they asked about the open source part. “What will this mean to our existing constructive sim business?” they asked. My reply -“Oh, it will completely disrupt it.” There was a pause, then the inevitable question “So, how will we make money? What is the business model?” My reply – “The business models will be legion.” I still savor the confused frowns that reply evoked. I went through all of the ways Red Hat managed to build a multibillion-dollar business on the back of free software, but I don’t think they ever got comfortable with the disruption.
My title at the time was Director of Emerging and Disruptive Technology, along with my title of Chief Architect of Virtual World Labs. I worked for the next year trying to get Lockheed to embrace more self-disruption and Schumpeter creative destruction. I described innovation in big organizations like Lockheed as being similar to childbirth. People love the idea of having children. It is good for society and very rewarding. Children are our future. But through the wrong lens, children can also be considered parasites. Beginning when they first take purchase in the womb, they begin taking resources away. If it were not for the sheltering conditions of the womb, the Mother’s antibodies would come out to destroy the baby. Innovation in Lockheed was like that. Everyone wants and talks about innovation, but no one wants to sacrifice their resources for it when the payoff is so far away. (See my white board animation about how innovation is like childbirth.)
During your time at Lockheed Martin one of the patents you co-authored sounds like something out of a science fiction blockbuster, called the holodeck. What exactly is the holodeck?
In 2009, I was invited out to Los Angeles by James Cameron during production of his movie Avatar. We had worked with Jim before (on “The Abyss”) and he wanted to show me his new 3d camera he had invented with Vince Pace (who we also knew from The Abyss). But the thing that really captivated me was the virtual set inside the huge Hughes aircraft hangar. I spent a lot of time in there with a little flat panel screen just wandering around the virtual world of Pandora. I wrote about this for Armed Forces Journal and conceived with David Smith the idea of building a big virtual training battle ground the size of a football field. at the time we were working on a program called Future Immersive Training Environment (FITE) for the Marines. In that program the marine would wear a laptop on their back and the headmount. All of this extra gear really caused some concern about negative training. I will never forget the first sergeant who strapped it on and said, “we have to train like we fight right?” and then dove on the ground and rolled, smashing all of the electronics to chunky useless bits. The Holodeck concept was more like James Cameron’s Volume for filming; where the actors have light tracking suits and all of the instrumentation is around them. The head mounted display was still necessary but was wireless and lightweight. More like today’s Oculus Quest. We even figured out a way to do it outside in sunlight.
In 2015 you wrote that we should not worry so much about machines taking over, but instead we should figure out how to achieve the right balance between humans and automation to optimize outcomes. Do you still feel society is overly concerned about AGI or machines taking over?
I think when really smart people with an abundance of expertise in this space (like Ray Kurzweil, Stephen Hawking, Elon Musk, James Cameron and Bill Gates) express concerns, we should all pay attention and track the progress towards artificial general intelligence and the implications for society. If we have learned anything recently it is that domain expertise matters, and we should always heed warnings from people with deeper expertise than our own.
Having said that, for the foreseeable future we are seeing more incremental disruption that merits immediate attention and action. My distilled quote about the 21st century imperative of achieving the right balance between humans and automation to optimize outcomes is a critical issue right now. I really think anyone who does not get that right is doomed to be irrelevant soon, not just not competitive. When JP Morgan replaced 320,000 hours a year of legal review of loan agreements with a machine learning system called COIN, they disrupted themselves and immediately created a $300 million benefit to their bottom line. And that benefit is now an annuity. Any of their competitors who still have that cost cannot hope to compete.
I believe this is true and imperative for companies, governments, even individuals. I am on the board of a community college in North Carolina with 70,000 students. I am constantly trying to guide students and our curriculum towards those jobs that will still be performed by humans five years from now. When I find students, who want to go into radiology I explain to them that machines are already better than humans at reading xrays. Consider a new field or how that field is likely to change with that reality. This isn’t futurism. It’s Nowism.
You’ve stated that humans think linearly and that machines think exponentially. Clearly you are an exponential thinker, why is it so difficult for humans to think exponentially?
70% of Americans cannot read and comprehend the science section of the New York Times (Michigan State study). Authors like Dan Ariely, in his best seller Predictably Irrational, and others talk about how we humans just are not good at statistical thinking. Exponential and logarithmic thinking are also not very universal. My mentor and hero Alan Kay has a great Ted Talk on universals and non-universals in education. I wrote about this in an article In the Getting Smart website about Rethinking Education from First Principles. Essentially abstract and deductive reasoning are difficult unless taught. We absolutely have an education issue that is hampering our ability to grasp the progress of Moore’s Law or the likely spread of a pandemic.
The current pandemic shines another bright light on the implications of having leadership who cannot think exponentially (or heed expertise).
You’ve worked in VR since the 90s, how do you feel about some of the current VR consumer applications such as the Oculus Quest?
Every time I see the roller coaster of VR hype heading back up the slope of the track I start reminding everyone of the three main constraints that are preventing widespread adoption.
- Some humans will forever be physically unable to enjoy stereoscopic 3D VR.
- The friction of setup and connection make it an experience that not many will find delightful.
- The brittleness of the systems mean that only expert hobbyists will want to tinker with it and trouble shoot failed connections
Humans haven’t had an upgrade in a long time (since the Pleistocene by my reckoning) and some of us have a very difficult time adapting to stereoscopic 3d displays. A significant portion of the population is never going to get comfortable with immersive VR because of how they are physically wired. So, setting them aside for a moment, we are left with the second big issue: the horrible friction of tethering to these devices. Way too much cabling and tuning before one can get into an experience. And third is the fragility and brittleness of all of these extra dongles and connectors.
The Oculus Quest vastly exceeded my expectations by completely blowing out the second and third constraints. In my family we spend time in Oculus Quest content almost every day. This is, in my view, the big breakthrough that VR needed. Now we just need to go the last mile and see how we can adapt the tech to meet those who have physical constraints that prevent enjoyment of VR.
What was your inspiration behind launching Tanjo?
I discovered machine learning in 2009 while I was running Virtual World Labs at Lockheed Martin. Machine learning already existed, of course, but that was the year I fully understood how far it had come, and how it was fundamentally different from the “Artificial intelligence” we had been using in computer games and DOD simulations before then.
I now think of AI as advancing in 3 stages. In the first stage, that lasted from around 1958 until 2009 (my arbitrary marker) we wouldn’t ask a computer to compute something until humans understood it completely and could break it down into brittle little logic gates and if/then statements. We would then feed that into the computers as finite state machines or hierarchical behavior trees and run the programs. In the end, it was all just code. Nothing mystical about it.
The next phase is machine learning, where a human doesn’t necessarily even understand how to tell a machine to drive a car. Now we just feed a massive set of training data to a group of well-designed machine learning libraries that then infer their own understanding. Today a machine learning system can just watch 100 hours of video and go out and drive an autonomous vehicle flawlessly anywhere. (I usually make the joke ‘anywhere but in Rome’)
At Tanjo we are using machine learning in short burst projects to give banks and institutions of higher education and fortune 2000 companies the intelligence amplification and automation that is transforming how they work. We routinely see returns on investment of 10x from our implementations. And that return is usually an annuity. How many technology investments have we seen before this that create those kinds of productivity gains? We have had validated ROI measurements of as much as 600x; and one embarrassing result of 1600x. We don’t even use the last one as a case study because it feels too hyperbolic.
Could you discuss the Tanjo Animated Personas (TAP), and how it works?
Our big breakthrough came when we realized that these amazing, weird machine intelligence systems we were building looked at people the same way that they looked at information objects. We ran an early experiment with a set of training data from a popular dating app. Our little mini machine learning brain created interest graphs and sentiment maps of each person from their data exhaust that emerged looking something like a Myers Briggs profile. We thought briefly about making a machine learning dating app in 2014. It was a very brief consideration, because it didn’t meet the lofty goals, we had for doing meaningful work.
Instead, we called it the “Empathy Engine” and built what we called “Tanjo Animated Personas” from these machine learning patterns of human behavior.
The analyst firm Gartner gave us a “Cool Vendor Award” in 2018 for this breakthrough. We are helping market researchers’ model and understand (and hopefully forge deeper meaningful conversations with their customers; as well as using it to model populations of people to study health and wellness. For example: we can create a synthetic population model of a zip code or a county and simulate which interventions and messages encourage better behavior to reduce the spread of a virus, or to lower obesity, smoking etc.
Do you use supervised learning to educate the Tanjo?
The balance struck between humans and machines is as important in the input as it is in the output of these systems. Human supervision absolutely helps with training the “Brain” of one of our Machine learning systems faster. When we created the NC brain that will tie all 58 community college in North Carolina together we worked with faculty and administrators at some of the top colleges here to ensure that its ranking of different areas of knowledge and how it sorted the content was valid.
One of the Tanjo products is the ContractBot for contract analysis. What is the ContractBot and what types of enterprises is it primarily designed for?
We created Contractbot initially for the accounting industry. In 2017 FASBI (Financial Accounting Standards Board Interpretations) was releasing new rules around revenue recognition and lease recognition for business. Accounting firms were holding conferences around the country trying to prepare themselves and their clients for these changes. With our machine learning lens, we realized this was a perfect opportunity for a narrowly focused machine learning system to work alongside accountants to dramatically increase speed of analysis as well as increase accuracy. We trained a system on over four million contracts: everything from a one page scanned in, hand-scrawled purchase order, to contracts with a hundred pages of warranties and disclaimers and milestone payment descriptions. It learned very quickly to understand the language and sort the documents or sections of documents and apply the business rules to almost instantly performa analysis that would take a human accountant all day.
This project and others are case studies we provide to encourage anyone in business today to take the new machine/human balance lens and take a close look at every activity to determine what is the right mix of human and machine effort to optimize their business.
When JP Morgan used this approach to eliminate 320,000 hours annually of loan analysis, they not only realized a 100x return on their investment that year, but will receive that annuity payout every year going forward. Any competitor of theirs who is still doing “business as usual”, who still has that cost won’t just be uncompetitive, but irrelevant.
One of the most exciting products that Tanjo offers is the Tanjo’s Enterprise Brain. What type of machine learning is behind this and what are its use cases?
When we used machine learning to help the US Department of Education create the Learning Registry we saw the power of machine learning to organize and analyze knowledge. When I talk about this I usually show a slide with an image from the last scene of “Raiders of the Lost Ark”; where a clerk is wheeling a drab looking crate through an immense warehouse to file away this incredibly potent artifact that can save or destroy the planet, and it has a little tag on it that says ARC.
What we learned from the Learning Registry project and others is that enterprise search is broken. Companies have untagged and hidden information squirreled away in little data lakes and ponds that are either inaccessible or inscrutable and therefore not transparent to inquiry. In this accelerating Information Age, we are losing knowledge gained every day because of poor storage and retrieval methods.
The Tanjo Enterprise Brain lives inside your firewall, with full source code, connects to everything and has nothing to do but read and scan and organize everything it has access to and wait for the exciting moment when it detects a human trying to do something that could make use of the vast information map at its fingertips. Because it has so much time and power and intimacy with your organizational knowledge, it does not settle for reducing its reading of “War and Peace” to #Russiannovel #Tolstoy #warstory #lovestory. Instead it will map it with what we call a “hyperdimensional fingerprint” of up to 4,000 weighted concepts. This seemingly overpowered effort pays great dividends for the research institutions, banks and colleges with Tanjo Enterprise Brains. It is customary for them to realize value far in excess of the license fee just in the organizational knowledge mapping step that is part of training your Enterprise Brain, when leadership learns how much of the knowledge they invest in and depend upon is actually there and what it all means. When the Enterprise Brain is implemented the organization now has a lens with which to see how information entered its systems, who championed it, who challenged it and ultimately how decisions got made. It is becoming a need that is retrospectively obvious. And like the machine learning system JP Morgan implemented, pays dividends forever.
Is there anything else that you would like to share about Tanjo?
Tanjo is working hard right now on a Covid-19 brain. In keeping with the thesis that drives our company, we are determining how to achieve the man and machine balance to make sure the right information and best resources are available to people making important decisions during this crisis. The Tanjo Animated Personas capability will be used to model human population data to track viral spread, but also determine which measure and communication methods and the actual words, will get the behavior we need to help us successfully navigate out of this crisis to a healthier ecosystem for all of us.
This has been a fascinating conversation, readers who may wish to learn more should visit Tanjo.
Akilesh Bapu, Founder & CEO of DeepScribe – Interview Series
Akilesh Bapu is the Founder & CEO of DeepScribe, which uses natural language processing (NLP) and advanced deep learning to generate accurate, compliant, and secure notes of doctor-patient conversations.
What was it that introduced and attracted you to AI and natural language processing?
If I remember correctly, Jarvis from “Iron Man” was the first thing that really attracted me to the world of natural language processing and AI. Particularly, I found it fascinating how much faster a human was able to not only go through tasks but also go into an incredible level of depth into certain tasks and unveil certain information that they wouldn’t have even known about if it weren’t for this AI.
It was this concept of “AI by itself won’t be as good as humans at most tasks but put a human and AI together and that combination will dominate.” Natural language processing is the most efficient way for this human/AI combination to happen.
From then on, I was obsessed with Siri, Google Now, Alexa, and the others. While they didn’t work as seamlessly as Jarvis, I so badly wanted to make them work as Jarvis did. Particularly, what became apparent was, commands such as “Alexa do this,” “Alexa do that,” were pretty easy and accurate to do with the current state of technology. But when it comes to something like Jarvis, where it can actually learn and understand, filter, and pick up on important topics during another conversational exchange—that hadn’t really been done before. This actually directly relates to one of my core motivations in founding DeepScribe. While we are solving the issue of documentation for physicians, we’re attempting a whole new wave of intelligence while doing it: ambient intelligence. AI that can dig through your day-to-day utterances, find useful information, and use that information to help you out.
You previously did some research using deep learning and NLP at UC Berkeley College of Engineering. What was your research on?
Back at the Berkeley AI Research Lab, I was working on a gene ontology annotator project where we were summarizing PubMed articles with specific output parameters.
The high-level overview: Take a task like the CNN news article summarization. In that task you’re taking news articles and summarizing them into roughly a few sentences. In your favor you have data and the ability to train these models on over a million articles. However, the problem space is enormous since you have limited structure to the summaries. In addition, there is hardly any structure to the actual articles. While there have been quite a few improvements since 2.5 years ago when I was working on this project, this is still an unsolved problem.
In our research project, however, we were developing structured summaries of articles. A structured summary in this case is similar to a typical summary except we know the exact structure of the output summary. This is helpful since it dramatically reduces the output options for our machine learning model—the challenge was that there was not enough annotated training to run a data-hungry deep learning model and get usable results.
The core of the work I did on this project was to leverage the knowledge we have around the input data and develop an ensemble of shallow ML models to support it—a technique we invented called the 2-step annotator. The 2-step annotator benchmarked at nearly 20x the accuracy as the previous best (54 percent vs 3.6 percent).
While side by side, this project and DeepScribe may sound entirely different, they were highly similar in how they used the 2-step annotation method to vastly improve results on a limited dataset.
What was the inspiration behind launching DeepScribe?
It all started with my father, who was a medical oncologist. Before electronic health record systems took over health care, physicians would jot down things on paper and spend very little time on notes. However, once EHRs started becoming popular as part of the HITECH Act of 2009, I started noticing that my dad spent more and more time at the computer. He’d start coming home later. On the weekends, he’d be sitting on the couch dictating notes. Simple things like him picking me up from school or basketball practice became a thing of the past as he’d be spending most of his evening hours catching up on documentation.
As a nerdy kid growing up, I would try to find solutions for him by searching the web and having him try them out. Sadly, nothing worked well enough to save him from the long hours of documentation.
Fast forward several years to the summer of 2017—I’m a researcher working at the Berkeley AI Research Lab, working on projects in document summarization. One summer when I’m back at home, I notice that my dad is still spending copious amounts of time documenting. I ask, “What’s new in the world of documentation? Alexa is everywhere, Google Assistant is so good now. Tell me, what’s the latest in the medical space?” And his answer was, “Nothing has changed.” I thought that it was just him but when I went and surveyed several of his colleagues, it was the same issue: not what the latest is in cancer treatment or the novel problems their patients were having—it was documentation. “How can I get rid of documentation? How can I save time on documentation? It’s taking so much of my time.”
I also noticed several companies that had emerged to try to solve documentation. However, either they were too expensive (thousands of dollars per month) or they were too minimal in terms of technology. The physicians at that time had very few options. That was when the opportunity opened up that if we could create an artificially intelligent medical scribe, a technology that could follow physicians’ patient visits and summarize them, and offer it at a cost that could make it accessible for everyone, it could truly bring the joy of care back to medicine.
You were only 22 years old when you launched DeepScribe. Can you describe your journey as an entrepreneur?
At Berkeley, I continued to delve into the world of entrepreneurship as much as possible, primarily with their wide array of classes. My favorites were:
- The Newton Lecture Series—people like Jessica Mah from InDinero or Diane Greene from VMWare who were Cal alums gave highly relatable talks about their time at Berkeley and how they started their own companies
- Challenge Lab—I actually met my co-founder Matt Ko through this class. We were placed in groups and went through a semester-long journey of creating a product and being mentored on what it takes during the early stages to get an idea going.
- Lean Launchpad—By far my favorite of the three; this was a grueling and rigorous process where we were guided by Steve Blank (acclaimed billionaire and the man behind the lean startup movement) to take an idea, validate it through 100 customer interviews, build a financial model, and more. This was the type of class where we pitched our “startup” only to get stopped on slide 1 or 2 and get grilled. If that wasn’t hard enough, we were also expected to interview 10 customers a week. Our idea at the time was to create a patent search that would give similar results to an expensive prior art search, which meant we were pitching to 10 enterprise customers a week. It was great because it taught us to think fast on our feet and be extra resourceful.
DeepScribe started when an investor group called The House Fund was writing checks for students who would turn down their summer internships and spend their summer building their company. We had just shut down Delphi (the patent search engine) and Matt and I had been constantly talking about medical documentation and everything fell in place since it was the perfect time to give it a shot.
With DeepScribe, we were lucky to have just come fresh out of Lean Launchpad since one of the most important factors in building a product for physicians was to iterate and refine the product around customer feedback. A historical issue with the medical industry has been that software has rarely had physicians in the design loop, therefore resulting in software that wasn’t optimized for the end user.
Since DeepScribe was happening at the same time as my final year at Berkeley, it was a heavy balancing act. I’d show up to class in a suit so I could be on time for a customer demo right after. I’d use all the EE facilities and professors not for anything to do with class but 100 percent for DeepScribe. My meetings with my research mentor even turned into DeepScribe brainstorming sessions.
Looking back, if I had to change one thing about my journey, it would’ve been to put college on hold so I could spend 150 percent of my time on DeepScribe.
Can you describe for a medical professional what the advantages of using DeepScribe are versus the more traditional method of voice dictation or even taking notes?
Using DeepScribe is meant to be very similar to using an actual human scribe. As you talk naturally to your patient, DeepScribe will listen in and pick up on the medically relevant speech that usually goes in your notes and puts it in there for you, using the same medical language that you yourself use. We like to think of it as a new AI-powered member of your medical staff that you can train as you’d like to help with documentation in your electronic health record system as you’d like. It’s very different from using voice dictation service as it eliminates the entire step of having to go back and document. While typical dictation services turn 10 minutes of documentation into 7-8 minutes, DeepScribe turns it into a few seconds. Our physicians report anywhere from 1.5 to 3 hours of time saved per day depending on how many patients they see.
DeepScribe is device-agnostic, operable from an iPhone, Apple Watch, browser (for telemedicine), or hardware device.
What are some of the speech recognition or NLP challenges that DeepScribe may encounter due to complex medical terminology?
Contrary to popular opinion, complex medical terminology is actually the easiest part for DeepScribe to pick up. The trickiest part for DeepScribe is to pick up on unique contextual statements a patient may give a physician. The more they stray from a typical conversation, the more we see the AI stumble. But as we collect more conversational data, we see it improve on this dramatically every day.
What are the other machine learning technologies that are used with DeepScribe?
The large umbrellas of speech recognition and NLP tend to cover most of the machine learning we’re doing at DeepScribe.
Can you name some of the hospitals, nonprofits, or academic institutions that are using DeepScribe?
DeepScribe started out through a pilot program with the UC Berkeley Health Center. Hartford Healthcare, Texas Medical Center, and Cedar Valley Medical Specialists are a handful of the larger systems DeepScribe is working with.
However, the larger percentage of DeepScribe users are 50 private practices from Alaska to Florida. Our most popular specialties are primary care, orthopedics, gastroenterology, cardiology, psychiatry, and oncology, but we do support a handful of other specialties.
DeepScribe has recently launched a program to assist with COVID-19. Could you walk us through this program?
COVID-19 has hit our doctors hard. Practices are only seeing 30-40 percent of their patient load, scribe staffing is being cut, and providers are being forced to rapidly switch all their patients on to telemedicine. All this ends up leading to more clerical work for providers—we at DeepScribe firmly believe that in order for this pandemic to come to a halt, physicians must devote 100 percent of their attention and time to taking care of their patients.
To help aid this cause, we are proud to launch a free telemedicine solution to health care professionals fighting this pandemic. Our telemedicine solution is fully integrated with our AI-powered medical scribe solution, eliminating the need for clinical documentation for encounters made on our platform.
We’re also offering our scribe service for free during the pandemic. This means that any physician can get access to a scribe for free to handle their documentation. Our hopes are that by doing this, physicians will be able to focus more of their attention on their patients and spend less time thinking about documentation, leading to a faster halting of the COVID-19 outbreak.
Thank you for the great interview, I really enjoyed learning about DeepScribe and your entrepreneurial journey. Anyone who wishes to learn more should visit DeepScribe.
Stefano Pacifico, and David Heeger, Co-Founders of Epistemic AI – Interview Series
Epistemic AI employs state-of-the-art Natural Language Processing (NLP), machine learning and deep learning algorithms to map relations among a growing body of biomedical knowledge, from multiple public and private sources, including text documents and databases. Through a process of Knowledge Mapping, users’ work interactively with the platform to map and understand subsets of biomedical knowledge, which reveals concepts and relationships and that are otherwise missed with traditional search.
We interviewed both Co-Founders of Epistemic AI to discuss these latest advances.
Stefano Pacifico comes from 10+ years in applied AI and NLP development. Formerly at Bloomberg, where he spent 7 years, and was at Elemental Cognition before starting Epistemic.
David Heeger is a Silver Professor of data science and neuroscience at NYU, and has spent his career bridging computer science, AI and bioscience. He is a member of the National Academy of Sciences. As founders they bring together the expertise of building applied large-scale AI and NLP systems for understanding large collections of knowledge, with expertise in computational biology and biomedical science from years of research in the area.
What is it that introduced and attracted you to AI and Natural Language Processing (NLP)?
Stefano Pacifico: When I was in college in Rome, and AI was not popular at all (in fact it was very fringe), I asked my then advisor what specialization I should have taken among those available. He said: “If you want to make money, Software Engineering and Databases, but if you want to be weird but very advanced, then choose Artificial Intelligence”. I was sold at “weird”. I then started working on knowledge representation and reasoning to study how autonomous agents could play soccer or rescue people. Then two realizations made me fall in love with NLP: first, autonomous agents might have to communicate with natural language among themselves! Second, building formal knowledge bases by hand is hard, while natural language (in text) already provides the largest knowledge base of all. I know today these might seem obvious observations, but they were not as mainstream before.
What was the inspiration behind launching Epistemic AI?
Stefano Pacifico: I am going to make a bold claim. Nobody today has adequate tooling to understand and connect the knowledge present in large, ever-growing collections of documents and data. I had previously worked on that problem in the world of finance. Think of news, financial statements, pricing data, corporate actions, filings etc. I found that problem intoxicating. And of course, it’s a difficult problem; and an important one! When I met my co-founder, Dr. David Heeger, we spent quite a bit of time evaluating startup opportunities in the biomedical industry. When we realized the sheer volume of information generated in this field, it’s as if everything fell in its right place. Biomedical researchers struggle with information overload, while attempting to grapple with the vast and rapidly expanding base of biomedical knowledge, including documents (e.g., papers, patents, clinical trials) and databases (e.g., genes, proteins, pathways, drugs, diseases, medical terms). This is a major pain point for researchers and, with no appropriate solution available, they are forced to use basic search tools (PubMed and Google Scholar) and explore manually-curated databases. These tools are suitable for finding documents matching keywords (e.g., a single gene or a published journal paper), but not for acquiring comprehensive knowledge about a topic area or subdomain (e.g., COVID-19), or for interpreting the results of high throughput biology experiments, such as gene sequencing, protein expression, or screening chemical compounds. We started Epistemic AI with the idea to address this problem with a platform that allows them to iteratively:
- Shorten the time to gather information and build comprehensive knowledge maps
- Surface cross-disciplinary information that can be otherwise difficult to find (real discoveries often come from looking into the white space between disciplines);
- Identify causal hypotheses by finding paths and missing links in your knowledge map.
What are some of both the public and private sources that are used to map these relations?
Stefano Pacifico: At this time, we are ingesting all the publicly available sources that we can get our hands on, including Pubmed and clinicaltrials.gov. We ingest databases of genes, drugs, diseases and their interactions. We also include private data sources for select clients, but we are not at liberty to disclose any details yet.
What type of machine learning technologies are used for the knowledge mapping?
Stefano Pacifico: One of the deeply held beliefs at Epistemic AI is that zealotry is not helpful for building products. Building an architecture integrating several machine learning techniques was a decision made early on, and those range from Knowledge Representation to Transformer models, through graph embeddings, but include also simpler models like regressions and random forests. Each component is as simple as it needs to be, but no simpler. While we believe to have already built NLP components that are state-of-the-art for certain tasks, we don’t shy away from simpler baseline models when possible.
Can you name some of the companies, non-profits, or academic institutions that are using the Epistemic platform?
Stefano Pacifico: While I’d love to, we have not agreed with our users to do so. I can say that we had people signing up from very high-profile institutions in all three segments (companies, non-profits, and academic institutions). Additionally, we intend to keep the platform free for academic/non-profit purposes.
How does Epistemic assist researchers in Identifying central nervous system (CNS) and other disease-specific biomarkers?
Dr. David Heeger: Neuroscience is a very highly interdisciplinary field including molecular and cellular biology and genomics, but also psychology, chemistry, and principles of physics, engineering, and mathematics. It’s so broad that nobody can be an expert at all of it. Researchers at academic institutions and pharma/biotech companies are forced to specialize. But we know that the important insights are interdisciplinary, combining knowledge from the sub-specialties. The AI-powered software platform that we’re building enables everyone to be much more interdisciplinary, to see the connections between their individual subarea of expertise and other topics, and to identify new hypotheses. This is especially important in neuroscience because it is such a highly interdisciplinary field to begin with. The function and dysfunction of the human brain is the most difficult problem that science has ever faced. We are on a mission to change the way that biomedical scientists work and even how they think.
Epistemic also enables the discovery of genetic mechanisms of CNS disorders. Can you walk us through how this works?
Dr. David Heeger: Most neurological diseases, psychiatric illnesses, and developmental disorders do not have a simple explanation in terms of genetic differences. There are a handful of syndromic disorders for which a specific mutation is known to cause the disorder. But that’s not typically the case. There are hundreds of genetic differences, for example, that have been associated with autism spectrum disorders (ASD). There is some understanding for some of these genes about the functions they serve in terms of basic biology. For example, some of the genes associated with ASD hold synapses together in the brain (note, however, that the same genes typically perform different functions in other organ systems in the body). But there’s very little understanding about how these genetic differences can explain the complex suite of behavioral differences exhibited by individuals with ASD. To make matters worse, two individuals with the same genetic difference may have completely different outcomes, one diagnosed with ASD and the other, not. And two individuals with completely different genetic profiles may have the same outcome with very similar behavioral deficits. To understand all this requires making the connection from genomics and molecular biology to cellular neuroscience (how do the genetic differences cause individual neurons to function differently) and then to systems neuroscience (how do those differences in cellular function cause networks of large numbers of interconnected neurons to function differently) and then to psychology (how do those differences in neural network function cause differences in cognition, emotion, and behavior). And all of this needs to be understood from a developmental perspective. A genetic difference may cause a deficit in a particular aspect of neural function. But the brain doesn’t just sit there and take it. Brains are highly adaptive. If there’s a missing or broken mechanism then the brain will develop differently to compensate as much as possible. This compensation might be molecular, for example, upregulating another synaptic receptor to replace the function of a broken synaptic receptor. Or the compensation might be behavioral. The end result depends not only on the initial genetic difference but also on the various attempts to compensate relying on other molecular, cellular, circuit, systems, and behavioral mechanisms.
No individual has the knowledge to understand all this. We all need help. The AI-powered software platform that we’re building enables everyone to collect and link all the relevant biomedical knowledge, to see the connections and to identify new hypotheses.
How are biopharma and academic institutions using Epistemic to tackle the COVID-19 challenge?
Stefano Pacifico: We have released a public version of our platform that includes COVID specific datasets and is freely accessible to anyone doing research on COVID-19. It is available at https://covid.epistemic.ai
What are some of the other diseases or genetic issues that Epistemic have been used for?
Stefano Pacifico: We have collaborated with autism researchers and are most recently putting together a new research effort for Cystic Fibrosis. But we are happy to collaborate with any other researchers or institutions that might need help with their research.
Is there anything else that you would like to share about Epistemic?
Stefano Pacifico: We are building a movement of people that want to change the way biomedical researchers work and think. We sincerely hope that many of your readers will want to join us!
Thank you both for taking the time to answer our questions. Readers who wish to learn more should visit Epistemic AI.
Emrah Gultekin, CEO and Co-founder of Chooch AI – Interview Series
Emrah is the co-founder and CEO of Chooch, an end-to-end visual AI solution. Chooch provides fast, accurate facial authentication and object recognition for the media, advertising, banking, medical and security industries. Chooch offers an easy-to-use and deployable API, a dashboard and mobile app SDK.
What was your inspiration for launching Chooch AI?
In our previous entrepreneurial experiences, my co-founder and I saw that there were a multitude of data-driven challenges that needed to be solved in a wide-variety of verticals, so I decided to dive in and solve the ones that I could. I had started companies before, but this was my first true “deep tech” company.
With our broader team, we’ve worked to develop a visual AI product that is sustainable, scalable, robust, and usable for an array of enterprises. The product is now being utilized by companies in the healthcare, public safety, industrial, media and geospatial industries, with uses that range from fraud prevention and decreasing medical errors to deepening the understanding of our world.
Can you share with us what Chooch AI does?
Chooch copies human visual intelligence into machines. We train and deploy visual AI for customers in the cloud and on the edge and deliver fast and accurate computer vision for any visual process.
We can do that because Chooch AI is a platform for every step of the visual AI process from data collection, annotation and labeling, to AI training, model deployment, and integration. Because of the broad range of problems we’ve solved, our team now has deep expertise in scoping and developing computer vision projects that are ready for global scale. This can be everything from cell identification, geospatial image analysis and public safety.
What type of imagery can be processed by the computer vision system?
What the human eye can do, Chooch can do better and at scale. For example, the human eye cannot process any spectrum from visible to CT scans, but Chooch can detect fevers with IR sensors and process x-rays to detect lung damage. We can do this for video or still imagery, both faster and more accurately than the human eye and have deployed over 2400 models for a variety of applications.
Chooch AI connects to the cloud but is also able to run on a local machine, can you elaborate on how this works?
Yes, this is one of our breakthroughs. We launched with the Chooch AI API, which allows companies to use our cloud server to process their images, but our customers wanted to deploy AIoT on the edge in places with no connectivity. So, we created Chooch Edge AI, which is basically a standalone AI container that is generated by our Chooch Cloud AI. For instance, we are able to remotely deploy that AI software on NVIDIA Jetson devices, which are amazing by the way, and we can then remotely update the edge AI as needed from the Chooch Dashboard. Technically, that AI software on the edge is called an inference engine. Chooch is able to connect up to four cameras to the edge devices and the AI can recognize thousands of classes on the edge. We are able to iterate on models, remove models and train new models on the Edge. This is always improving, because as chip and hardware providers release more powerful devices, we are generating more and more powerful AIoT deployments. We can now run multiple models on the edge with multiple layers of dense classification at very low latency.
Is facial recognition technology used?
We don’t do facial recognition as a company policy. We only do facial authentication with liveness detection with the caveat that it will always be consent-based, like providing permission to check in to a location or for a flight with your face instead of a ticket. Chooch AI can be trained with as few a couple of images. Facial authentication files are not stored as pictures of faces. And we do liveness detection to make sure people are not able to spoof the system.
Training AI models can be a steep learning curve for the uninitiated, what assistance do you provide for data labelling and annotating?
For the uninitiated, we offer end-to-end training assistance. When companies come to Chooch with a visual problem to solve, our team works in partnership with them to train and deploy AI models. It’s as simple as that. We do labelling and annotation as a service, and generally speaking users supply the data, but we help them organize that. Our training platform can use still images,but with videos, we can generate over 1,000 annotated images per minute, that’s another breakthrough, by the way. We take on the whole process from planning and consulting on data collection to model creation and testing and support. Our customer relationships become ongoing partnerships.
Chooch AI can assist enterprises with COVID-19. Can you detail how it can be of assistance?
Essentially, Chooch AI is supporting public safety with several visual AI models all while working with partners to deploy complete solutions. One such solution detects the presence or absence of masks and another detects fevers with IR cameras, these two solutions can be deployed as a complete solution. Of note, these AI models do not include any facial recognition features. Additionally, we have a research model that we are providing to researchers for detecting the signs of COVID-19 related pneumonia that looks at x-rays and detects lung injury.
Is there anything else that you would like to share about Chooch AI?
As a proof point for our technology, our system is live and is being utilized by numerous clients. Our customers are driving real ROI because we can automate literally any visual process at scale, reducing costs and human error.
Thank you for the interview. Readers who wish to learn more should visit Chooch AI.
- Microsoft to Replace Dozens of Journalists With AI
- AI Model Might Let Game Developers Generate Lifelike Animations
- Akilesh Bapu, Founder & CEO of DeepScribe – Interview Series
- AI Models Trained On Sex Biased Data Perform Worse At Diagnosing Disease
- Stefano Pacifico, and David Heeger, Co-Founders of Epistemic AI – Interview Series