Ittai Dayan, MD, Co-founder & CEO of Rhino Health – Interview Series
Ittai Dayan, MD is the co-founder and CEO of Rhino Health. His background is in developing artificial intelligence and diagnostics, as well as clinical medicine and research. He is a former core member of BCG’s healthcare practice and hospital executive. He is currently focused on contributing to the development of safe, equitable and impactful Artificial Intelligence in healthcare and life sciences industry. At Rhino Health, they are using distributed compute and Federated Learning as a means for maintaining patient privacy and fostering collaboration across the fragmented healthcare landscape.
He served in the IDF – special forces, led the largest Academic-medical-center based translational AI center in the world. He is an expert in AI development and commercialization, and a long-distance runner.
Could you share the genesis story behind Rhino Health?
My journey into AI started when I was a clinician and researcher, using an early form of a ‘digital biomarker’ to measure treatment response in mental disorders. Later, I went on to lead the Center for Clinical Data Science (CCDS) at Mass General Brigham. There, I oversaw the development of dozens of clinical AI applications, and witnessed firsthand the underlying challenges associated with accessing and ‘activating’ the data necessary to develop and train regulatory-grade AI products.
Despite the many advancements in healthcare AI, the road from development to launching a product in the market is long and often bumpy. Solutions crash (or just disappoint) once deployed clinically, and supporting the full AI lifecycle is nearly impossible without ongoing access to a swath of clinical data. The challenge has shifted from creating models, to maintaining them. To answer this challenge, I convinced the Mass General Brigham system of the value of having their own ‘specialized CRO for AI’ (CRO = Clinical Research Org), to test algorithms from multiple commercial developers.
However, the problem remained – health data is still very siloed, and even large amount of data from one network aren’t enough to combat the ever-more-narrow targets of medical AI. In the Summer of 2020, I initiated and led (together with Dr. Mona Flores from NVIDIA), the world’s largest healthcare Federated Learning (FL)study at that time, EXAM. We used FL to create a COVID outcome predictive model, leveraging data from around the world, without sharing any data.. Subsequently published in Nature Medicine, this study demonstrated the positive impact of leveraging diverse and disparate datasets and underscored the potential for more widespread usage of federated learning in healthcare.
This experience, however, elucidated a number of challenges. These included orchestrating data across collaborating sites, ensuring data traceability and proper characterization, as well as the burden placed on the IT departments from each institution, who had to learn cutting-edge technologies that they weren’t used to. This called for a new platform that would support these novel ‘distributed data’ collaborations. I decided to team up with my co-founder, Yuval Baror, to create an end-to-end platform for supporting privacy-preserving collaborations. That platform is the ‘Rhino Health Platform’, leveraging FL and edge-compute.
Why do you believe that AI models often fail to deliver expected results in a healthcare setting?
Medical AI is often trained on small, narrow datasets, such as datasets from a single institution or geographic region, which lead to the resulting model only performing well on the types of data it has seen. Once the algorithm is applied to patients or scenarios that differ from the narrow training dataset, performance is severely impacted.
Andrew Ng, captured the notion well when he stated, “It turns out that when we collect data from Stanford Hospital…we can publish papers showing [the algorithms] are comparable to human radiologists in spotting certain conditions. … [When] you take that same model, that same AI system, to an older hospital down the street, with an older machine, and the technician uses a slightly different imaging protocol, that data drifts to cause the performance of AI system to degrade significantly.”3
Simply put, most AI models are not trained on data that is sufficiently diverse and of high quality, resulting in poor ‘real world’ performance. This issue has been well documented in both scientific and mainstream circles, such as in Science and Politico.
How important is testing on diverse patient groups?
Testing on diverse patient groups is crucial to ensuring the resulting AI product is not only effective and performant, but safe. Algorithms not trained or tested on sufficiently diverse patient groups may suffer from algorithmic bias, a serious issue in healthcare and healthcare technology. Not only will such algorithms reflect the bias that was present in the training data, but exacerbate that bias and compound existing racial, ethnic, religious, gender, etc. inequities in healthcare. Failure to test on diverse patient groups may result in dangerous products.
A recently published study5, leveraging the Rhino Health Platform, investigated the performance of an AI algorithm detecting brain aneurysms developed at one site on four different sites with a variety of scanner types. The results demonstrated substantial performance variability on sites with various scanner types, stressing the importance of training and testing on diverse datasets.
How do you identify if a subpopulation is not represented?
A common approach is to analyze the distributions of variables in different data sets, individually and combined. That can inform developers both when preparing ‘training’ data sets and validation data sets. The Rhino Health Platform allows you to do that, and furthermore, users may see how the model performs on various cohorts to ensure generalizability and sustainable performance across subpopulations.
Could you describe what Federated Learning is and how it solves some of these issues?
Federated Learning (FL) can be broadly defined as the process in which AI models are trained and then continue to improve over time, using disparate data, without any need for sharing or centralizing data. This is a huge leap forward in AI development. Historically, any user looking to collaborate with multiple sites must pool that data together, inducing a myriad of onerous, costly and time consuming legal, risk and compliance.
Today, with software such as the Rhino Health Platform, FL is becoming a day-to-day reality in healthcare and lifesciences. Federated learning allows users to explore, curate, and validate data while that data remains on collaborators’ local servers. Containerized code, such as an AI/ML algorithm or an analytic application, is dispatched to the local server where execution of that code, such as the training or validation of an AI/ML algorithm, is performed ‘locally’. Data thus remains with the ‘data custodian’ at all times.
Hospitals, in particular, are concerned about the risks associated with aggregating sensitive patient data. This has already led to embarrassing situations, where it has become clear that healthcare organizations collaborated with industry without exactly understanding the usage of their data. In turn, they limit the amount of collaboration that both industry and academic researchers can do, slowing R&D and impacting product quality across the healthcare industry. FL can mitigate that, and enable data collaborations like never before, while controlling the risk associated with these collaborations.
Could you share Rhino Health’s vision for enabling rapid model creation by using more diverse data?
We envision an ecosystem of AI developers and users, collaborating without fear or constraint, while respecting the boundaries of regulations.. Collaborators are able to rapidly identify necessary training and testing data from across geographies, access and interact with that data, and iterate on model development in order to ensure sufficient generalizability, performance and safety.
At the crux of this, is the Rhino Health Platform, providing a ‘one-stop-shop’ for AI developers to construct massive and diverse datasets, train and validate AI algorithms, and continually monitor and maintain deployed AI products.
How does the Rhino Health platform prevent AI bias and offer AI explainability?
By unlocking and streamlining data collaborations, AI developers are able to leverage larger, more diverse datasets in the training and testing of their applications. The result of more robust datasets is a more generalizable product that is not burdened by the biases of a single institution or narrow dataset. In support of AI explainability, our platform provides a clear view into the data leveraged throughout the development process, with the ability to analyze data origins, distributions of values and other key metrics to ensure adequate data diversity and quality. In addition, our platform enables functionality that is not possible if data is simply pooled together, including allowing users to further enhance their datasets with additional variables, such as those computed from existing data points, in order to investigate causal inference and mitigate confounders.
How do you respond to physicians who are worried that an overreliance on AI could lead to biased results that aren’t independently validated?
We empathize with this concern and recognize that a number of the applications in the market today may in fact be biased. Our response is that we must come together as an industry, as a healthcare community that is first and foremost concerned with patient safety, in order to define policies and procedures to prevent such biases and ensure safe, effective AI applications. AI developers have the responsibility to ensure their marketed AI products are independently validated in order to achieve the trust of both healthcare professionals and patients. Rhino Health is dedicated to supporting safe, trustworthy AI products and is working with partners to enable and streamline independent validation of AI applications ahead of deployment in clinical settings by unlocking the barriers to the necessary validation data.
What’s your vision for the future of AI in healthcare?
Rhino Health’s vision is of a world where AI has achieved its full potential in healthcare. We are diligently working towards creating transparency and fostering collaboration by asserting privacy in order to enable this world. We envision healthcare AI that is not limited by firewalls, geographies or regulatory restrictions. AI developers will have controlled access to all of the data they need to build powerful, generalizable models – and to continuously monitor and improve them with a flow of data in real time. Providers and patients will have the confidence of knowing they do not lose control over their data, and can ensure it’s being used for good. Regulators will be able to monitor the efficacy of models used in pharmaceutical & device development in real time. Public health organizations will benefit from these advances in AI while patients and providers rest easy knowing that privacy is protected.
Thank you for the great interview, readers who wish to learn more should visit Rhino Health.