Connect with us

COVID-19

COVID-19 Open AI Consortium – Interview with Dr. Stephen Weng, Principal Investigator

mm

Published

 on

COVID-19 Open AI Consortium - Interview with Dr. Stephen Weng, Principal Investigator

The Covid-19 Open AI Consortium (COAI) intends to bring breakthrough medical discoveries and actionable findings to the fight against the Covid-19 pandemic.

COAI aims to increase collaborative research, to accelerate clinical development of effective treatments for Covid-19, and to share all of its findings with the global medical and scientific community. COAI will unite collaborators: academic institutions, researchers, data scientists and industrial partners, to fight the Covid-19 pandemic.

This is the second of three interviews with principal leaders behind COAI. The first interview was with Owkin’s Sanjay Budhdeo, MD, Business Development.

Stephen Weng is an Assistant Professor of Integrated Epidemiology and Data Science who leads the data science research within the Primary Care Stratified Medicine Research Group.

He integrate traditional epidemiological methods and study design with new informatics-based approaches, harnessing and interrogating “big health care data” from electronic medical records for the purpose of risk prediction modeling, phenotyping chronic diseases, data science methods research, and translation of stratified medicine into primary care.

You recently joined the COVID-19 Open AI Consortium (COAI) as a lead principal investigator. Can you discuss what moved you to join this project?

I have been collaborating with Owkin and European partners in projects aimed at improving secondary prevention for acute coronary syndrome for the past year.  When Owkin launched the COVID-19 Open AI Consortium leverage their technology, expertise and our infrastructure to contribute to the global fight against COVID-19, this was an obvious choice and a natural fit to join the consortia. We have excellent partners who are leading cardiologists across Europe among our investigator group from our previous consortia. Using these resources and expertise, we could move very quickly and at a pace to launch this consortia within a matter of weeks and ultimately improve our understanding of disease progression, the underlying aetiology and risk factors in our populations.

A percentage of the population that is afflicted with COVID-19 show signs of cardiovascular damage. What type of heart related problems are being seen?

There is evidence emerging that cardiovascular risk factors and cardiovascular disease is a major contributor to the severity of the disease. A recent analysis of 17000 COVID-19 cases which required hospitalisations in the UK identified that heart disease was present in 29% of all hospitalised cases. Underlying cardiovascular risk factors including increasing age, high blood pressure, obesity, hypertension and type 2 diabetes contributes significantly to disease severity.

Do you believe that we currently have any type of understanding as to why COVID-19  causes this type of heart damage?

There are still many questions which need to be answered around epidemiology around the progression and severity of COVID-19, in particular regarding patients with heart disease. Patients with heart disease are at increased risk of experience severe illness which may require cardiorespiratory support in an intensive care unit. The severity of COVID-19 and progression towards severe outcomes is likely driven by some direct injury to the cardiovascular system, which may be acute. The exact type of cardiac injury in COVID-19 patients requires further investigation.

What will be your role with COAI?

I am an epidemiologist and data scientist with a research focus on prognosis of cardiovascular outcomes. Much of my work is a deep-dive in very large datasets to answer these clinical questions. In my role, as well as directly trying to answer some of these important research questions by leveraging my ability to access large population datasets, I am also trying to facilitate other academics and colleagues to contribute to our consortia.

What type of people do we need to join the COAI project in order to maximize its efficacy?

Not only is it important to obtain larger numbers from more scientists and clinical colleagues contributing data but also we need to increase the diversity of our data resources. We know COVID-19 has a wide spectrum of severity from asymptomatic individuals to very severe disease that results in death. Different types of data across the spectrum of the health care settings from primary to secondary care are needed to answer these questions about disease progression and severity.

You are currently an Assistant Professor of Integrated Epidemiology and Data Science who leads the data science research within the University of Nottingham’s Primary Care Stratified Medicine Research Group. Can you discuss possible ways big data can be used to target COVID-19 with the current information that we have?

We have some major big datasets we can leverage. The major wins have been recent investments into data linkage has really been put in action and we are starting to see these initiatives bearing major fruit. In fact, we are embarking on obtaining access to large population cohorts which have now been linked to primary care, hospital records, death registries, and COVID-19 testing data. Moreover, these data have opportunities to investigate genetic influences on COVID-19 outcomes. These linkages are only made possible with the rise of big data linkages and large population biobanks. Due to the amount of data and variables collates, the AI models that Owkin has developed and perfected are indeed very useful to efficiency analyse data at speed to derive meaningful insight.

What information do we need to gather to make precision medicine an effective tool in treating COVID-19 patients?

More diverse array of data types, including imaging, genetic, biomarkers alongside clinical features and patient demographics.

In a perfect world, what type of data should be collected from COVID-19 patients?

In such a novel disease like COVID-19, I don’t think there is and should be a maximum ceiling of data needed. There is a term “we don’t know what we don’t know yet”, so the more types of data and information we can collate now will be may be useful in the future. For instance, how many genomic advances knowledge have we experienced because we were able to sequence data and keep it accessible for researchers in bio banks? I see this occurring COVID-19. If we create a diverse and large data resource now, I have no doubt there will be new findings emerging to help our understanding in the future.

Should we also be collecting data from the segment of the population that is immune to COVID-19, in order to better understand what makes them immune?

In epidemiology, the choice of the comparator group is extremely important. Risk in many senses is relative. If our baseline starts with admitted to hospital, then we are also only understanding disease aetiology in those who present with more severe symptoms. I think a better understanding of asymptomatic individuals and what makes them asymptomatic towards COVID-19 is absolutely necessary. How many therapeutics are developed due to investigating gain of function mutations or loss of function mutations that naturally occur in populations.

Thank you for the fantastic interview. Readers who wish to learn more, may read our article describing the COAI project.

The first interview in this series was with Owkin’s Sanjay Budhdeo, MD, Business Development.

The third interview in this series was with Folkert W. Asselbergs, Principal Investigator

You may also visit the Covid-19 Open AI Consortium website.

Spread the love

Antoine Tardif is a futurist who is passionate about the future of AI and robotics. He is the CEO of BlockVentures.com, and has invested in over 50 AI & blockchain projects. He is also the Co-Founder of Securities.io a news website focusing on digital securities, and is a founding partner of unite.ai

COVID-19

Stefano Pacifico, and David Heeger, Co-Founders of Epistemic AI – Interview Series

mm

Published

on

Stefano Pacifico, and David Heeger, Co-Founders of Epistemic AI - Interview Series

Epistemic AI employs state-of-the-art Natural Language Processing (NLP), machine learning and deep learning algorithms to map relations among a growing body of biomedical knowledge, from multiple public and private sources, including text documents and databases. Through a process of Knowledge Mapping, users’ work interactively with the platform to map and understand subsets of biomedical knowledge, which reveals concepts and relationships and that are otherwise missed with traditional search.

We interviewed both Co-Founders of Epistemic AI to discuss these latest advances.

Stefano Pacifico comes from 10+ years in applied AI and NLP development. Formerly at Bloomberg, where he spent 7 years, and was at Elemental Cognition before starting Epistemic.

David Heeger is a Silver Professor of data science and neuroscience at NYU, and has spent his career bridging computer science, AI and bioscience. He is a member of the National Academy of Sciences. As founders they bring together the expertise of building applied large-scale AI and NLP systems for understanding large collections of knowledge, with expertise in computational biology and biomedical science from years of research in the area.

What is it that introduced and attracted you to AI and Natural Language Processing (NLP)?

Stefano Pacifico: When I was in college in Rome, and AI was not popular at all (in fact it was very fringe), I asked my then advisor what specialization I should have taken among those available. He said: “If you want to make money, Software Engineering and Databases, but if you want to be weird but very advanced, then choose Artificial Intelligence”. I was sold at “weird”. I then started working on knowledge representation and reasoning to study how autonomous agents could play soccer or rescue people. Then two realizations made me fall in love with NLP: first, autonomous agents might have to communicate with natural language among themselves! Second, building formal knowledge bases by hand is hard, while natural language (in text) already provides the largest knowledge base of all. I know today these might seem obvious observations, but they were not as mainstream before.

What was the inspiration behind launching Epistemic AI?

Stefano Pacifico: I am going to make a bold claim. Nobody today has adequate tooling to understand and connect the knowledge present in large, ever-growing collections of documents and data. I had previously worked on that problem in the world of finance. Think of news, financial statements, pricing data, corporate actions, filings etc. I found that problem intoxicating. And of course, it’s a difficult problem; and an important one!  When I met my co-founder, Dr. David Heeger, we spent quite a bit of time evaluating startup opportunities in the biomedical industry. When we realized the sheer volume of information generated in this field, it’s as if everything fell in its right place. Biomedical ​researchers struggle with information overload, while attempting to grapple with the vast and rapidly expanding base of biomedical knowledge, including documents (e.g., papers, patents, clinical trials) and databases (e.g., genes, proteins, pathways, drugs, diseases, medical terms). This is a major pain point for researchers and, with no appropriate solution available, they are forced to use basic search tools (PubMed and Google Scholar) and explore manually-curated databases. These tools are suitable for finding documents matching keywords (e.g., a single gene or a published journal paper), but not for acquiring comprehensive knowledge about a topic area or subdomain (e.g., COVID-19), or for interpreting the results of high throughput biology experiments, such as gene sequencing, protein expression, or screening chemical compounds. We started Epistemic AI with the idea to address this problem with a platform that allows them to iteratively:

  1. Shorten the time to gather information and build comprehensive knowledge maps
  2. Surface cross-disciplinary information​ that can be otherwise difficult to find (real discoveries often come from looking into the white space between disciplines);
  3. Identify causal hypotheses by finding paths and missing links in your knowledge map​.

What are some of both the public and private sources that are used to map these relations?

Stefano Pacifico: At this time, we are ingesting all the publicly available sources that we can get our hands on, including Pubmed and clinicaltrials.gov. We ingest databases of genes, drugs, diseases and their interactions. We also include private data sources for select clients, but we are not at liberty to disclose any details yet.

What type of machine learning technologies are used for the knowledge mapping?

Stefano Pacifico: One of the deeply held beliefs at Epistemic AI is that zealotry is not helpful for building products. Building an architecture integrating several machine learning techniques was a decision made early on, and those range from Knowledge Representation to Transformer models, through graph embeddings, but include also simpler models like regressions and random forests. Each component is as simple as it needs to be, but no simpler. While we believe to have already built NLP components that are state-of-the-art for certain tasks, we don’t shy away from simpler baseline models when possible.

Can you name some of the companies, non-profits, or academic institutions that are using the Epistemic platform?

Stefano Pacifico: While I’d love to, we have not agreed with our users to do so. I can say that we had people signing up from very high-profile institutions in all three segments (companies, non-profits, and academic institutions). Additionally, we intend to keep the platform free for academic/non-profit purposes.

How does Epistemic assist researchers in Identifying central nervous system (CNS) and other disease-specific biomarkers?

Dr. David Heeger: Neuroscience is a very highly interdisciplinary field including molecular and cellular biology and genomics, but also psychology, chemistry, and principles of physics, engineering, and mathematics. It’s so broad that nobody can be an expert at all of it. Researchers at academic institutions and pharma/biotech companies are forced to specialize. But we know that the important insights are interdisciplinary, combining knowledge from the sub-specialties. The AI-powered software platform that we’re building enables everyone to be much more interdisciplinary, to see the connections between their individual subarea of expertise and other topics, and to identify new hypotheses. This is especially important in neuroscience because it is such a highly interdisciplinary field to begin with. The function and dysfunction of the human brain is the most difficult problem that science has ever faced. We are on a mission to change the way that biomedical scientists work and even how they think.

Epistemic also enables the discovery of genetic mechanisms of CNS disorders. Can you walk us through how this works?

Dr. David Heeger: Most neurological diseases, psychiatric illnesses, and developmental disorders do not have a simple explanation in terms of genetic differences. There are a handful of syndromic disorders for which a specific mutation is known to cause the disorder. But that’s not typically the case. There are hundreds of genetic differences, for example, that have been associated with autism spectrum disorders (ASD). There is some understanding for some of these genes about the functions they serve in terms of basic biology. For example, some of the genes associated with ASD hold synapses together in the brain (note, however, that the same genes typically perform different functions in other organ systems in the body). But there’s very little understanding about how these genetic differences can explain the complex suite of behavioral differences exhibited by individuals with ASD. To make matters worse, two individuals with the same genetic difference may have completely different outcomes, one diagnosed with ASD and the other, not. And two individuals with completely different genetic profiles may have the same outcome with very similar behavioral deficits. To understand all this requires making the connection from genomics and molecular biology to cellular neuroscience (how do the genetic differences cause individual neurons to function differently) and then to systems neuroscience (how do those differences in cellular function cause networks of large numbers of interconnected neurons to function differently) and then to psychology (how do those differences in neural network function cause differences in cognition, emotion, and behavior). And all of this needs to be understood from a developmental perspective. A genetic difference may cause a deficit in a particular aspect of neural function. But the brain doesn’t just sit there and take it. Brains are highly adaptive. If there’s a missing or broken mechanism then the brain will develop differently to compensate as much as possible. This compensation might be molecular, for example, upregulating another synaptic receptor to replace the function of a broken synaptic receptor. Or the compensation might be behavioral. The end result depends not only on the initial genetic difference but also on the various attempts to compensate relying on other molecular, cellular, circuit, systems, and behavioral mechanisms.

No individual has the knowledge to understand all this. We all need help. The AI-powered software platform that we’re building enables everyone to collect and link all the relevant biomedical knowledge, to see the connections and to identify new hypotheses.

How are biopharma and academic institutions using Epistemic to tackle the COVID-19 challenge?

Stefano Pacifico: We have released a public version of our platform that includes COVID specific datasets and is freely accessible to anyone doing research on COVID-19. It is available at https://covid.epistemic.ai

What are some of the other diseases or genetic issues that Epistemic have been used for?

Stefano Pacifico: We have collaborated with autism researchers and are most recently putting together a new research effort for Cystic Fibrosis. But we are happy to collaborate with any other researchers or institutions that might need help with their research.

Is there anything else that you would like to share about Epistemic?

Stefano Pacifico: We are building a movement of people that want to change the way biomedical researchers work and think. We sincerely hope that many of your readers will want to join us!

Thank you both for taking the time to answer our questions. Readers who wish to learn more should visit Epistemic AI.

Spread the love
Continue Reading

Artificial Neural Networks

AI Models Struggle To Predict People’s Irregular Behavior During Covid-19 Pandemic

mm

Published

on

AI Models Struggle To Predict People's Irregular Behavior During Covid-19 Pandemic

Retail and service companies around the world make use of AI algorithms to predict customer behaviors, take stock of inventory, estimate marketing impacts, and detect possible instances of fraud. The machine learning models used to make these predictions are trained on patterns derived from the normal, everyday activity of people. Unfortunately, our day-to-day activity has changed during the coronavirus pandemic, and as MIT Technology Review reported current machine learning models are being thrown off as a result. The severity of the problem differs from company to company, but many models have been negatively impacted by the sudden change in people’s behavior over the course of the past few weeks.

When the coronavirus pandemic occurred, the purchasing habits of people shifted dramatically. Prior to the onset of the pandemic, the most commonly purchased objects were things like phone cases, phone chargers, headphones, kitchenware. After the start of the pandemic, Amazon’s top 10 search terms became things like Clorox wipes, Lysol spray, paper towels, hand sanitizer, face masks, and toilet paper. Over the course of the last week of February, the top Amazon searches all became related to products people required to shelter themselves from Covid-19. The correlation of Covid-19 related product searches/purchases and the spread of the disease is so reliable that it can be used to track the spread of the pandemic across different geographical regions. Yet machine learning models break down when the model’s input data is too different from the data used to train the model.

The volatility of the situation has made automation of supply chains and inventories difficult. Rael Cline, the CEO of London-based consultancy Nozzle,  explained that companies are trying to optimize for the demand of toiler paper one week ago, while “this week everyone wants to buy puzzles or gym equipment.”

Other companies have their own share of problems. One company provides investment recommendations based on the sentiment of various news articles, but because the sentiment of news articles at the moment is often more pessimistic than usual, the investing advice could be heavily skewed toward the negative. Meanwhile, a streaming video company utilized recommendation algorithms to suggest content to viewers, but as many people suddenly subscribed to the service their recommendations started to fall from the mark. Yet another company responsible for supplying retailers in India with condiments and sauces discovered bulk orders broke their predictive models.

Different companies are handling the problems caused by pandemic behavior patterns in different ways. Some companies are simply revising their estimates downward. People still continue to subscribe to Netflix and purchase products on Amazon, but they have cut back on luxury spending, postponing purchases on big-ticket items. In a sense, people’s spending behaviors can be conceived of as a contraction of their usual behavior.

Other companies have had to get more hand-on with their models and have engineers make important tweaks to the model and it’s training data. For example, Phrasee is an AI firm that utilizes natural language processing and generation models to create copy and advertisements for a variety of clients. Phrasee always has engineers check what text the model generates, and the company has begun manually filtering out certain phrases in its copy. Phrasee has decided to ban the generation of phrases that might encourage dangerous activities during a time of social distancing, phrases like “party wear”. They have also decided to restrict terms that could lead to anxiety, like “brace yourself”, “buckle up”, or “stock up”.

The Covid-19 crisis has demonstrated that freak events can throw off even highly-trained models that are typically reliable, as things can get much worse than the worst-case scenarios that are typically included within training data. Rajeev Sharma, CEO of AI consultancy Pactera Edge, explained to MIT Technology Review that machine learning models could be made more reliable by being trained on freak events like the Covid-19 pandemic and the Great Depression, in addition to the usual upwards and downwards fluctuations.

Spread the love
Continue Reading

COVID-19

COVID-19 Open AI Consortium – Interview with Principal Investigator Prof. Folkert Asselbergs

mm

Published

on

COVID-19 Open AI Consortium - Interview with Principal Investigator Prof. Folkert Asselbergs

The Covid-19 Open AI Consortium (COAI) intends to bring breakthrough medical discoveries and actionable findings to the fight against the Covid-19 pandemic.

COAI aims to increase collaborative research, to accelerate clinical development of effective treatments for Covid-19, and to share all of its findings with the global medical and scientific community. COAI will unite collaborators: academic institutions, researchers, data scientists and industrial partners, to fight the Covid-19 pandemic.

This is the Third of three interviews with principal leaders behind COAI.

Folkert W. Asselbergs is professor of precision medicine in cardiovascular disease at Institute of Cardiovascular Science, UCL, Director NIHR BRC Clinical Research Informatics Unit at UCLH, professor of cardiovascular genetics and consultant cardiologist at the department of Cardiology, University Medical Center Utrecht, and chief scientific officer of the Durrer Center for Cardiovascular Research, Netherlands Heart Institute. Prof Asselbergs published more than 275 scientific papers and obtained funding from leDucq foundation, British and Dutch Heart Foundation, EU (FP7, ERA-CVD, IMI, BBMRI), and RO1 National Institutes of Health.

A percentage of the population that is afflicted with COVID-19 show signs of cardiovascular damage. What type of heart related problems are being seen?

From the studies that have been published thus far, acute cardiac injury is observed in up to 27.8% of patients. In addition, a number of case reports have been published of patients that have developed myocarditis and myocardial infarction in the context of COVID-19. There is also an unexpected high incidence of pulmonary embolisms in this patient population.

Do you believe that we currently have any type of understanding as to why COVID-19  causes this type of heart damage?

Our current understanding is still very limited. The release of troponin in critically ill patients is common, and also frequently seen in other patient groups (trauma/surgical/sepsis etc.). Troponin release is thus a non-specific finding and the mechanisms explaining myocardial injury in COVID-19 are not fully understood.

What type of people do we need to join the COAI project in order to maximize its efficacy?

To maximize the efficacy of this project, I believe we must strive towards a multidisciplinary approach between data scientists, statisticians, epidemiologists and clinicians.

You are currently a Professor of Precision medicine at the Institute of Cardiovascular Science and Institute of Health Informatics at UCL. Can you discuss possible ways precision medicine can be used to target COVID-19 with the current information that we have?

It is still unclear which people develop severe symptoms due to COVID-19. Novel predictive models are needed to identify those patients at high-risk. Those patients should be monitored more intensively and be prioritized for novel treatments.

What information do we need to gather to make precision medicine an effective tool in treating COVID-19 patients?

Easy to obtain routinely collected data is needed to develop a risk calculator to identify those at high-risk such as demographics, medical history and drug use. Of course, collaboration across sites and countries is needed to validate any developed risk model to ensure external validity.

One of the current projects you are associated with is the Capacity Covid Registry. Can you discuss what this project is and why it is so important?

COVID-19 patients with cardiovascular disease are a vulnerable population. To give these patients the best possible care and to be prepared for future outbreaks, we need to know more about these patients and the best practices for treating them. The CAPACITY COVID Registry was launched. CAPACITY COVID Registry is an extension of the registry released by the International Severe Acute Respiratory and Emerging Infection Consortium (ISARIC) and WHO in response to the emerging outbreak of COVID-19. CAPACITY registers data regarding the cardiovascular history, diagnostic information and occurrence of cardiovascular complications in patients with COVID-19. By collecting this information in a standardised manner, CAPACITY aims to provide more insight in:

  • the vulnerability and clinical course of COVID-19 in patients with underlying cardiovascular disease;
  • the incidence of cardiovascular in patients diagnosed with COVID-19.

In a perfect world, what type of data should be collected from COVID-19 patients?

In a perfect world, data would be collected at an early stage in home-setting to detect those at high risk for admission and when admitted as much data should be extracted from clinical systems such as Electronic Health Records including laboratory measurements, physical measurements and complaints during time to have as much information possible to early identify those at risk.

Should we also be collecting data from the segment of the population that is immune to COVID-19, in order to better understand what makes them immune?

We should also focus on those people tested positive for COVID-19 but only have mild symptoms to learn who are less vulnerable to severe symptoms.

Is there anything else that you would like to share about either the  COVID-19 Open AI Consortium or the Capacity Covid Registry?

Since the launch of the registry, 88 centres across 17 countries have registered to join CAPACITY. We hereby would like to invite other centres to participate in CAPACITY-COVID. To allow a quick set up of the project for centres that want to participate, we have developed a portfolio of resources, including the study protocol, patient information form and standard operating procedures that are all freely available. For more information visit our website: www.capacity-covid.eu

Thank you for the fantastic interview. Readers who wish to learn more, may read our article describing the COAI project.

The first interview in this series was with Owkin’s Sanjay Budhdeo, MD, Business Development.

The second interview in this series was with Dr. Stephen Weng, Principal Investigator.

You may also visit the Covid-19 Open AI Consortium website.

Spread the love
Continue Reading