When it comes to biomedical research, there are hundreds of research papers being published every day. Yet it can be difficult to predict what research will make it out of the lab setting and lead to clinical applications. Recently, a machine learning model developed by the Office of Portfolio Analysis, or OPA, at the National Institutes of Health (NIH) was able to determine the likelihood of a biomedical research case being used in clinical trials or guidelines. According to the OPA, the citation of a research article in a clinical trial is an early indicator of translational progress or the use of research findings as a potential treatment for disease.
As reported by AI Trends, the researchers at the OPA created a new metric for their machine learning model to use, dubbed Approximate Potential to Translate, or APT. According to the OPA Director, George Santangelo, bio-medicinal translation can be predicted based on the reaction of the scientific community to the research papers that a project based on. Santangelo said that there are distinct trajectories for the flow of knowledge which can predict the success or failure rate of a paper influencing clinical research.
The creation of the APT metric coincides with the release of the NIH’s second version of the iCite tool. iCite is a browser-based application that provides information about journal publications based on their specific field of analysis. Moving forward, the iCite tool will return the APT values for queries.
The process of adapting laboratory research into clinical applications is a complex tasks that often takes years. Attempts have been made to expedite this process, due to the many variables involved in the task, it can be difficult to assess the translational process. As explained by Santangelo, machine learning algorithms are a powerful tool that could
enable clinicians to better understand which research papers are likely to prove useful in the clinic. As the team of researchers experimented with and refined their APT metric, useful predictive patterns began to materialize.
“I think the most important one that we focus on is the diversity of interest from across the fundamental to clinical research axis. When people across that axis — from fundamental scientists often in the same field as the work that’s being published, all the way to people in the clinic — show an interest in the form of citations in those papers, then the likelihood of eventual citation by a clinical trial or guideline is quite high.”
According to Santangelo, the selected features show genuine promise in predicting the translation from research paper to a clinical method. Data on a publication collected over at least two years from the date of publication often give accurate predictions about a paper’s eventual citation in a clinical article.
Santangelo explained that thanks to the new metric and machine learning algorithms the researchers can have more complete knowledge of what is going on in the literature and that this allows better insight into the research areas which are more likely to appeal to clinical scientists.
Santangelo also explained that their algorithms integration into the iCite tool is intended to leverage the free, open nature of the NIH’s Open Citation Collection database.
The NIH Open Citation Collection database is currently comprised of over 420 million citation links and growing. The Santangelo team’s algorithm will be presenting the APT values for these citations when iCite 2.0 launches in the future.
Many databases are restrictive and propriety, and according to Santangelo, these barriers inhibit collaborative research. Santangelo opines that there isn’t a fantastic justification for keeping the data behind a paywall and that because their algorithm is supposed to let others see the calculated APT values, it wouldn’t be beneficial to use proprietary data sources.