In order to win the battle against COVID-19, studies to develop vaccines, drugs, devices, and re-purposed drugs are urgently needed. Randomized clinical trials are used to provide evidence of safety and efficacy as well as to better understand this novel and evolving virus. As of July 15, more than 6,180 COVID-19 clinical trials have been registered through ClinicalTrials.gov. Knowing which ones are likely to succeed is imperative.
Researchers from Florida Atlantic University’s College of Engineering and Computer Science are the first to model COVID-19 completion versus cessation in clinical trials using machine learning algorithms and ensemble learning. The study, published in PLOS ONE, provides the most extensive set of features for clinical trial reports, including features to model trial administration, study information and design, eligibility, keywords, drugs, and other features.
It is hoped that the new approach will be helpful to design computational approaches to predict whether a COVID-19 clinical trial will be completed, so that stakeholders can leverage the predictions to plan resources, reduce costs, and minimize the time of the clinical study. The study was funded by the National Science Foundation.
This research shows that computational methods can deliver effective models to understand the difference between completed versus ceased COVID-19 trials. In addition, these models can predict COVID-19 trial status with satisfactory accuracy.
Because COVID-19 is a relatively novel disease, very few trials have been formally terminated. Therefore, for the study, researchers considered three types of trials as cessation trials: terminated, withdrawn, and suspended. These trials represent research efforts that have been stopped/halted for particular reasons and represent research efforts and resources that were not successful.
“The main purpose of our research was to predict whether a COVID-19 clinical trial will be completed or terminated, withdrawn, or suspended. Clinical trials involve a great deal of resources and time including planning and recruiting human subjects,” said Xingquan “Hill” Zhu, PhD, senior author and a professor in the Department of Computer and Electrical Engineering and Computer Science. Zhu conducted the research with first author Magdalyn “Maggie” Elkin, a second-year PhD student in computer science who also works full-time. “If we can predict the likelihood of whether a trial might be terminated or not down the road, it will help stakeholders better plan their resources and procedures. Eventually, such computational approaches may help our society save time and sources to combat the global COVID-19 pandemic.”
For the study, Zhu and Elkin collected 4,441 COVID-19 trials from ClinicalTrials.gov to build a testbed. They designed four categories of features (statistics features, keyword features, drug features, and embedding features) to characterize the trials, leading to 693 dimensional features representing each trial. Feature selection and ranking showed that keyword features derived from medical subject heading terms in the clinical trial reports were the most informative for COVID-19 trial prediction, followed by drug features, statistics features, and embedding features.
Edited by Gary Cramer