Clinical Researcher—June 2024 (Volume 38, Issue 3)

GOOD MANAGEMENT PRACTICE

Steve Young, MA

Over the past 10 years we have seen a lot of progress in risk-based quality management (RBQM) adoption across the industry. This has rightly led to questions about its impact.

Is RBQM working? Is it supporting the primary mission of improving quality in clinical trials?

To answer these questions, we need to understand the limitations of traditional approaches to quality and explore the latest evidence which demonstrates how the components of centralized monitoring are helping to find the errors that matter.

Traditional Approaches to Quality Management

We know traditional approaches to quality management have been largely ineffective and inefficient. A 2014 analysis of clinical data from 1,168 trials found that only 1.1% of all data entered into electronic data capture (EDC) systems by sites was corrected as a result of 100% source data verification (SDV).{1} All quality reviews combined resulted in corrections to 3.7% of EDC. This included EDC auto-queries (1.4%) and all other reviews—data management, medical/safety monitoring, biostatistical reviews, etc. (1.2%).

This does not mean these approaches do not add value; however, it does raise questions as to whether we have been finding the errors that really matter.

In the past, we were using very visual, unguided manual reviews. Today, RBQM allows us to leverage the advanced analytics of centralized monitoring, which is much more effective at spotting potential issues more quickly.

The Key Components of Centralized Monitoring

Centralized monitoring ideally consists of three key components—statistical data monitoring (SDM), key risk indicators (KRIs), and quality tolerance limits (QTLs). SDM is an unsupervised analysis running statistical tests across all of the clinical data to expose systemic risk patterns your study team may not even have thought about in a pre-study risk assessment. KRIs monitor pre-specified (anticipated) risks at a site or country level while QTLs monitor pre-specified critical risks at the trial level.

Crucially, experience over hundreds of studies has shown that SDM generally finds issues which are more impactful to the potential reliability of study results because they are systemic in nature. Statistical tests can and have been designed to run across all of the clinical data in a study. Those tests generate p-values—a measure of how different a given site is on each parameter than all other sites in the trial. The lower the p-value, the more unlikely the observed value of the parameter for the given site.

Hundreds of p-values can be generated for each site on a suitably designed platform. All those scores are then converted into a single site-level score, which is referred to as the site’s data inconsistency score (DIS), ranging between 0 and 10. This allows sponsors to quickly identify any sites they should be concerned about, and is simply the negative log of the p-value, so that the higher the score, the more unusual or at-risk your site is for the given test and variable.

Site DIS Progression as a Measure of Quality Improvement

To explore whether SDM is improving quality, we used site DIS progression as a measure of quality improvement. Once you decide to follow up on a site with a high DIS (i.e., at-risk site), the DIS becomes a real marker of quality rather than just an indicator of risk. In particular, you would generally expect follow-up remedial actions to result in a lower DIS score because the site has corrected its behaviors.

For this analysis, we looked at sites that went above the threshold of risks and had risk signals which were followed up by the study team. This could include multiple issues requiring investigation and possibly action. We then compared the DIS when risk signals were opened to the DIS when they were all closed.

We measured two outcomes for this analysis. The first was the proportion of sites with improved quality—the finishing DIS score is lower than it was at the start of the study team follow-up. The second measurement was how much that DIS score improved. For example, if Site A had a DIS of 1.57 when risk signals were opened and 1.04 when all signals were closed, that is a 34% improvement in its DIS (i.e., 1.04 is 34% closer to a perfect score of 0 than 1.57).

The analysis incorporated 159 studies, across a whole range of therapeutic areas and phases. This included 1,111 sites with significant DIS (> 1.3), investigation of 3,651 risk signals, and 7,576 significant observed values. We were also given access to two large studies which had never used SDM, so these were pre-RBQM studies, to use as a comparison.

We found 83% of the sites using SDM had a lower DIS at the point when all risk signals were closed compared with only 56% for all sites in the two comparator studies. Data quality improved by 46% in sites using SDM compared to only 17% in those not using SDM. For sites which were using SDM, similarly positive results were repeated across all therapeutic areas and phases. This provides quantitative evidence that SDM is improving quality.

What is interesting about the two comparator studies is that, if you are not doing anything to address the risk signals found via SDM, you would expect around a 50/50 chance of the DIS improving vs. not improving over time (i.e., random drift), and that is essentially what we observed in this sample.

To give an example from one site among the hundreds we analyzed, the DIS was above the threshold when risk signals were first opened. Two individual risks of interest were surfaced, both of which were real issues which required correction. The first was that half of the study participants had a very high disease response score. Investigation revealed an error in data entry—this was corrected and no additional atypical scores were observed.

The second risk signal was a low volume of drug dispensed for the first two patients at the site. A clinical research associate checked the weighing technique and scale calibration and identified an issue due to a misunderstanding of reporting requirements. This was resolved and there were no additional erroneous results. These corrections meant the individual test scores improved, along with the overall DIS for the site.

Of course, not all risk signals represent an actual issue once they are investigated by the study team. In those cases, you are not necessarily going to get an improvement. It is statistically anomalous, but we have plenty of examples of where that has occurred. That is why we would not expect to see 100% of sites with a lower DIS when risk signals are closed and why 83% is so significant.

Analyzing Key Risk Indicator Outcomes

SDM is not the only component of centralized monitoring helping to drive quality improvement. A similar analysis was conducted on the impact of KRIs using data from the same platform as described earlier. The analysis focused on nine commonly used KRIs, representing categories including safety, compliance, data quality, and enrollment and retention. More than 20 organizations contributed data, allowing analysis of 212 studies and 1,676 sites with KRI risk signals.

Two quality improvement metrics were used for assessment—one based on p-value, and one based on a KRI’s observed value. Both metrics showed improvement in a vast majority of sites (82.9% for p-value, 81.1% for observed KRI value). Additionally, there was a 72.4% improvement toward expected KRI value on average.{2}

To illustrate what these improvements mean in practice, I have two specific examples.

At one site, the standard KRI of visit-to-electronic case report form (eCRF)-entry cycle time had increased to more than 30 days. The study team opened a risk signal, followed up, and talked to the site, and subsequently the site team members improved their behavior. When the signal was closed, the site’s average eCRF entry cycle time had dropped to less than five days.

Another site was not reporting any adverse events when the signal was opened. After the risk signal was followed up, the adverse event reporting improved dramatically, and the site was near the study trend.

Centralized Monitoring Works

The examples shared provide clear, compelling evidence that the use of central monitoring—both KRIs and SDM—is leading to improved behaviors and improved quality outcomes. They contribute to the growing body of evidence demonstrating how RBQM approaches which harness centralized monitoring can improve quality.

Risks detected via SDM and KRIs result in successful remediation and improved quality. This enables site metric values to improve toward nominal expected behavior and sites to become statistically less atypical or at risk.

Central monitoring is a critical component of effective quality oversight. It allows us to detect systemic issues in study conduct—the errors that matter—and identify issues typically missed by traditional methods like SDV or transactional data reviews.

All of this demonstrates how RBQM is an important new weapon in quality oversight.

References

Evaluating Source Data Verification as a Quality Control Measure in Clinical Trials. 2014. Therapeutic Innovation & Regulatory Science 48(6) 671–80.
Does Central Monitoring Lead to Higher Quality? An Analysis of Key Risk Indicator Outcomes. 2023. Therapeutic Innovation & Regulatory Science 57(2) 295–303.

Steve Young, MA, is Chief Scientific Officer at CluePoints, where he oversees the research and development of advanced methods for data analytics, data surveillance, and risk management, along with guiding customers in RBQM methodology and best practices. He has led a pivotal RBM-related analysis in collaboration with TransCelerate and is currently leading RBQM best practice initiatives for several industry RBM consortiums.