Clinical Researcher—June 2024 (Volume 38, Issue 3)
SPECIAL FEATURE
Samuel Salvaggio, PhD; Emilie Barré, MSc; Sébastien Coppe, PhD; Marc Buyse, ScD
Clinical trials are critical for assessing new treatments in terms of their potential benefits for patients’ health and for their likelihood of financial success if marketed. Yet, the traditional approach focusing on a single primary endpoint may mean lengthy trials that often fail to capture the full spectrum of the treatment effects that are crucial to patient well-being. Generalized pairwise comparisons (GPC) is an innovative statistical methodology that offers a revolutionary approach to clinical trial analysis, addressing these shortcomings by allowing the integration of multiple clinically relevant outcomes into a single assessment. This methodology better leverages the large amount of collected data and improves clinical trial efficiency by reducing the required sample size. Multiple case studies have demonstrated the successful application of GPC in regulatory submissions, with notable U.S. Food and Drug Administration (FDA) approvals in cardiovascular disease.
This paper discusses the benefits of GPC, providing a compelling argument for its broader adoption in clinical research to meet modern healthcare challenges, design patient-centric protocols, and approve patient-centered treatments at a faster pace.
Background
Clinical trials stand as the cornerstone of medical advancements, offering critical insights before the commercialization of any new treatment (e.g, a new drug, vaccine, or medical device) which will improve patients’ health. Efficacy endpoints in clinical trials are crucial measures designed to reflect the intended effects of a new treatment, encompassing a wide array of assessments from clinical events like stroke and mortality to symptoms such as pain and measures of function. As diseases can affect patients in multiple ways, leading to various clinical events, symptoms, and functional impairments, many trials strive to examine the effects of a treatment on multiple aspects of a disease.
Yet, treatments are often approved based on the results tied to a single primary criterion. This main primary endpoint, while selected for its clinical relevance by medical professionals, may not always resonate with the day-to-day realities and preferences of patients and can overlook broader effects that a treatment might have—such as variations in symptom alleviation, functional improvements, or side effects—which are all critical to patient well-being. This narrow focus can restrict the understanding of a treatment’s comprehensive benefits and risks, potentially sidelining important factors that influence patient quality of life and overall treatment satisfaction.
An approach involving a single primary endpoint can be too narrow in scenarios where multiple outcomes are of interest. It does not fully address the complexity of patient needs, especially when evaluating treatments for diseases with multiple symptomatic expressions or functional impacts. This issue is further compounded by the inherent limitations of conventional statistical tests commonly used in clinical trials. These traditional methods are limited to the analysis of a single variable at a time. As a result, clinical outcomes collected during trials are analyzed separately, leading to two significant issues. First, failure to consider the interactions between clinical outcomes. Second, many analyses are considered exploratory and discounted in decision-making processes due to concerns about Type I errors, a statistical technical term meaning that a treatment effect is detected when there is none (e.g., a false positive).{1} This fragmented approach prevents a comprehensive understanding of how different outcomes of interest collectively influence the efficacy and safety of a treatment.
Another problem related to this way of defining primary analysis is that it may not reflect patient preferences. While patients invest a large amount of time in performing multiple tests at the investigational site or answering different questionnaires, much of the collected data remain unexploited as contributors to decisions about the trial’s success. At worst, this single endpoint approach might lead to the wrong “market access” decision.
Imagine a randomized clinical trial in oncology, where a new innovative drug would be compared to the standard of care. Let’s take a simple example, where one patient in each treatment arm would have the same overall survival time (i.e., similar values for the primary analysis). These two patients will not play an important role in differentiating the two treatments. Still, one of the two patients may have a much worse quality of life than the other patient. Not being able to leverage this information may prevent some pharmaceutical companies from bringing to market some innovative treatments, which may answer important patient needs, while not being less efficient than the standard of care.
This idea of answering patients’ needs in a more holistic way is not an easy one. In some situations, non-inferiority trials have been trying to achieve this objective—to highlight that a new treatment is not much better in terms of efficacy but may bring value to patients when looking to other facets of the treatment’s effects.{2}
Non-inferiority trials aim to show that a new treatment isn’t importantly worse than an established one. Picture this situation: a new treatment has similar efficacy to the standard of care while reducing treatment burden or side effects. Typically, regulators demand evidence of “non-inferiority.” However, non-inferiority trials come with their own set of challenges—among them being the choice of an arbitrary “non-inferiority margin,” the requirement for large sample sizes, and the high probability of the trial ending up inconclusive. Moreover, these trials may not always focus on the treatment effects that matter most to patients.
When it comes to helping patients make informed decisions, we need to focus on the big picture—evaluating all the evidence to figure out the net benefit of different treatment options.
Generalized Pairwise Comparisons
As clinical research continues to expand with hundreds of new randomized clinical trials added weekly to public registries like ClinicalTrials.gov, the need for robust, flexible statistical tools becomes ever more apparent. GPC marks a significant evolution in the statistical analysis of clinical trials. GPC, an extension of the well-known Wilcoxon-Mann-Whitney test, offers a robust statistical method to analyze multiple outcomes simultaneously.{3} Forty peer-reviewed scientific papers were published on the topic over the past 15 years, while the GPC methodology also gained traction in the biopharma industry, with a growing number of protocols approved by the regulatory agencies.
GPC is a statistical methodology that addresses these issues by comparing every possible pair of individuals within a trial to assess the likelihood of one treatment being more effective than another, from a comprehensive standpoint. The GPC methodology stands out as a pragmatic approach that offers unparalleled flexibility, especially in scenarios where multiple outcomes are in play, each with its own priority. Consider the landscape of oncology, where primary outcomes typically revolve around overall survival (OS) and progression-free survival (PFS). However, the traditional focus on just one of these metrics often leads to confusion and inconsistent results. Unlike conventional methods that focus on “time to first outcome,” GPC allows one to focus on the “time to worst outcome.” This nuanced approach addresses the complex trade-offs between OS and PFS, smoothing out the wrinkles of uncertainty. But GPC’s utility extends far beyond oncology. In cardiology, for instance, it has already gained traction, offering a versatile framework for comparing treatment groups across a spectrum of outcomes—be they continuous, time-to-event, binary, or categorical.{4}
This approach not only accommodates the complexity of modern medical treatments, but also aligns with the growing emphasis on patient-centric research. By allowing outcomes to be analyzed simultaneously and hierarchized based on clinical relevance and patient preferences, GPC facilitates a holistic evaluation of treatment effects, culminating in the calculation of the net treatment benefit (NTB)—the cornerstone outcome of GPC analysis. The NTB serves as a comprehensive measure of treatment’s effects, capturing the disparity between two treatment cohorts: one receiving the experimental treatment and the other the control treatment. Conceptually, the NTB represents the net probability of observing a superior outcome in the experimental group compared to the control group.
This absolute metric directly correlates with the concept of “number needed to treat” (NNT), wherein the inverse of the NTB yields the NNT value. For instance, if the NTB equals 20%, it implies that, on average, one in every five patients experiences a superior outcome with the experimental treatment over the control treatment. Such clarity in quantifying treatment efficacy empowers clinicians and researchers alike in making informed decisions regarding patient care and trial design. It also helps patients understand easily the outcome of the statistical analysis.
Moreover, it provides a clear breakdown of the contributions of each single outcome in the final NTB value. If three patient-relevant outcomes were chosen, one may see the individual contribution of these three outcomes. For instance, the first outcome may bring a treatment benefit of 10% in favor of the experimental treatment, the second outcome a benefit of 15% in favor of the experimental treatment, and the third outcome a benefit of 5% in favor of the control treatment. Clinicians and patients can therefore easily understand the positive impact of the experimental treatment for the first two outcomes, not especially for the third one.
The GPC methodology not only incorporates outcomes typically used in standard primary endpoints, it can also integrate additional measures such as quality of life and adverse effects, providing a holistic view of a treatment’s impact. This approach allows for the utilization of a broader spectrum of collected data in clinical trials.
One significant advantage of GPC is its capacity to substantially reduce sample sizes. While the extent of sample size reduction may vary depending on factors such as the number of outcomes considered and their relationship, it often leads to notable decreases in sample size requirements. This ability to minimize the number of patients needed for a well-powered clinical trial design is particularly crucial in therapeutic areas such as rare diseases, where patient recruitment is challenging. Achieving reductions in sample sizes without compromising the clinical relevance of analyses is one of the top opportunities for boosting biopharmaceutical research and development productivity.{5}
Leveraging GPC methodology represents a paradigm shift toward more holistic, patient-centered approaches in clinical research. On top of reducing patients’ recruitment timelines and study budgets, it provides an efficient approach to listen to a patient’s voice and incorporate the most patient-relevant outcomes in the primary endpoint, mirroring the broader transitions in healthcare toward personalized medicine.
Case Studies
Multiple clinical trials have successfully used GPC as their primary statistical analysis for registration clinical trials, mainly for cardiovascular disease. Two of those drug submissions have since received FDA approval for market access, both drugs aiming to treat cardiac amyloidosis. For the first drug, tafamidis meglumine, the primary analysis that led to its regulatory approval used GPC to assess a multivariate hierarchized endpoint. The analysis combined all-cause mortality and the frequency of cardiovascular-related hospitalizations.{6} For the second drug, acoramidis hydrochloride, the GPC analysis was expanded to use four hierarchical outcomes.{7} Alongside all-cause mortality and hospitalization frequency, the analysis also assessed changes in a protein biomarker (NT-proBNP) and the results from a six-minute walk test (to measure one aspect of the patients’ quality of life {8}), therefore providing even more layers of understanding of the drug’s effects.
To illustrate the application of GPC, let’s examine a real case study from an oncology FDA submission. In this case, the investigational drug was compared to a placebo in a randomized trial, on top of standard of care, to assess its effectiveness in preventing severe toxicity associated with cancer treatment. The primary analysis initially used a traditional approach, focusing solely on the occurrence of severe adverse effects—a mere binary variable. This method underutilized much of the available data, omitting crucial aspects of the toxicity that are significant both to the patient and the overall treatment strategy, such as the severity grade of the toxicity (e.g., whether hospitalization was required) and the duration of the toxicity.
GPC methodology was used to address the shortcomings of the classical univariate statistical analyses by capturing a broader range of outcomes into the evaluation. In the GPC analysis of this oncology trial, multiple dimensions of toxicity were considered: initially, the occurrence of toxicity at its most severe grade, followed by occurrences at less severe grades, and finally, the duration of the toxicity. To illustrate the analysis process, pairs of patients from the experimental (E in Figure 1) and placebo (P in Figure 1) groups were compared, as depicted below. Each patient from the experimental group was compared to each patient from the placebo group, offering a large number of pairs to be evaluated following this process. This approach allowed for a more comprehensive assessment by comparing treatment effect across different levels of severity and duration of toxicity, providing a clearer picture of the drug’s impacts on patient health.
While the univariate statistical analysis narrowly achieved significance with a p-value just below the conventional threshold of 5%, incorporating additional clinically relevant information through the GPC analysis increased statistical power. This led to a more comprehensive and convincing demonstration of the drug’s superiority over placebo.
Figure 1: Flowchart depicting the decision process behind a pair classification for a GPC analysis
Note: To compute the NTB, the number of pairs favoring the experimental arm are subtracted from the number of pairs favoring the placebo arm, then the result is divided by the total number of pairs.
Conclusion
In recent years, the clinical research industry has taken significant strides in engaging patients, caregivers, and advocacy groups in the trial design process, gathering invaluable insights into protocols, endpoints, and the overall trial experience. However, despite these commendable efforts, a pressing need remains for a robust methodology capable of synthesizing multiple outcomes into a single, clinically meaningful statistical analysis.
This is precisely where GPC emerges as a transformative solution, seamlessly integrating patients’ and clinicians’ insights with trial data to create a comprehensive assessment reflective of patient needs. The NTB evaluation offers to both patients and clinicians some easily understandable treatment assessment. By facilitating the combination of all key patient-relevant outcomes, GPC enhances the patient-centricity of clinical research while also optimizing its efficiency and effectiveness. Through its capacity to harness vast amounts of collected data, GPC not only streamlines treatment decisions, but also holds the potential to reduce sample sizes and study timelines, thereby amplifying biopharmaceutical research and development productivity.
As the industry continues to evolve, embracing methodologies like GPC will be pivotal in driving innovation and advancing the pursuit of improved patient outcomes.
References
- U.S. Food and Drug Administration. 2022. Multiple Endpoints in Clinical Trials—Guidance for Industry. https://www.fda.gov/media/162416/download
- Ranganathan P, Pramesh CS, Aggarwal R. 2022. Non-inferiority trials. Perspect Clin Res 13(1):54–7. doi:10.4103/picr.picr_245_21. PMID:35198430; PMCID:PMC8815668.
- Buyse M. 2010. Generalized pairwise comparisons of prioritized outcomes in the two-sample problem. Stat Med 29(30):3245–57. doi:10.1002/sim.3923. PMID:21170918.
- Verbeeck J, De Backer M, Verwerft J, et al. 2023. Generalized Pairwise Comparisons to Assess Treatment Effects. J Amer Coll Cardiol 82:1360–72.
- McKinsey & Company. 2024. Accelerating clinical trials to improve biopharma R&D productivity. https://www.mckinsey.com/industries/life-sciences/our-insights/accelerating-clinical-trials-to-improve-biopharma-r-and-d-productivity?utm_medium=DSMN8&utm_source=LinkedIn&utm_user=14419233687008705
- Maurer MS, Schwartz JH, Gundapaneni B, Elliott PM, Merlini G, Waddington-Cruz M, … Rapezzi C. 2018. Tafamidis treatment for patients with transthyretin amyloid cardiomyopathy. New England Journal of Medicine 379(11):1007–16. https://www.nejm.org/doi/full/10.1056/NEJMoa1805689
- Gillmore JD, Judge DP, Cappelli F, Fontana M, Garcia-Pavia P, Gibbs S, … Fox JC. 2024. Efficacy and Safety of Acoramidis in Transthyretin Amyloid Cardiomyopathy. New England Journal of Medicine 390(2):132–42. https://www.nejm.org/doi/full/10.1056/NEJMoa2305434
- Serra AJ, de Carvalho Pde T, Lanza F, de Amorim Flandes C, Silva SC, Suzuki FS, Bocalini DS, Andrade E, Casarin C, Silva JA Jr. 2015. Correlation of six-minute walking performance with quality of life is domain- and gender-specific in healthy older adults. PLoS One 10(2):e0117359. doi:10.1371/journal.pone.0117359. PMID:25695668; PMCID:PMC4335060.
Samuel Salvaggio, PhD, is a Senior Trial Design Lead with One2Treat in Mont Saint Guibert, Belgium, and teaches applied biostatistics at Université libre de Bruxelles.
Emilie Barré, MSc, is a Senior Trial Design Lead with One2Treat.
Sébastien Coppe, PhD, formerly with N-SIDE, is CEO of One2Treat.
Marc Buyse, ScD, is Founder of One2Treat, IDDI, and CluePoints, and an Associate Professor of Biostatistics with I-BioStat at Hasselt University in Hasselt, Belgium.