Clinical Researcher—April 2024 (Volume 38, Issue 2)
PEER REVIEWED
Robert Jeanfreau, MD, CPI; Robyn Harrell, MS; Paul Pati, MSN, FNP-C
All rational behavior, driven by an incentive to obtain a desired goal, is tempered by an evaluation of the attendant risk of harm. In common parlance, safety is sometimes misunderstood to be the condition of being free from the risk of any harm; however, even a superficial examination of this definition reveals its inadequacy. Every human activity, including medical therapeutics, involves the possibility of harm. Safety, in an absolute sense, simply does not exist. Every evaluation of safety must involve an estimation of the potential for benefit as well as for harm.
The weighing of benefit and risk is how stakeholders—the pharmaceutical industry, regulatory agencies (including the U.S. Food and Drug Administration [FDA]), institutional review boards (IRBs), study participants, practitioners, and patients—evaluate, in one way or another, safety.
In the past, evaluations have involved, to a greater or lesser extent, a qualitative, intuitive component performed by a stakeholder. For example, in clinical research prior to a protocol being initiated, it must be approved by an IRB. A scientific reviewer, one of the panel members, has the primary responsibility of determining if the benefits justify the risks. Each IRB must perform its own, individual benefit-risk assessment de novo. Although the reviewer carefully evaluates the scientific data, there is no standardized framework available for deliberations.
More recently, there has been a shift from qualitative assessments toward a more quantitative approach. A quantitative framework is a method for arranging numerical data in a standardized format to assist in the decision-making process.
Formalized evaluation of benefit and risk of harm is referred to as the benefit-risk framework (BRF) (the term we will use hereafter), or the benefit-risk assessment, and has been structured in a variety of different ways. The lack of wide acceptance of a particular BRF underscores the significant challenges.
There is, however, a general opinion regarding the desired features of a BRF{1–5}. It should:
- be as quantitative as possible;
- incorporate the patient’s perspective;
- be transparent; and
- be applicable throughout the lifecycle of the drug.
Quantitative
The shift from qualitative assessments is based upon certain deficiencies inherent in that approach, making the last three features of a desirable BRF difficult to achieve. For example, one of the encumbrances for some formal, qualitative assessments is the requirement for convening an expert panel. If a BRF is to be utilized throughout the lifecycle of a drug, it will be necessary to perform serial assessments as new risks and benefits become apparent. Routinely utilizing expert panels is excessively onerous and time-consuming. A structured, quantitative approach, however, allows for repetitive and timely determinations.
Another major advantage of a fully quantitative BRF is the opportunity for mathematical analysis, enabling its consistent application. This advantage is, of course, based on the premise that the components of the framework, specifically benefits and risks, are, in fact, measurable and, therefore, quantifiable.
The Patient’s Perspective
If benefits must outweigh risks throughout the lifecycle of a drug, inclusion of a participant’s perspective is not just desirable but imperative. In some early-phase studies, there are no recognized benefits from the study drug to the enrolled participants. Since there are no objective benefits, any question about the benefits justifying the risks is meaningless.
In these situations, the widely promulgated concept that the benefits must outweigh the risks is abandoned in favor of the assertion that the risks must be minimal. What is being overlooked is the fact that, although there may not be recognized benefits to receiving an investigational product (IP), there are study benefits to the participant, albeit entirely subjective. For example, the desire to help find a cure for a disease affecting a participant’s loved one would be a powerful motivator.
Transparency
Transparency implies that the “inner workings” of a BRF are readily apparent and understandable. In the case of a qualitative assessment, transparency dictates that the intuitive reasoning of an expert panel is available and clearly described.
Transparency of quantitative assessments requires that the framework is also understandable to the stakeholders. A quantitative framework cannot be deemed to be truly transparent if its computations are so complex as to defy the understanding of the most important stakeholder—the patient. An approach based upon basic algebra satisfies this requirement.
The Drug’s Lifecycle
If a BRF is to be utilized throughout the entire lifecycle of a drug, it is reasonable to examine how these concepts are addressed near the very beginning of drug development (i.e., in clinical research).
Ensuring the effectiveness of investigational drugs and devices and the safety of human participants are two primary pillars of the FDA. Effectiveness is the likelihood that, under specified conditions, an IP will result in a desired therapeutic benefit. Therefore, the benefits of drugs are initially verified in clinical trials. Harms are determined by the evaluation of adverse events collected during clinical studies. The evaluation of benefits and the risk of harm is at the very core of clinical trials. The BRF would also need to address any additional, longitudinal risks and benefits recognized after drug approval.
A Possible Approach to a Benefit-Risk Framework
Every BRF seeks to compare benefits and risks. Do the potential benefits outweigh the potential risks? A common opinion, as discussed in the quotation below, is that these two factors cannot be directly compared, although that is precisely what a quantitative BRF strives to do.
Consider this perspective: “Risk–benefit ratio: The most common expression for the comparison of harms and benefits. It is a technical term that assumes that a ratio can indeed be calculated. Because the benefits and harms of an intervention are often so different in character or are measured on different scales, the term ‘risk–benefit ratio’ has no literal meaning. In addition, there may be several distinct benefits and harms. We advocate using ‘balance of benefits and harms’ rather than ‘risk–benefit ratio.’”{6}
If the above statement is, however, true (i.e., if there is no common ground for comparing risk of harms and benefits), then there can be no basis for quantification, rendering a quantitative BRF completely untenable.
Benefits and risks can be expressed as probabilities of comparable scales. The common ground is, of course, health. Harms diminish whereas benefits promote health. Clinical research provides guideposts as to how these two entities might be compared as a ratio.
Benefit
Risk
An example follows. A hypothetical, investigational arthritis drug relieves joint pain in 99 of 100 patients with only one reported adverse reaction (AR).
The equation can then be written as:
Frequency of Benefit
Frequency of AR
Initially, the benefit-risk ratio looks acceptable—until it is disclosed that the AR was a death. Clearly, more than just the frequency of the AR must be considered.
Severity of the AR is a critical factor. The equation is modified as noted below.
Frequency of Benefit
Frequency of AR x Severity of AR
Another example follows with a different drug in a terminal setting. With this drug, there is again a single AR out of 100 patients and again that AR is a death. However, the drug was completely effective in the other 99 patients. Consider that all 99 patients had a diagnosis of terminal lung cancer and were, therefore, subsequently cured. So now the benefit- risk ratio is remarkably positive. Clearly, not only the frequency of the benefit and the severity of the AR, but also the severity of the underlying disease are necessary factors to consider. The equation is modified to include four specific factors.
Frequency of Benefit x Severity of Disease
Frequency of AR x Severity of the AR
At first glance, this appears to be a simple, usable equation. When the numerator is larger than the denominator, the benefits “outweigh” the risks.
Important questions remain. How is “benefit” determined? How is the severity of an AR determined? How is the severity of a disease determined? Clinical research may provide answers to these questions.
The FDA defines disease as:
“…damage to an organ, part, structure, or system of the body such that it does not function properly (e.g., cardiovascular disease), or a state of health leading to such dysfunctioning (e.g., hypertension); except that diseases resulting from essential nutrient deficiencies (e.g., scurvy, pellagra) are not included in this definition.”{7} In this definition, the emphasis is on normal functioning.
The FDA defines an adverse event as any untoward medical occurrence associated with the use of a drug in humans, whether or not considered drug related. Therefore, an adverse event could be a symptom, an abnormal lab finding, a physical finding, or a clearly defined disorder or disease. If it should be determined that the adverse event is, in fact, caused by the drug, then the adverse event is termed an AR.
Clinical research provides grading scales for the evaluation of ARs. A grading scale is a type of rank order and is the first step toward assigning weights.
A common grading scale for symptoms, physical findings, and diseases is based upon the effect that the AR has on activities of daily living (ADLs). Grading scales for the evaluation of abnormal lab results are also found in clinical research.
Frequency of Adverse Reactions
The collection and characterization of ARs, an integral part of clinical research, continues even after drug approval. Therefore, the frequency of ARs, often contained within the package insert or in systematic reviews, is subject to change as additional, longitudinal data become available. As more long-term evidence accrues, a dynamic framework would make updating benefit-risk assessments throughout the drug’s lifecycle a much less daunting task.
Severity of Diseases and Adverse Reactions
One of the biggest challenges in formulating a workable equation is in defining severity of ARs and severity of diseases. For ease of interpretation, these terms will be defined such that they are “like-terms.” The commonality of diseases and ARs is that they both affect health. Therefore, the challenge becomes to define both in terms of health. Health can be defined as the “ability to function normally.”
The severity of diseases and ARs can thus be operationally described in terms of the impact on a person’s ability to function normally, taken here to mean the ability to carry out ADLs.
An excellent example of how these concepts are utilized in clinical research is the Common Terminology Criteria for Adverse Events v5.0 (CTCAE). The CTCAE, originally formulated for oncology trials, provides a grading system for all categories of ARs, including symptoms, lab abnormalities, physical findings, and disease states. The grading system is described in the introduction of the document:
“Grades: Grade refers to the severity of the [adverse event]. The CTCAE displays Grades 1 through 5 with unique clinical descriptions of severity for each [adverse event] based on this general guideline:
Grade 1 Mild; asymptomatic or mild symptoms; clinical or diagnostic observations only; intervention not indicated.
Grade 2 Moderate; minimal, local, or noninvasive intervention indicated; limiting age-appropriate instrumental ADL.*
Grade 3 Severe or medically significant but not immediately life-threatening; hospitalization or prolongation of hospitalization indicated; disabling; limiting self-care ADL.**
Grade 4 Life-threatening consequences; urgent intervention indicated.
Grade 5 Death related to [adverse event].
Activities of Daily Living (ADL)
*Instrumental ADL [refers] to preparing meals, shopping for groceries or clothes, using the telephone, managing money, etc.
**Self-care ADL [refers] to bathing, dressing and undressing, feeding self, using the toilet, taking medications, and not bedridden.”{8}
This approach has become standard practice in oncology trials which routinely rank subjects according to The Eastern Cooperative Oncology Group (ECOG) Performance Status Scale.
The CTCAE grading system can be implemented in this approach to grade the severity of disease states and ARs. Since there can be little argument (i.e., the effect of subjectivity is minimal) that the worst possible harm is death, death will be assigned the highest weight for ARs. The numeral, 1, will be arbitrarily chosen as the highest weight. The other categories of ARs, therefore, will be assigned values less than 1. In this way, the weights for ARs are defined as “constants” as opposed to variables.
Although severity of disease can also be expressed by its impact on health, it cannot be assigned as a constant in clinical practice. The reason is that the severity of a particular disease can vary widely from patient to patient (e.g., the manifestations of multiple sclerosis can range from minimal to life-threatening). Another example is COVID-19, which has gradually mutated into a less virulent disease. For a specific patient, the particular weight for the severity of disease would be assigned by the treating clinician. In the clinical trial setting, however, the weight for the severity of disease could be assigned by the sponsor in concert with the FDA.
Frequency of Benefit
For the purpose of defining this variable, benefit will be understood to mean how well the drug does what it is purported to do. It is obvious that some types of benefits have a more profound effect on health than others. What drugs do can be broadly categorized in rank order of increasing importance as follows: 1) alleviates symptoms, 2) ameliorates (or slows down) disease, 3) halts disease progression, 4) cures disease, and 5) prevents disease. Therefore, the frequency of benefit is the frequency with which the drug achieves its primary goal. Ranking of these five categories will later serve as the basis for assigning weights to benefits.
The benefit variable can now be further defined as the frequency of the benefit x the weight of the benefit.
(Frequency of Benefit x Weight of Benefit) x Disease
Frequency of AR x Severity of the AR
Because the weight of the benefit is defined as a constant, specific weights have to be assigned. Since there can be little argument (i.e., the effect of subjectivity is minimal) that the best possible category of benefits is prevention, it will be assigned the highest weight for benefits. Again, the numeral, 1, will be chosen as the highest weight. The weights for the other categories of benefits will be assigned values less than 1.
Similarly, because the weight of ARs is defined as a constant, specific weights have to be assigned. Since there can be little argument (i.e., the effect of subjectivity is minimal) that the worst possible AR is death, death will be assigned the highest weight for ARs. The weights for the other categories of ARs will be assigned values less than 1.
The remaining weights for the other benefits and ARs would be assigned by a panel of experts. However, for demonstration purposes only, the following weights will be assigned for benefits and ARs:
Benefits ARs
0.0032 Alleviates symptoms 0.0032 Mild
0.252 Ameliorates disease 0.252 Moderate
0.501 Halts disease progression 0.501 Severe
0.75 Cures disease 0.75 Life threatening
1.0 Prevents disease 1.0 Death
Examples follow. Any value > 1.0 will be seen as a positive benefit-risk ratio for the drug. An antibiotic which has, as its most common AR, diarrhea, is being used to treat pneumonia. This AR occurs in 10% of patients, and its severity is considered moderate (i.e., a weight of 0.252). We will further postulate that the antibiotic cures pneumonia in about 90%. The pneumonia is graded as severe.
(Frequency of Benefit x Weight of Benefit) x Severity of Disease
Frequency of AR x Severity of AR
Substitute the numerical values for the variables:
(0.9 x 0.75) x 0.501 = 13.5
0.1 x 0.252
As another example, alter the above equation as follows. The AR is colitis due to clostridium difficile with an occurrence rate of 30% with a severity of severe. The numerator will be the same.
(0.9 x 0.75) x 0.501 = 2.25
0.3 x 0.501
The result is a positive risk-benefit profile, although not as pronounced as the preceding example.
Alter the variables once more. Instead of pneumonia, the disease will be pharyngitis with a designated severity of mild with the other variables remaining the same.
(0.9 x 0.75) x 0.0032 = 0.0144
0.3 x 0.501
Now the benefit-risk ratio becomes 0.0144, clearly an unacceptable treatment option. Clearly, any framework is dependent upon reliable data drawn from well-conducted clinical trials with sound statistical analysis (e.g., appropriate sample size, etc.). If a study is subsequently found to be flawed, the data would simply be expunged from the calculations.
Discussion
In addition to the previously mentioned advantages (i.e., quantitative, incorporates patient’s perspective, is transparent, and applicable throughout the lifecycle of the drug), this approach is also versatile.
For example, for drugs having multiple ARs and/or multiple benefits, the computations can be carried out in a single equation, yielding a composite ratio.
(Frequency of Benefit x Weight of Benefit) x Severity of Disease
(Frequency of AR1 x Severity of AR1) + (Frequency of AR2 x Severity of AR2)
Additionally, there are some situations in which the drug has an AR that some participants may actually view as a benefit. For example, a drug for migraines may be found to have mild weight loss as an AR. However, some participants might consider mild weight loss to be a benefit rather than an AR. This approach is versatile enough to account for this type of participant subjectivity.
To be sure, there are certain limitations to this approach; among them, in its current format, it does not address the issue of uncertainty. For example, a Phase I study with a sample size of 30 subjects yields data on risks and benefits of an IP for migraine. Three of these subjects (10%) developed nausea felt to be secondary to the IP. A larger Phase III study with 500 subjects also yielded a 10% rate for nausea along with the same benefit profile. Although the results of the benefit-risk ratio will be exactly the same, the certainty associated with each study is markedly different. Uncertainty is inversely proportional to the amount of reliable data.
Additionally, this approach assumes that drugs are only used for disease states. There are, however, drugs (primarily those used for cosmetic purposes) that are not used to treat a disease but to improve the quality of life. The approach would have to be modified to evaluate these medications.
This approach utilizing well-established concepts in clinical research is not a definitive solution, but is intended to stimulate discussion regarding a viewpoint that has been prematurely dismissed.
References
- Benefit-Risk Assessment in Drug Regulatory Decision-Making. PDUFA VI Plan (FY 2018–2022). Draft of March 2018. https://www.fda.gov/files/about%20fda/published/Benefit-Risk-Assessment-in-Drug-Regulatory-Decision-Making.pdf
- Benefit-Risk Assessment for New Drug and Biological Products. Guidance for Industry, October 2023. https://www.fda.gov/media/152544/download
- Waschbusch M, Rodriguez L, Brueckner A, Lee KJ, Li X, Mokliatchouk O, Tremmel L, Yuan SS. 2022. Global Landscape of Benefit-Risk Considerations for Medicinal Products: Current State and Future Directions. Pharmaceutical Medicine 36:201–13.
- Walker S, McAuslane N, Liberti L, Leong J, Salek S. 2015. A Universal Framework for the Benefit-Risk Assessment of Medicines: Is This the Way Forward? Therapeutic Innovation & Regulatory Science 49(1):17–25.
- Kurzinger M-L, Douarin L, Uzun I, El-Haddad C, Hurst W, Juhaeri J, Tcherny-Lessenot S. 2020. Structured benefit-risk evaluation for medicinal products: review of quantitative benefit-risk assessment findings. Therapeutic Advances in Drug Safety 11:2042098620976951.
- Ioannidis JPA, Evans SJW, Gøtzsche PC, O’Neill RT, Altman DG, Schulz K, Moher D. 2004. Better Reporting of Harms in Randomized Trials: An Extension of the CONSORT Statement. Annals of Internal Medicine. http://annals.org/aim/fullarticle/717961/better-reporting-harms-randomized-trialsextension-consort-statement
- U.S. Food and Drug Administration. Code of Federal Regulations Title 21. https://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfcfr/CFRsearch.cfm?fr=101.93#:~:te xt=343(r)(6)%2C,nutrient%20deficiencies%20(e.g.%2C%20scurvy%2C
- National Institutes of Health; National Cancer Institute; Division of Cancer Treatment and Diagnosis. https://ctep.cancer.gov/protocoldevelopment/electronic_applications/docs/CTCAE_v5_Quick_Reference_8.5×11.pdf
Robert Jeanfreau, MD, CPI, (jeanfreaurobert228@gmail.com) is a Principal Investigator with Velocity Clinical Research.
Robyn Harrell, MS, is a Senior Data Scientist with Ontada.
Paul Pati, MSN, FNP-C, is a Family Nurse Practitioner and Sub-Investigator with Velocity Clinical Research.