How and Why Bayesian Statistics Are Revolutionizing Pharmaceutical Decision Making

Clinical Researcher—December 2021 (Volume 35, Issue 9)


Bruno Boulanger, PhD; Bradley P. Carlin, PhD


Over the past 20 years, there has been growing interest within the pharmaceutical industry in Bayesian statistics and how to apply this methodology toward reaching goals in the arenas of research, development, manufacturing, and health economics.

The Bayesian approach to pharmaceutical decision making started to gather greater momentum after the first Applied Bayesian Biostatistics conference in 2010, which brought together academicians, industry representatives, and regulatory authorities to discuss the practical implementation of Bayesian statistics in speeding up drug discovery, development, and approvals. Increasingly, pharmaceutical companies have been turning to Bayesian biostatisticians to apply probabilities to statistical problems to determine likely outcomes—in clinical trials, in product development, in manufacturing, in post-market surveillance, and in market access.

While regulatory authorities have been slower to adopt Bayesian methodologies, that is starting to change. The U.S. Food and Drug Administration (FDA), in particular, has embraced Bayesian statistics as a method for supporting clinical trials in medical devices, in adaptive clinical designs, and in rare diseases.

This paper explores the growing significance of Bayesian statistics in supporting decision making across the development and regulatory processes, and its potential to improve outcomes for the biopharmaceutical industry.

How an 18th Century Methodology is Gaining Traction Today

For more than a half a century, traditional frequentist statistical methodology—where predictions are based on a fixed target of estimation—has been entrenched in clinical development and regulatory statistics. Yet, all too often drugs fail late, even in confirmatory clinical trials, at enormous cost to companies and ethical concerns for the patients, suggesting some shortcomings in these traditional methods.{1} As pressure to reduce costs and improve regulatory decision making early in the process intensifies, companies have sought more efficient ways to analyze data and assess the safety and efficacy of drugs.

It turns out that one of the most effective tools for synthesizing clinical trial data is far older than even the clinical trial process itself: Bayesian statistics.

Bayes Theorem was formulated by the Rev. Thomas Bayes, an 18th century English mathematician, philosopher, and Nonconformist minister. However, it wasn’t until the 1990s, when advances in computing technology emerged, that its techniques could be usefully applied.

Interest within the pharmaceutical industry in applying Bayesian methods at various stages of research, development, manufacturing, and health economics has been growing for the past 20 years because it applies the logic of probability to statistical problems, based on observed data.{2}

Comparing Statistical Methods

Mathematical methods have long been used to assist with decision making in clinical research, with researchers often depending on the p-value, or observed significance level, to test whether something is statistically significant. The point is to determine the significance of the results from a study in relation to the null hypothesis, which states there is no  difference between two variables. If the data sample size is big enough, then the distribution of the test statistic is roughly normal (bell-shaped) and can give you the p-value. However, if there isn’t a large sample of data, it becomes impossible to produce a reliable inference.

For example, if researchers are gathering disease rates by county or state, it’s relatively easy to gather good estimates in urban areas, but in sparsely populated rural areas, the estimates are not necessarily reliable. Two breast cancer cases in one small area in a year may raise suggestions of a cluster because, based on traditional statistical methods, the resulting rate is far higher than expected; seeing zero cases a year later would be just as uninformative. To make sense of the data, statisticians have to smooth the spatial maps and produce a more accurate picture of those cancer rates.

Bayesian methods help to achieve this by borrowing strength from observations across similar but not identical bits of information; for example, cancer rates across the map in question. In Bayesian statistics, previous and related information is relevant. Past information—whether from previous trials, scientific literature, or real-world data—is considered as part of an ongoing stream of data, “in which inferences are being updated each time new data become available.”{3} This allows researchers to achieve direct probability statements about unknown information, rather than settling for approximations.

Why Bayesian Makes Sense in the Pharmaceutical World

What, though, does all this mean for clinical trials and drug development? As everyone in the industry knows all too well, the drug approval process is costly, complex, and time-consuming. During clinical trials, companies need to know whether a drug under development is safe and effective, as well as its likelihood of success in the marketplace. This is where Bayesian methodology comes to the fore. It addresses the probability inference: What’s the probability that this new drug is safe and effective? What is the probability our current drug development program will be successful?

In most development programs, companies already have some information about a molecule or therapy from previous studies, either conducted by that company or by others. Rather than start from scratch, Bayesian statistics allow researchers to leverage this pre-existing information—including from scientific literature—to help determine the probability of success.

When a trial is conducted using Bayesian principles, initial estimates of probabilities are attributed to unknown quantities (the likelihood of a serious event, the likelihood the product will be effective for a given set of patients, etc.) using existing information (e.g., previous clinical trials) or expert opinion. These probabilities together constitute the prior distribution for the quantities of interest.{3}

As long as those conducting the study construct the prior distribution in an unbiased way (i.e., incorporating all existing knowledge, not merely that which is favorable to the company’s position), leveraging this information to support a study can dramatically improve study accuracy and efficiency. It is also economically and ethically preferable to limit the number of in-human studies conducted whenever possible.

Regulators can sometimes be somewhat more rigorous when it comes to Bayesian analyses, because they are less familiar with it than the traditional p-value approach. However, this tends to encourage careful, less automatic analyses that are typically very robust, and more formally consider the impact of multiple different models and assumptions.

That is not to suggest that Bayesian methodologies are a replacement for p-values, which answer a fundamentally different question than Bayesian probabilities. “The p-value quantifies the discrepancy between the data and a null hypothesis of interest, usually the assumption of no difference or no effect. A Bayesian approach allows the calibration of p-values by transforming them to direct measures of the evidence against the null hypothesis, so-called Bayes factors.”{4}

For example, in a genomics experiment, researchers will put some of the drug or molecule into cells to assess the expression of different genes. The question in this case will be, is the inhibition or excitation of the genes likely linked to the treatment? Here, researchers may legitimately be interested in the p-value; they want to know if the data they see are inconsistent with the hypothesis of no differential inhibition or excitation across genes. In such cases, p-values provide fairly straightforward yes-no answers because the very question is about the observed data.

However, p-values have a number of problems that limit their effectiveness even when used correctly. This was emphasized a few years ago by the American Statistical Association (ASA), which released an official “Statement on Statistical Significance and P-Values,”{5} and later held two conferences devoted to an investigation of their problems and potential remedies—many of them Bayesian. The ASA statement emphasizes various misconceptions about p-values, including the facts that they are not the probability that the null hypothesis is true, or the probability that the data were obtained “by chance alone” (two very common though falsely held beliefs). P-values do not measure the size or importance of an observed effect and can only provide evidence against a hypothesis of no difference, not evidence for it. As such, p-values are not useful in proving the equivalence of two treatments.

When conducting a clinical trial or animal study to evaluate the efficacy of a treatment, the question is not about the data itself, but rather about the treatment: Is the treatment effective? Is it safe? How likely is it that a trend emerging in the data will continue in the future? This is where Bayesian methodology has even greater usefulness. To predict a future situation, Bayesian statistics enable researchers to determine the probability of something occurring by first quantifying current uncertainty, and then propagating that into the future to get predictive probabilities. The question then is about the benefit for future patients in future trials or in the real world (i.e., not for the patients included in the past trials).

This current and future uncertainty is common in chemistry, manufacturing, and controls (CMC) applications where companies need to be able to quantify what they know now about product characteristics or manufacturing processes and combine that with additional uncertainty about what will happen in the future. It allows researchers to address the real questions of interest: Is a process comparable to a previous one? What is the probability that a development approach is on target given the observed data?

Bayesian statistics combine all that complicated and high-dimensional data, and, using 21st century computing power and experts in mathematical probability theory, develop modeling to predict a likely future outcome.

Supporting Decision Making in a Competitive Market

The enormous cost of bringing drug products to market, combined with the shift away from blockbuster product development and toward personalized medicines, often targeting rare diseases, means the paradigm for product development is changing. The past practice of using trial and error to make decisions about clinical trials, manufacturing processes, regulatory practices, or any other part of the pharmaceutical value chain is proving to be highly ineffective.

Regulatory leaders also recognize the need for new methodologies to support clinical trial design. For example, the FDA has issued guidance for industry on Complex Innovative Trial Designs (CIDs) for Drugs and Biological Products, providing advice on interacting with the agency in the development and regulatory review of such products.{6} As the guidance notes: “Bayesian approaches may be well-suited for some CIDs intended to provide substantial evidence of effectiveness because they can provide flexibility in the design and analysis of a trial, particularly when complex adaptations and predictive models are used. In addition, Bayesian inference may be appropriate in settings where it is advantageous to systematically combine multiple sources of evidence, such as extrapolation of adult data to pediatric populations, or to borrow control data from Phase II trials to augment a Phase III trial.”

Sometimes a drug might work for one patient population but fail with another, and there may be multiple reasons for that, including some tied to patients’ behaviors. As an example, at the outset of the AIDS epidemic in the 1980s, the majority of clinical trials were conducted on predominantly gay men from San Francisco and other diverse, urban areas. These men were largely compliant in their trial behavior: they stayed on their assigned treatments and dramatically reduced their risky behaviors. The result was these early trials were able to show that the drugs worked. Later in the epidemic, however, when different populations started getting HIV (for example, IV drug users from economically disadvantaged neighborhoods), these groups were sometimes less able to comply with rigid trial protocols. The result was that drugs already approved by regulators for treatment of HIV did not work in the “real worlds” of these later patients. An effective statistical approach must adjust for these differences.

What Bayesian inference allows researchers to do is, rather than keep conducting randomized trials, adjust for individual characteristics—based on where a patient lives, how old they are, their gender, their doctor, their socioeconomic status, drug use, etc. By adjusting for those real-world, confounding variables, Bayesian enables an innovative approach to data analysis with a focus on solutions.

Most important is that by leveraging prior knowledge—from previous clinical trials, scientific literature, or real-world data—Bayesian statistics allow researchers to reduce the number and size of clinical trials and help to determine the probability of success before entering Phase III trials. It does this by injecting flexibility into the way the trial is designed, to ensure projections aren’t overly optimistic, thereby accounting for the probabilities of unknown issues occurring.

Changing the paradigm of clinical trials is not only more practical and financially beneficial, but also potentially more ethical, particularly when conducting studies into treatments for rare or pediatric diseases. Not only are researchers working with a much smaller sample size of patients, but they are also working with very vulnerable patients. In some diseases, for example, life expectancy of the patient may be very short, and including a randomized parallel control study arm would strike many as unethical. Instead, by leveraging information from past studies and the literature, researchers can eliminate or at least dramatically reduce the need for a control group and ensure new treatments are tested on the patients who really need it. Bayesian statistics support that cumulative learning process by connecting the dots across different studies to support decision making in a formal way.

Bayesian methodology can also help companies make economic decisions, such as whether to build a manufacturing line for a drug in development. This is a difficult decision: If the company decides to invest in building its facility early and a drug fails in clinical trials, that investment is wasted. On the other hand, building a plant can take several years, and if the company waits for regulatory approval to begin building the facility, it will be years before that company is able to sell the approved product. Using Bayesian statistics, it is possible to compute the future probability of success during the Phase III trial and make a risk-based economic decision from that assessment.

Similarly, in portfolio management, Bayesian methodology can help companies to compute the probability of success of each of their compounds and thus decide where to invest future resources. The point is that the methodology can assist companies with making smart investment decisions through its ability to estimate probabilities of future success.

Into the Future with Bayes

The pressing needs of both companies and patients to improve the framework for making decisions has led many biopharmaceutical companies to seek out statistical experts specializing in Bayesian methodology. Many recognize the potential Bayesian statistics present to address complex problems that arise across the product lifecycle—from the probability of success with clinical trials to managing CMC and supply chains to determine the best course of action with the product portfolio.

More recently, the Bayesian momentum has gathered pace. In 2010, the first Applied Bayesian Biostatistics conference was held with a goal of stimulating the practical implementation of Bayesian statistics for the purpose of accelerating the discovery and delivery of new cures to patients.{2} That conference and others brought together a wealth of insights and knowledge that formed the basis for an award-winning book offering an overview of Bayesian methods applied to nearly all stages of drug research and development. The book, entitled Bayesian Methods in Pharmaceutical Research, was announced as the 2021 winner of Best New Bayesian Statistics Book, Best New Biostatistics Books, and one of the Best Statistics eBooks of all time by BookAuthority.

Insights from the book and from experts in the field of Bayesian statistics and their applications in the pharmaceutical industry will play an important role in improving understanding of ways to apply statistical methods to pharmaceutical problem solving.


  1. Ruberg SJ, et al. 2019. Inference and Decision Making for 21st-Century Drug Development and Approval. The American Statistician 73:319–27.
  2. Lesaffre E, Baio G, Boulanger B. 2020. Bayesian Methods in Pharmaceutical Research, CRC Press/Taylor and Francis Group.
  3. U.S. Food and Drug Administration. 2010. Guidance for the Use of Bayesian Statistics in Medical Device Clinical Trials.
  4. Held L, Ott M. 2017. On p-Values and Bayes Factors. Annual Review of Statistics and its Application 5:393–419.
  5. American Statistical Association releases statement on statistical significance and p-values. 2016. ASA News.
  6. U.S. Food and Drug Administration. 2020. Interacting with the FDA on Complex Innovative Clinical Trial Designs for Drugs and Biological Products, Guidance for Industry.

Bruno Boulanger, PhD, is Global Head of Statistics and Data Science at PharmaLex and founder of Arlenda, which merged with PharmaLex in 2018.

Bradley P. Carlin, PhD, is Senior Advisor for Data Science and Statistics at PharmaLex and former Head of the Division of Biostatistics at the University of Minnesota School of Public Health. He is also Founder and President of Counterpoint Statistical Consulting, LLC.