P-value pitfalls in clinical trials

P-value pitfalls in clinical trials

P-value pitfalls in clinical trials 558 344 Exploristics

By Kim Hacquoil, Exploristics CDSO

It is not possible for clinical trials to test investigational treatments on all potential patients. Therefore, clinical trials are conducted on a sample of participants drawn from the population of interest. The results of a trial are, therefore, estimates of what might happen if the treatment were to be given to the entire population of interest. The inferences and statistics from the sample are important to describe what might happen in the full population and are key to make decisions about possible treatments.


What is a p-value?

There are lots of different definitions out there to describe what a p-value is. Many of these are quite technical and talk about null hypothesis, test statistics, probabilities and other statistical jargon. A layman’s definition I like is that a p-value describes how likely is it that the results seen could have happened anyway, if the effect you are testing for does not exist.


What does a p-value tell us?

A p-value is used to help us decide whether or not to reject the null hypothesis*. The smaller the p-value, the less likely it becomes that the null hypothesis is accurate and the more likely you are to reject it. In clinical trials the null hypothesis is usually the assumption that there is no difference between the investigational and control treatments. So, if you can reject the null hypothesis then you have confidence in accepting the alternative hypothesis (that there is a difference between the groups).

* Null Hypothesis – the hypothesis that there is no difference between the specified populations of interest, with any observed difference due to sampling or experimental error.


What does a p-value not tell us?

A p-value does not tell us the clinical relevance or importance of the observed treatment effects. Specifically, a p-value does not provide details about the magnitude of effect. P-values are one component to consider when interpreting study results, with much deeper appreciation of results being available when the treatment effects and associated confidence intervals are also taken into consideration.


Common misconceptions about p-values

  • A p-value is not the probability that a treatment doesn’t work (i.e., not the probability that the null hypothesis is true).
  • The commonly used 5% threshold to define “significance” is not a magic “all-or-nothing” level – it’s an arbitrary one and should be used with care.
  • Comparing multiple things at the same time within a study each at 5% p-value level will not control the false positive rate – in fact it leads to a >5% chance of incorrectly finding a “significant” result.
  • A low p-value does not necessarily mean the results are “positive”. With large sample sizes small effect sizes can be statistically significant but not clinically relevant.
  • P-values are reliant on the validity of the statistical test performed – if your assumptions are not relevant or valid then the resulting p-value may not be a good measure to use.


Are p-values useful for decision-making in clinical trials?

Traditionally p-values <0.05 get people excited and get data and results published, but there is so much more to decision-making in clinical trials. Attempting to inform clinical practice patterns through interpretation of p-values alone is overly simplistic, and fraught with the potential for misleading conclusions.

Particularly in the hypothesis generating stages of clinical development (Phase 2 and earlier), its key that project teams think more broadly. Here understanding the potential for making incorrect decisions is crucial when designing studies. Understanding the operating characteristics of the decision rules and study designs across different assumed “truths” can reveal where the best balance in trade-offs will lie.

Even when looking at phase 3 pivotal studies, where the spotlight is more firmly on statistical significance, a recent review showed that between 2018 and 2021, approximately 10% of FDA approvals were based on pivotal studies that did not reach statistical significance in their primary endpoint. This indicates that the Regulators view results in the whole and are not necessarily solely focussed on the p-value as a means to base approval on.


Simulation can help in understanding the trade-offs involved in designing a study

Simulation is an efficient and effective approach to evaluating different “truths” or assumptions and should be used to fully explore clinical trial design options before starting a study. It can provide decision-makers with the trade-off between making incorrect go and incorrect no-go decisions and correct go and correct no-go decisions. These trade-offs will always exist and need to be understood before starting a study. Simulation gives project teams a powerful tool with which to quantify and assess the impact of these early; reaching beyond simple p-values to achieve a more holistic and successful approach to clinical trial design.


Read more:

Lightening the load for patients in clinical trials

Plague, problems and profiting from failure

One size does not fit all

I’ve got the power or have I?

Statistical Consulting Services

Hear more:

The power of simulations for designing clinical studies and beyond

People always, patients sometimes

Simulations your most powerful study design tool

Watch more:

I just need a simple sample size

I’ve got the power or have I?

Can pharma learn from development in other industries?