↓ Skip to main content

Type I error rates of multi-arm multi-stage clinical trials: strong control and impact of intermediate outcomes

Overview of attention for article published in Trials, July 2016
Altmetric Badge

About this Attention Score

  • In the top 25% of all research outputs scored by Altmetric
  • Good Attention Score compared to outputs of the same age (76th percentile)

Mentioned by

12 tweeters


13 Dimensions

Readers on

25 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Type I error rates of multi-arm multi-stage clinical trials: strong control and impact of intermediate outcomes
Published in
Trials, July 2016
DOI 10.1186/s13063-016-1382-5
Pubmed ID

Daniel J. Bratton, Mahesh K. B. Parmar, Patrick P. J. Phillips, Babak Choodari-Oskooei


The multi-arm multi-stage (MAMS) design described by Royston et al. [Stat Med. 2003;22(14):2239-56 and Trials. 2011;12:81] can accelerate treatment evaluation by comparing multiple treatments with a control in a single trial and stopping recruitment to arms not showing sufficient promise during the course of the study. To increase efficiency further, interim assessments can be based on an intermediate outcome (I) that is observed earlier than the definitive outcome (D) of the study. Two measures of type I error rate are often of interest in a MAMS trial. Pairwise type I error rate (PWER) is the probability of recommending an ineffective treatment at the end of the study regardless of other experimental arms in the trial. Familywise type I error rate (FWER) is the probability of recommending at least one ineffective treatment and is often of greater interest in a study with more than one experimental arm. We demonstrate how to calculate the PWER and FWER when the I and D outcomes in a MAMS design differ. We explore how each measure varies with respect to the underlying treatment effect on I and show how to control the type I error rate under any scenario. We conclude by applying the methods to estimate the maximum type I error rate of an ongoing MAMS study and show how the design might have looked had it controlled the FWER under any scenario. The PWER and FWER converge to their maximum values as the effectiveness of the experimental arms on I increases. We show that both measures can be controlled under any scenario by setting the pairwise significance level in the final stage of the study to the target level. In an example, controlling the FWER is shown to increase considerably the size of the trial although it remains substantially more efficient than evaluating each new treatment in separate trials. The proposed methods allow the PWER and FWER to be controlled in various MAMS designs, potentially increasing the uptake of the MAMS design in practice. The methods are also applicable in cases where the I and D outcomes are identical.

Twitter Demographics

The data shown below were collected from the profiles of 12 tweeters who shared this research output. Click here to find out more about how the information was compiled.

Mendeley readers

The data shown below were compiled from readership statistics for 25 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 25 100%

Demographic breakdown

Readers by professional status Count As %
Researcher 7 28%
Student > Ph. D. Student 6 24%
Other 3 12%
Student > Bachelor 2 8%
Student > Master 2 8%
Other 2 8%
Unknown 3 12%
Readers by discipline Count As %
Medicine and Dentistry 7 28%
Mathematics 4 16%
Nursing and Health Professions 2 8%
Agricultural and Biological Sciences 2 8%
Biochemistry, Genetics and Molecular Biology 1 4%
Other 5 20%
Unknown 4 16%

Attention Score in Context

This research output has an Altmetric Attention Score of 7. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 18 August 2020.
All research outputs
of 18,687,462 outputs
Outputs from Trials
of 4,834 outputs
Outputs of similar age
of 269,380 outputs
Outputs of similar age from Trials
of 1 outputs
Altmetric has tracked 18,687,462 research outputs across all sources so far. Compared to these this one has done well and is in the 80th percentile: it's in the top 25% of all research outputs ever tracked by Altmetric.
So far Altmetric has tracked 4,834 research outputs from this source. They typically receive more attention than average, with a mean Attention Score of 7.8. This one has gotten more attention than average, scoring higher than 70% of its peers.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 269,380 tracked outputs that were published within six weeks on either side of this one in any source. This one has done well, scoring higher than 76% of its contemporaries.
We're also able to compare this research output to 1 others from the same source and published within six weeks on either side of this one. This one has scored higher than all of them