Consider allowing JavaScript. Otherwise, you have to be proficient in reading since formulas will not be rendered. Furthermore, the table of contents in the left column for navigation will not be available and codefolding not supported. Sorry for the inconvenience.
Examples in this article were generated with 4.0.5 by the package PowerTOST
.^{1}
More examples are given in the respective vignette.^{2} See also the README on GitHub for an overview and the online manual^{3} for details and a collection of other articles.
What is the porpose of a power analysis?
Since it is unlikely that we will observed exactly our assumed values in the planned study, in a power analysis we explore the impact on power for potential deviations.
A basic knowledge of R is required. To run the scripts at least version 1.4.3 (20161101) of PowerTOST
is suggested. Any version of R would likely do, though the current release of PowerTOST
was only tested with version 3.6.3 (20200229) and later.
Note that in all functions of PowerTOST
the arguments (say, the assumed T/Rratio theta0
, the BElimits (theta1
, theta2
), the assumed coefficient of variation CV
, etc.) have to be given as ratios and not in percent.
Sample sizes are given for equally sized groups (parallel design) and for balanced sequences (crossovers, replicate designs). Furthermore, the estimated sample size is the total number of subjects, which is always an even number (parallel design) or a multiple of the number of sequence (crossovers, replicate designs) not subjects per group or sequence – like in some other software packages).
In order to get prospective power (and hence, a sample size), we need five values:
1 – 2 are fixed by the agency,
3 is set by the sponsor (commonly to 0.80 – 0.90), and
4 – 5 are just (uncertain!) assumptions.
Since it is extremely unlikely that all assumptions will be exactly realized in a particular study, it is worthwhile to prospectively explore the impact of potential deviations from assumptions on power.
We don’t have to be concerned about values which are ‘better’ (i.e., a lower CV, and/or a T/Rratio closer to unity, and/or a lower than anticipated dropoutrate) because we will gain power. On the other hand, any change towards the ‘worse’ will negatively impact power.
The sample size cannot be directly estimated, only power calculated for an already given sample size.
What we can do: Estimate the sample size based on the conditions given above and vary our assumptions (4 – 5) accordingly. Furthermore, we can decrease the sample size to explore the impact of dropouts on power.
Let’s start with PowerTOST
.
library(PowerTOST) # attach it to run the examples
Although we could do that ‘by hand’, there a three functions supporting power analyses: pa.ABE()
for conventional Average Bioequivalence (ABE), pa.scABE()
for Scaled Average Bioequivalence (ABEL, RSABE), and pa.NTIDFDA()
for the FDA’s RSABE for NTIDs.
Note that the functions support only homoscedasticity (CV_{wT} = CV_{wR}).
The’ defaults common to all functions are:
Argument  Default  Meaning 

alpha

0.05

Nominal level of the test. 
targetpower

0.80

Target (desired) power. 
minpower

0.70

Minimum acceptable power. 
theta1

0.80

Lower BElimit in ABE and lower PEconstraint in Scaled Average Bioequivalence. 
theta2

1.25

Upper BElimit in ABE and upper PEconstraint in Scaled Average Bioequivalence. 
The default of the assumed T/Rratio theta0
is 0.95
in pa.ABE()
, 0.90
in pa.scABE()
, and 0.975
in pa.NTIDFDA()
. In pa.ABE()
any design
can be specified (defaults to "2x2"
).
In pa.scABE()
the default design
is "2x3x3"
and regulator = "EMA"
.
In pa.NTIDFDA()
only a 2sequence 4period full replicate design is implemented in accordance with the FDA’s guidance.^{4}
Let’s explore in the following examples given in other articles.
top of section ↩︎ previous section ↩︎
We assume a CV of 0.40, a T/Rratio of 0.95, target a power of 0.80 (see also the example in the corresponding article about the parallel design), and keep the default minimum power 0.70.
0.40
CV < pa.ABE(CV = CV, design = "parallel") # assign to an object
x <dev.new(width = 6.16, height = 6.7)
par(no.readonly = TRUE)
op <plot(x, pct = FALSE, ratiolabel = "theta0")
par(op)
Such patterns are typical for ABE. Note that the xaxes of the second and third panels are reversed (decreasing).
Most sensitive parameter is the T/Rratio \(\small{\theta_0}\), which can decrease from the assumed 0.95 to 0.9276 (maxium relative deviation –2.36%) until we reach our minimum acceptable power of 0.70. The CV is much less sensitive, which can increase from the assumed 0.40 to 0.4552 (relative +13.8%). The least sensitive is the sample size itslef, which can decrease from the planned 130 subjects to 103 (relative –20.8%). Hence, dropouts are in general of the least concern.
We assume a CV of 0.25, a T/Rratio of 0.95, target a power of 0.80 (see also the example in the corresponding article about the 2×2×2 design), and keep the default minimum power 0.70.
0.25
CV < pa.ABE(CV = CV)
x <dev.new(width = 6.16, height = 6.7)
par(no.readonly = TRUE)
op <plot(x, pct = FALSE, ratiolabel = "theta0")
par(op)
The order of parameters influencing power is as common in ABE: \(\small{\theta_0}\) \(\small{\gg}\) CV \(\small{>}\) n.
We assume a CV of 0.125, a T/Rratio of 0.975, target a power of 0.80 (see also the example in the corresponding article about the 2×2×2 design), and keep the default minimum power 0.70.
0.125
CV < 0.975
theta0 < 0.90
theta1 < pa.ABE(CV = CV, theta0 = theta0, theta1 = theta1)
x <dev.new(width = 6.16, height = 6.7)
par(no.readonly = TRUE)
op <plot(x, pct = FALSE, ratiolabel = "theta0")
par(op)
Business as usual.
We assume a CV of 0.45, a T/Rratio of 0.90, target a power of 0.80 (see also the example in the corresponding article), keep the default minimum power 0.70, and want to perform the study in a 2sequence 4period full replicate study (TRTRRTRT or TRRTRTTR or TTRRRRTT).
0.45
CV < "2x2x4"
design < pa.scABE(CV = CV, design = design)
x <dev.new(width = 6.16, height = 6.7)
par(no.readonly = TRUE)
op <plot(x, pct = FALSE, ratiolabel = "theta0")
par(op)
Here the pattern is different to ABE. The idea of referencescaling is to maintain power irrespective of the CV. Therefore, the CV is the least sensitive parameter – it can increase from the assumed 0.45 to 0.6629 (relative +47.3%). If the CV decreases, power decreases as well because we can expand limits less.
Let’s dive deeper. x
is an S3 object.^{5} Contained in its list are the data.frames paCV
, paGMR
, and paN
. With the function power.scABEL()
we can assess the components of power at specific values of the CV.
x$paN$N[1]
n < which(x$paCV$pwr == max(x$paCV$pwr))
max.pwr < c(head(x$paCV$CV, 1),
CVs <$paCV$CV[x$paCV$pwr == min(x$paCV$pwr[1:max.pwr])],
x$paCV$CV[max.pwr], tail(x$paCV$CV, 1))
CV, x data.frame(CV = CVs, V1 = NA, V2 = NA, V3 = NA, V4 = NA)
res <for (i in 1:nrow(res)) {
2:5] < suppressMessages(
res[i, power.scABEL(CV = CVs[i], design = design,
n = n, details = TRUE))
}names(res)[2:5] < c("p(BE)", "p(BEABEL)",
"p(BEPE)", "p(BEABE)")
print(signif(res, 4), row.names = FALSE)
R> CV p(BE) p(BEABEL) p(BEPE) p(BEABE)
R> 0.3000 0.7383 0.7383 0.9832 0.6786
R> 0.3167 0.7351 0.7351 0.9784 0.6403
R> 0.4500 0.8112 0.8116 0.9266 0.4107
R> 0.4766 0.8164 0.8165 0.9161 0.3764
R> 0.6629 0.7000 0.7000 0.8467 0.1546
p(BE)
is the overall power, p(BEABEL)
is the power of ‘pure’ ABEL (without the PErestriction), and p(BEPE)
is the power of the criterion ‘point estimate within acceptance range’ alone. p(BEABE)
is the power of the conventional ABE test for comparative purposes.
We see that close to the upper cap of scaling (at CV 50%) power starts to decrease because we cannot expanding the limits any more (maximum expansion 69.84 – 14319%). Furthermore, the PErestriction becomes more important.
As above but according to the rules of Health Canada (upper cap of scaling ~57.4% instead of 50%).^{6}
0.45
CV < "2x2x4"
design < pa.scABE(CV = CV, design = design,
x <regulator = "HC")
dev.new(width = 6.16, height = 6.7)
par(no.readonly = TRUE)
op <plot(x, pct = FALSE, ratiolabel = "theta0")
par(op)
Similar to the EMA but more relaxed due to the higher upper cap.
That’s a special case because for any CV ≥ 30% the limits are directly widened to 75.00 – 133.33% and there is no upper cap of scaling.^{7}
0.45
CV < "2x2x4"
design < pa.scABE(CV = CV, design = design,
x <regulator = "GCC")
dev.new(width = 6.16, height = 6.7)
par(no.readonly = TRUE)
op <plot(x, pct = FALSE, ratiolabel = "theta0")
par(op)
As above but for the U.S. FDA and China’s CDE (see also the example in the corresponding article).
0.45
CV < "2x2x4"
design < pa.scABE(CV = CV, design = design,
x <regulator = "FDA")
dev.new(width = 6.16, height = 6.7)
par(no.readonly = TRUE)
op <plot(x, pct = FALSE, ratiolabel = "theta0")
par(op)
Due to unlimited scaling the CV is even less important than in the other methods.
We assume a CV of 0.125, a T/Rratio of 0.975 (the function’s default), target a power of 0.80, and want to perform the study in a 2sequence 4period full replicate study (TRTRRTRT or TRRTRTTR or TTRRRRTT). Note that such a design is mandatory for the FDA.
0.125
CV < "2x2x4"
design < pa.NTIDFDA(CV = CV, design = design)
x <dev.new(width = 6.16, height = 6.9)
par(no.readonly = TRUE)
op <plot(x, pct = FALSE, ratiolabel = "theta0")
par(op)
Here the CV shows a different behavior to both RSABE for HVDs / HVDPs and ABE (see above). Maximum power is seen at ~20.4% and we observe two minima.
A power analysis is not a substitute for the ‘Sensitivity Analysis’ recommended by the ICH^{8} because in a real study a combination of all effects occurs simultaneously. It is up to you to decide on reasonable combinations and analyze their respective power.
How to explore deviations simultaneously will be elaborated in another article.
top of section ↩︎ previous section ↩︎
License
Helmut Schütz 2021
1^{st} version April 11, 2021.
Rendered 20210418 21:16:15 CEST by rmarkdown in 0.58 seconds.
Footnotes and References
Labes D, Schütz H, Lang B. PowerTOST: Power and Sample Size for (Bio)Equivalence Studies. 20210118. CRAN.↩︎
Labes D, Schütz H, Lang B. Package ‘PowerTOST’. January 18, 2021. CRAN.↩︎
FDA. Office of Generic Drugs. Draft Guidance on Warfarin Sodium.. Recommended Dec 2012.↩︎
Wickham H. Advanced R. 20190808. The S3 object system.↩︎
Health Canada. Guidance Document – Comparative Bioavailability Standards: Formulations Used for Systemic Effects. Ottawa, 2018/06/08.↩︎
Executive Board of the Health Ministers’ Council for GCC States. The GCC Guidelines for Bioequivalence. Version 2.4. 3/02/2011.↩︎
International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use. ICH Harmonised Tripartite Guideline. Statistical Principles for Clinical Trials. 5 February 1998. E9 Step 4.↩︎