Consider allowing JavaScript. Otherwise, you have to be proficient in reading since formulas will not be rendered. Furthermore, the table of contents in the left column for navigation will not be available and code-folding not supported. Sorry for the inconvenience.
Examples in this article were generated with 4.0.5 by the packages TeachingDemos
,^{1} magicaxis
,^{2} and PowerTOST
.^{3}
For applications see also a collection of other articles.
Abbreviation | Meaning |
---|---|
ABE | Average Bioequivalence |
ABEL | Average Bioequivalence with Expanding Limits |
CV (CV_{wT}, CV_{wR}) |
Within-subject Coefficient of Variation (of the Test and Reference treatment) |
GMR | Geometric Mean Ratio |
HVD(P) | Highly Variable Drug (Product) |
RSABE | Reference-scaled Average Bioequivalence |
TOST | Two One-Sided Tests |
What are simulations any why may we need them?
In statistics (or better, in science in general) we prefer to obtain results by an analytical calculation. Remember your classes in algebra and calculus? That’s the stuff.
However, sometimes the desire for an ‘exact’ solution of a problem reminds on the hunt for unobtainium.
If you know the movie Hidden Figures maybe you recall the scene where Katherine Johnson solved the transition from the elliptic orbit of the Mercury capsule to the parabolic re-entry trajectory by Euler’s approximations. Given, not a simulation but a numerical method since an algebraic solution simply did not – and will not – exist.
Least-squares and maximum-likelihood optimizations are part of our bread-and-butter jobs. Nobody complains that the result is not ‘exact’.
“Statistics. A sort of elementary form of mathematics which consists of adding things together and occasionally squaring them.
However, sometimes numerical methods are not feasible if the problem gets too large. Large in this context means many parameters, a high number of data points to fit, or both.
Two examples:
Approximate trajectories of three identical bodies located at the vertices of a scalene triangle and having zero initial velocities. It is seen that the center of mass, in accordance with the law of conservation of momentum, remains in place.
© Dnttllthmmnm @ wikimedia commons
When it comes to (bio)equivalence we need simulations in various areas. One is the power calculation / sample size estimation for Reference-scaled Average Bioequivalence and the other for adaptive sequential Two-Stage Designs.
“Whenever a theory appears to you as the only possible one, take this as a sign that you have neither understood the theory nor the problem which it was intended to solve.
As we will see in the examples, in the Monte Carlo method values approach the exact one asymptotically.
# attach libraries to run the examples
library(TeachingDemos)
library(magicaxis)
library(PowerTOST)
Here it’s easy to get the average sum of pips. If you remember probability and combinatorics, you will get the answer 7. If you are a gambler, you simply know it. If you have two handy, roll them. Of course, you will get anything between the possible extremes (2 and 12) in a single roll. But you will need many rolls to come close to the theoretical average 7. Since this is a boring excercise (unless you find an idiot betting against 7), let R do the job (10,000 rolls):
set.seed(123)
ceiling(10^(seq(0, 4, 0.01)))
nsims <- data.frame(nsims = nsims, sum.pips = NA)
df <-for (i in seq_along(nsims)) {
2] <- sum(dice(rolls = nsims[i],
df[i, ndice = 2, sides = 6))/nsims[i]
} c(2, 12)
ylim <- 10^(seq(0, 4, 1))
j <- numeric()
mean.j <-for (i in seq_along(j)) {
mean(df[df$nsims == j[i], 2])
mean.j[i] <-
}dev.new(width = 4.5, height = 4.5)
par(no.readonly = TRUE)
op <-par(mar = c(3, 3, 3, 0.1), cex.axis = 0.9,
mgp = c(2, 0.5, 0))
plot(df$nsims, df[, 1], type = "n",
ylim = ylim, axes = FALSE, log = "x",
xlab = "Number of simulated rolls",
ylab = "Average sum of pips")
mtext("Average at 1, 10, \u2026", 3, line = 2)
grid(); box()
abline(h = 7)
magaxis(1)
magaxis(2, minorn = 2, las = 1)
axis(3, labels = sprintf("%.2f", mean.j), at = j)
points(df$nsims, df[, 2], pch = 21,
col = "blue", bg = "#9090FF80", cex = 1.15)
par(op)
We generate a set of uniformly distributed coordinates \(\small{\left\{x|y\right\}}\) in the range \(\small{\left\{-1|+1\right\}}\). According to Mr Pythagoras all \(\small{x^2+y^2\leq1}\) are at or within the unit circle and coded in blue. Since \(\small{A=\frac{\pi\,d^2}{4}}\) and \(\small{d=1}\), we get an estimate of \(\small{\pi}\) by 4 times the number of coded points (the ‘area’) divided by the number of simulations. Here for 50,000:
set.seed(12345678)
5e4
nsims <- runif(nsims, -1, 1)
x <- runif(nsims, -1, 1)
y <- (x^2 + y^2) <= 1
is.inside <- 4 * sum(is.inside) / nsims
pi.estimate <-dev.new(width = 5.7, height = 5.7)
par(no.readonly = TRUE)
op <-par(pty = "s", cex.axis = 0.9, mgp = c(2, 0.5, 0))
bquote(italic(pi)*-estimate == .(pi.estimate))
main <-plot(c(-1, 1), c(-1, 1), type = "n", axes = FALSE,
xlim = c(-1, 1), ylim = c(-1, 1), font.main = 1,
xlab = expression(italic(x)),
ylab = expression(italic(y)), cex.main = 1.1,
main = main)
points(x[is.inside], y[is.inside], pch = '.',
col = "blue")
points(x[!is.inside], y[!is.inside], pch = '.',
col = "lightgrey")
magaxis(1:2, las = 1)
abline(h = 0, v = 0)
box()
par(op)
This was a lucky punch since the convergence is lousy. To get a stable estimate with just four correct significant digits, we need at least 100 million (‼) simulations.
Another slowly converging approach is simulating Buffon’s needle.^{4}
# Cave: Extremely long runtime
set.seed(123456)
function(l, n = 1) {
buffon_needles <- rep(0, n)
H <-for (i in 1:n) {
runif(2)
M <- c(1, 1)
D <-while (norm(D, type = '2') > 1) {
runif(2)
D <-
} D/norm(D, type = '2')
DD <- M + l / 2 * DD
E1 <- M - l / 2 * DD
E2 <- (E1[1] < 0) |
H[i] <- (E1[1] > 1) |
(E2[1] < 0) |
(E2[1] > 1)
}return(H)
} 0.4
l <- as.integer(10^(seq(3, 7, 0.05)))
nsims <- txtProgressBar(0, 1, 0, char = "\u2588",
pb <-width = NA, style = 3)
for (i in 1:nrow(df)) {
buffon_needles(l, n = df$nsims[i])
HHH <-$pi.estimate[i] <- 2 * l / mean(HHH)
dfsetTxtProgressBar(pb, i/nrow(df))
}close(pb)
tail(df, 1) # show last
dev.new(width = 4.5, height = 4.5)
par(no.readonly = TRUE)
op <-par(mar = c(3, 3, 0, 0), cex.axis = 0.9,
mgp = c(2, 0.5, 0))
plot(df$nsims, df$pi.estimate, type = "n",
log = "x", axes = FALSE,
xlab = "Number of simulations",
ylab = expression(italic(pi)*"-estimate"))
grid(); box()
abline(h = pi, col = "red")
magaxis(1:2, las = 1)
points(df$nsims, df$pi.estimate, pch = 21,
col = "blue", bg = "#9090FF80", cex = 1.15)
par(op)
Of course, there are extremely efficient methods like the Chudnovsky algorithm (with January 2020 the record holder with 50 trillion digits).
0:1L
q <- 0
pi.rep <-for (i in seq_along(q)) {
pi.rep -
pi.rep <- 12/640320^(3/2) *
(factorial(6 * q[i]) * (13591409 + 545140134 *q[i])) /
(factorial(3 * q[i]) * factorial(q[i])^3 *
-640320^(3 * q[i]))
} 1/pi.rep
pi.approx <- data.frame(q = as.integer(max(q)),
comp <-pi = sprintf("%.15f", pi),
pi.approx = sprintf("%.15f", pi.approx),
RE = sprintf("%-.15f%%",
100 * (pi.approx - pi) / pi))
print(comp, row.names = FALSE)
R> q pi pi.approx RE
R> 1 3.141592653589793 3.141592653589675 -0.000000000003746%
Note that increasing the number of iterations q
does not improve the result. Since this is an extremely efficient algorithm, we are already at the numeric resolution of a 64 bit numeric, which is ~2.220446·10^{-16}.
The wonderland of representing numbers on digital computers is elaborated in another article.
Over the top since exact (analytical) solutions are available. However, if you don’t trust them, simulations are possible with the function power.TOST.sim()
, which employs the distributional properties (\(\small{\sigma^2}\) follows a \(\small{\chi^2}\)-distribution with \(\small{n-2}\) degrees of freedom and \(\small{\log_{e}(\theta_0)}\) follows a normal distribution).^{5} Convergence takes a while. Empiric power, its standard error, its relative error compared to the exact power, and execution times on a Xeon E3-1245v3 3.4 GHz, 8 MB cache, 16 GB RAM, 64 bit R 4.0.5 on Windows 7:
0.25
CV <- sampleN.TOST(CV = CV,
n <-print = FALSE)[["Sample size"]]
c(1e5, 5e5, 1e6, 5e6, 1e7, 5e7, 1e8)
nsims <- power.TOST(CV = CV, n = n)
exact <- data.frame(simulations = nsims,
df <-exact = rep(exact, length(nsims)),
simulated = NA, SE = NA, RE = NA)
for (i in 1:nrow(df)) {
proc.time()[[3]]
start.time <-$simulated[i] <- power.TOST.sim(CV = CV, n = n,
dfnsims = nsims[i])
$secs[i] <- proc.time()[[3]] - start.time
df$SE[i] <- sqrt(0.025 / i)
df$RE <- 100 * (df$simulated - exact) / exact
df
}$exact <- signif(df$exact, 5)
df$SE <- signif(df$RE, 4)
df$RE <- sprintf("%+.4f%%", df$RE)
df$simulated <- signif(df$simulated, 5)
df$simulations <- formatC(df$simulations, format = "f",
dfdigits = 0, big.mark=",")
names(df)[c(1, 3)] <- c("sim\u2019s", "sim\u2019d")
print(df, row.names = FALSE)
R> sim’s exact sim’d SE RE secs
R> 100,000 0.80744 0.80814 0.0867600 +0.0868% 0.18
R> 500,000 0.80744 0.80699 -0.0551700 -0.0552% 0.79
R> 1,000,000 0.80744 0.80766 0.0279300 +0.0279% 1.70
R> 5,000,000 0.80744 0.80751 0.0090080 +0.0090% 8.27
R> 10,000,000 0.80744 0.80733 -0.0136200 -0.0136% 16.32
R> 50,000,000 0.80744 0.80743 -0.0009096 -0.0009% 81.16
R> 100,000,000 0.80744 0.80744 0.0001147 +0.0001% 161.95
We know that \(\small{\alpha}\) (the probability of the Type I Error) never exceeds the nominal level of the test (generally 0.05). In e.g., a 2×2×2 crossover design with CV 25% and 28 subjects it is 0.04999963… We can simulate studies with a T/R-ratio at one of the borders of the acceptance range and assess them for ABE as well. We expect ≤5% to pass. The number of passing studies divided by the number of simulations gives the empiric Type I Error.
# Cave: Long run-time!
5
runs <- as.integer(10^(seq(3, 7, 0.1)))
nsims <- data.frame(run = 1:runs,
df <-nsims = rep(nsims, each = runs),
binomial.lower = NA,
binomial.upper = NA,
TIE = NA)
0.25
CV <- sampleN.TOST(CV = CV,
n <-print = FALSE)[["Sample size"]]
power.TOST(theta0 = 1.25, CV = CV, n = n)
exact <- txtProgressBar(0, 1, 0, char = "\u2588",
pb <-width = NA, style = 3)
for (i in 1:nrow(df)) {
ceiling(0.05 * df$nsims[i])
x <- binom.test(x, df$nsims[i], alternative =
b.t <-"two.sided")[["conf.int"]]
3:4] <- as.numeric(b.t)
df[i, ifelse (df$run[i] == 1,
TRUE,
setseed <- FALSE)
setseed <-$TIE[i] <- power.TOST.sim(theta0 = 1.25, CV = CV,
dfn = n, nsims = df$nsims[i],
setseed = setseed)
setTxtProgressBar(pb, i/nrow(df))
}close(pb)
range(df[, 3:5])
ylim <-dev.new(width = 4.5, height = 4.5)
par(no.readonly = TRUE)
op <-par(mar = c(3, 3.5, 0, 0), cex.axis = 0.9,
mgp = c(2, 0.5, 0))
plot(df$nsims, df[, 2], type = "n", log = "x",
ylim = ylim, axes = FALSE,
xlab = "Number of simulations", ylab = "")
mtext("Empiric Type I Error", 2, line = 2.5)
grid(); box()
abline(h = exact, col = "red")
lines(df$nsims, df[, 3])
lines(df$nsims, df[, 4])
magaxis(1:2, las = 1)
for (i in 1:nrow(df)) {
ifelse (df$run[i] == 1, pch <- 19, pch <- 1)
points(df$nsims[i], df$TIE[i], pch = pch,
col = "blue", cex = 1.15)
}par(op)
Five runs. The filled circles are runs with a fixed seed of 123456 and open circles ones with random seeds. The lines show the upper and lower significance limits of the binomial test. We see that convergence of this problem is bad.
Therefore, e.g., in Two-Stage Designs and in assessing a potentially inflated Type I Error in reference-scaling 10^{6} simulations are common.
Table A2 gives for a 4-period full replicate study (EMA method), GMR 0.85, CV_{wR} 50%, and ≥80% power a sample size of 54 (10,000 simulations),^{6} whilst the function sampleN.scABEL()
of PowerTOST
gives only 52 (with its default of 100,000 simulations).
5
runs <- as.integer(10^(seq(3, 6, 0.1)))
nsims <- data.frame(run = 1:runs,
df <-nsims = rep(nsims, each = runs),
n = NA)
txtProgressBar(0, 1, 0, char = "\u2588",
pb <-width = NA, style = 3)
for (i in 1:nrow(df)) {
ifelse (df$run[i] == 1,
TRUE,
setseed <- FALSE)
setseed <-$n[i] <- sampleN.scABEL(theta0 = 0.85, CV = 0.5,
dfnsims = df$nsims[i],
setseed = setseed,
design = "2x2x4",
targetpower = 0.8,
print = FALSE,
details = FALSE)[["Sample size"]]
setTxtProgressBar(pb, i/nrow(df))
}close(pb)
range(df$n)
ylim <-dev.new(width = 4.5, height = 4.5)
par(no.readonly = TRUE)
op <-par(mar = c(3, 3, 0, 0), cex.axis = 0.9,
mgp = c(2, 0.5, 0))
plot(df$nsims, df$n, type = "n", log = "x", ylim = ylim,
axes = FALSE, xlab = "Number of simulations",
ylab = "Sample size")
grid(); box()
magaxis(1)
magaxis(2, minorn = 2, las = 1)
for (i in 1:nrow(df)) {
ifelse (df$run[i] == 1, pch <- 19, pch <- 1)
points(df$nsims[i], df$n[i], pch = pch,
col = "blue", cex = 1.15)
}points(1e5, 52, pch = 16, col = "#00AA00", cex = 1.15)
points(1e4, 54, pch = 16, col = "red", cex = 1.15)
par(op)
Five runs. As before filled circles are with a fixed seed of 123456 and open circles with random seeds. With only 10,000 simulations I got^{7} four times 50, once 52, and once 54 (like ‘The Two Lászlós’). If you have more runs, you will get in some of them more extreme ones (in 1,000 runs of 10,000 I got a median of 52 and a range of 48–56). However, this wider range demonstrates that the result is not stable. Hence, we need at least 50,000 simulations to end up with 52 as a stable result. Although you are free to opt for more than the default of 100,000 it’s a waste of time.
Table A4 of the same reference gives for a 4-period full replicate study (FDA method), GMR 0.90, CV_{wR} 70%, and ≥90% power a sample size of only 46 (10,000 simulations), whilst the function sampleN.RSABE()
of PowerTOST
gives 48 (with its default of 100,000 simulations).
5
runs <- as.integer(10^(seq(3, 6, 0.1)))
nsims <- data.frame(run = 1:runs,
df <-nsims = rep(nsims, each = runs),
n = NA)
txtProgressBar(0, 1, 0, char = "\u2588",
pb <-width = NA, style = 3)
for (i in 1:nrow(df)) {
ifelse (df$run[i] == 1,
TRUE,
setseed <- FALSE)
setseed <-$n[i] <- sampleN.RSABE(theta0 = 0.9, CV = 0.7,
dfnsims = df$nsims[i],
setseed = setseed,
design = "2x2x4",
targetpower = 0.9,
print = FALSE,
details = FALSE)[["Sample size"]]
setTxtProgressBar(pb, i/nrow(df))
}close(pb)
range(df$n)
ylim <-dev.new(width = 4.5, height = 4.5)
par(no.readonly = TRUE)
op <-par(mar = c(3, 3, 0, 0), cex.axis = 0.9,
mgp = c(2, 0.5, 0))
plot(df$nsims, df$n, type = "n", log = "x", ylim = ylim,
axes = FALSE, xlab = "Number of simulations",
ylab = "Sample size")
grid(); box()
magaxis(1)
magaxis(2, minorn = 2, las = 1)
for (i in 1:nrow(df)) {
ifelse (df$run[i] == 1, pch <- 19, pch <- 1)
points(df$nsims[i], df$n[i], pch = pch,
col = "blue", cex = 1.15)
}points(1e5, 48, pch = 16, col = "#00AA00", cex = 1.15)
points(1e4, 46, pch = 16, col = "red", cex = 1.15)
par(op)
Five runs. As before filled circles are with a fixed seed of 123456 and open circles with random seeds. With only 10,000 simulations I did’n get 46 (like ‘The Two Lászlós’) but once 48, twice 50, once 52, and once 54. In 1,000 runs of 10,000 I got a median of 48 and a range of 44–52). With more simulations we end up with 48 as a stable result.
One regulator once told me »I don’t believe in simulations«. Strong view. I doubt that he avoids to use any kind of modern transportation – a lot of simulations involved in the design and testing. However, there’s a grain of truth in his statement.
“Garbage in, garbage out. It’s very very simple.
If an analytical solution exists ot more efficient methods are available, it’s easy to check whether the result of a simulation is reliable.
If an analytical solution exists and is reasonably fast, use it. In such a case performing simulations is a waste of time. If an exact method does not exist, it get’s tricky, i.e., rigorous testing is of utmost importance.
rnorm(n, mean = 0, sd = 1)
in R which will give values of the standard normal.As said above, in the simulation-functions of PowerTOST
these ‘key’ statistics are implemented: \(\small{\sigma^2}\) follows a \(\small{\chi^2}\)-distribution with \(\small{n-2}\) degrees of freedom and \(\small{\log_{e}(\theta_0)}\) follows a normal distribution. This approach is less computational intensive than simulating subjects of a study.
Let’s compare the sample size, power, and empiric Type I Error for a 4-period full replicate design, homoscedasticity (CV_{wT} = CV_{wR} = 0.45), T/R-ratio 0.90, target power ≥0.80, intended for the EMA’s ABEL.
0.45
CV <- "2x2x4"
design <- scABEL(CV = CV)[["upper"]]
U <- data.frame(method = c("stats", "subj"),
comp <-n = NA, power = NA, TIE = NA,
secs = NA)
proc.time()[[3]]
st <-1, 2:3] <- sampleN.scABEL(CV = CV, design = design,
comp[details = FALSE,
print = FALSE)[8:9]
1, 4] <- power.scABEL(CV = CV, design = design,
comp[n = comp$n[1], theta0 = U,
nsims = 1e6)
1, 5] <- proc.time()[[3]] - st
comp[ proc.time()[[3]]
st <-2, 2:3] <- sampleN.scABEL.sdsims(CV = CV,
comp[design = design,
details = FALSE,
print = FALSE)[8:9]
2, 4] <- power.scABEL.sdsims(CV = CV,
comp[design = design,
n = comp$n[1],
theta0 = U,
nsims = 1e6,
progress = FALSE)
2, 5] <- proc.time()[[3]] - st
comp[1] <- c("\u2018key\u2019 stat\u2019s",
comp["subject sim\u2019s")
print(comp, row.names = FALSE)
R> method n power TIE secs
R> ‘key’ stat’s 28 0.81116 0.048889 0.66
R> subject sim’s 28 0.81196 0.048334 27.86
That’s a sufficiently close match and in general simulating via the ‘key’ statistics is justified due to speed reasons.
However, there is one case where subject simulations are recommended: A partial replicate design with heteroscedasticity (only if CV_{wT} > CV_{wR}).
0.45
CV <-# split based on the variance ratio,
# first element CVwT, second element CVwR
CVp2CV(CV, ratio = 1.5)
CV <- "2x3x3"
design <- scABEL(CV = CV[2])[["upper"]]
U <- data.frame(method = c("stats", "subj"),
comp <-CVwT = CV[1], CVwR = CV[2],
n = NA, power = NA, TIE = NA,
secs = NA)
proc.time()[[3]]
st <-1, 4:5] <- sampleN.scABEL(CV = CV, design = design,
comp[details = FALSE,
print = FALSE)[8:9]
1, 6] <- power.scABEL(CV = CV, design = design,
comp[n = comp$n[1], theta0 = U,
nsims = 1e6)
R> Warning: Heteroscedasticity in the 2x3x3 design may lead to poor accuracy of power;
R> consider the function power.scABEL.sdsims() instead.
1, 7] <- proc.time()[[3]] - st
comp[ proc.time()[[3]]
st <-2, 4:5] <- sampleN.scABEL.sdsims(CV = CV,
comp[design = design,
details = FALSE,
print = FALSE)[8:9]
2, 6] <- power.scABEL.sdsims(CV = CV,
comp[design = design,
n = comp$n[1],
theta0 = U,
nsims = 1e6,
progress = FALSE)
2, 7] <- proc.time()[[3]] - st
comp[2:3] <- signif(comp[, 2:3], 4)
comp[, 1] <- c("\u2018key\u2019 stat\u2019s",
comp["subject sim\u2019s")
print(comp, row.names = FALSE)
R> method CVwT CVwR n power TIE secs
R> ‘key’ stat’s 0.4977 0.3987 51 0.81558 0.054635 0.61
R> subject sim’s 0.4977 0.3987 54 0.81723 0.067935 57.94
THX for throwing a warning!
Yes, there is an inflated Type I Error. It will be discussed in another article.
Compare that with homoscedasticity.
c(0.45, 0.45)
CV <- "2x3x3"
design <- scABEL(CV = CV[2])[["upper"]]
U <- data.frame(method = c("stats", "subj"),
comp <-CVwT = CV[1], CVwR = CV[2],
n = NA, power = NA, TIE = NA,
secs = NA)
proc.time()[[3]]
st <-1, 4:5] <- sampleN.scABEL(CV = CV, design = design,
comp[details = FALSE,
print = FALSE)[8:9]
1, 6] <- power.scABEL(CV = CV, design = design,
comp[n = comp$n[1], theta0 = U,
nsims = 1e6)
1, 7] <- proc.time()[[3]] - st
comp[ proc.time()[[3]]
st <-2, 4:5] <- sampleN.scABEL.sdsims(CV = CV,
comp[design = design,
details = FALSE,
print = FALSE)[8:9]
2, 6] <- power.scABEL.sdsims(CV = CV,
comp[design = design,
n = comp$n[1],
theta0 = U,
nsims = 1e6,
progress = FALSE)
2, 7] <- proc.time()[[3]] - st
comp[2:3] <- signif(comp[, 2:3], 4)
comp[, 1] <- c("\u2018key\u2019 stat\u2019s",
comp["subject sim\u2019s")
print(comp, row.names = FALSE)
R> method CVwT CVwR n power TIE secs
R> ‘key’ stat’s 0.45 0.45 39 0.80588 0.046774 0.74
R> subject sim’s 0.45 0.45 39 0.80256 0.048737 37.16
No problems even in a partial replicate design.
Why do we need less subjects? Although with heteroscedasticity we had the same pooled CV_{w} of 0.45, here we can expand the limits more (CV_{wR} 0.45 → 72.15–138.59% than with CV_{wR} 0.3987 → 74.68–133.90%) and the lower CV_{wT} (0.45 instead of 0.4977) will result in a narrower 90% confidence interval.
Welcome to the nightmare. What you should do:
State which PRNG you used. Nowadays in many software packages the. Mersenne Twister^{9} is the default. It has a period of ≈4.3×10^{6,001} and hence, one can be sure to get no repetitions in extremely large simulations. It is implemented even im M$ Excel since v2010.
However, in VBA still a LCG is implemented, which is useless in Monte Carlo simulations due to its short period.
Find a compromise between accuracy and speed. As we have seen above, 100,000 simulations are sufficient in sample size estimations for reference-scaling. Nevertheless, in simulating the empiric Type I Error 10^{6} (e.g., for adaptive Two-Stage Designs, reference-scaling) are highly recommended and the standard nowadays.
top of section ↩︎ previous section ↩︎
License
Helmut Schütz 2021
1^{st} version March 13, 2021.
Rendered 2021-04-11 12:48:34 CEST by rmarkdown in 0.86 seconds.
Footnotes and References
Snow G. TeachingDemos: Demonstrations for Teaching and Learning. 2020-04-07. CRAN.↩︎
Robotham A. magicaxis: Pretty Scientific Plotting with Minor-Tick and Log Minor-Tick Support. 2020-12-04. CRAN.↩︎
Labes D, Schütz H, Lang B. PowerTOST: Power and Sample Size for (Bio)Equivalence Studies. 2021-01-18. CRAN.↩︎
Jensen R. Simulating Buffon’s Needle in R. Answer at StackExchange. 2019-12-18.↩︎
Zheng C, Wang J, Zhao L. Testing bioequivalence for multiple formulations with power and sample size calculations. Pharm Stat. 2012; 11(4): 334–41. doi:10.1002/pst.1522.↩︎
Tóthfalusi L, Endrényi L. Sample Sizes for Designing Bioequivalence Studies for Highly Variable Drugs. J Pharm Pharmaceut Sci. 2011; 15(1): 73–84. doi:10.18433/j3z88f.
Open access.↩︎
If you perform five runs with 10,000 simulations, you will get different numbers. That’s the effect of random seeds and a low number of simulations.↩︎
Potvin D, DiLiberti CE, Hauck WW, Parr AF, Schuirmann DJ, Smith RA. Sequential design approaches for bioequivalence studies with crossover designs Pharm Stat. 2007; 7(4): 245–262. doi:10.1002/pst.294.↩︎
Matsumoto M, Nishimura T. Mersenne Twister: A 623-dimensionally equidistributed uniform pseudo-random number generator. ACM Trans Model Comput Simul. 1998; 8: 3–30. doi:10.1145/272991.272995.↩︎
Schütz H. Reference-Scaled Average Bioequivalence. Heterogenicity. 2020-12-23. CRAN.↩︎
FDA. Office of Generic Drugs. Draft Guidance on Progesterone. Step 2.. Recommended Apr 2010; Revised Feb 2011.↩︎