Consider allowing JavaScript. Otherwise, you have to be proficient in reading since formulas will not be rendered. Furthermore, the table of contents in the left column for navigation will not be available and code-folding not supported. Sorry for the inconvenience.
The examples in this article were generated with 4.0.5 by the packages truncnorm
,^{1} coin
,^{2} reshape
,^{3} and lattice
.^{4} A basic knowledge of R is required. Any version of R would likely be sufficient to run the example.
See also a collection of other articles.
What is an ‘apparent’ difference in t_{max}?
A good question. Puzzles me for years…
The European Medicines Agency states for immediate release products:
“A statistical evaluation of t_{max} is not required. However, if rapid release is claimed to be clinically relevant and of importance for onset of action or is related to adverse events, there should be no apparent difference in median t_{max} and its variability between test and reference product.
Later the EMA states for delayed and multiphasic release formulations:
“A formal statistical evaluation of t_{max} is not required. However, there should be no apparent difference in median t_{max} and its range between test and reference product.
‘Apparent’ is – and for good reasons – not a term contained in the statistical toolbox. Not surprising, since according to the guidelines »a [formal] statistical evaluation of t_{max} is not required«.
These definitions of ‘apparent’ are given in dictionaries:
Macmillan (British English)
Meriam-Webster (American English)
Beauty Lies in the Eye of the Beholder. Paper doesn’t blush…
Hence, the assessment of formulation differences opens the door for interpretation, which is – from a scientific perspective – extremely unsatisfactory.
Whereas the underlying distribution of t_{max} is continuous, due to the sampling schedule the observed distribution is discrete. This calls for a comparison by a robust (i.e., nonparametric) statistical method.
A comparison of t_{max} was never – and is not – required by the FDA.
In the past in Europe^{7} ^{8} a comparison by a nonparametric method was recommended. That was – and still is – statistically sound.
For modified release products t_{max} was not mentioned in the previous Note for Guidance at all.^{9}
A nonparametric method is recommended in other jurisdictions (e.g., Argentina,^{10} Japan,^{11} South Africa^{12}).
Alas, the EMA’s vague recommendation of 2010 was incurred in numerous other jurisdictions (e.g., Australia,^{13} ASEAN states,^{14} Chile,^{15} the EEU,^{16} Egypt,^{17} members of the Gulf Cooperation Council,^{18} New Zealand^{19}).
The WHO leaves more scope for interpretation.
“Where t_{max} is considered clinically relevant, median and range of t_{max} should be compared between test and comparator to exclude numerical differences with clinical importance. A formal statistical comparison is rarely necessary. Generally the sample size is not calculated to have enough statistical power for t_{max}. However, if t_{max} is to be subjected to a statistical analysis, this should be based on non-parametric methods and should be applied to untransformed data.
»A [formal] statistical evaluation of t_{max} is not required.«
Well. Does that imply that if one is performed anyhow, it is not welcomed?
»[…] there should be no apparent difference in median t_{max} and its variability between test and reference product.«
Leaving the ambiguous statement about an ‘apparent’ difference aside – what is the ‘variability’ of the median? The median is a statistic (one of many estimators) whose value is a certain number (the estimate).
If we aggree that t_{max} follows a discrete distribution, we can only make a statement about the variability of the sample.^{21} A robust measure is the IQR. Is that meant?
»[…] there should be no apparent difference in median t_{max} and its range between test and reference product.«
Similarly the median does not have a range, only the data. At least the range is a statistical term we can work with.
The median has a breakdown point of 50%. That means, it would need 50% of the data contaminated by false observations until the median of the sample would change. Hence, it is a robust estimator.
Compare that to the mean, which has a breakdown point of 1. Imagine, we have an infinitely large sample of identical values (\(\small{x_{i=1}=\ldots=x_{i=n}}\)). When we contaminate the data with the single (‼) highest possible value \(\small{\small{\infty}}\), the mean would also be infinite and hence, all other values of the data practically ignored.
The breakdown point of the IQR is not defined. It depends on the sample size, the number of contaminations, and their values compared to the ‘normal’ ones.
On the other hand, the range has – like the mean – a breakdown point of 1. Therefore, it is not a robust statistic. Futhermore, if we compare two data sets whose extremes are contaminated in opposite directions (i.e, the minimum of one and the maximum of the other), both the median and range will be identical.^{22} Without inspecting the data or performing an EAD (e.g., box plots) the range is not informative and for comparisons practically useless.^{23}
library(coin)
16
n <- c(rep(1, n-1), 2)
R <- summary(R)
sumR <- c(rep(1, n-1), 3)
T1 <- summary(T1)
sumT1 <- c(rep(1, n-1), 1)
T2 <- summary(T2)
sumT2 <- data.frame(tmax = c(R, T1),
comp1 <-treatment = factor(rep(c("R", "T1"), c(n, n))))
data.frame(tmax = c(R, T2),
comp2 <-treatment = factor(rep(c("R", "T2"), c(n, n))))
data.frame(treatment = c("R", "T1", "T2"),
res <-Median = c(sumR[3],
3],
sumT1[3]),
sumT2[IQR = c(sumR[5] - sumR[2],
5] - sumT1[2],
sumT1[5] - sumT2[2]),
sumT2[Range = c(sumR[6] - sumR[1],
6] - sumT1[1],
sumT1[6] - sumT2[1]))
sumT2[ wilcox_test(tmax ~ treatment, data = comp1,
wt1 <-distribution = "exact", conf.int = TRUE,
conf.level = 0.90)
wilcox_test(tmax ~ treatment, data = comp2,
wt2 <-distribution = "exact", conf.int = TRUE,
conf.level = 0.90)
# we need suppressWarnings() if data are identical
# (the CI cannot be computed)
c(suppressWarnings(confint(wt1)[[2]]),
PE <-suppressWarnings(confint(wt2)[[2]]))
c(suppressWarnings(confint(wt1)$conf.int[1]),
lower <-suppressWarnings(confint(wt2)$conf.int[1]))
c(suppressWarnings(confint(wt1)$conf.int[2]),
upper <-suppressWarnings(confint(wt2)$conf.int[2]))
data.frame(comparison = c("T1 vs R", "T2 vs R"),
wilc <-PE = PE, lower.CL = lower, upper.CL = upper)
print(res, row.names = FALSE); print(wilc, row.names = FALSE)
R> treatment Median IQR Range
R> R 1 0 1
R> T1 1 0 2
R> T2 1 0 0
R> comparison PE lower.CL upper.CL
R> T1 vs R 0 0 0
R> T2 vs R 0 NA NA
library(coin)
function(x, y) {
roundClosest <-# round x to the closest multiple of y
return(y * round(x / y))
}set.seed(1234567)
48
n <- 40/60
lo <- 100/60
hi <- 20/60
y <- runif(n - 1, min = lo, max = hi)
x <- roundClosest(x, y)
x <- c(x, 2)
R <- runif(n - 1, min = lo, max = hi)
x <- roundClosest(x, y)
x <- c(x, 3)
T1 <- runif(n - 1, min = lo, max = hi)
x <- roundClosest(x, y)
x <- c(x, 1)
T2 <- summary(R)
sumR <- summary(T1)
sumT1 <- summary(T2)
sumT2 <- data.frame(tmax = c(R, T1),
comp1 <-treatment = factor(rep(c("R", "T1"), c(n, n))))
data.frame(tmax = c(R, T2),
comp2 <-treatment = factor(rep(c("R", "T2"), c(n, n))))
data.frame(treatment = c("R", "T1", "T2"),
res <-Median = c(sumR[3],
3],
sumT1[3]),
sumT2[IQR = c(sumR[5] - sumR[2],
5] - sumT1[2],
sumT1[5] - sumT2[2]),
sumT2[Min = c(sumR[1],
1],
sumT1[1]),
sumT2[Max = c(sumR[6],
6],
sumT1[6]),
sumT2[Range = c(sumR[6] - sumR[1],
6] - sumT1[1],
sumT1[6] - sumT2[1]))
sumT2[2:6] <- signif(res[, 2:6], 4)
res[, wilcox_test(tmax ~ treatment, data = comp1,
wt1 <-distribution = "exact", conf.int = TRUE,
conf.level = 0.90)
wilcox_test(tmax ~ treatment, data = comp2,
wt2 <-distribution = "exact", conf.int = TRUE,
conf.level = 0.90)
c(confint(wt1)[[2]],
PE <-confint(wt2)[[2]])
c(confint(wt1)$conf.int[1],
lower <-confint(wt2)$conf.int[1])
c(confint(wt1)$conf.int[2],
upper <-confint(wt2)$conf.int[2])
data.frame(comparison = c("T1 vs R", "T2 vs R"),
wilc <-PE = PE, lower.CL = lower, upper.CL = upper)
2:4] <- signif(wilc[, 2:4], 4)
wilc[, 1.04 * c(0, max(c(R, T1, T2), na.rm = TRUE))
ylim <-# box plots
dev.new(width = 4.6, height = 4.6)
par(no.readonly = TRUE)
op <-par(mar = c(2.1, 4.1, 0, 0), cex.axis = 0.9)
plot(c(0.5, 3.5), c(0, 0), type = "n", axes = FALSE,
ylim = ylim, xlab = "", ylab = expression(italic(t)[max]))
axis(1, at = 1:3, labels = c("R", "T1", "T2"), tick = FALSE)
axis(2, las = 1)
boxplot(tmax ~ treatment, data = comp1, add = TRUE, at = 1:2,
boxwex = 0.4, ylim = ylim, names = "",
main = "", axes = FALSE, col = "bisque", outcol = "red")
boxplot(tmax ~ treatment, data = comp2, add = TRUE, at = c(1, 3),
boxwex = 0.4, ylim = ylim, names = "", ylab = "",
main = "", axes = FALSE, col = "bisque", outcol = "red")
par(op)
print(res, row.names = FALSE); print(wilc, row.names = FALSE)
R> treatment Median IQR Min Max Range
R> R 1.333 0.3333 0.6667 2.000 1.333
R> T1 1.000 0.3333 0.6667 3.000 2.333
R> T2 1.333 0.3333 0.6667 1.667 1.000
R> comparison PE lower.CL upper.CL
R> T1 vs R 0 0 0.3333
R> T2 vs R 0 0 0.3333
C_{max} and t_{max} are not sensitive to even substantial changes of the rate of absorption.^{24} Since both are composite pharmacokinetic metrics, they are influenced by the extent of absorption (\(\small{f}\) as reflected in AUC). See also the interlude below.
Though stated in the WHO’s guideline, it is a misconception that a statistical comparison of t_{max} with a nonparametric method has low power. For instance, the Wilcoxon signed-rank test (for paired differences) and the Mann–Whitney U test (for independent samples) have an asymptotical power of \(\small{3/\pi\approx95.5\%}\) for normal distributed data. For small sample sizes sufficient power was confirmed in simulations.^{25}
I picked an easy one. An immediate release formulation, one compartment model, PK-parameters: \(\small{D=100}\), fraction absorbed \(\small{f=0.80}\), volume of distribution \(\small{V=4}\), absorption rate constant \(\small{k_{\,01}=2\,\textrm{h}^{-1}}\), lag time \(\small{t_\textrm{lag}=5\,\textrm{min}}\), elimination rate constant \(\small{k_{\,10}=0.25\,\textrm{h}^{-1}}\).
The sampling schedule was 0, 10, 15, 20, 30, 45 minutes, 1, 1.25, 1.5, 2, 2.5, 3.25, 4.25, 5.5, 7, 9.25, and 12 hours in order to get a reliable estimate of t_{max} (the theoretical value based on the model is 1.272 hours).
Error distributions were uniform for \(\small{f}\) (0.6–1), log-normal for \(\small{V}\) (CV 50%), \(\small{k_{\,01}}\) (CV 35%), \(\small{k_{\,10}}\) (CV 40%), and truncated normal for \(\small{t_\textrm{lag}}\) (limits 0–15 min). Analytical error was log-normal with a CV of 5% of the simulated concentration.
25 simulated studies with 48 subjects each. The LLOQ was set to 5% of the theoretical C_{max}.
I split the data of the studies in halves (mimicking a parallel design with 24 subjects / group) and compared the medians, interquartile ranges, and ranges. Furthermore, I compared the t_{max} values by the Mann–Whitney U test.^{26}
Cave: 141 LOC.
library(truncnorm)
library(coin)
library(reshape)
library(lattice)
function(x, D = D, F = F.d, V = V.d,
Ct <-k01 = k01.d, k10 = k10.d, tlag = tlag.d) {
# one-compartment model with a lag time
F.d*D/V.d*k01/(k01-k10.d)*
C <- (exp(-k10*(x-tlag))-exp(-k01*(x-tlag)))
< 0] <- 0
C[C return(C)
}set.seed(123456)
c(0, 10/60, 15/60, 20/60, 30/60, 45/60, 1, 1.25,
t <-1.5, 2, 2.5, 3.25, 4.25, 5.5, 7, 9.25, 12)
25 # number of studies
studies <- 48 # number of subjects / study
n <- 100 # dose
D <-# Index ".d" denotes theoretical value, ".c" its CV
0.8 # fraction absorbed
F.d <- 0.6 # lower limit
F.l <- 1 # upper limit
F.u <- 4 # volume of distribution
V.d <- 0.50 # CV 50%, lognormal
V.c <- 2 # absorption rate constant
k01.d <- 0.35 # CV 35%, lognormal
k01.c <- 0.25 # elimination rate constant
k10.d <- 0.40 # CV 40%, lognormal
k10.c <- 5/60 # lag time
tlag.d <- 0 # lower truncation 1
tlag.l <- 0.25 # upper truncation 1
tlag.u <- 0.5 # CV 50%, truncated normal
tlag.c <- 0.05 # analytical error CV 5%, lognormal
AErr <- 0.05 # fraction of theoretical Cmax
LLOQ.f <-# theoretical profile
seq(min(t), max(t), length.out=2500)
x <- Ct(x = x, D = D, F = F.d, V = V.d,
C.th <-k01 = k01.d, k10 = k10.d, tlag = tlag.d)
log(k01.d/k10.d)/(k01.d-k10.d)+tlag.d
tmax <- Ct(x = tmax, D = D, F = F.d, V = V.d,
Cmax <-k01 = k01.d, k10 = k10.d, tlag = tlag.d)
LLOQ.f * Cmax
LLOQ <- data.frame(study = rep(1:studies, each = n*length(t)),
data <-subject = rep(1:n, each = length(t)),
t = t, C = NA)
for (i in 1:studies) {
# individual PK parameters
runif(n = n, min = F.l, max = F.u)
F <- rlnorm(n = n, meanlog = log(V.d)-0.5*log(V.c^2+1),
V <-sdlog = sqrt(log(V.c^2+1)))
rlnorm(n = n, meanlog = log(k01.d)-0.5*log(k01.c^2+1),
k01 <-sdlog = sqrt(log(k01.c^2+1)))
rlnorm(n = n, meanlog = log(k10.d)-0.5*log(k10.c^2+1),
k10 <-sdlog = sqrt(log(k10.c^2+1)))
rtruncnorm(n = n, a = tlag.l, b = tlag.u,
tlag <-mean = tlag.d, sd = tlag.c)
for (j in 1:n) {
# individual profiles
Ct(x = t, D = D, F = F[j], V = V[j],
C <-k01 = k01[i], k10 = k10[j],
tlag = tlag[j])
for (k in 1:length(t)) { # analytical error (multiplicative)
if (k == 1) {
rnorm(n = 1, mean = 0, sd = abs(C[k]*AErr))
AErr1 <-else {
} c(AErr1, rnorm(n = 1, mean = 0, sd = abs(C[k]*AErr)))
AErr1 <-
}
} C + AErr1 # add analytical error
C <-< LLOQ] <- NA # assign NAs to Cs below LLOQ
C[C $C[data$study == i & data$subject == j] <- C
data
}
}# simple NCA
data.frame(study = rep(1:studies, each = n),
res <-subject = rep(1:n, studies),
tlag = NA, tmax = NA, Cmax = NA)
0
i <-for (j in 1:studies) {
for (k in 1:n) {
i + 1
i <- data$t[data$study == j & data$subject == k]
tmp.t <- data$C[data$study == j & data$subject == k]
tmp.C <-$tlag[i] <- tmp.t[which(tmp.t == t[!is.na(tmp.C)][1])+1]
res$Cmax[i] <- max(tmp.C, na.rm = TRUE)
res$tmax[i] <- tmp.t[which(tmp.C == res$Cmax[i])]
res
}
} data.frame(study = 1:studies, me.1 = NA, me.2 = NA,
comp <-iqr.1 = NA, iqr.2 = NA, rg.1 = NA, rg.2 = NA,
CL.lo = NA, CL.hi = NA )
for (i in 1:studies) {
res[res$study == i, ]
study <- summary(study$tmax[study$subject == 1:(n/2)])
sum1 <- summary(study$tmax[study$subject == (n/2+1):n])
sum2 <-$me.1[i] <- sum1[3]
comp$me.2[i] <- sum2[3]
comp$iqr.1[i] <- sum1[5] - sum1[2]
comp$iqr.2[i] <- sum2[5] - sum2[2]
comp$rg.1[i] <- sum1[6] - sum1[1]
comp$rg.2[i] <- sum2[6] - sum2[1]
comp data.frame(tmax = study$tmax,
stud.tmax <-part = factor(rep(c(1, 2),
c(n/2, n/2))))
wilcox_test(tmax ~ part, data = stud.tmax,
wt <-distribution = "exact", conf.int = TRUE,
conf.level = 0.90)
$CL.lo[i] <- sprintf("%+.2f", confint(wt)$conf.int[1])
comp$CL.hi[i] <- sprintf("%+.2f", confint(wt)$conf.int[2])
compif (comp$CL.lo[i] == "+0.00") comp$CL.lo[i] <- "\u00B10.00"
if (comp$CL.hi[i] == "+0.00") comp$CL.hi[i] <- "\u00B10.00"
} sprintf("%+.3f", unique(sort(comp$me.1-comp$me.2)))
diff.med <- sprintf("%+.4f", unique(sort(comp$iqr.1-comp$iqr.2)))
diff.iqr <- sprintf("%+.2f", unique(sort(comp$rg.1-comp$rg.2)))
diff.rge <- paste("Unique differences of medians:\n",
txt <-paste(diff.med, collapse = ", "),
"\nUnique differences of interquartile ranges:\n",
paste(diff.iqr, collapse = ", "),
"\nUnique differences of ranges:\n",
paste(diff.rge, collapse = ", "))
print(comp, row.names = FALSE); cat(txt)
R> study me.1 me.2 iqr.1 iqr.2 rg.1 rg.2 CL.lo CL.hi
R> 1 2.000 2.00 0.5000 1.0000 2.00 3.25 ±0.00 +0.50
R> 2 1.250 1.25 0.2500 0.3125 1.50 1.25 ±0.00 +0.25
R> 3 1.250 1.25 0.5000 0.3125 0.75 1.25 ±0.00 +0.25
R> 4 1.250 1.25 0.5000 0.5000 1.25 1.00 ±0.00 +0.25
R> 5 1.250 1.25 0.0000 0.2500 1.25 1.50 -0.25 ±0.00
R> 6 1.500 1.50 0.7500 0.2500 1.50 1.00 ±0.00 +0.25
R> 7 1.500 1.50 0.5000 0.1875 1.25 1.50 ±0.00 +0.25
R> 8 1.500 1.50 0.3750 0.7500 1.00 1.50 -0.25 ±0.00
R> 9 1.500 1.50 0.7500 0.5625 1.50 1.50 -0.25 ±0.00
R> 10 1.375 1.50 0.2500 0.2500 1.25 1.75 -0.25 ±0.00
R> 11 2.000 2.00 0.2500 1.0000 1.00 2.25 -0.50 ±0.00
R> 12 2.000 2.00 0.6250 0.5000 1.25 1.50 ±0.00 ±0.00
R> 13 1.250 1.00 0.2500 0.2500 1.25 0.75 ±0.00 +0.25
R> 14 1.250 1.25 0.2500 0.2500 1.00 1.25 ±0.00 ±0.00
R> 15 1.250 1.25 0.3125 0.0625 1.75 0.75 ±0.00 +0.25
R> 16 1.500 1.50 0.2500 0.2500 1.50 1.00 ±0.00 +0.25
R> 17 2.000 2.00 0.5000 0.5000 2.00 1.25 ±0.00 +0.25
R> 18 1.500 1.25 0.2500 0.2500 1.50 1.75 ±0.00 +0.25
R> 19 2.000 2.00 0.6250 0.5000 1.75 1.25 ±0.00 +0.50
R> 20 1.500 1.50 0.6250 0.5000 1.25 1.50 ±0.00 +0.25
R> 21 1.500 1.50 0.7500 0.7500 1.75 1.50 -0.25 ±0.00
R> 22 1.500 1.50 0.5000 0.5000 0.75 2.00 ±0.00 ±0.00
R> 23 1.000 1.00 0.3125 0.3125 1.25 1.25 -0.25 ±0.00
R> 24 1.250 1.25 0.2500 0.3125 1.50 1.00 ±0.00 ±0.00
R> 25 1.500 1.25 0.2500 0.5000 1.00 1.00 ±0.00 +0.25
R> Unique differences of medians:
R> -0.125, +0.000, +0.250
R> Unique differences of interquartile ranges:
R> -0.7500, -0.5000, -0.3750, -0.2500, -0.0625, +0.0000, +0.1250, +0.1875, +0.2500, +0.3125, +0.5000
R> Unique differences of ranges:
R> -1.25, -0.50, -0.25, +0.00, +0.25, +0.50, +0.75, +1.00
as.data.frame(
prep <-melt(res, id = c("study", "subject"), measure = "tmax"))
for (i in 1:nrow(prep)) {
if (prep$subject[i] %in% 1:(n/2)) {
$group[i] <- 1
prepelse {
} $group[i] <- 2
prep
}
}$study <- as.factor(prep$study)
prep$subject <- as.factor(prep$subject)
prep$group <- factor(prep$group, levels = 2:1, order = TRUE)
prep# box plots
dev.new(width = 6, height = 6)
par(no.readonly = TRUE)
op <-bwplot(value ~ study | variable*group, data = prep, group = group,
xlab = "study", ylab = "hours",
ylim = 1.04 * c(0, max(res$tmax, na.rm = TRUE)))
par(op)
In none of the simulated studies the nonparametric ~90% confidence interval^{27} did not include zero, i.e., showed a statistically significant difference in t_{max}.
Interlude 1
It is a misconception that the Wilcoxon signed-rank test and the Mann–Whitney U test compare medians. The Wilcoxon signed-rank test (paired samples) employs the Hodges-Lehmann estimator. The Mann–Whitney U test (independent samples) compares the median of the difference between a sample from \(\small{x}\) and a sample from \(\small{y}\).
Both are unbiased estimators of a shift in location only if distributions are identical. However, in well-controlled studies (e.g., in BE) this is likely the case.^{28}
An alternative not requiring identical distributions was proposed.^{29} It is out of scope of this article.
In a real study – if t_{max} is a clinically relevant PK metric – one would have to pre-specify the acceptance limits and assess whether the ~90% CI lies entirely within. For an analgesic for rapid relief of pain it might be as short as ±20 minutes. If this would have been the case in the example above, all studies would pass because none of the confidence limits were more than ±15 minutes from the reference.
However, if we follow the guidelines we are left in the dark.
Is a difference in medians of –7.5 min ‘apparent’ – or a difference of +15 min? No idea.
Even worse the interquartile ranges. Here we have values from –45 min to +30 min. Are they ‘apparently’ different?
Let’s forget the range, please. What might values from –1.25 to +1 hours mean?
Interlude 2
t_{max} is a poor predictor of differences in the rate of absorption.
Example: An immediate release formulation, one compartment model with complete absorption, no lag time. \(\small{D=100}\), \(\small{V=3}\), absorption half lives 30 minutes (Reference) and 20 minutes (Test), elimination half life 4 hours. Sigmoid effect model identical for R and T: \(\small{E_\textrm{max}=125}\), \(\small{EC_{50}=50}\), \(\small{\gamma=0.5}\).
function(D, f, V, k01, k10, t) {
Ct <- D * f * k01 / (V * (k01 - k10)) *
C <- (exp(-k10 * t) - exp(-k01 * t))
return(C) # concentration
} function(Emax, EC50, gamma, C, t) {
Et <- (Emax * C^gamma)/(C^gamma + EC50^gamma)
E <-return(E) # effect
}# PK model
100
D <- 3
V <- 1
f <- log(2)/0.5 # t½,abs 30 minutes
k01.R <- log(2)/(1/3) # t½,abs 20 minutes
k01.T <- log(2)/4 # t½,el 4 hours
k10 <-# PD (sigmoidal link model)
125
Emax <- 50
EC50 <- 0.5
gamma <- seq(0, 24, length.out = 1001)
t <- Ct(D, f, V, k01.R, k10, t)
C.R <- Ct(D, f, V, k01.T, k10, t)
C.T <- Et(Emax, EC50, gamma, C.R, t)
E.R <- Et(Emax, EC50, gamma, C.T, t)
E.T <- data.frame(trt = c("R", "T", "T/R", "T-R"),
res <-Cmax = c(max(C.R), max(C.T),
max(C.T)/max(C.R), NA),
tmax = c(t[C.R == max(C.R)],
== max(C.T)], NA,
t[C.T == max(C.T)]-t[C.R == max(C.R)]),
t[C.T Emax = c(max(E.R), max(E.T),
max(E.T)/max(E.R), NA),
t.Emax = c(t[E.R == max(E.R)],
== max(E.T)], NA,
t[E.T == max(E.T)]-t[E.R == max(E.R)]))
t[E.T 2:5] <- signif(res[, 2:5], 4)
res[, dev.new(width = 4.5, height = 4.5, record = TRUE)
par(no.readonly = TRUE)
op <-invisible(split.screen(c(2, 1)))
screen(1)
par(mar = c(3, 2.9, 1.1, 0.1), cex.axis = 0.8,
xaxs = "r", yaxs = "i")
plot(t, C.R, type = "n", axes = FALSE,
ylim = 1.04*c(0, max(C.R, C.T)),
xlab = "", ylab = "")
axis(1, at = seq(0, 24, 4))
axis(2, las = 1)
mtext("concentration", 2, line = 2)
lines(t, C.R, lwd = 3, col = "#1010FF80")
lines(t, C.T, lwd = 3, col = "#FF101080")
legend("topright", legend = c("R", "T"),
col = c("#1010FF80", "#FF101080"),
box.lty = 0, cex = 0.8, lwd = 3)
box()
screen(2)
par(mar = c(4, 2.9, 0.1, 0.1), cex.axis = 0.8,
xaxs = "r", yaxs = "i")
plot(t, E.R, type = "n", axes = FALSE,
ylim = 1.04*c(0, max(E.R, E.T)),
xlab = "time", ylab = "")
axis(1, at = seq(0, 24, 4))
axis(2, las = 1)
mtext("effect", 2, line = 2)
lines(t, E.R, lwd = 3, col = "#1010FF80")
lines(t, E.T, lwd = 3, col = "#FF101080")
legend("topright", legend = c("R", "T"),
col = c("#1010FF80", "#FF101080"),
box.lty = 0, cex = 0.8, lwd = 3)
box()
close.screen(all.screens = TRUE)
par(op)
print(res, row.names = FALSE)
R> trt Cmax tmax Emax t.Emax
R> R 24.770 1.704 51.630 1.704
R> T 26.590 1.296 52.720 1.296
R> T/R 1.074 NA 1.021 NA
R> T-R NA -0.408 NA -0.408
Although absorption of the Test is 33% faster than the one of the Reference, its t_{max} occurs only ~25 minutes earlier. The T/R-ratio of C_{max} is 107.4%.
It’s another common misconception that C_{max} is directly related to safety. Linking to the effect site dampens the peaks. Hence, the T/R-ratio of E_{max} is only 102.1%.^{30}
An alternative to comparing t_{max} would be to assess the ‘early exposure’ (i.e., by a partial AUC) as recommended by Health Canada,^{31} the FDA,^{32} and by the EMA for multiphasic modified release products. However, this PK metric is extremely variable and might require a replicate design (for the FDA’s RSABE or the EMA’s ABEL). Similarly to the comparison of C_{max}, Health Canada requires only that the point estimate lies within 80.0–125.0% and hence, a parallel or a simple 2×2×2 crossover design would be sufficient.
Interesting is an older guidance of the FDA.
“An early exposure measure may be informative on the basis of appropriate clinical efficacy/safety trials and/or pharmacokinetic/pharmacodynamic studies that call for better control of drug absorption into the systemic circulation (e.g., to ensure rapid onset of an analgesic effect or to avoid an excessive hypotensive action of an antihypertensive). In this setting, the guidance recommends use of partial AUC as an early exposure measure. We recommend that the partial area be truncated at the population median of Tmax values for the reference formulation.
Sorry folks but the population median is unknown. Is the sample median meant? Or is it the value reported in the reference formulation’s label?
Regulations differ in their definitions of ‘early exposure’. Health Canada requires the cut-off time as the subject’s t_{max} of the reference,^{34} which can be difficult to set up in some software packages. The FDA and the EMA require a pre-specified cut-off time (identical for all subjects under all treatments). Whereas the FDA recommends to set the cut-off time based on clinical grounds (i.e., PD), for the EMA it should be based on PK. If the reference product’s t_{max} is not stated in the SmPC, a justification might be difficult. To complicate things: Quite often the arithmetic mean instead of the median is given.
Recall that in the simulations we had perfectly matching products (the average PK-parameters were identical). If we apply an appropriate statistical method, we could confirm that (i.e., detect no differences).
There is no guarantee that by looking (‼) at reported medians / IQRs / ranges an assessor will arrive at the same conclusions as the applicant.
Assessing t_{max} based on ‘apparent’ differences of medians, its ‘variability’ or range is bad science and thus, should be abandoned.
Nevertheless, the statement in the guidelinesdoes not preclude to perform one.
Pre-specify a clinically relevant acceptance range for t_{max} and assess the ~90% CI by an appropriate nonparametric method.
I did so since the mid 1980s and even after the 2010 guideline never received a single deficiency letter in this respect.
“Enlightenment is man’s release from his self-incurred tutelage. Tutelage is man’s inability to make use of his understanding without direction from another. Self-incurred is this tutelage when its cause lies not in lack of reason but in lack of resolution and courage to use it without direction from another. Sapere aude! “Have courage to use your own reason!” – that is the motto of enlightenment.
top of section ↩︎ previous section ↩︎
License
Helmut Schütz 2021
1^{st} version April 19, 2021.
Rendered 2021-04-22 12:25:47 CEST by rmarkdown in 0.17 seconds.
Footnotes and References
Mersman O, Trautmann H, Steuer D, Bornkamp B. truncnorm: Truncated Normal Distribution. 2018-02-27. v1.0-8. CRAN.↩︎
Hothorn T, Winell H, Hornik K, van de Wiel MA, Zeileis A. Conditional Inference Procedures in a Permutation Test Framework. 2021-02-08. v1.4-1. CRAN↩︎
Wickham H. Flexibly Reshape Data. 2018-10-23. v0.8.8. CRAN.↩︎
Sarkar D, Andrews F, Wright K, Klepeis, N, Murrell P. Trellis Graphics for R. 2020-04-01. v0.20-41. CRAN.↩︎
European Medicines Agency. CHMP. London. 20 January 2010. Guideline on the Investigation of Bioequivalence. CPMP/EWP/QWP/1401/98 Rev. 1/ Corr **.↩︎
European Medicines Agency. CHMP. London. 20 November 2014. Guideline on the pharmacokinetic and clinical evaluation of modified release dosage forms. EMA/CHMP/EWP/280/96 Rev1.↩︎
Commission of the European Communities. CPMP Working Party on Efficacy of Medicinal Products. Brussels. December 1991. Note for Guidance: Investigation of Bioavailability and Bioequivalence. Appendix III: Technical Aspects of Bioequivalence Statistics.↩︎
Commission of the European Communities. CPMP Working Party on Efficacy of Medicinal Products. Brussels. June 1992. Note for Guidance: Investigation of Bioavailability and Bioequivalence.↩︎
European Agency for the Evaluation of Medicinal Products. CPMP. London. 28 July 1999. Note for Guidance on Modified Release Oral and Transdermal Dosage Forms: Section II (Pharmacokinetic and Clinical Evaluation). CPMP/EWP/280/96 Corr *.↩︎
Ministerio de Salud. Secretaría de Políticas, Regulación y Relaciones Sanitarias. Buenos Aires. 2006. Régimen de Buenas Prácticas para la Realización de Estudios de Biodisponibilidad/Bioequivalencia. Spanish.↩︎
Pharmaceuticals and Medical Devices Agency. Pharmaceutical and Food Safety Bureau. Guideline for Bioequivalence Studies of Generic Products. Tokio. February 29, 2012. Q & A.↩︎
Medicines Control Council. Pretoria. July 2015. Registration of Medicines. Biostudies.↩︎
Department of Health. Therapeutic Goods Administration. Canberra. 16 June 2011. European Union and ICH Guidelines adopted in Australia. Guideline on the Investigation of Bioequivalence with TGA Annotations.↩︎
ASEAN States Pharmaceutical Product Working Group. Vientiane. March 2015. ASEAN Guideline for the Conduct of Bioequivalence Studies..↩︎
Departamento Agencia Nacional de Medicamentos. Instituto de Salud Pública de Chile. Santiago. December 2018. Guia para La realización de estudios de biodisponibilidad comparativa en formas farmacéuticas sólidas de administración oral y acción sistémica. Spanish.↩︎
Eurasian Economic Commission. 3 November 2016 Regulations Conducting Bioequivalence Studies within the Framework of the Eurasian Economic Union.. Russian.↩︎
Ministry of Health and Population, The Specialized Scientific Committee for Evaluation of Bioavailability & Bioequivalence Studies. Cairo. February 2017. Egyptian Guideline For Conducting Bioequivalence Studies for Marketing Authorization of Generic Products..↩︎
Executive Board of the Health Ministers’ Council for GCC States. March 2016 The GCC Guidelines for Bioequivalence..↩︎
New Zealand Medicines and Medical Devices Safety Authority. Wellington. February 2018. Guideline on the Regulation of Therapeutic Products in New Zealand. Part 6: Bioequivalence of medicines.↩︎
WHO Expert Committee on Specifications for Pharmaceutical Preparations, Fifty-first Report (WHO TRS No. 992). Geneva. 2017. Annex 6. Multisource (generic) pharmaceutical products: guidelines on registration requirements to establish interchangeability..↩︎
Think about normal distributed data. The arithmetic mean is the statistic of location and the standard deviation the statistic of dispersion. Both are estimated values. Talking about the ‘variability’ of the mean would be nonsense as well.↩︎
Let \(\small{x_{i=1}=\ldots=x_{i=n-1},x_{i=n}=10\cdot x_{i=1}}\) and \(\small{y_{i=1}=10\cdot y_{i=n},y_{i=2}=\ldots=y_{i=n}}\). Then \(\small{\tilde{x}\equiv\tilde{y}}\) and \(\small{G(x)\equiv G(y)}\).↩︎
Imagine:
\(\small{R:\left\{1,\ldots,1,2\right\}}\) → \(\small{\tilde{R}={\color{Blue}1}}\) and \(\small{G(R)={\color{Blue}1}}\).
\(\small{T:\left\{1,\ldots,1,3\right\}}\) → \(\small{\tilde{T}={\color{Blue}1}}\) and \(\small{G(T)={\color{Red}2}}\).
Is the range of T ‘apparently’ different from R?
\(\small{T:\left\{1,\ldots,1,1\right\}}\) → \(\small{\tilde{T}={\color{Blue}1}}\) and \(\small{G(T)={\color{Red}0}}\).
‘Apparently’ different ranges?↩︎
Tothfálusi L, Endrényi. Estimation of C_{max} and T_{max} in Populations After Single and Multiple Drug Administration. J. Pharmacokin. Pharmacodyn. 2003; 40(5): 363–85. doi:10.1023/B:JOPA.0000008159.97748.09.↩︎
Chow S-C, Liu J-p. Design and Analysis of Bioavailability and Bioequivalence Studies. Boca Raton: Chapman & Hall/CRC; 3^{rd} ed. 2009. Table 5.3.4. p. 146.↩︎
Note that the function wilcox.test()
of Base R is not exact for tied data. Hence, I used the function wilcox_test()
of package coin
with its default average ranking.↩︎
Due to the discrete nature of the underlying distribution, the \(\small{100(1-2\,\alpha)}\) CI is always wider than the 90% CI based on the normal distribution. In a 2×2×2 crossover design with 24 subjects the exact CI is 91.12% and with 48 subjects 90.28%. However, such a comparison is not fair since the distribution is not normal.↩︎
Hauschke D, Steinijans VW, Diletti E. A distribution-free procedure for the statistical analysis of bioequivalence studies. Int J Clin Pharm Ther Toxicol. 1990; 28(2): 72–8. PMID:2307548.↩︎
Brunner E, Munzel U. The Nonparametric Behrens‐Fisher Problem: Asymptotic Theory and a Small‐Sample Approximation. Biom. J. 2000; 42(1): 17–25. doi:10.1002/(SICI)1521-4036(200001)42:1%3C17::AID-BIMJ17%3E3.0.CO;2-U.↩︎
That explains why PD is less sensitive to differences in formulations and PK-studies are preferred.↩︎
Health Canada. Ottawa. 2018/06/08. Guidance Document. Comparative Bioavailability Standards: Formulations Used for Systemic Effects.↩︎
U.S. FDA. Center for Drug Evaluation and Research. Rockville. December 2013. Draft Guidance for Industry. Bioequivalence Studies with Pharmacokinetic Endpoints for Drugs Submitted Under an ANDA.↩︎
U.S. FDA. Center for Drug Evaluation and Research. Rockville. March 2003. Guidance for Industry. Bioavailability and Bioequivalence Studies for Orally Administered Drug Products — General Considerations. webarchive.↩︎
Will that inflate the Type I Error because it depends on the observed data? Perhaps.↩︎