Consider allowing JavaScript. Otherwise, you have to be proficient in reading LaTeX since formulas will not be rendered. Furthermore, the table of contents in the left column for navigation will not be available. Sorry for the inconvenience.


  • The right-hand badges give the respective section’s ‘level’.
    
  1. Basics requiring no or only limited statistical expertise.
    
  1. These sections are the most important ones. They are – hopefully – easily comprehensible even for novices. A basic knowledge of R does not hurt.
    
  1. A somewhat higher knowledge of statistics and/or R is required. May be skipped or reserved for a later reading.
    
  1. An advanced knowledge of statistics and/or R is required. Not recommended for beginners in particular.
  • Click to show / hide R code.
  • To copy R code to the clipboard click on the icon copy icon in the top left corner.

Introduction

If this article is perceived as overly focused on statistics, I apologize. This is due to my professional background, which has led me to be less skilled at crafting engaging narratives.
I have to confess that »Short« in the title is an euphemism…

    

‘Bioavailability’ (a portmanteau of ‘biologic availability’) in its current meaning was coined in 19711 and ‘Bio­equi­va­lence’ saw the light of day in 1975.2

The MeSH term ‘Biological Availability’ was introduced in 1979.

The extent to which the active ingredient of a drug dosage form becomes available at the site of drug action or in a biological medi­um believed to reflect accessibility to a site of action.

The site of action (i.e., a receptor) is inaccessible. There should be no space for believes in science.
The best definition of bioequivalence (BE) is given by the International Council for Har­mo­ni­sa­tion of Techni­cal Require­ments for Pharmaceuticals for Human Use (ICH).3

Two drug products containing the same drug substance(s) are con­sidered bioequivalent if their relative bio­availability (BA) (rate and extent of drug absorption) after administration in the same molar dose lies with­in acceptable predefined limits. These limits are set to ensure com­par­able in vivo performance, i.e., si­mi­la­ri­ty in terms of safety and efficacy.
ICH (2020)3

We will use a two-treatment two-sequence two-period (2×2×2) crossover design as an example. \[\small{\begin{array}{cccc} \textsf{Table I}\phantom{0}\\ \text{subject} & \text{sequence} & \text{T} & \text{R}\\\hline \phantom{1}1 & \text{RT} & 71 & 81\\ \phantom{1}2 & \text{TR} & 61 & 65\\ \phantom{1}3 & \text{RT} & 80 & 94\\ \phantom{1}4 & \text{TR} & 66 & 74\\ \phantom{1}5 & \text{TR} & 94 & 54\\ \phantom{1}6 & \text{RT} & 97 & 63\\ \phantom{1}7 & \text{RT} & 70 & 85\\ \phantom{1}8 & \text{TR} & 76 & 90\\ \phantom{1}9 & \text{TR} & 54 & 53\\ 10 & \text{RT} & 99 & 56\\ 11 & \text{RT} & 83 & 90\\ 12 & \text{TR} & 51 & 68\\\hline \end{array}}\]

Abbreviations are given at the end.

top of section ↩︎

The 1970s

    

Problems were reported with formulations of Narrow Therapeutic Index Drugs (NTIDs) like phenytoin,4 5 6 7 digoxin,1 8 9 10 11 12 warfarin,13 theophylline,14 primidone.15 Some show nonlinear pharmacokinetics (phenytoin) or are auto-inducers (war­fa­rin).

  • Poor content uniformity1
  • Excipient changed from CaS04 to lactose5 6
  • The API was altered (e.g., particle size,7 12 amorphous to crystalline14)
  • Variable disintegration time
  • Dissolution testing not mandatory
  • No in vivo studies were performed comparing the new to the approved formulation
  • Breakthrough-seizures4 and intoxications5 6 (phenytoin) and variable or poor effect (digoxin, theophylline)

Generic drugs in the current sense did not yet exist at that time; only the content had to meet the USP requirements.

Although in 1969 Professor John Wagner demonstrated to the Bu­reau of Medicine, methods for comparing areas under the serum versus time curve (AUC) to estimate bioequivalence, his approach was ignored inasmuch as the FDA hierarchy did not believe a problem existed, and there­fore such studies would not be nec­ces­sary. For their part the Offices of Pharmaceutical Re­search and Compliance in the Bureau of Medicine and the Com­mis­sio­ner’s Office believed that the “Bioavailability Problem” as some called it was a “Content Uni­formity Problem”.16 In 1971 for example, when notified of a “Bioavailability Problem” with a generic di­goxin product, FDA in­vestigated and ascertained that one man­u­fac­tur­er first added all the excipients into a 55-gal drum, then added di­gox­in, closed the lid, and mixed it by rolling the drum across the floor a few times. The content uniformity of those tablets varied from 10% to 156%.
Jerome P. Skelly (2010)17

Following a ‘Conference on Bioavailability of Drugs’ held at the National Academy of Sciences of the United States in 1971, a guideline was published the following year.18

Oh dear! © 2008 hobvias sudoneighm @ flickr

[…] the mean of AUC of the generic had to be within 20% of the mean AUC of the approved product. At first this was de­ter­mined by using serum versus time plots on specially weighted paper, cutting the plot out and then weighing each se­pa­rately.
Jerome P. Skelly (2010)17

Methods and procedures for in vivo testing to determine bioavailability (BA) for new drugs were proposed by the FDA on June 20, 1975. Several terms were defined:19

  1. Bioavailability
    The rate and extent to which the therapeutic moiety is absorbed and becomes available to the site of drug action, normally estimated by its mncentrations in body fluids, rate of extretion, or acute pharmacologic effect.
  2. Drug Product
    A finished dosage form (e.g., tablet, capsule, solution) that contains the active drug ingredient often, but not necessarily, in association with inactive ingredients.
  3. Pharmaceutical Equivalents
    Drug products that contain the same quantities of the identical active drug (but not necessarily containing the same inactive) ingredient (i.e., the same salt or ester of the same therapeutic moiety) in an identical dosage form and that meet the com­pendial or other applicable standard of identity, strength, quality, and purity, including potency and, where applicable, content uniformity, disintegration times, and/or dissolution rates.
  4. Pharmaceutical Alternatives
    Drug products that contain the identical therapeutic moiety (or its precursor), but not necessarily in the same amount or the same dosage form, or as the same salt or ester. Each such drug product meets its own compendial or other applicable standard of strength, quality, and purity, including potency and, where ap­plic­able, content uniformity, disintegration time, and dissolution rate.
  5. Bioequivalent Drug Products
    Pharmaceutical equivalents or pharmaceutical alternatives which are not significantly different with respect to rate and extent of absorption when administered at the same molar dose under similar experimental conditions (single dose or mul­ti­ple dose). Some pharmaceutical equivalents may be equivalent in the extent but not the rate of their absorption, and yet may be considered bioequivalent because the differences in rates of absorption may be considered clinically insignificant for the particular drug products studied.
  6. Bioequivalence Requirement
    A requirement, imposed by the FDA for in vitro and/or in vivo testing of specific drug products, which will be required of all manufacturers as a condition of marketing.

The term “site of drug action” was questioned but kept in the regulation of 1977 and is used ever since by the FDA.20

[A] comment also recommended that the phrase “becomes available to the site of drug action” be de­leted since it is overly optimistic to presume that bioavailability data consisting of estimates of parent drug […] concentration in body fluids […] provides, as a general rule, an estimate of the availability of the therapeutic moiety at the site of drug action.
The Commissioner agrees that bioavailability data alone do not estimate the availability of the therapeutic moiety at the site of drug action. It is scientifically valid to assume, however, that if an active drug ingredient or therapeutic moiety reaches a reasonable extent of systemic circulation at a reasonable rate, the therapeutic moiety will also become available at the site of drug action […]. For this reason, the Commissioner concludes that reference to availability at site of drug action should not be de­leted. He also believes that omission of such a reference would incorrectly focus the definition of bio­avail­ability exclusively on absorption of the active drug ingredient or therapeutic moiety from the drug pro­duct. Even where such absorption is total, the product may not be bioavailable because an insufficient amount of the active drug in­gre­dient or therapeutic moiety reaches the systemic circulation. In cer­tain instances, e.g., high first-pass metabolism in the liver or rapid renal clearance, the active drug in­gredient or therapeutic moiety must be absorbed at a rate sufficient to overcome the metabolic or eli­mi­nation mechanism and reach the systemic circulation so that the therapeutic moiety will become avail­able at the site of drug action in sufficient amounts to elicit the intended therapeutic effect.
Sherwin Gardener (1976)20

top of section ↩︎ previous section ↩︎

80/20 Rule

    

The FDA’s 80/20 Rule or ‘Power Approach’ (at least 80% power to detect a 20% difference) of 1972 consisted of testing the hypothesis of no difference at the \(\small{\alpha=0.05}\) level of significance.17 21 \[H_0:\;\mu_\text{T}-\mu_\text{R}=0\;vs\;H_1:\;\mu_\text{T}-\mu_\text{R}\neq 0,\tag{1}\] where \(\small{H_0}\) is the null hypothesis of equivalence and \(\small{H_1}\) the alternative hypothesis of inequivalence. \(\small{\mu_\text{T}}\) and \(\small{\mu_\text{R}}\) are the (true) means of \(\small{\text{T}}\) and \(\small{\text{R}}\), respectively. In order to pass the test, the estimated (post hoc, a posteriori, retro­spec­tive) power had to be at least 80%. The power depends on the true value of \(\small{\sigma}\), which is unknown. There exists a value of \(\small{\sigma_{\,0.80}}\) such that if \(\small{\sigma\leq\sigma_{\,0.80}}\), the power of the test of no difference \(\small{H_0}\) is greater or equal to 0.80. Since \(\small{\sigma}\) is unknown, it has to be approximated by the sample standard deviation \(\small{s}\). The Power Approach in a simple 2×2×2 cross­over design then consists of rejecting \(\small{H_0}\) and concluding that \({\small{\mu_\text{T}}}\) and \({\small{\mu_\text{R}}}\) are equivalent if \[-t_{1-\alpha/2,\nu}\leq\frac{\bar{x}_\text{T}-\bar{x}_\text{R}}{s\sqrt{\tfrac{1}{2}\left(\tfrac{1}{n_1}+\tfrac{1}{n_2}\right)}}\leq t_{1-\alpha/2,\nu}\:\text{and}\:s\leq\sigma_{0.80},\tag{2}\] where \(\small{n_1,\,n_2}\) are the number of subjects in sequences 1 and 2, the degrees of freedom \(\small{\nu=n_1+n_2-2}\), and \(\small{\bar{x}_\text{T}\,,\bar{x}_\text{R}}\) are the means of \(\small{\text{T}}\) and \(\small{\text{R}}\), respectively.
Note that this procedure is based on estimated power \(\small{\widehat{\pi}}\), since the true power is a function of the unknown \(\small{\sigma}\). It was the only approach based on post hoc power and was never implemented in any other jurisdiction.

For the example we estimate a power of only 46.4% to detect a 20% difference and the study would fail.

First proposals by the biostatistical community were published.22 23 24 25

top of section ↩︎ previous section ↩︎

95% CI

    

The analysis was performed on untransformed data (i.e., by an additive model assuming normal distributed data) and bio­equi­va­lence was concluded if the 95% con­fi­dence interval (CI) of the point estimate (PE) was entirely within 80 – 120%.22 25

We get for our example in R:

example          <- data.frame(subject   = rep(1:12, each = 2),
                               sequence  = c("RT", "RT", "TR", "TR", "RT",
                                             "RT", "TR", "TR", "TR", "TR",
                                             "RT", "RT", "RT", "RT", "TR",
                                             "TR", "TR", "TR", "RT", "RT",
                                             "RT","RT",  "TR", "TR"),
                               treatment = c("R", "T", "T", "R", "R", "T", "T", "R",
                                             "T", "R", "R", "T", "R", "T", "T", "R",
                                             "T", "R", "R", "T", "R", "T", "T", "R"),
                               period    = rep(1:2, 12),
                               Y         = c(81, 71, 61, 65, 94, 80, 66, 74,
                                             94, 54, 63, 97, 85, 70, 76, 90,
                                             54, 53, 56, 99, 90, 83, 51, 68))
factors          <- c("subject", "period", "treatment")
example[factors] <- lapply(example[factors], factor) # factorize the data
# additive model (untransformed data, differences); sequence not in the model!
muddle           <- lm(Y ~ subject + period + treatment, data = example)
CI               <- as.numeric(confint(muddle, level = 0.95)["treatmentT", ])
PE               <- coef(muddle)[["treatmentT"]]
# Percentages (flawed!)
mean.T           <- mean(example$Y[example$treatment == "T"])
mean.R           <- mean(example$Y[example$treatment == "R"])
PE.pct           <- 100 * mean.T / mean.R
CI.pct           <- 100 * (CI + mean.R) / mean.R
result           <- data.frame(method = c("differences", "percentages"),
                               PE = c(sprintf("%+.3f", PE),
                                      sprintf("%6.2f%%",  PE.pct)),
                               lower = c(sprintf("%+.3f", CI[1]),
                                         sprintf("%.2f%%",  CI.pct[1])),
                               upper = c(sprintf("%+.3f", CI[2]),
                                         sprintf("%6.2f%%",  CI.pct[2])),
                               BE = c("", "fail"))
if (CI.pct[1] >= 80 & CI.pct[2] <= 120) result$BE[2] <- "pass"
names(result)[3:4] <- c("lower CL", "upper CL")
print(result, row.names = FALSE)
#       method      PE lower CL upper CL   BE
#  differences  +2.417  -12.777  +17.611     
#  percentages 103.32%   82.44%  124.21% fail

If data are analyzed by an additive model the result are dif­ferences. It is a fundamental error to naïvely transform differences to percentages – it would require Fieller’s CI.26 27 However, this was not done back in the day. We get a 95% CI of 82.44 – 124.21%, and the study would fail because the upper con­fi­dence limit (CL) is > 120%.

top of section ↩︎ previous section ↩︎

Westlake’s CI

    

Westlake23 mused that the shortest CI – which is symmetrical about the PE – would be too difficult to comprehend by non-sta­tis­ticians. He suggested to split the t-values in such a way that the probability of the two tails sums to \(\small{\alpha}\) and the respective CI is symmetrical around 0 (or 100%). In the example we obtain ±21.80%, and the study would fail as well because the confidence limits are > ±20%. As above, calculating a percentage is flawed.

However, such a result is misleading. The information about the location of the difference is lost; one cannot know any more whether the BA of \(\small{\text{T}}\) is lower or higher than the one of \(\small{\text{R}}\). Therefore, the method was criticized24 and never implemented in prac­tice. It took me years to convince Certara to remove Westlake’s CI from the results in Phoe­nix Win­Non­lin. In 2016, I was successful with version 6.4… Since then the differences are given in the additive model.

top of section ↩︎ previous section ↩︎

The Roaring 1980s

    

The ‘Approved Drug Products with Therapeutic Equivalence Evaluations’ (the ‘Or­ange Book’ named after its ugly cover) was first published in 1980 and is annually updated28 with monthly supplements.29 It gives information about the originator’s approval (with a ‘New Drug Application’ – NDA), as well as which originator’s pro­duct and strength (called Ref­er­ence Listed Drug – RLD) has to be used in studies of generics in an ‘Abbreviated New Drug Application’ – ANDA. Generic prescription drugs are coded as follows:

  1. Drug products that FDA considers to be therapeutically equivalent to other pharmaceutically equivalent products, i.e., drug products for which:
    1. there are no known or suspected BE problems. These are designated AA, AN, AO, AP, or AT, depending on the dosage form; or
    2. actual or potential BE problems problems have been resolved with adequate in vivo and/or in vitro evidence supporting BE problems. These are designated AB.
  2. Drug products that FDA at this time, considers not to be therapeutically equivalent to other pharmaceutically equivalent products, i.e., drug products for which actual or potential BE problems have not been resolved by adequate evidence of BE. Often the problem is with specific dosage forms rather than with the active ingredients. These are designated BC, BD, BE, BN, BP, BR, BS, BT, BX, or B*.

See also information about the ‘Electronic Orange Book’ below.

    

The generic boom started 1984 in the U.S. with the ‘Drug Price Competition and Patent Term Restoration Act’ (informally known as ‘Hatch-Waxman Act’).30

The approval process was different for innovator (originator) and generic companies.

Innovators:

  • Preclinical data
  • Documentation of pharmaceutical quality
  • In clinical phase I documentation of pharmacokinetics (PK) in healthy subjects, dose finding, safety / tolerability, food effect
  • In phase II efficacy & safety in a small groups of patients
  • In phase III demonstration of efficacy & safety versus placebo in well-powered studies

Generic companies:

  • Documentation of pharmaceutical quality
  • Not required:
    • Any in vivo study
    • Sometimes comparison of disintegration, rarely comparison of dissolution was performed

Regulatory concerns about generic substitution arose, leading to extensive discussions which method could be used to compare formulations.

  • Pharmaceutical equivalence
  • Bioequivalence (BE)
  • Therapeutic equivalence

There was an early agreement that pharmaceutical equivalence is too permissive and therapeutic equivalence would require extremely large studies in patients.31 Hence, comparing BA in healthy volunteers seemed to be a reasonable com­pro­mise.32

What is the justification for studying bioequivalence in healthy volunteers?
“Variability is the enemy of therapeutics” and is also the enemy of bioequivalence. We are trying to determine if two dosage forms of the same drug behave similarly. Therefore we want to keep any other variability not due to the dosage forms at a minimum. We choose the least vari­able “test tube”, that is, a healthy vo­lun­teer.
Disease states can definitely change bioavailability, but we are test­ing for bioequivalence, not bio­avail­ability.

Whereas in PK by bioavailability exclusively the Area under Curve extrapolated to infinite time \(\small{(AUC_{0-\infty}})\) is meant, the FDA introduced two new terms, namely

Therefore, PK metrics, whereas PK para­me­ters refer to modeling.
  1. the ‘rate of bioavailability’ (peak exposure) – measured by the maximum concentration \(\small{(C_\text{max}})\) and
  2. the ‘extent of bioavailability’ (total exposure) – measured by the \(\small{AUC}\).

The former is understood as a surrogate for the absorption rate \(\small{k\,_\text{a}}\) in a PK model. I pre­fer – like the ICH3 and the FDA since 200333 – rate and extent of absorption, in order not to contaminate the original meaning of BA in PK. Where­as the FDA and China’s CDE require for single dose studies \(\small{AUC_{0-\text{t}}}\) and \(\small{AUC_{0-\infty}}\), in all other jurisdictions only \(\small{AUC_{0-\text{t}}}\) is required.

    

Let us consider the basic equation of pharmacokinetics \[\frac{f\cdot D}{CL}=\frac{f\cdot D}{V\cdot k_\text{ el}}=AUC_{0-\infty}=\int_{0}^{\infty}C(t)\,dt,\tag{3}\] where \(\small{f}\) is the fraction absorbed (we are interested in the comparison of formulations), \(\small{D}\) is the dose, \(\small{CL}\) is the clear­ance, \(\small{V}\) is the apparent volume of distribution, \(\small{k\,_\text{el}}\) is the elim­i­na­tion rate constant, and \(\small{C(t)}\) is the plasma concentration with time. We see im­me­di­ately that for identical34 doses and invariate35 \(\small{CL}\), \(\small{V}\), \(\small{k\,_\text{el}}\) (which are drug-spe­ci­fic), com­par­ing the \(\small{AUC}\text{s}\) allows to compare the frac­tions absorbed.

Pharmacokinetics: one of the magic arts of divination whereby needles are stuck into dum­mies in an attempt to predict profits.
Stephen Senn (2004)

It must be mentioned that \(\small{C_\text{max}}\) is not sensitive to even substantial changes in the rate of absorption \(\small{k\,_\text{a}}\), since it is a composite metric.36 In a one compartment model it depends on \(\small{k\,_\text{a}}\), \(\small{f}\) and both the elimination rate con­stant \(\small{k\,_\text{el}}\) and \(\small{V}\) (or \(\small{CL}\) if you belong to the other church). Whereas \(\small{k\,_\text{a}}\) and \(\small{f}\) are properties of the formulation – we are interested in – the others are properties of the drug.37 \[\eqalign{ t_\textrm{max}&=\frac{\log_{e}(k\,_\text{a}/k\,_\text{el})}{k\,_\text{a}-k\,_\text{el}}\\ C_\textrm{max}&=\frac{f\cdot D\cdot k\,_\text{a}}{V\cdot (k\,_\text{a}-k\,_\text{el})}\large(\small\exp(-k\,_\text{el}\cdot t_\textrm{max})-\exp(-k\,_\text{a}\cdot t_\textrm{max})\large)\tag{4}}\] Therefore, when using it as a surrogate for the absorption rate one must keep in mind that formulations with different fractions absorbed and \(\small{t_\text{max}}\) might show the same \(\small{C_\text{max}}\).
It took ten years before the alternative metric \(\small{C_\text{max}/AUC}\) (based on theo­re­tical considerations and simulations) was proposed.38 39 40 Apart from being independent from \(\small{f}\), it is substantially less variable than \(\small{C_\text{max}}\). Regrett­ably, it was never implemented in any guideline.

    

In the early 1980s originators failed in trying to falsify the concept (i.e., comparing BE in healthy volunteers to large the­ra­peu­tic equi­va­lence (TE) studies in patients): If BE passed, TE passed as well and vice versa. If they would have succeeded (BE passed while TE failed), generic companies would have to demonstrate TE in order to get pro­ducts approved. Such studies would have to be much larger than the originators’ phase III studies, making them economically infeasible.31 Essentially, that would have meant an early end of the young generic industry.

However, comparative BA is also used by originators in scale-up of formulations used in phase III to the to-be-mar­keted formulation, supporting post-approval changes, in line extensions of approved products, and for testing of drug-drug interactions or food effects. Hence, a substantial part of BE trials are performed by originators. If they had been successful to refute the concept, they would have shot into their own foot.

In the mid 1980s a consensus was reached, i.e., that generic approval should only be acceptable after suitable in vivo equivalence. It must be mentioned that BE relies on current Good Manufacturing Practices (cGMP). If drugs are not manufactured according to cGMP, the entire concept would collapse.

The main assumption in BE was (and still is) that ‘similar’ plasma concentrations in healthy volunteers will lead to similar concentrations at the target site (i.e., a receptor) and thus, to similar effects in patients. It was an open issue whether BE should be interpreted as a surrogate of clinical efficacy/safety or a measure of pharmaceutical quality. Where­as in the 1980s the former was prevalent, since the 1990s the latter is mainstream.
A somewhat naïve interpretation of the PK metrics is that \(\small{AUC}\) directly translates to efficacy and \(\small{C_\text{max}}\) to safety. Especially the latter is not correct because any difference in \(\small{C_\text{max}}\) leads to a relatively smaller difference in the ma­xi­mum effect \(\small{E_\text{max}}\).

There was no consensus about the definition of ‘similarity’ and the statistical methodology to compare plasma profiles. Two early methods are outlined in the following.

top of section ↩︎ previous section ↩︎

75/75 Rule

    

This was an approach employed by the FDA. Two drugs were considered bioequivalent if at least 75% of subjects show \(\small{\text{T}/\text{R}\textsf{-}}\)ratios within 75 – 125%.17 41 42 It is not a statistic and, thus, was immediately criticized because variable formulations or studies with some ex­treme values may pass the criterion by pure chance.43

    

We get for our example in R:

example          <- data.frame(subject   = rep(1:12, each = 2),
                               sequence  = c("RT", "RT", "TR", "TR", "RT",
                                             "RT", "TR", "TR", "TR", "TR",
                                             "RT", "RT", "RT", "RT", "TR",
                                             "TR", "TR", "TR", "RT", "RT",
                                             "RT","RT",  "TR", "TR"),
                               treatment = c("R", "T", "T", "R", "R", "T", "T", "R",
                                             "T", "R", "R", "T", "R", "T", "T", "R",
                                             "T", "R", "R", "T", "R", "T", "T", "R"),
                               period    = rep(1:2, 12),
                               Y         = c(81, 71, 61, 65, 94, 80, 66, 74,
                                             94, 54, 63, 97, 85, 70, 76, 90,
                                             54, 53, 56, 99, 90, 83, 51, 68))
rule.75.75    <- reshape(example, idvar = "subject", timevar = "treatment",
                         drop = c("sequence", "period"), direction = "wide")
rule.75.75    <- rule.75.75[c("subject", "Y.T", "Y.R")]
names(rule.75.75)[2:3] <- c("T", "R")
rule.75.75$T.R <- 100 * (rule.75.75$T / rule.75.75$R)
for (i in 1:nrow(rule.75.75)) {
  if (rule.75.75$T.R[i] >= 75 & rule.75.75$T.R[i] <= 125) {
    rule.75.75$BE[i]     <- TRUE
    rule.75.75$within[i] <- "yes"
  } else {
    rule.75.75$BE[i]     <- FALSE
    rule.75.75$within[i] <- "no"
  }
}
names(rule.75.75)[c(4, 6)] <- c("T/R (%)", "±25%")
BE            <- "Failed BE by the"
if (sum(rule.75.75$BE) / nrow(rule.75.75) >= 0.75) BE <- "Passed BE by the"
print(rule.75.75[, c(1:4, 6)], row.names = FALSE); cat(BE, "75/75 Rule.\n")
#  subject  T  R   T/R (%) ±25%
#        1 71 81  87.65432  yes
#        2 61 65  93.84615  yes
#        3 80 94  85.10638  yes
#        4 66 74  89.18919  yes
#        5 94 54 174.07407   no
#        6 97 63 153.96825   no
#        7 70 85  82.35294  yes
#        8 76 90  84.44444  yes
#        9 54 53 101.88679  yes
#       10 99 56 176.78571   no
#       11 83 90  92.22222  yes
#       12 51 68  75.00000  yes
# Passed BE by the 75/75 Rule.

Nine of the twelve subjects (75%) have a \(\small{\text{T}/\text{R}\textsf{-}}\)ratio within 75 – 125% and the study would pass, despite the three subjects with high \(\small{\text{T}/\text{R}\textsf{-}}\)ratios.

top of section ↩︎ previous section ↩︎

t-test

    

Another suggestion was testing for a statistically significant difference at level \(\small{\alpha=0.05}\) with a t-test. The null hypothesis was that formulations are equal, i.e., \(\small{\mu_\text{T}-\mu_\text{R}=0}\).

Let’s assess our example in R again:

example          <- data.frame(subject   = rep(1:12, each = 2),
                               sequence  = c("RT", "RT", "TR", "TR", "RT",
                                             "RT", "TR", "TR", "TR", "TR",
                                             "RT", "RT", "RT", "RT", "TR",
                                             "TR", "TR", "TR", "RT", "RT",
                                             "RT","RT",  "TR", "TR"),
                               treatment = c("R", "T", "T", "R", "R", "T", "T", "R",
                                             "T", "R", "R", "T", "R", "T", "T", "R",
                                             "T", "R", "R", "T", "R", "T", "T", "R"),
                               period    = rep(1:2, 12),
                               Y         = c(81, 71, 61, 65, 94, 80, 66, 74,
                                             94, 54, 63, 97, 85, 70, 76, 90,
                                             54, 53, 56, 99, 90, 83, 51, 68))
tt             <- reshape(example, idvar = "subject", timevar = "treatment",
                          drop = c("sequence", "period"), direction = "wide")
tt             <- tt[c("subject", "Y.T", "Y.R")]
tt$T.R         <- tt[, 2] - tt[, 3]
names(tt)[2:4] <- c("T", "R", "T–R")
tt[, 4]        <- sprintf("%+0.f", tt[, 4])
p              <- t.test(x = tt$T, y = tt$R, paired = TRUE)$p.value
BE             <- "Failed BE"
if (p >= 0.05) BE <- "Passed BE"
print(tt, row.names = FALSE); cat(sprintf("%s by a paired t-test (p = %.4f).\n", BE, p))
#  subject  T  R T–R
#        1 71 81 -10
#        2 61 65  -4
#        3 80 94 -14
#        4 66 74  -8
#        5 94 54 +40
#        6 97 63 +34
#        7 70 85 -15
#        8 76 90 -14
#        9 54 53  +1
#       10 99 56 +43
#       11 83 90  -7
#       12 51 68 -17
# Passed BE by a paired t-test (p = 0.7193).

We calculate a \(\small{p}\)-value of 0.7193, which is statistically not significant and the study would pass again.

However, we face a similar problem like with the 75/75 Rule. If the differences show high variability, the study would pass. On the other hand, if there is low variability in the differences, the study would fail. This is counterintuitive and actually the opposite of what regulators want.

    
Interlude 1

One of my early sins44 – it was not the last…
After phenytoin intoxications in Austria45 we compared three ge­ne­rics (containing the free acid like the ori­gi­nator, Na-, or Ca-salt) to the reference in a cross­over design. All formulations have been approved and were marketed in Austria. Although at that time I already calculated a 95% CI, the reviewers of our manuscript insisted in testing for a significant difference »because it is state of the art«.


Fig. 1 Phenytoin 3 × 100 mg equivalent, single dose fasting.

The \(\small{AUC}\)s of two generics were statistically significant different from the reference (\(\small{\text{T}_1}\) containing the free acid like the originator and \(\small{\text{T}_3}\) containing the Ca-salt). \(\small{\text{T}_2}\) containing the Na-salt was statistically not significant different and, thus, considered equi­va­lent – despite its high \(\small{\text{T}/\text{R}\textsf{-}}\)ratio (Table II). \[\small{ \begin{array}{ccccc} \textsf{Table II}\phantom{00000}\\ \text{formulation} & \text{T}/\text{R (%)} & p & & \text{BE}\\\hline \text{T}_1 & 146.65 & 0.0195\phantom{6} & \text{*} & \text{fail}\\ \text{T}_2 & 133.67 & 0.151\phantom{96} & \text{n.s.} & \text{pass}\\ \text{T}_3 & \phantom{1}27.97 & 0.00596 & \text{**} & \text{fail}\\\hline \end{array}}\] If we would evaluate the study according to current standards (i.e., by the 90% CI inclusion approach based on \(\small{\log_{e}\textsf{-}}\)trans­formed data and acceptance limits of 80.00 – 125.00%), all generics would fail. \(\small{\text{T}_3}\) would even be bio­in­equi­valent because its upper CL is way below 80% (Table III).
\[\small{\begin{array}{ccccc} \textsf{Table III}\phantom{0000}\\ \text{formulation} & \text{PE (%)} & \text{CL}_\text{lower}\text{(%)} & \text{CL}_\text{upper}\text{ (%)} & \text{BE}\\\hline \text{T}_1 & 151.12 & 118.75 & 192.32 & \text{fail (inconclusive)}\\ \text{T}_2 & 139.39 & \phantom{1}95.91 & 202.60 & \text{fail (inconclusive)}\\ \text{T}_3 & \phantom{1}21.67 & \phantom{1}10.25 & \phantom{2}45.81 & \text{fail (inequivalent)}\\\hline \end{array}}\] Given the nonlinear PK of phenytoin,46 47 switching a patient from the originator to the generics with high \(\small{\text{T}/\text{R}\textsf{-}}\)ratios would be problematic – potentially leading to toxicity after multiple doses. Even worse would be switching from the ge­ne­ric \(\small{\text{T}_3}\) with its low \(\small{\text{T}/\text{R}\textsf{-}}\)ratio to any of the other formulations.

top of section ↩︎ previous section ↩︎

ANOVA and beyond

    

An Analysis of Variance (ANOVA) instead of a t-test allows to take period-effects into account.48 49 50 This decade was also the heyday of Bayesian methods.51 52 53 54 Nomograms for sample size estimation were also Bayesian55 but happily mis­used by frequentists. New parametric56 57 as well as nonparametric methods entered the stage.57 58 PK metrics to com­pare controlled release formulations in steady state were pro­posed.59 60 61 The first software to evaluate 2×2×2 crossover studies was released in the public domain.62

    

The acceptance range in bioequivalence is based on a ‘clinically relevant difference’ \(\small{\Delta}\), i.e., for data following a lognormal dis­tri­bu­tion \[\left\{\theta_1,\theta_2\right\}=\left\{100\,(1-\Delta),100\,(1-\Delta)^{-1}\right\}\tag{5}\] It must be mentioned that the commonly applied \(\small{\Delta=20\%}\)63 leading to \(\small{\{80.00\%,}\) \(\small{125.00\%\}}\) is arbitrary (as is any other).

    

An important leap forward was the Two One-Sided Tests Procedure (TOST)21 – al­though it was never implemented in its original form \(\small{(6)}\) in regulatory practice. In­stead, the confidence interval inclusion approach \(\small{(7)}\) made it to the guidelines. Al­though these approaches are operationally identical (i.e., their outcomes [pass | fail] are the same), these are statistically different methods:

The TOST Procedure gives two \(\small{p}\)-values, namely \(\small{p(\theta_0\geq\theta_1)}\) and \(\small{p(\theta_0\leq\theta_2)}\). BE is concluded if both \(\small{p}\)-values are \(\small{\leq\alpha}\).

\[\begin{matrix}\tag{6} H_\textrm{0L}:\frac{\mu_\textrm{T}}{\mu_\textrm{R}}\leq\theta_1\:vs\:H_\textrm{1L}:\frac{\mu_\textrm{T}}{\mu_\textrm{R}}>\theta_1\\ H_\textrm{0U}:\frac{\mu_\textrm{T}}{\mu_\textrm{R}}\geq\theta_2\:vs\:H_\textrm{1U}:\frac{\mu_\textrm{T}}{\mu_\textrm{R}}<\theta_2 \end{matrix}\]

In the CI inclusion approach BE is concluded if the two-sided \(\small{1-2\,\alpha}\) CI lies entirely within the acceptance range \(\small{\left\{\theta_1,\theta_2\right\}}\).

\[H_0:\frac{\mu_\textrm{T}}{\mu_\textrm{R}}\not\subset\left\{\theta_1,\theta_2\right\}\:vs\:H_1:\theta_1<\frac{\mu_\textrm{T}}{\mu_\textrm{R}}<\theta_2\tag{7}\]

When we evaluate our example by \(\small{(6)}\), we get \(\small{p(\theta_0\geq\theta_1)=0.0160}\) and \(\small{p(\theta_0\leq\theta_2)=0.0528}\). Since one of the \(\small{p\textsf{-}}\)values is \(\small{>\alpha}\), the study would fail.

    
Interlude 2

It is a misconception that a certain CI of a sample (i.e., a particular study) contains the – true but unknown – population mean \(\small{\mu}\) with \(\small{1-\alpha}\) probabilty. Let’s simulate some studies and evaluate them by \(\small{(7)}\):

invisible(library(PowerTOST))
set.seed(123) # for reproducibility of simulations
mue      <- 1 # true population mean
CV       <- 0.25
studies  <- 100
x        <- sampleN.TOST(CV = CV, theta0 = mue, targetpower = 0.8, print = FALSE)
subjects <- x[["Sample size"]]
power    <- x[["Achieved power"]]
# simulate subjects within studies, lognormal distribution
samples  <- data.frame(study     = rep(1:studies, each = subjects * 2),
                       subject   = rep(rep(1:subjects, studies), each = 2),
                       period    = rep(rep(1:2, studies), 2),
                       sequence  = rep(c(rep(c("TR"), subjects),
                                         rep(c("RT"), subjects)), studies),
                       treatment = c(rep(c("T", "R"), subjects / 2),
                                     rep(c("R", "T"), subjects / 2)),
                       Y         = rlnorm(n = subjects * studies * 2,
                                          meanlog = log(mue) - 0.5 * log(CV^2 + 1),
                                          sdlog = sqrt(log(CV^2 + 1))))
facs     <- c("subject", "period", "treatment")
samples[facs] <- lapply(samples[facs], factor) # factorize the data
result   <- data.frame(study = 1:studies, PE = NA_real_,
                       lower = NA_real_, upper = NA_real_,
                       BE = FALSE, contain = TRUE)
grand.PE <- numeric(studies)
for (i in 1:studies) {
  temp           <- samples[samples$study == i, ]
  heretic        <- lm(log(Y) ~ period + subject + treatment, data = temp)
  result$PE[i]   <- 100 * exp(coef(heretic)[["treatmentT"]])
  result[i, 3:4] <- 100 * exp(confint(heretic, level = 0.90)["treatmentT", ])
  if (round(result[i, 3], 2) >= 80 & round(result[i, 4], 2) <= 125)
    result$BE[i] <- TRUE
  if (result$lower[i] > 100 * mue | result$upper[i] < 100 * mue) result$contain[i] <- FALSE
  grand.PE[i]    <- mean(result$PE[1:i]) # (cumulative) grand means
}
dev.new(width = 4.5, height = 4.5)
op       <- par(no.readonly = TRUE)
par(mar = c(3.05, 2.9, 1.4, 0.75), cex.axis = 0.9, mgp = c(2, 0.5, 0))
xlim     <- range(c(min(result$lower), 1e4 / min(result$lower),
                    max(result$upper), 1e4 / max(result$upper)))
plot(1:2, 100 * rep(mue, 2), type = "n", log = "x", xlab = "PE [90% CI]",
     ylab = "study  #", axes = FALSE,
     xlim = xlim, ylim = range(result$study))
abline(v = 100 * c(0.8, mue, 1.25), lty = c(2, 1, 2))
axis(1, at = c(125, pretty(xlim)),
     labels = sprintf("%.0f%%", c(125, pretty(xlim))))
axis(2, at = c(1, pretty(1:studies)[-1]), las = 1)
axis(3, at = 100 * mue, label = expression(mu))
box()
lines(grand.PE, 1:studies, lwd = 2)
for (i in 1:studies) {
  if (result$BE[i]) {       # pass
    clr <- "blue"
  } else {                  # fail
    if (result$contain[i]) {# mue within CI
      clr <- "magenta"
    } else {                # mue not in CI
      clr <- "red"
    }
  }
  lines(c(result$lower[i], result$upper[i]), rep(i, 2), col = clr)
  points(result$PE[i], i, pch = 16, cex = 0.6, col = clr)
}
par(op)


Fig. 2 2×2×2 crossover studies (\(\small{\mu}\) = 100%, \(\small{CV}\) = 25%: \(\small{n}\) = 24 for ≥80% power).

In 7% of studies the population mean \(\small{\mu}\) is not contained in the 90% CI (red lines). In other words, given the result of a single study we can never know where \(\small{\mu}\) lies. Only the grand mean (mean of sample means \(\small{\frac{1}{n}\sum_{i=1}^{i=n}\overline{x_i}}\)) approaches \(\small{\mu}\) for a large num­ber of samples. After the 100th study it is with 99.44% pretty close to \(\small{\mu}\) (for geeks: The convergence is poor; when simulating 25,000 studies, it is 100.23%). How­ever, nobody would repeat a – passing – study (blue lines) for such a rather un­inter­esting information, right?
This explains also why a particular study might fail by pure chance even if a formulation is equivalent (here 15% of studies; red or magenta lines). Such cases are related to the producer’s risk (Type II Error = 1 – power), which is for the given conditions 16.3%. On the other hand, it is also possible that a formulation which is not equivalent might pass. These cases are related to the patient’s risk (Type I Error).
For details see the articles about hypotheses, treatment effects, post hoc power, and sample size estimation. Science is a cruel mistress.

    

At a hearing in 1986 the FDA confirmed that \(\small{(6)}\) or \(\small{(7)}\) of untransformed data should be used with \(\small{\Delta=20\%}\). If clinically relevant, tighter limits (\(\small{\Delta=10\%}\)) might be needed.64

The first German guideline was drafted by the International Association for Pharmaceutical Technology (Ar­beits­ge­mein­schaft für Phar­ma­zeu­tische Ver­fah­rens­tech­nik) in 1985.65 It was presented and discussed in 1987.66 67 68

In 1988 wider acceptance limits of 70 – 130% were proposed for \(\small{C_\text{max}}\) due to its inherent high variability69 (as a one-point metric practically always larger than the one of the integrated metric \(\small{AUC}\)).

The Australian draft guideline was published in 1988.70 It was the first covering not only the design and evaluation but also validation of bioanalytical methods. The model with effects period, subject, treatment25 50 was rec­om­mend­ed and a test for se­quence-ef­fects was not considered necessary. The problematic conversion of differences to per­centages was acknowledged and Fieller’s CI26 27 discussed. Kudos to both!

In 1989 a series of loose-leaf binders was started.71 It contained raw-data of generic drugs marketed in Germany, the evaluation provided by companies, as well as results recalculated by the ZL (Central Laboratory of German Phar­ma­cists). Including the 6th supplement of 1996 it contained more than 2,000 pages… It was an indispensible resource for planning new studies and also showed the ‘journey’ of dossiers (i.e., the same study being used by different companies).

The BioInternational conference series set milestones in the development of testing for bioequivalence. The first in Toronto 1989 dealt with the \(\small{\log_{e}\textsf{-}}\)transformation of data and the definition of highly variable drugs (HVDs).72 There was a poll among the participants about the \(\small{\log_{e}\textsf{-}}\)transformation. Out­come: ⅓ never, ⅓ always, ⅓ case by case (i.e., perform both analyses and report the one with narrower CI ‘because it fits the data better’). Let’s be silent about the last team.73 HVDs were defined as drugs with intra-subject variabilities of more than 30% but problems might be evident already at 25%.

top of section ↩︎ previous section ↩︎

The Boring (?) 1990s

    

The original acceptance range was symmetrical around 100%. In \(\small{\log_{e}\textsf{-}}\)scale it should be symmetrical around \(\small{0}\) (because \(\small{\log_{e}1=0}\)). What happens to our \(\small{\Delta}\), which should still be 20%? Due to the positive skewness of the lognormal distribution a lively discussion started after early publications proposing 80 – 125%.25 50 Keeping 80 – 120% would have been flawed because the maximum power should be obtained at \(\small{\mu_\text{T}/\mu_\text{R}=1}\) for \[\exp\left((\log_{e}\theta_1+\log_{e}\theta_2)/2\right),\tag{8}\] which works only if \(\small{\theta_2=\theta_1^{-1}}\) or \(\small{\theta_1=\theta_2^{-1}}\). Keeping the original limits, maximum power would be obtained at \(\small{\mu_\text{T}/\mu_\text{R}=}\) \(\small{\exp((\log_{e}0.8+\log_{e}1.2)/2)}\) \(\small{\approx0.979796}\).


Fig. 3 Power for a 2×2×2 design and limits 0.80 – 1.20.
Note that with a multiplicative model the power curve is asymmetric and shifted to the left.

There were three parties (all agreed that the acceptance range should be symmetrical in \(\small{\log_{e}\textsf{-}}\)scale and consequently asymmetrical when back-transformed). These were their arguments and suggestions:

The width of the acceptance range was 40% and we have empiric evidence that the concept of BE ‘worked’ – let’s be conservative and keep it.
\[\left\{\theta_1,\theta_2\right\}=81.98-121.98\%\tag{9}\]
Since that’s a new method, we don’t want to face safety issues with a higher limit. Furthermore, a more restrictive lower limit prevents issues with insufficient efficacy.
\[\left\{\theta_1,\theta_2\right\}=\left\{100/(1+\Delta),100\,(1+\Delta)\right\}=8\dot{3}.33-120\%\tag{10}\]
80% as the lower limit served us well in the past. Hence, 125% is the way to go because it is simply the reciprocal of the lower limit and the coverage probability in the log-domain is the same like the one we had. Furthermore, these are nice numbers.

\[\left\{\theta_1,\theta_2\right\}=\left\{100\,(1-\Delta),100/(1-\Delta)\right\}=80-125\%\tag{11}\]

    

The 90% CI inclusion approach \(\small{(7)}\) based on \(\small{\log_{e}\textsf{-}}\)transformed data with acceptance limits of 80.00 – 125.00% \(\small{(5)}\) was the winner.


Fig. 4 Power for a 2×2×2 design and limits 0.80 – 1.25.
Note the symmetry: power for any \(\small{1/\theta=\theta}\).

Why are we using a 90% CI – and not a 95% CI like in phase III? In the worst case in a particular patient the bioavailability can be

either too low, i.e., \(\small{p(\text{BA}<\phantom{1}80\%)>5\%}\)

or too high, i.e., \(\small{p(\text{BA}>125\%)>5\%}\)

but evidently not at the same time. Hence, the 90% CI controls the risk for the population of patients. Therefore, if a study passes, the risk for patients does still not exceed 5%. Note that at the BE limits \(\small{\left\{\theta_1,\theta_2\right\}}\) power, i.e., the chance to pass, is 5%. Therefore, the patient’s risk (type I error) is controlled.

First sample size tables for the multiplicative model with the acceptance range 80 – 125% were published74 and ex­tended for narrower (90 – 111%) and wider (70 – 143%) acceptance ranges.75 The nonparametric method was improved taking period-effects into account.76 77 Drug-drug and food-in­ter­action studies should be assessed for equi­va­lence.78 The general applicability of average BE was challenged and the concept of individual and population bioequivalence outlined.79 80 81 The first textbook dealing exclusively with BA/BE was published.82

This was also the decade of updated and new guidelines. A European draft guidance was published in 1990;83 the final guideline was published in December 1991 and came into force in June 1992.84 The 90% CI inclusion approach of \(\small{\log_{e}\textsf{-}}\)transformed data with an acceptance range of 80 – 125% was recommended and for NTIDs the acceptance range may need to be tightened. Due to its inherent higher variability a wider acceptance range may be acceptable for \(\small{C_\text{max}}\). If inevitable and clinically acceptable, a wider acceptance range may also be used for \(\small{AUC}\). Only if clinically relevant, a nonparametric analysis of \(\small{t_\text{max}}\) was re­comm­end­ed.
An in vivo stuy was not required if the new formulation is

  1. to be parenterally administered as a solution and contains the same API(s) and excipients in the same concentrations as the reference or
  2. is a liquid oral form in solution (elixir, syrup, etc.) containing the API(s) in the same concentration and form as the reference, not containing excipients that may significantly affect gastric passage or absorption of the active substance.

Similar statements about solutions were given in all later guidelines. The second lead to application of the Bio­phar­ma­ceu­tic Classi­fi­cation System (BCS).85 More about that further down.

The almost classical 1977 FDA notice […] defined bioavailability as the rate and extent to which the active drug ingredient of therapeutic moiety is absorbed from a drug product and becomes available at the site of action.20 However, in the majority of cases substances are intended to exhibit a systemic therapeutic effect, and a more practical definition can be given, taking into account that the substance in the general circulation is in exchange with the substance at the site of action. Therefore, the European 1991 guidance on bioavailability and bioequivalence84 gave the following definition: Bioavailability is understood as to be the extent and rate to which the a substance or its therapeutic moiety is delivered from the pharmaceutical form into the general circulation.
Volker W. Steinijans and Dieter Hauschke (1993)86

In July 1992 a guidance of the FDA was published.87 An ANOVA of \(\small{\log_{e}\textsf{-}}\)transformed data was re­com­mend­ed and the nested subject(sequence) term in the statistical model entered the scene. It must be mentioned that in com­pa­rative BA studies subjects are usually uniquely coded. Hence, the term subject(sequence) is a bogus one88 and could be replaced by the simple subject as well (see below for an example). Regrettably this model was implemented in all global guidelines ever since.

In the same year the Canadian guidance for Immediate Release (IR) formulations was published.89 To that time is was the most extensive one because it gave not only the method of evaluation, but information about the study design, sample size, ethics, bioanalytics, etc. It differed from the others in the relaxed requirement for \(\small{C_\text{max}}\), where only the \(\small{\text{T}/\text{R}\textsf{-}}\)ratio has to lie within 80 – 125% (instead of its CI). The guidance for MR formulations followed in 1996.90

In 1998 the World Health Organization published its first guideline,91 which was similar to the European one.

Table IV shows the result of the example evaluated by various methods. \[\small{\begin{array}{lcccc} \textsf{Table IV}\phantom{0}\\ \phantom{0}\text{Method} & \text{Model} & \text{PE} & \text{power},p,\text{CI, etc.} & \text{BE?}\\\hline \text{80/20 Rule} & \text{additive} & - & 46.40<80\% & \text{fail}\\ t\text{-test} & \text{additive} & +2.417\;(103.32\%) & 0.7193\geq0.05 & \text{pass}\\ \text{TOST} & \text{additive} & +2.417\;(103.32\%) & 0.0160\leq0.05,\,0.0528>0.05 & \text{fail}\\ \text{95% CI} & \text{additive} & +2.417\;(103.32\%) & -12.777\,,+17.611\;(82.44-124.21\%) & \text{fail}\\ \text{Westlake} & \text{additive} & \pm0.000\;(100.00\%) & \pm2.944\;(\pm21.80\%) & \text{fail}\\\hline \text{80/20 Rule} & \text{multiplicative} & - & 72.90<80\% & \text{fail}\\ \text{75/75 Rule} & - & - & 9/12=75\% & \text{pass}\\ t\text{-test} & \text{multiplicative} & 103.14\% & 0.7317\geq0.05 & \text{pass}\\ \text{TOST} & \text{multiplicative} & 103.14\% & 0.0097\leq0.05,\,0.0309\leq0.05 & \text{pass}\\ {\color{Blue} {90\%\,\text{CI}}} & {\color{Blue} {\text{multiplicative}}} & {\color{Blue} {103.14\%}} & {\color{Blue} {87.40-121.73\%}} & {\color{Blue} {\text{pass}}}\\ \text{Westlake} & \text{multiplicative} & 100.00\% & \pm18.09\% & \text{pass}\\ \text{75/75 Rule} & \text{multiplicative} & - & 75\%\subset \pm25\% & \text{pass}\\\hline \end{array}}\] In the additive model the acceptance range was 80 – 120%, whereas in the multiplicative model it is 80 – 125%. Since in the former differences are assessed, the wrong percentages are given in brackets.

    

At the time being only the 90% CI inclusion approach is globally accepted. Our example in R again:

example          <- data.frame(subject   = rep(1:12, each = 2),
                               sequence  = c("RT", "RT", "TR", "TR", "RT",
                                             "RT", "TR", "TR", "TR", "TR",
                                             "RT", "RT", "RT", "RT", "TR",
                                             "TR", "TR", "TR", "RT", "RT",
                                             "RT","RT",  "TR", "TR"),
                               treatment = c("R", "T", "T", "R", "R", "T", "T", "R",
                                             "T", "R", "R", "T", "R", "T", "T", "R",
                                             "T", "R", "R", "T", "R", "T", "T", "R"),
                               period    = rep(1:2, 12),
                               Y         = c(81, 71, 61, 65, 94, 80, 66, 74,
                                             94, 54, 63, 97, 85, 70, 76, 90,
                                             54, 53, 56, 99, 90, 83, 51, 68))
facs          <- c("subject", "sequence", "treatment", "period")
example[facs] <- lapply(example[facs], factor) # factorize the data
txt           <- paste("nested model : period, subject(sequence), treatment",
                       "\nsimple model : period, subject, sequence, treatment",
                       "\nheretic model: period, subject, treatment\n\n")
result        <- data.frame(model = c("nested", "simple", "heretic"),
                            PE = NA, lower = NA, upper = NA, BE = "fail", na = 0)
for (i in 1:3) {
  if (result$model[i] == "nested") { # bogus nested model (guidelines)
    nested         <- lm(log(Y) ~ period +
                                  subject %in% sequence +
                                  treatment, data = example)
    result$PE[i]   <- 100 * exp(coef(nested)[["treatmentT"]])
    result[i, 3:4] <- 100 * exp(confint(nested, level = 0.90)["treatmentT", ])
    result[i, 6]   <- sum(is.na(coef(nested)))
  }
  if (result$model[i] == "simple") { # simple model (subjects are uniquely coded)
    simple         <- lm(log(Y) ~ period +
                                  subject +
                                  sequence +
                                  treatment, data = example)
    result$PE[i]   <- 100 * exp(coef(simple)[["treatmentT"]])
    result[i, 3:4] <- 100 * exp(confint(simple, level = 0.90)["treatmentT", ])
    result[i, 6]   <- sum(is.na(coef(simple)))
  }
  if (result$model[i] == "heretic") { # heretic model (without sequence)
    heretic        <- lm(log(Y) ~ period +
                                  subject +
                                  treatment, data = example)
    result$PE[i]   <- 100 * exp(coef(heretic)[["treatmentT"]])
    result[i, 3:4] <- 100 * exp(confint(heretic, level = 0.90)["treatmentT", ])
    result[i, 6]   <- sum(is.na(coef(heretic)))
  }
  # rounding acc. to guidelines
  if (round(result[i, 3], 2) >= 80 & round(result[i, 4], 2) <= 125)
    result$BE[i] <- "pass"
}
# cosmetics
result$PE     <- sprintf("%6.2f%%", result$PE)
result$lower  <- sprintf("%6.2f%%", result$lower)
result$upper  <- sprintf("%6.2f%%", result$upper)
names(result)[c(3:4, 6)] <- c("lower CL", "upper CL", "NE")
cat(txt); print(result, row.names = FALSE)
# nested model : period, subject(sequence), treatment 
# simple model : period, subject, sequence, treatment 
# heretic model: period, subject, treatment
# 
#    model      PE lower CL upper CL   BE NE
#   nested 103.14%   87.40%  121.73% pass 13
#   simple 103.14%   87.40%  121.73% pass  1
#  heretic 103.14%   87.40%  121.73% pass  0

As already outlined above, the nested model recommended in all [sic] guidelines is over-specified because subjects are uniquely coded. In the example we get 13 not estimable (aliased) effects (in the output of R lines with NA, in SAS ., and in Phoe­nix Win­Non­lin not estimable). Correct, because we asking for something the data cannot pro­vide.88 In the simple mod­el only one effect can­not be estimated. Even sequence can be removed from the model. I call it he­re­tic because regulators will grill you if you are using it. It was proposed by Westlake25 50 and I em­ployed it in hundreds (‼) of stud­ies.
Note that the results of all models are identical; if you don’t believe me, try it with one of your stud­ies.

    

A ‘Positive List’ was published by the German regulatory authority, i.e., for 90 drugs BE was not required.92 In order to comply with the European Note for Guid­ance of 200193 it had to be removed by the BfArM.

The FDA published guidance for ‘Scale-Up and Postapproval Changes’ (SUPAC)94 95 defining three ‘Levels’ of changes:

  1. Those that are unlikely to have any detectable impact on formulation quality and performance.
  2. Those that could have a significant impact on formulation quality and performance. Tests and filing documentation for a Level 2 change vary depending on three factors: therapeutic range, solubility, and permeability. Therapeutic range is defined as either narrow or non-narrow. […] Drug solubility and drug permeability are defined as either low or high. So­lu­bi­lity is calculated based on the minimum concentration of drug (mg/mL), in the largest dosage strength, determined in the physiological pH range (pH 1 to 8) and temperature (37 ±0.5 ℃). High so­lu­bi­lity drugs are those with a dose/solubility volume of less than or equal to 250 mL. Per­me­abi­lity Pe (cm/s) is defined as the effective human jejunal wall perme­ability of a drug and includes an apparent resistance to mass transport to the intestinal membrane. High per­me­ability drugs are generally those with an extent of absorption greater than 90% in the absence of documented instability in the gastrointestinal tract, or those whose permeability attributes have been determined experimentally.
  3. Those that are likely to have a significant impact on formulation quality and performance. Tests and filing documentation vary depending on the following three factors: therapeutic range, solubility, and permeability.

Under certain conditions of Level 2, demonstration of in vitro similarity by \(\small{f_2\geq 50\%}\)96 in the application/compendial medium at 15, 30, 45, 60 and 120 minutes (or until an asym­pto­te is reached) of at least 12 units is sufficient.

\[f_2=50\,\log_{10}\left\{100\,\sqrt{1+\frac{1}{n}\sum_{i=1}^{i=n}(\text{R}_i-\text{T}_i)^2}\right\}\small{\textsf{,}}\tag{12}\]

where \(\small{\text{R}_i}\) and \(\small{\text{T}_i}\) are the cumulative percent dissolved at \(\small{1\ldots\ n}\) time points of \(\small{\text{R}}\) and \(\small{\text{T}}\), respectively.
For Level 3 changes in vivo testing (BE) is mandatory.

It must be mentioned that comparing formulations by \(\small{f_2}\) can be problematic, especially if the shapes of dissolution curves are different and/or if they intersect. \(\small{f_2}\) is not a statistic and, therefore, it is impossible to evaluate false positive and negative rates of decisions for approval of drug products based on \(\small{f_2}\).97

Two (of five) sessions of the BioInternational ’92 conference in Bad Homburg dealt with BE of Highly Variable Drugs.98 99 Vari­ous approaches have been discussed: Multiple dose instead of single dose studies, metabolite instead of the parent compound, stable isotope tech­niques,100 add-on designs, and – for the first time – replicate designs.

Although the BioInternational 2 in Munich 1994 was with over 600 participants the largest in the series, no sub­stan­ti­al progress for HVD(P)s was achieved.101 Following a suggestion102 at a joint AAPS/FDA workshop in 1995 widening the conventional acceptance limits of 80.00 – 125.00% was considered.103

For some highly variable drugs and drug products, the bioequivalence standard should be modified by changing the BE limits while maintaining the current confidence interval at 90%. […] the bioequivalence limits should be determined based in part upon the intrasubject varia­bility for the reference product.
Shah et al. (1996)103

A hot topic ever since… Why are we discussing it for 35 (‼) years (since the first Bio­Inter­national conference)? Is it really that com­pli­cated104 or are we too stupid?

Studies in steady-state were proposed as an option for HVD(P)s in a European draft guideline105 in order to reduce variability, but it was re­moved from the final version of 2001.93

Validation of bioanalytical methods106 107 108 109 was partly covered in Australia and Canada. However, no specific guideline existed. A series of conferences (informally known as ‘Crys­tal City’) was initiated in 1990.110 Procedures stated in the conference report111 were discussed at the Bio­In­ter­na­tio­nal 2 in Munich 1994 and were quickly adopt­ed by bioanalytical sites. Updates were subsequently published.112 113

top of section ↩︎ previous section ↩︎

21st century

    

Poland happily adopted Germany’s ‘Positive List’92 only when it wanted to join the European Union to learn that in the mean­time Germany abandoned it to comply with the 2001 guideline.93
Until 2015 a similar (but shorter) list existed in The Ne­ther­lands for »strict national market autho­ri­sa­tion«. Must have been a schizophrenic situation for assessors of the Medicines Evaluation Board: In the morning a dossier for na­ti­o­nal MA with­out any in vivo comparison  accepted. In the afternoon another dossier of the same product in the course of a European submission. BE performed, but CI 80.00–125.01%  rejected. Outright bizarre.
Until 2012 Denmark required for NTIDs that the 90% CI had to include 100% (i.e., that there is no significant treatment effect). Bizarre as well. For details see Example 3 in this article.

In February 2005 the FDA published the Electronic Orange Book (EOB), which is updated daily. It can be searched by: Pro­pri­e­tary name, active ingredient, applicant (company), application number, dosage form, route of administration, patent number. It gives also a list of newly added or delisted patents.

The first bioanalytical method validation guidance was published by the FDA in 2001 and revised in 2018.114 115 Before the European draft guideline was published in 2009,116 some inspectors raised an eyebrow if sites worked according to the FDA’s guidance.

The validation of bioanalytical methods and the analysis of study samples should be per­form­ed in accordance with the principles of Good Laboratory Practice (GLP). However, as human bio­ana­ly­ti­cal studies fall outside of the scope of GLP […], the sites con­duct­ing the human studies are not required to be mo­ni­tored as part of a national GLP compliance programme.
EMEA (2009)116

Well roared, lions! My CRO (in Austria) was GLP-certified since 1991, although we performed only phase I studies. In other countries (e.g., Spain), this was not possible. In Germany GLP is subject to state law. Hence, it was possible to get certified in one federal state but not in another… However, this ‘issue’ was resolved with the final guideline published in 2011117 and the ICH M10 guideline of 2022,118 119 superseding all local guidelines.

It must be mentioned that the EMA requires different PK metrics to assess the minimum concentration of modified release (MR) formulations in steady-state.120 Originators have to assess the minimum concentration within the dosing interval \(\small{(C_\text{ss,min})}\), where­as generic companies have to assess the minimum concentration at the end of the dosing interval \(\small{(C_{\text{ss}\,,\tau})}\). If there is a lag-time, the latter is more difficult due to its higher vari­abi­lity.121 Why double standards?

In June 2010 the FDA started to publish Product-Specific Guidances (PSGs).122 They are available online (with May 16, 2024 an amazing 2,213) and can be searched by active ingredient or RLD. Many PSGs remain drafts for a long time. For example, of the 131 PSGs starting with the letter P, only ten are final and some are for 13 years still in draft state.

top of section ↩︎ previous section ↩︎

PBE/IBE

    

After a wealth of – controversal – publications in the 1990s,79 80 81 123 124 125 126 127 128 129 130 131 the FDA introduced two new concepts as alternatives to average bio­equi­va­lence (ABE), namely population bioequivalence (PBE) and individual bio­equi­va­lence (IBE).132 ABE focuses only on the comparison of po­pu­lation averages of the PK metrics and not the variances of formulations. It does also not assess a sub­ject-by-for­mu­lat­ion interaction variance, that is, the variation in the average \(\small{\text{T}}\) and \(\small{\text{R}}\) difference among individuals. In contrast, PBE and IBE include com­pa­ri­sons of both averages and variances of PK metrics. The PBE approach assesses total variability of the PK metrics in the population. The IBE approach assesses within-subject variability for the \(\small{\text{T}}\) and \(\small{\text{R}}\) formulations, as well as the sub­ject-by-formulation interaction.

Demonstrated PBE would support ‘Prescribability’ (i.e., a drug naïve patient could start treatment), where­as IBE support ‘Switchability’ (i.e., a patient could switch formulations during treatment).131 Contrary to ABE, both PBE and IBE require studies in a full replicate design, which means that both \(\small{\text{T}}\) and \(\small{\text{R}}\) are administered twice. The acceptance limits for ABE were kept at 80.00 – 125.00% but for the others scaling to the variability of the reference was possible. That would mean an incentive for test formulations with lower variability than the reference but a penalty for ones with a higher variability.

However, the underlying statistical concepts were not trivial and the result practically incomprehensible for non-sta­tis­ticians. Furthermore, both approaches had a discontinuity (when moving from constant- to reference-scaling), which lead to an inflated type I error (patient’s risk) of approximately 6.5%.128 129 132 133 134
PBE/IBE faced criticism, e.g.,

responses [to the guidance] were still doubt-filled as to whether the new bioequivalence criteria really provided added value compared to average bioequivalence135

and was regarded a

‘theoretical’ solution to a ‘thoretical’ problem136 137

leading to its omission from a subsequent guid­ance,138 and a return to con­ventional ABE.139

Average bioequivalence should suffice based upon grounds of ‘practicality, plausibility, his­to­ri­cal adequacy, and purpose’ and ‘because we have better things to do.’ […] ‘Sta­tis­ti­ci­ans have a bad track record in bioequivalence, […] the literature is full of ludicrous recommendations from statisticians, […] regulatory recommendations (of dubious validity) have been hastily implemented, and practical realities have been ignored’.
Stephen Senn (2000)140
Individual bioequivalence is a promising, clinically relevant method that should theoretically provide further confidence to cli­ni­cians and patients that generic drug products are indeed equi­va­lent in an individual patient.
Even today, considering the studies summarized and analyzed by the FDA, the data is inadequate to validate the theoretical approach and provide confidence to the scientific community that the methodology required and the expense entailed are justified.
At this time, individual bioequivalence still remains a theoretical solution to solve a theoretical clinical problem. We have no evidence that we have a clinical problem, either a safety or an efficacy issue, and we have no evidence that if we have the problem that individual bioequivalence will solve the problem.

I remember a Dutch regulator standing up in the BioInternational conference in London 2003, saying:

I’m glad that PBE and IBE are dead. I never understood them.

top of section ↩︎ previous section ↩︎

SABE

    
Another possibility to deal with Highy Variable Drug Products – requiring ex­tre­mely large sample sizes in ABE – is the concept of Scaled Average Bio­equi­va­lence (SABE). It was a topic at meetings of the FDA Advisory Committee for Phar­ma­ceu­tical Sci­ence (7 May 1997, 23 September 1999, 14 April 2004, 10 October 2006). In a meeting of the Therapeutic Products Direc­to­rate (26–27 June 2003), Health Canada stated:

We don’t see a problem, our database search showed that even HVDPs comply with the usual BE criteria.

However, this observation was based on its requirement that only the PE of \(\small{C_\text{max}}\) has to lie within 80.0–125.0%.89
SABE was also discussed at the Bio­Inter­national 2005 (London).141 As already suggested by Benet at the Bio­Inter­na­tio­nal in 1994,101 innovators should be encouraged to provide an estimate of the within-subject variability upon approval.

The EMA published a concept paper in 2006, containing valuable points for discussion.142

  • What are the best methods to provide evidence that a medicinal product is a HVDP?
  • Describe different approaches to bioequivalence of HVDP, with benefits and drawbacks for regulatory purposes.
  • For the SABE concept:
    • Define the recommended study designs.
    • Define the acceptance range for this new approach.
    • Suggest the recommended statistical and computational analyses, including the estimation of the within-subject variances of the two formulations and the determination of BE. A technical appendix will describe the recommended computational methods.
    • Decide whether any additional constraints are necessary.
    • Decide what to do if the within-subject variance ratio shows that the test product is more variable than the reference product.
    • Decide how to define and how to handle outliers with this approach.
The concept paper was deleted from the EMA’s website in October 2007 – an unprecedented case…

Application of SABE was not limited to a certain PK metric. Furthermore, a comparison of \(\small{s_{\text{wT}}^{2}}\) with \(\small{s_{\text{wR}}^{2}}\) would require a full replicate design.

Who controls the past controls the future: who controls the present controls the past.

SABE was introduced 2010 first by the EMA,143 shortly after by the FDA,144 145 in 2017 by the WHO,146 and in 2018 by Health Canada.147

Terminology:

  1. A Highy Variable Drug (HVD) shows a within-subject Coefficient of Variation of the Reference (\(\small{CV_\text{wR}}\)) > 30% if administered as a solution in a replicate design. The high variability is an intrinsic property of the drug (absorption, permeation, clearance – in any combination).
  2. A Highy Variable Drug Product (HVDP) shows a \(\small{CV_\text{wR}}\) > 30% in a replicate design.148
    

The concept of SABE is based on the following considerations:

  1. HVD(P)s are safe and efficacious despite their high variability because:
    1. They have a wide therapeutic index (i.e., a flat dose-response curve). Consequently, even substantial changes in concentrations have only a limited impact on safety and efficacy.
      If they would have a narrow therapeutic index, adverse effects (due to high concentrations) and lacking effects (due to low concentrations) would have been observed in phase II (or in phase III at the latest) and therefore, the originator’s product would not have been approved in the first place.149
    2. Once approved, the product has a documented safety / efficacy record in phase IV and in clinical practice. If problems would became evident, the product would have been taken off the market.
  2. Given that, the conventional ‘clinically relevant difference’ \(\small{\Delta=20\%}\) in ABE is considered overly conservative and hence, requires large sample sizes.
  3. Thus, a more relaxed \(\small{\Delta>20\%}\) was proposed. A natural approach is to scale (expand / widen) the limits based on the within-subject variability of the reference product \(\small{\sigma_\text{wR}}\).150
    

The conventional model of ABE by \(\small{(7)}\) is modified in SABE to \[H_0:\;\frac{\mu_\text{T}}{\mu_\text{R}}\Big{/}\sigma_\text{wR}\not\subset\left\{\theta_{\text{s}_1},\,\theta_{\text{s}_2}\right\}\;vs\;H_1:\;\theta_{\text{s}_1}<\frac{\mu_\text{T}}{\mu_\text{R}}\Big{/}\sigma_\text{wR}<\theta_{\text{s}_2},\tag{13}\] where \(\small{\sigma_\text{wR}}\) is the standard deviation of the reference. The scaled limits \(\small{\left\{\theta_{\text{s}_1},\,\theta_{\text{s}_2}\right\}}\) of the acceptance range depend on conditions given by the agency.

    

Reference-Scaled Average Bioequivalence (RSABE)151 is recommended by the FDA and China’s CDE. Average Bio­equi­va­lence with Expanding Limits (ABEL)152 is another variant of SABE and recommended in all other jurisdictions. In order to apply the methods following conditions have to be fulfilled:

  1. The study has to be performed in a replicate design, i.e., at least the reference product has to be administered twice.
  2. The observed within-subject variability of the reference product has to be high (in RSABE \(\small{s_\text{wR}\geq 0.294}\)153 and in ABEL \(\small{CV_\text{wR}>30\%}\)).
    Agencies are only interested in the va­ri­ability of the reference product, al­though for the spon­sor the one of the test is ‘nice to know’ as well.
  3. ABEL only:
    1. A clinical justification must be given that the expanded limits will not impact safety / efficacy.
    2. There is an ‘upper cap’ of scaling (\(\small{uc=50\%}\), except for Health Canada, where \(\small{uc\approx57.382\%}\)147), i.e., the expansion is limited to 69.84 – 143.19% or 67.7 – 150.0%, respectively (see this article for the con­tra­diction with \(\small{uc=57.4\%}\) given in the Canadian guidance).
    3. It has to be demonstrated that the high variability of the reference is not caused by ‘outliers’.
      It should be noted that large deviations between geometric mean ratios arise as a natural, direct consequence of the high variability. Since extreme values are common for HVD(P)s, assessment of ‘outliers’ is not required by Brazil’s ANVISA and Chile’s ANAMED.

In all methods a point estimate-constraint is imposed. Even if a study would pass the scaled limits, the PE has to lie within 80.00 – 125.00% in order to pass. Whilst the PE-constraint is statistically not justified, it was implemented in all jurisdictions ‘for political reasons’.154

  1. There is no scientific basis or rationale for the point estimate recommendations
  2. There is no belief that addition of the point estimate criteria will improve the safety of approved generic drugs
  3. The point estimate recommendations are only “political” to give greater assurance to clinicians and patients who are not familiar (don’t understand) the statistics of highly variable drugs

Compared to ABE, SABE leads to a substantial reduction in sample sizes (see this article). However, both RSABE and ABEL may result in an inflated type I error (patient’s risk),104 which was already described in 2009151 155 (before [sic] SABE was implemented) and is still an unresolved issue156 157 (see also this article).

top of section ↩︎ previous section ↩︎

RSABE

    

The FDA published SAS code144 151 but it is a mystery why a fixed-effects model for the partial replicate design and a mixed-effects model for a full replicate design was recommended. If you understand why, please let me know.

    

If \(\small{s_\text{wR}<0.294}\), ABE has to be assessed by \(\small{(7)}\) and \(\small{\Delta=20\%}\) (90% CI entirely within 80.00 – 125.00%).
It must be mentioned that if the study was performed in a partial replicate design, the model is over-specified and the optimizer of any (‼) software might not converge (for details see this article).

    

If \(\small{s_\text{wR}\geq0.294}\), RSABE should be applied. The regulatory constant is given by \[\theta_\text{s}=\frac{\log_{e}1.25}{s_0}\approx 0.8925742\ldots\small{\textsf{,}}\tag{14}\] where \(\small{s_0}\) is the regulatory switching condition \(\small{0.25}\). The point estimate \(\small{PE}\) is given by \(\small{\overline{Y}_\text{T}-\overline{Y}_\text{R}}\), where \(\small{\overline{Y}_\text{T}}\) and \(\small{\overline{Y}_\text{R}}\) are the means of \(\small{\log_{e}}\)-transformed PK-metrics obtained for the test and reference products, respectively. The standard error \(\small{se}\) of the \(\small{PE}\) is \[se=\sqrt{\frac{\widehat{s}}{{N_{s}}^{2}}\sum \frac{1}{n_i}}\small{\textsf{,}}\tag{15}\] where \(\small{\widehat{s}}\) is the model’s residual mean squares error, \(\small{N_\text{s}}\) are the number of sequences, and \(\small{n_i}\) the number of subjects in sequence \(\small{i}\). We start with the SABE model \(\small{(13)}\) and work with \(\small{\log_{e}\textsf{-}}\)transformed values for convenience \[-\theta_\text{s}\leq\frac{\mu_\text{T}-\mu_\text{R}}{\sigma_\text{wR}}\leq\theta_\text{s}\tag{16}\] and use its squared and linearized form \[\left(\mu_\text{T}-\mu_\text{R}\right)^2-{\theta_{s}}^{2}\cdot{\sigma_{\text{wR}}}^{2}\leq0\small{\text{.}}\tag{17}\] Upon inspecting part of the SAS code in the FDA’s guidance…151

  pointest=exp(estimate);
  x=estimate**2-stderr**2;
  theta=((log(1.25))/0.25)**2;
  y=-theta*s2wr;

…we see that stderr**2, i.e., \(\small{se^2}\) from \(\small{(15)}\), is inserted in the left-hand side of \(\small{(17)}\) – which is formulated in the true para­me­ters – yielding for the estimates \[PE^2-se^2-{\theta_{s}}^{2}\cdot {s_{\text{wR}}}^{2}\leq0\small{\textsf{.}}\tag{18}\] This is not stated as such in the formulas of the guidance. We are aware of only one reference,158 which is – regrettably – not in the public domain.

The statistical approach we use is very similar to that proposed by Tothfalusi, Endrenyi, et al. 2001,159 with a minor difference (use of an unbiased estimator for \(\small{\left(\mu_\text{T}-\mu_\text{R}\right)^2})\).
Donald Schuirmann (2016)158

Then \[\eqalign{ E_\text{m}&=PE^2-se^2\\ E_\text{s}&={\theta_{s}}^{2}-{s_{\text{wR}}}^{2} }\tag{19}\] are calculated, where \(\small{E_\text{m}}\) and \(\small{E_\text{s}}\) are the estimates of the true parameters (\(\small{se^2}\) acts again as a bias correction). Since their distributions are known, their upper confidence limits \(\small{C_\text{m}}\) and \(\small{C_\text{s}}\) can be calculated by \[\eqalign{ C_\text{m}&=\left(\left|PE\right|+t_{1-\alpha,\,\nu}\cdot se\right)^2\\ C_\text{s}&=E_\text{s}\cdot \nu\big{/}\chi_{1-\alpha,\,\nu}^{2}\small{\textsf{,}} }\tag{20}\] where \(\small{\nu}\) are the degrees of freedom given by \(\small{\sum n-N_\text{s}}\). A modification160 of Howe’s approximation161 is used in order to get the CI of a sum of random variables from the individual CIs. The squared lengths of the individual CIs are: \[\eqalign{ L_\text{m}&=\left(C_\text{m}-E_\text{m}\right)^2\\ L_\text{s}&=\left(C_\text{s}-E_\text{s}\right)^2\small{\textsf{.}} }\tag{21}\] Finally we calculate the 95% upper confidence bound: \[\small{\textsf{bound}}=E_\text{m}-E_\text{s}+\sqrt{\left(L_\text{m}-L_\text{m}\right)^2}\tag{22}\]

    

In order to pass RSABE:

  1. \(\small{\textsf{bound}}\) by \(\small{(22)}\) has to be \(\small{\leq0}\).
  2. The PE has to lie within 80.00 – 125.00%.

top of section ↩︎ previous section ↩︎

ABEL

    

Although the EMA’s concept paper stated142 that the statistical and computational methods will be given in the guideline, this was not the case.143 SAS code and two example data sets were published later in a Q&A document.162 The evaluation has to be done with a simple ANOVA, i.e., assuming identical within-subject variances of the test and reference products. Methods to identify and handle outliers were not given.

If \(\small{CV_\text{wR}\leq30\%}\), ABE has to be demonstrated by \(\small{(7)}\) and \(\small{\Delta=20\%}\) (90% CI entirely within 80.00 – 125.00%).

Otherwise, ABEL can be applied and the limits expanded to \(\small{\left\{L,U\right\}=100\exp(\mp k\cdot s_\text{wR})}\), with the regulatory constant \(\small{k=0.76}\). The scaling is capped at 50% for all agencies (maximum expansion 69.84 – 143.19%), except for Health Canada at ≈57.382% (67.7 – 150.0%).

invisible(library(PowerTOST))
CVwR      <- 100 * sort(c(seq(0.3, 0.6, 0.05), 0.57382))
EL        <- data.frame(CVwR  = CVwR,
                        EMA.L = NA_real_, EMA.U = NA_real_,
                        HC.L  = NA_real_, HC.U  = NA_real_)
EMA       <- scABEL(CV = CVwR / 100, regulator = "EMA")
HC        <- scABEL(CV = CVwR / 100, regulator = "HC")
EL[, 1]   <- sprintf("%.3f%%", EL[, 1])
EL[, 2:3] <- sprintf("%.2f%%", 100 * EMA)
EL[, 4:5] <- sprintf("%.1f%%", 100 * HC)
names(EL)[2:5] <- c("L (EMA)", "U (EMA)", "L (HC)", "U (HC)")
print(EL, row.names = FALSE)
#     CVwR L (EMA) U (EMA) L (HC) U (HC)
#  30.000%  80.00% 125.00%  80.0% 125.0%
#  35.000%  77.23% 129.48%  77.2% 129.5%
#  40.000%  74.62% 134.02%  74.6% 134.0%
#  45.000%  72.15% 138.59%  72.2% 138.6%
#  50.000%  69.84% 143.19%  69.8% 143.2%
#  55.000%  69.84% 143.19%  67.7% 147.8%
#  57.382%  69.84% 143.19%  66.7% 150.0%
#  60.000%  69.84% 143.19%  66.7% 150.0%

It has to be demonstrated that the high \(\small{CV_\text{wR}}\) is not caused by outliers. If outliers are detected, they have to be excluded and \(\small{CV_\text{wR}}\) as well as \(\small{\left\{L,U\right\}}\) recalculated. However, the 90% CI has to be calculated with complete data.

    

In order to pass ABEL:

  1. The 90% CI has to lie entirely within \(\small{\left\{L,U\right\}}\).
  2. The PE has to lie within 80.00 – 125.00%.

top of section ↩︎ previous section ↩︎

NTIDs

    

Based on \(\small{\Delta=10\%}\) for Narrow Therapeutic Index Drugs (EMA) and Critical Dose Drugs (Health Canada) the BE limits may need to be narrowed143 147 or scaled150 163 (FDA and China’s CDE).

  1. For the EMA this means generally acceptance limits of 90.00 – 111.11% for \(\small{AUC}\) only. Where \(\small{C_\text{max}}\) is of particular importance for safety, efficacy or drug level monitoring, 90.00 – 111.11% should be applied as well. It is not possible to define a set of criteria to categorize drugs as NTIDs and it must be decided case by case based on clinical considerations whether an active substance is an NTID. However, according to all PSGs published so far, 90.00 – 111.11% is recommended only for \(\small{AUC}\).
  2. For Health Canada acceptance limits are 90.0 – 112.0% (see also this article) for \(\small{AUC}\), whereas for \(\small{C_\text{max}}\) the 90% CI has to be assessed for the conventional limits of 80.0 – 125.0%.
  3.     
  4. The FDA and China’s CDE require RSABE based on \(\small{s_\text{wR}}\). The study has to be performed in a 2-treat­ment 2-se­quence 4-period full replicate design, thus allowing a comparison of \(\small{s_\text{wT}}\) with \(\small{s_\text{wR}}\).

With the regulatory switch­ing condition \(\small{s_0=0.10}\) we get the regulatory constant by \[\theta_\text{s}=\frac{\log_{e}1.11111}{s_0}\approx 1.053595\ldots\tag{23}\] The 95% upper confidence bound is determined with \(\small{\theta_\text{s}}\) by \(\small{(15)-(22)}\).
The upper CL for \(\small{\sigma_\text{wT}/\sigma_\text{wR}}\) is calculated by \[\frac{s_\text{wT}/s_\text{wR}}{\sqrt{F_{{1-\alpha/2},\nu_1,\nu_2}}}\small{\textsf{,}}\tag{24}\] where \(\small{s_\text{wT}}\) ist the estimate of \(\small{\sigma_\text{wT}}\) with \(\small{\nu_1}\) degrees of freedom, \(\small{s_\text{wR}}\) ist the estimate of \(\small{\sigma_\text{wR}}\) with \(\small{\nu_2}\) degrees of freedom, and \(\small{F}\) is the value of the F-distribution with \(\small{\nu_1}\) (numerator) and \(\small{\nu_2}\) (denominator) for \(\small{\alpha=0.1}\).

    

In order to pass:

  1. \(\small{\textsf{bound}}\) by \(\small{(22)}\) has to be \(\small{\leq0}\).
  2. The upper CL by \(\small{(24)}\) has to be \(\small{\leq 2.5}\).
  3. ABE has to be demonstrated by \(\small{(7)}\) and \(\small{\Delta=20\%}\) (90% CI entirely within 80.00 – 125.00%).
    

The last condition is equivalent to capping the ‘implied’ limits \(\small{\left\{L,U\right\}}\) of RSABE at \(\small{CV_\text{wR}\geq\approx21.42\%}\). Other­wise, for any larger \(\small{CV_\text{wR}}\) they would by wider than 80.00 – 125.00%. Of course, that is not what we want for an NTID. We can show that numerically.

fun <- function(x, Delta, sigma.0) { # x is CVwR
  theta.s   <- log(Delta) / sigma.0  # regulatory constant
  swR       <- sqrt(log(x^2 + 1))    # within subject standard deviation of R
  U         <- exp(theta.s * swR)    # upper ‘implied’ (scaled) limit
  objective <- U - 1.25              # target zero
  return(objective)
}
Delta   <- 1.11111 # approximate acc. to the guidance (not the exact 1/0.9)
sigma.0 <- 0.10    # regulatory switching condition
# numerically find the CV where U ≈1.25
CVcap   <- 100 * uniroot(fun, interval = c(0, 0.3), tol = 1e-8,
                         Delta, sigma.0)$root
# check the ‘implied’ limits
CVwR    <- sort(c(CVcap / 100, seq(0.05, 0.3, 0.05)))
comp    <- data.frame(CVwR = CVwR, L.implied = NA_real_, U.implied = NA_real_,
                      L.capped = NA_real_, U.capped = NA_real_)
f       <- c(-1, +1)
for (i in seq_along(CVwR)) {
  comp[i, 2:5] <- 100 * exp(f * log(Delta) / sigma.0 *
                            sqrt(log(CVwR[i]^2 + 1)))
  if (comp$CVwR[i] >= CVcap / 100) {
    comp[i, 4:5] <- 100 * exp(f * log(Delta) / sigma.0 *
                              sqrt(log((CVcap / 100)^2 + 1)))
  }
}
comp$CVwR <- 100 * comp$CVwR
txt       <- sprintf("The ‘implied’ limits in RSABE are capped at CVwR %.9g%%.\n", CVcap)
cat(txt); print(comp, row.names = FALSE)
# The ‘implied’ limits in RSABE are capped at CVwR 21.4189888%.
#      CVwR L.implied U.implied L.capped U.capped
#   5.00000  94.87150  105.4057 94.87150 105.4057
#  10.00000  90.02367  111.0819 90.02367 111.0819
#  15.00000  85.45665  117.0184 85.45665 117.0184
#  20.00000  81.16742  123.2021 81.16742 123.2021
#  21.41899  80.00000  125.0000 80.00000 125.0000
#  25.00000  77.15013  129.6174 80.00000 125.0000
#  30.00000  73.39651  136.2463 80.00000 125.0000

top of section ↩︎ previous section ↩︎

BCS-based Biowaivers

    

Introduced by the FDA in 2000,164 138 the EMA in 2010,143 the WHO in 2017,146 and adopted by the ICH in 2019165 as an alternative for in vivo testing of IR pro­ducts based on the Bio­phar­ma­ceu­tic Clas­si­fi­cation System, where drugs are classified by their solubility and permeability.85

\[\small{\begin{array}{cc}\hline \textbf{Class I} & \textbf{Class II}\\\hline \text{High solubility} & \text{Low solubility}\\ \text{High permeability} & \text{High permeability}\\\hline \textbf{Class III} & \textbf{Class IV}\\\hline \text{High solubility} & \text{Low solubility}\\ \text{Low permeability} & \text{Low permeability}\\\hline\end{array}}\]

The idea behind waiving an in vivo study is based on the fact that such studies are not required for aqueous solutions (see above). Thus, if a drug product dissolves very rapidly, it can be expected to behave similar to a solution.

A BCS-based biowaiver may be acceptable if ƒthe drug substance has been proven to exhibit high solubility and com­plete absorption (Class I) and ƒeither very rapid (> 85% within 15 min) or similarly rapid (85% within 30 min) in vitro dissolution characteristics of the test and reference product has been demonstrated considering specific requirements andƒ excipients that might affect BA are qualitatively and quantitatively the same. In general, the use of the same excipients in similar amounts is preferred.

BCS-based biowaivers may also be acceptable if ƒthe drug substance has been proven to exhibit high solubility and limited absorption (Class III) and very rapid (> 85% within 15 min) in vitro dissolution of the test and reference product has been demonstrated considering specific requirements,ƒ excipients that might affect BA are qualitatively and quantitatively the same, and other excipients are qualitatively the same and quantitatively very similar.

The following conditions should be employed in the comparative dissolution studies to characterize the dissolution profile of the products:165

  • Paddle of basket apparatus.
  • Volume of dissolution medium: ≤ 900 mL; preferrably the volume of the QC test.
  • Temperature of the dissolution medium: 37 ±1 ℃.
  • Agitation: 50 rpm (paddle apparatus), 100 rpm (basket apparatus).
  • ≥ 12 units of the reference and test product.
  • Three buffers: pH 1.2, 4.5, and 6.8. Pharmacopoeial buffers should be employed. Additional investigation may be required at the pH of minimum solubility (if different from the buffers above).
  • Organic solvents are not acceptable and no surfactants should be added.
  • Samples should be filtered during collection, unless an in-situ detection method is used.
  • For gelatin capsules or tablets with gelatin coatings where cross-linking has been demonstrated, the use of enzymes may be acceptable, if appropriately justified.

When high variability or coning is observed in the paddle apparatus at 50 rpm for both reference and test products, the use of the basket apparatus at 100 rpm is recommended. Additionally, alternative methods (e.g., the use of sin­kers or other appropriately justified approaches) may be considered to overcome issues such as coning, if scientifically substantiated.165

The evaluation of the similarity factor \(\small{f_2}\) is based on the following conditions:165

  • A minimum of three time points (zero excluded).
  • The time points should be the same for the two products.
  • Mean of the individual values for every time point for each product.
  • Not more than one mean value of ≥ 85% dissolved for either of the products.
  • The coefficient of variation (CV) of mean values should not > 20% at early time points (≤ 10 min) and should not > 10% at other time points. When the CV is too high, the  ƒ2 calculation is considered inaccurate and a conclusion on similarity in dissolution can­not be made.

A risk assessment of potential bioinequivalence by application of a biowaiver has be provided, which has to be more strict for Class III than for Class I drugs.143 Biowaivers for NTIDs are not possible.

top of section ↩︎ previous section ↩︎

Open Issues

    
Not harmonized approaches – unbelievable after six GBHI workshops!
Drug FDA EMA WHO Health Canada
un­com­pli­cated ABE (any metric)150 143 146 90% CI within 80.00–125.00% ABE147
\(\small{AUC}\): 90% CI within 80.0–125.0%
\(\small{C_\text{max}}\): PE within 80.0–125.0%
HVD(P) RSABE150 ABEL (\(\small{C_\text{max}}\)143 \(\small{\textsf{p}AUC}\)120)
\(\small{uc}\) 50%
ABEL (\(\small{C_\text{max}}\)146 \(\small{AUC}\)166)
\(\small{uc}\) 50%
ABEL/ABE147
\(\small{AUC}\): \(\small{uc}\) 57.382%
\(\small{C_\text{max}}\): PE within 80.0–125.0%
NTID RSABE150 ABE: 90% CI within 90.00–111.11%
EMA PSGs: Only for \(\small{AUC}\)
ABE147
\(\small{AUC}\): 90% CI within 90.0–112.0%
\(\small{C_\text{max}}\): 90% CI within 80.0–125.0%

This lack of harmonization leads to the paradox (though hypothetical) situation that the same study will pass in one jurisdiction but fail in another.104 156

Still unresolved, outlook:

  1. Control of the type I error in SABE?104 156 157
  2. Outliers in ABEL: Why and how?142 145 146 167
  3. Innovators should perform a study in a replicate design in order to obtain estimates of the with-subject variability of PK metrics.100 103 140 This would allow to define fixed wider BE limits in PSGs and strictly control the type I error.104 157
  4. Method for NTIDs: Fixed narrower acceptance limits143 145 146 or reference-scaling?151 168 169
  5. Comparison of ‘early exposure’170 171 172 if clinically relevant (i.e., \(\small{t_\text{max}}\) by a nonparametric method or first partial \(\small{AUC}\))? See also this article.
  6. Selection of cut-off times of partial \(\small{AUC}\textsf{s}\) of multiphasic release formulations (i.e., based on PD – like the FDA or on PK – like the EMA)? If a PK/PD relationship is lacking, the selection of cut-off times is challenging at least.173
  7. Remove174 assessment3 175 of the group-by-treatment interaction?
  8. Is the questionable176 177 recommendation for the inclusion of male and female subjects3 146 151 driven by ‘gender politics’ rather than science?
  9. Use \(\small{C_\text{max}/AUC}\) as an alternative surrogate for the rate of absorption?38 39 40
  10. Reduce variability of the extent of absorption of HVDs by using \(\small{AUC/\widehat{t}_{1/2}}\) or \(\small{AUC/\widehat{\lambda}_z}\)?178 179 180 181
  11. Use \(\small{\widehat{C}_{\text{t}_\text{last}}}\)167 for the extrapolation of \(\small{AUC}\); see also this article.
  12. The requirement that \(\small{AUC_{0-\text{t}}}\) of IR products has to cover ≥ 80% of \(\small{AUC_{0-\infty}}\).3 143 146 It appeared out of blue skies already in the APV guide­line of 198768 without any justification. Thoughtless copy & paste ever since? It is questionable because at \(\small{2-4\,\times}\) \(\small{t_\text{max}}\) ab­sor­p­tion is practically complete.182 183
  13. While for all IR products \(\small{AUC_{0-72\text{h}}}\) (instead of \(\small{AUC_{0-\text{t}}}\)) is acceptable for the EMA and the WHO,143 146 the ICH, the FDA, and Health Canada accept that only for drugs with a half-life of > 24 hours.3 150 147 Why?
  14. The elastic clause »appropriate sample size«3 143 was borrowed from ICH M9.184 However, the latter was intended for a statistical audience knowing what that means. The 2001 guideline of the EMA,93 as well as the ones of the WHO146 and Health Canada167 are more specific.
  15. Testing of sequence, period and formulation effects with an »explanation« for significant effects167 is ludicrous. Period effects mean out in cross­over studies; see this article about sequence effects as well as that one about treatment effects.
  16. Should studies in fed state be mandatory?
  17. Should the requirement of multiple dose studies of extended release products120 146 be abandoned?151 185 186
  18. Use \(\small{C_\text{ss,min}}\) in multiple dose studies also for generics?121
  19. Adaptive sequential Two-Stage Designs (TSDs).187 188 Only exact189 or simulation-based methods190 191 192 193 194 as well?
  20. Limit the potency-correction3 143 146 not only to cases where measured contents differ by more than 5%?72 The Ca­na­di­an requirement147 of demonstrating BE on both potency‐corrected and uncorrected data is peculiar.

<nitpick>

It is beyond me why the EMA’s guideline143 (based on the European legislation195) refer – apart from salts and esters – to different ethers.
  • Different salts dissociate to the same base and different anions.
  • Different esters hydrolize to the same base and different alcohols.
  • apple and orangeCleavage of ethers is impossible in physiological conditions.
    This statement demonstrates a lack of understanding basic organic chemistry. Dif­fe­rent ethers are different active moieties!

No other jurisdiction contains such a ridiculous statement.

Those people who think they know everything
are a great annoyance to those of us who do.

</nitpick>

top of section ↩︎ previous section ↩︎


TODO &c. 

  • Line-extensions and dose-proportionality biowaivers143 146 151
  • In Vivo-In Vitro Correlation120 143 196 197
  • Transdermal Drug Delivery Systems120 151
  • Fixed Dose Combination (FDC) products198 199
  • Orally Inhaled Products200 201 202
  • BE-drift203 204 205
  • Locally applied & locally acting (LALA) products206
  • Global comparator207 208 209 210 211
  • Adjusted indirect com­pa­ri­sons212 213 214 215 216 217 218
  • Group-sequential and Two-Stage Designs
  • Biosimilars
  • Current guidelines in various jurisdictions

See also a – somewhat outdated – collection of guidelines, my presentations, and further readings.134 135 173 219 220 221 222 223 224 225 226 227 228 229 230 231 To whom it may concern.232 233 234

    

A word of warning: The textbooks dealing mainly with statistics (marked with ) are rather tough cookies and not recommended for beginners.

Acknowledgments

Henning Blume and José A. Guimarães Morais for sharing memories about the Bio­Inter­national conferences and the early period of bioequivalence.

Postscript

I tried to give online-resources as far as possible. Others were published before the internet was developed. I have them on yellowed or even faint thermal paper of (yes!) FAX-machines. Some books are out of print; perhaps you can get them used. No, I will not sell any of them.

Licenses

CC BY 4.0 Helmut Schütz 2024
R GPL 3.0, klippy MIT, pandoc GPL 2.0.
1st version April 9, 2024. Rendered May 19, 2024 15:56 CEST by rmarkdown via pandoc in 0.11 seconds.

Abbreviations

Abbreviation Meaning
\(\small{\alpha}\) Nominal level of the test, probability of Type I Error (patient’s risk)
ABE Average Bioequivalence
ABEL Average Bioequivalence with Expanding Limits
ANDA Abbreviated New Drug Application (generics; FDA term)
ANOVA Analyis of Variance
API Active Pharmaceutical Ingredient
\(\small{AUC}\) Area Under the Curve
\(\small{AUC_{0-\text{t}}}\) \(\small{AUC}\) from the time of administration to the time of the last measurable concentration
\(\small{AUC_{0-72\text{h}}}\) \(\small{AUC}\) from the time of administration to 72 hours (IR products)
\(\small{AUC_{0-\infty}}\) \(\small{AUC}\) from the time of administration extrapolated to infinite time
BA Bioavailability
BCS Biopharmaceutic Classification System
BE Bioequivalence
BfARM Bundesinstitut for Arzneimittel und Medizinprodukte (German competent authority)
\(\small{\beta}\) Probability of Type II Error (producer’s risk), where \(\small{\beta=1-\pi}\)
\(\small{\textsf{bound}}\) 95% upper confidence bound in RSABE
CDE Center for Drug Evaluation (China)
CDER Center for Drug Evaluation and Research (FDA)
cGMP curent Good Manufacturing Practices
CI Confidence Interval
CL Confidence Limit
\(\small{CL}\) Clearance
\(\small{C_\text{max}}\) Maximum concentration
\(\small{C_\text{ss,min}}\) Minimum concentration in steady-state within the dosing interval
\(\small{C_{\text{ss}\,,\tau}}\) Concentration in steady-state at the end of the dosing interval
\(\small{C_{\text{t}_\text{last}}}\) Last measured concentration
\(\small{\widehat{C}_{\text{t}_\text{last}}}\) Estimated concentration at the time point of \(\small{C_{\text{t}_\text{last}}}\)
\(\small{CV_\text{wR},CV_\text{wT}}\) Observed within-subject Coefficient of Variation of the Reference and Test product
\(\small{D}\) Dose
\(\small{\Delta}\) Clinically relevant difference
EMA European Medicines Agency
\(\small{E_\text{max}}\) Maximum effect
EOB Electronic Orange Book (FDA)
\(\small{f}\) Fraction absorbed
\(\small{f_2}\) Similarity factor
FDA U.S. Food and Drug Administration
FDC Fixed Dose Combination (Product)
GLP Good Laboratory Practice
\(\small{H_0}\) Null hypothesis
\(\small{H_1}\) Alternative hypothesis
HVD(P) Highly Variable Drug (Product)
IBE Individual Bioequivalence
ICH International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use
IR Immediate Release (product)
\(\small{k}\) Regulatory constant (0.76) in SABE
\(\small{k\,_\text{a}}\) Absorption rate constant
\(\small{k\,_\text{el}}\) Elimination rate constant
\(\small{L}\) Lower expanded limit in ABEL
\(\small{L,U}\) Expanded limits in ABEL
\(\small{\widehat{\lambda}_\text{z}}\) Apparent terminal rate constant (estimated)
LALA Locally applied, locally acting (product)
MR Modified release (product)
\(\small{\mu_\text{T}/\mu_\text{R}}\) True \(\small{\text{T}/\text{R}}\)-ratio
\(\small{\nu}\) Degrees of freedom
\(\small{n}\) Sample size
\(\small{n_1,n_2}\) Number of subjects in sequences 1 and 2 of a 2×2×2 crossover design
NDA New Drug Application (originators; FDA term)
NTID Narrow Therapeutic Index Drug (Canada: Critical Dose Drug)
OGD Office of Generic Drugs (FDA)
\(\small{\text{p}AUC}\) Partial \(\small{AUC}\)
\(\small{\pi}\) Prospective (a priori) power, where \(\small{\pi=1-\beta}\)
\(\small{\widehat{\pi}}\) Retrospective (post hoc, estimated) power
\(\small{p}\) Probability
PBE Population Bioequivalence
PD Pharmacodynamics
PE Point Estimate of \(\small{\mu_\text{T}/\mu_\text{R}}\)
PK Pharmacokinetics
PSG Product-Specific Guidance
\(\small{\text{R}}\) Reference product
RLD Reference Listed Drug (FDA term)
RSABE Reference-Scaled Average Bioequivalence
\(\small{s_0}\) Switching condition in RSABE: for HVD(P)s 0.25 and for NTIDs 0.1.
SABE Scaled Average Bioequivalence
SUPAC Scale-Up and Postapproval Changes (FDA)
\(\small{s_\text{wR},s_\text{wT}}\) Observed within-subject standard deviation of the Reference and Test product
\(\small{s_{\text{wR}}^{2},s_{\text{wT}}^{2}}\) Observed within-subject variance of the Reference and Test product
\(\small{\sigma_\text{wR}}\) True within-subject standard deviation of the Reference product
\(\small{\text{T}}\) Test product
TE Therapeutic Equivalence
\(\small{t_\text{last}}\) Time of the last measured concentration \(\small{C_{\text{t}_\text{last}}}\)
\(\small{t_\text{max}}\) Time of \(\small{C_\text{max}}\)
\(\small{\theta_\text{s}}\) Regulatory constant in RSABE: for HVD(P)s 0.8925742… and for NTIDs 1.053595…
\(\small{\theta_0}\) True (in sample size estimation assumed) \(\small{\text{T}/\text{R}}\)-ratio
\(\small{\theta_1,\theta_2}\) Fixed lower and upper limits of the BE acceptance range
\(\small{\theta_{\text{s}_1},\theta_{\text{s}_2}}\) Scaled lower and upper limits of the BE acceptance range
TIE Type I Error
TOST Two One-Sided Tests
TSD Two-Stage Design
\(\small{U}\) Upper expanded limit in ABEL
\(\small{uc}\) Upper cap of expansion in ABEL
\(\small{V}\) Apparent volume of distribution
WHO World Health Organization
2×2×2 2-treatment 2-sequence 2-period crossover design

Footnotes and References


  1. Vitti TG, Banes D, Byers TE. Bioavailability of Digoxin. N Engl J Med. 1971; 285(25): 1433–4. doi:10.1056/NEJM197112162852512.↩︎

  2. DeSante KA, DiSanto AR, Chodos DJ, Stoll RG. Antibiotic Batch Certification and Bioequivalence. JAMA. 1975; 232(13): 1349–51. doi:10.1001/jama.1975.03250130033016.↩︎

  3. ICH. Bioequivalence for Immediate-Release Solid Oral Dosage Forms. M13A. Draft version 20 December 2022. Online.↩︎

  4. Hall DG, In: Hearing Before the Subcommittee on Monopolies Select Committee on Small Business. U.S. Senate, Government Printing Office, Washington D.C. 1967: 258–81.↩︎

  5. Tyrer JH, Eadie MJ, Sutherland JM, Hooper WD. Outbreak of anticonvulsant intoxication in an Australian city. Br Med J. 1970; 4: 271–3. doi:10.1136/bmj.4.5730.271. Open Access Open Access.↩︎

  6. Bochner F, Hooper WD, Tyrer JH, Eadie MJ. Factors involved in an outbreak of pheny­to­in intoxications. J Neurol Sci. 1972; 16(4): 481–7. doi:10.1016/0022-510x(72)90053-6.↩︎

  7. Lund L. Clinical significance of generic inequivalence of three different pharmaceutical preparations of phenytoin. Eur J Clin Phar­ma­col. 1974; 7: 119–24. doi:10.1007/bf00561325.↩︎

  8. Lindenbaum J, Mellow MH, Blackstone MO, Butler VP. Variations in biological activity of digoxin from four preparations. N Engl J Med. 1971; 285(24): 1344–7. doi:10.1056/NEJM197112092852403.↩︎

  9. Wagner JG, Christensen M, Sakmar E, Blair D, Yates JD, Willis PW 3rd, Sedman AJ, Stoll RG. Equivalence lack in digoxin plasma levels. JAMA, 1973; 224(2): 199–204. PMID 4739492.↩︎

  10. Lindenbaum J, Preibisz JJ, Butler VP Jr., Saha JR. Variation in digoxin bioavailabity: a continuing problem. J Chron Dis. 1973; 16: 749–54. doi:10.1056/nejm197112092852403.↩︎

  11. Levy G, Gibaldi M. Bioavailability of Drugs. Focus on Digoxin. Circulation. XLIX(3); 1974: 391–4. doi:10.1161/01.CIR.49.3.391. Open Access Open Access.↩︎

  12. Jounela AJ, Pentikäinen PJ, Sothmann. Effect of particle size on the bioavalability of digoxin. Eur J Clin Phar­ma­col. 1975; 8(5): 365–70. doi:10.1007/BF00562664.↩︎

  13. Richton-Hewett S, Foster E, Apstein CS. Medical and Economic Consequences of a Blinded Oral Anticoagulant Brand Change at a Municipal Hospital. Arch Intern Med. 1988; 148(4): 806–8. doi:10.1001/archinte.1988.00380040046010.↩︎

  14. Weinberger M, Hendeles L, Bighley L, Speer J. The Relation of Product Formulation to Absorption of Oral Theo­phyl­line. N Engl J Med. 1978; 299(16): 852–7. doi:10.1056/nejm197810192991603.↩︎

  15. Bielmann B, Levac TH, Langlois Y, L Tetreault L. Bioavailability of primidone in epi­lep­tic patients. Int J Clin Phar­ma­col. 1974; 9(2): 132–7. PMID 4208031↩︎

  16. Skelly JP, Knapp G. Biologic availability of digoxin tablets. JAMA. 1973; 224(2): 243. doi:10.1001/jama.1973.03220150051015.↩︎

  17. Skelly JP. A History of Biopharmaceutics in the Food and Drug Administration 1968–1993. AAPS J. 2010; 12(1): 44–50. doi:10.1208/s12248-009-9154-8. PMC Free Full Text Free Full Text.↩︎

  18. APhA Academy of Pharmaceutical Sciences. Guidelines for Biopharmaceutic Studies in Man. Washington D.C. February 1972.↩︎

  19. Skelly JP. Bioavailability and Bioequivalence. J Clin Phar­ma­col. 1976; 16(10/2): 539–45. doi:10.1177/009127007601601013.↩︎

  20. Gardener S (Acting Commissioner of Food and Drugs). CFR, Title 21, Vol. 5, Chapter I, Part 320. Bioavailability and Bio­equi­va­lence Requirements. Procedures for Determining the In Vivo Bioavailability of Drug Products. December 30, 1976. Effective July 7, 1977, In: FR, Vol. 42, No. 5. January 7, 1977. Online.↩︎

  21. Schuirmann DJ. A comparison of the Two One-Sided Tests Procedure and the Power Ap­proach for Assessing the Equivalence of Av­er­age Bioavailability. J Pharmacokin Bio­pharm. 1987; 15(6): 657–80. doi:10.1007/BF01068419.↩︎

  22. Metzler CM. Bioavailability – A Problem in Equivalence. Biometrics. 1974; 30(2): 309–17. PMID 4833140.↩︎

  23. Westlake WJ. Symmetrical Confidence Intervals for Bioequivalence Trials. Bio­me­trics. 1976; 32(4): 741–4. PMID 1009222.↩︎

  24. Mantel N. Do We Want Confidence Intervals Symmetrical About the Null Value? Bio­me­trics. 1977; 33: 759–60. [Letter to the Editor]↩︎

  25. Westlake WJ. Design and Evaluation of Bioequivalence Studies in Man. In: Blanchard J, Sawchuk RJ, Brodie BB, editors. Prin­cip­les and perspectives in Drug Bio­avail­abi­li­ty. Basel: Karger; 1979. p. 192–210. ISBN 3-8055-2440-4.↩︎

  26. Fieller EC. Some Problems In Interval Estimation. J Royal Stat Soc B. 1954; 16(2): 175–85. doi:10.1111/j.2517-6161.1954.tb00159.x.↩︎

  27. Locke CS. An Exact Confidence Interval from Untransformed Data for the Ratio of Two Formulation Means. J. Phar­ma­co­kin. Bio­pharm. 1984; 12(6): 649–55. doi:10.1007/bf01059558.↩︎

  28. U.S. Department of Health and Human Services, FDA, Office of Medical Products and Tobacco, CDER, OGD, OGDP. Ap­proved Drug Products with Therapeutic Equi­va­lence Evaluations. 44th Edition. 2024. Download.↩︎

  29. U.S. Department of Health and Human Services, FDA, Office of Medical Products and Tobacco, CDER, OGD, OGDP. Ap­proved Drug Products with Therapeutic Equi­va­lence Evaluations. Cumulative Supplement. Download.↩︎

  30. Public Law 98-417. Sept. 24, 1984. Online.↩︎

  31. In phase III we try to demonstrate that verum performs ‘better’ than placebo, i.e., one-sided tests for non-inferiority (effect) and non-superiority (adverse reactions). Such studies are already large: Approving sta­tins and CO­VID-19 vaccines required ten thousands volunteers. Can you imagine how many it would need to detect a 20% difference between two treatments?↩︎

  32. Benet LZ. Why Do Bioequivalence Studies in Healthy Volunteers? Presentation at: 1st MENA Regulatory Conference on Bio­equi­va­lence, Bio­wai­vers, Bioanalysis and Dissolution. Amman; 23 September 2013.  Internet Archive.↩︎

  33. Office of the Federal Register. Code of Federal Regulations, Title 21, Part 320, Sub­part A, § 320.23(a)(1) Online.↩︎

  34. This is an assumption, i.e., based on the labelled content instead of the measured potency.↩︎

  35. Yet another assumption. Incorrect for highly variable drugs and, thus, inflates the confidence interval.↩︎

  36. Tóthfálusi L, Endrényi L. Estimation of Cmax and Tmax in Populations After Single and Multiple Drug Ad­mi­ni­stra­tion. J Pharma­co­kin Pharma­codyn. 2003; 30(5): 363–85. doi:10.1023/b:jopa.0000008159.97748.09.↩︎

  37. These formulas are only valid for a one-compartment model with zero order absorption and first order elimination. In all other models \(\small{t_\text{max}}\) (and thus, \(\small{C_\text{max}}\)) cannot be analytically derived. In software numeric optimization is employed to locate the maxi­mum of the function.↩︎

  38. Endrényi L, Fritsch S, Yan W. Cmax/AUC is a clearer measure than Cmax for absorption rates in investigations of bio­equi­va­lence. Int J Clin Pharmacol Ther Toxicol. 1991; 29(10): 394–9. PMID 1748540.↩︎

  39. Schall R, Luus HG. Comparison of absorption rates on bioequivalence studies of immediate release drug dormulations. Int J Clin Phar­ma­col Ther To­xi­col. 1992; 30(5): 153–9. PMID 1592542.↩︎

  40. Endrényi L, Yan W. Variation of Cmax and Cmax/AUC in investigations of bio­equi­va­lence. Int J Clin Pharm Ther To­xi­col. 1993; 31(4): 184–9. PMID 8500920.↩︎

  41. Haynes JD. Statistical simulation study of new proposed uniformity requirement for bioequivalency studies. J Pharm Sci. 1981; 70(6): 673–5. doi:10.1002/jps.2600700625.↩︎

  42. Cabana BE. Assessment of 75/75 Rule: FDA Viewpoint. Pharm Sci. 1983; 72(1): 98–9. doi:10.1002/jps.2600720127.↩︎

  43. Haynes JD. FDA 75/75 Rule: A Response. Pharm Sci. 1983; 72(1): 99–100.↩︎

  44. Nitsche V, Mascher H, Schütz H. Comparative bioavailability of several phenytoin preparations marketed in Austria. Int J Clin Pharmacol Ther Toxicol. 1984; 22(2): 104–7. PMID 6698663.↩︎

  45. Klingler D, Nitsche V, Schmidbauer H. Hydantoin-Intoxikation nach Austausch schein­bar gleich­wertiger Di­phenyl­hy­dan­toin-Prä­pa­rate. Wr Med Wschr. 1981; 131: 295–300. [German]↩︎

  46. Glazko AJ, Chang T, Bouhema J, Dill WA, Goulet JR, Buchanan RA. Metabolic disposition of diphenylhydantoin in normal human subjects following intravenous administration. Clin Pharmacol Ther. 1969; 10(4): 498–504. doi:10.1002/cpt1969104498.↩︎

  47. Bochner F, Hooper WD, Tyrer JH, Eadi MJ. Effect of dosage increments on blood pheny­toin concentrations. J Neu­rol Neuro­surg Psychiatr. 1972; 35(6): 873–6. doi:10.1136/jnnp.35.6.873.↩︎

  48. Kirkwood TBL. Bioequivalence Testing – A Need to Rethink [reader reaction]. Bio­me­trics. 1981, 37: 589—91. doi:10.2307/2530573.↩︎

  49. Westlake WJ. Response to Bioequivalence Testing – A Need to Rethink [reader reaction response]. Biometrics. 1981, 37: 591–3.↩︎

  50. Westlake WJ. Bioavailability and Bioequivalence of Pharmaceutical Formulations. In: Pearce KE, editor. Bio­phar­ma­ceu­tical Sta­tistics for Drug Development. New York: Marcel Dekker; 1988. p. 329–53. ISBN 0-8247-7798-0.↩︎

  51. Rodda BE, Davis RL. Determining the probability of an important difference in bio­availability. Clin Pharmacol Ther. 1980; 28: 247–52. doi:10.1038/clpt.1980.157.↩︎

  52. Mandallaz D, Mau J. Comparison of Different Methods for Decision-Making in Bio­equi­valence Assessment. Bio­me­trics. 1981; 37: 213–22. PMID 6895040.↩︎

  53. Fluehler H, Hirtz J, Moser HA. An Aid to Decision-Making in Bioequivalence Assess­ment. J Pharmacokin Bio­pharm. 1981; 9: 235–43. doi:10.1007/BF01068085.↩︎

  54. Selwyn MR, Hall NR. On Bayesian Methods for Bioequivalence. Biometrics. 1984; 40: 1103–8. PMID 6398710.↩︎

  55. Fluehler H, Grieve AP, Mandallaz D, Mau J, Moser HA. Bayesian Approach to Bio­equivalence Assessment: An Example. J Pharm Sci. 1983; 72(10): 1178–81. doi:10.1002/jps.2600721018.↩︎

  56. Anderson S, Hauck WW. A New Procedure for Testing Bioequivalence in Comparative Bioavailability and Other Clinical Trials. Commun Stat Ther Meth. 1983; 12(23): 2663–92. doi:10.1080/03610928308828634.↩︎

  57. Steinijans VW, Diletti E. Statistical Analysis of Bioavailability Studies: Parametric and Nonparametric Confidence Intervals. Eur J Clin Pharmacol. 1983; 24: 127–36. doi:10.1007/BF00613939.↩︎

  58. Steinijans VW, Diletti E. Generalization of Distribution-Free Confidence Intervals for Bioavailability Ratios. Eur J Clin Phar­ma­col. 1985; 28: 85–8. doi:10.1007/BF00635713.↩︎

  59. Steinijans VW, Schulz H-U, Beier W, Radtke HW. Once daily theophylline: multiple-dose comparison of an encapsulated micro-osmotic system (Euphylong) with a tablet (Uniphyllin). Int J Clin Pharm Ther Toxi­col. 1986; 24(8): 438–47. PMID 3759279.↩︎

  60. Steinijans VW. Pharmacokinetic Characteristics of Controlled Release Products and Their Biostatistical Analysis. In: Gundert-Remy U, Möller H, editors. Oral Controlled Release Products – Therapeutic and Biopharmaceutic Assess­ment. Stutt­gart: Wis­sen­schaftliche Verlagsanstalt; 1988, p. 99–115.↩︎

  61. Blume H, Siewert M, Steinijans V. Bioäquivalenz von per os applizierten Retard-Arz­nei­mitteln; Konzeption der Stu­dien und Ent­scheidung über Austauschbarkeit. Pharm Ind. 1989; 51: 1025–33. [German]↩︎

  62. Wijnand HP, Timmer CJ. Mini-computer programs for bioequivalence testing of pharmaceutical drug formulations in two-way cross-over studies. Comput Programs Bio­med. 1983; 17(1–2): 73–88. doi:10.1016/0010-468x(83)90027-2.↩︎

  63. Where did it come from? Two stories:
    Les Benet told that there was a poll at the FDA and – essentially based on gut feeling – the 20% saw the light of day.
    I’ve heard another one, which I like more. Wilfred J. Westlake, one of the pioneers of BE was a statistician at SKF. During a coffee and cig break (everybody was smoking in the 1970s) he asked his fellows of the clinical pharmacology department »Which difference in blood concentrations do you consider relevant?« Yep, the 20% were born.↩︎

  64. Rheinstein P. Report by the Bioequivalence Task Force on Recommendations from the Bioequivalence Hearing conducted by the Food and Drug Administration. September 29 – October 1986. January 1988.↩︎

  65. APV. Richtlinie und Kommentar. Pharmazeutische Industrie. 1985; 47(6): 627–32. [German]↩︎

  66. Arbeitsgemeinschaft Pharmazeutische Verfahrenstechnik (APV). International Symposium. Bioavail­abi­lity/Bio­equi­va­lence, Pharmaceutical Equivalence and The­ra­peu­tic Equivalence. Würzburg. 9–11 February, 1987.↩︎

  67. Junginger H. APV-Richtlinie – »Untersuchungen zur Bioverfügbarkeit, Bioäquivalenz« Pharm Ztg. 1987; 132: 1952–55. [German]↩︎

  68. Junginger H. Studies on Bioavailability and Bioequivalence – APV Guideline. Drugs Made in Germany. 1987; 30: 161–6.↩︎

  69. Blume H, Kübel-Thiel K, Reutter B, Siewert M, Stenzhorn G. Nifedipin: Monographie zur Prüfung der Bio­ver­füg­bar­keit / Bio­äqui­va­lenz von schnell-freisetzenden Zubereitungen (1). Pharm Ztg. 1988; 133(6): 398–93. [German]↩︎

  70. TGA. Guidelines for Bioavailability and Bioequivalency Studies. Draft C06:6723c (29/11/88).↩︎

  71. Blume H, Mutschler E. Bioäquivalenz – Qualitätsbewertung wirkstoffgleicher Fertigarzneimittel: An­lei­tung-Me­tho­den-Ma­te­ri­a­lien. Frank­furt/Main: Govi-Ver­lag; 1989. [German]↩︎

  72. McGilveray IJ, Midha KK, Skelly JP, Dighe S, Doluiso JT, French IW, Karim A, Burford R. Consensus Report from “Bio In­ter­na­tional ’89”: Issues in the Evaluation of Bioavailability Data. J Pharm Sci. 1990; 79(10): 945–6. doi:10.1002/jps.2600791022.↩︎

  73. Keene ON. The log transformation is special. Stat Med. 1995; 14(8): 811–9. doi:10.1002/sim.4780140810. Open Access Open Access.↩︎

  74. Diletti E, Hauschke D, Steinijans VW. Sample size determination for bioequivalence assessment by means of confidence intervals. Int J Clin Pharm Ther Toxicol. 1991; 29(1): 1–8. PMID 2004861.↩︎

  75. Diletti E, Hauschke D, Steinijans VW. Sample size determination: Extended tables for the multiplicative model and bioequivalence ranges of 0.9 to 1.11 and 0.7 to 1.43. Int J Clin Pharm Ther Toxicol. 1992; 30(Suppl.1): S59–62. PMID 1601533.↩︎

  76. Hauschke D, Steinijans VW, Diletti E. A distribution-free procedure for the statistical analysis of bioequivalence studies. Int J Clin Pharm Ther Toxicol. 1990; 28(2): 72–8.↩︎

  77. Steinijans VW, Hauschke D. Update on the statistical analysis of bioequivalence studies. Int J Clin Pharm Ther To­xi­col. 1990; 28(3): 105–10. PMID 2318545.↩︎

  78. Steinijans VW, Hartmann M, Huber R, Radtke HW. Lack of pharmacokinetic interaction as an equivalence problem. Int J Clin Pharm Ther To­xi­col. 1991; 29(8): 323–8. PMID 1835963.↩︎

  79. Anderson S, Hauck WW. Consideration of individual bioequivalence. J Phar­ma­co­kinet Biopharm 1990; 18(3): 259–73. doi:10.1007/bf01062202.↩︎

  80. Schall R, Luus HG. On population and individual bioequivalence. Stat Med 1993; 12(12): 1109–24. doi:10.1002/sim.4780121202.↩︎

  81. Schall R. A unified view of individual, population, and average bioequivalence. In: Blume HH, Midha KK, editors. Bio-Inter­na­tio­nal 2. Bioavailability, Bioequivalence and Pharmacokinetic Studies. Stuttgart: med­pharm; 1995: p. 91–106. ISBN 3-88763-040-8.↩︎

  82. Chow S-C, Liu J-p. Design and Analysis of Bioavailability and Bioequivalence Studies. New York: Marcel Dekker; 1992. ISBN 0-8247-8682-3. ↩︎

  83. CPMP Working Party. Investigation of Bioavailability and Bioequivalence: Note for Guidance. III/54/89-EN, 8th Draft. June 1990.↩︎

  84. Commission of the European Community. Investigation of Bioavailabilty and Bio­equivalence. Brussels. December 1991. BEBAC Archive BEBAC Archive.↩︎

  85. Amidon GL, Lennernäs H, Shah VV, Crison JR. A Theoretical Basis for a Bio­phar­ma­ceu­tic Drug Classification: The Correlation of in Vitro Drug Product Dis­so­lu­tion and in Vivo Bioavailability. Pharm Res. 1995; 12(3): 413–20. doi:10.1023/a:1016212804288. Open Access Open Access.↩︎

  86. Steinijans VW, Hauschke D. International Harmonization of Regulatory Bioequivalence Requirements. Clin Res Reg Aff. 1993; 10(4): 203–20.↩︎

  87. FDA, CDER. Guidance for Industry. Statistical Procedures for Bioequivalence Studies using a Standard Two-Treat­ment Crossover Design. Rockville. Jul 1992.  Internet Archive.↩︎

  88. If Subject 1 is randomized to sequence \(\small{\text{TR}}\), there is not another Subject 1 randomized to sequence \(\small{\text{RT}}\). Ran­dom­iza­tion is not like Schrödinger’s cat. Hence, the nested term in the guidelines is an insult to the mind.↩︎

  89. Health Canada, HPFB. Guidance for Industry. Conduct and Analysis of Bioavailability and Bioequivalence Studies – Part A: Oral Dosage FormulationsUsed for Systemic Effects. Ottawa. 1992. BEBAC Archive BEBAC Archive.↩︎

  90. Health Canada, HPFB. Guidance for Industry. Conduct and Analysis of Bioavailability and Bioequivalence Studies – Part B: Oral Modified Release Formulations. Ottawa. 1996. BEBAC Archive BEBAC Archive.↩︎

  91. WHO Marketing Authorization of Pharmaceutical Products with Special Reference to Multisource (Generic) Pro­ducts: A Manual for Drug Regulatory Authorities. Geneva. 1998. Internet Archive.↩︎

  92. Gleiter CH, Klotz U, Kuhlmann J, Blume H, Stanislaus F, Harder S, Paulus H, Poethko-Müller C, Holz-Slomczyk M. (1998), When Are Bioavailability Studies Required? A German Proposal. J Clin Pharmacol. 1998 38: 904–11. doi:10.1002/j.1552-4604.1998.tb04385.x. Open Access Open Access.↩︎

  93. EMEA, CPMP. Note for Guidance on the Investigation of Bioavailability and Bio­equi­va­lence. London. 26 July 2001. Online.↩︎

  94. FDA, CDER. Guidance for Industry. Immediate Release Solid Oral Dosage Forms. Scale-Up and Post­appro­val Changes: Chemistry, Manufacturing, and Controls, In Vitro Dissolution Testing, and In Vivo Bioequivalence Do­cu­men­tation. Rockville. November 1995. Download.↩︎

  95. FDA, CDER. Guidance for Industry. SUPAC-MR: Modified Release SolidOral Dosage Forms. Scale-Up and Post­appro­val Changes: Chemistry, Manufacturing, and Controls, In Vitro Dissolution Testing, and In Vivo Bioequivalence Do­cu­men­tation. Rockville. September 1997. Download.↩︎

  96. Shah VP, Tsong Y, Sathe P, Liu J-p. In vitro dissolution profile comparison – statistics and analysis of the similarity factor f2. Pharm Res. 1998; 15: 889–96. doi:10.1023/a:1011976615750.↩︎

  97. Liu J-p, Ma M-C, Chow S-C. Statistical Evaluation of Similarity Factor f2 as a Criterion for Assessment of Similarity Between Dis­so­lu­tion Profiles. Drug Inf J. 1997; 31: 1255–71. doi:10.1177/009286159703100426.↩︎

  98. Midha KK, Blume HH, editors. Bio-International. Bioavailability, Bio­equi­va­lence and Pharmacokinetics. Stutt­gart: med­pharm; 1993. ISBN 3-88763-019-X.↩︎

  99. Blume HH, Midha KK. Bio-International 92, Conference on Bioavailability, Bio­equi­va­lence, and Pharmacokinetic Studies. J Pharm Sci. 1993; 82(11): 1186–9. doi:10.1002/jps.2600821125.↩︎

  100. Simultaneous administration of a stable isotope labelled IV dose would allow to calculate the true clearance in each period. Then it would not be necessary to assume identical clearances in \(\small{(3)}\) any more and the problem of highly vari­able drugs (inflating the CI) could be avoided. However, it would require that the IV formulation is manufactured according to the rules of cGMP and different from the internal standard in MS, which is generally not feasible. Such an approach is only mentioned in Japanese guidelines.↩︎

  101. Blume HH, Midha KK, editors. Bio-International 2. Bioavailability, Bioequivalence and Pharmacokinetic Studies. Stutt­gart: med­pharm; 1995. ISBN 3-88763-040-8.↩︎

  102. Boddy AW, Snikeris FC, Kringle RO, Wei GCG, Opperman JA, Midha KK. An approach for widening the bio­equi­va­lence acceptance limits in the case of highly variable drugs. Pharm Res. 1995; 12(12): 1865–8. doi:10.1023/a:1016219317744.↩︎

  103. Shah VP, Yacobi A, Barr WH, Benet LZ, Breimer D, Dobrinska MR, Endrényi L, Fairweather W, Gillespie W, Gonzalez MA, Hooper J, Jackson A, Lesko LL, Midha KK, Noonan PK, Patnaik R, Williams RL. Workshop Report. Evaluation of Orally Ad­mi­nis­tered Highly Variable Drugs and Drug Formulations. Pharm Res. 1996; 13(11): 1590–4. doi:10.1023/a:1016468018478.↩︎

  104. Schütz H, Labes D, Wolfsegger MJ. Critical Remarks on Reference-Scaled Average Bioequivalence. J Pharm Phar­ma­ceut Sci. 25: 285–96. doi:10.18433/jpps32892.↩︎

  105. EMEA Human Medicines Evaluation Unit / CPMP. Note for Guidance on the In­ves­tigation of Bioavailability and Bio­equi­va­lence. Draft. London. 17 December 1998.↩︎

  106. Brooks MA, Weifeld RE. A Validation Process for Data from the Analysis of Drugs in Biological Fluids. Drug Devel Ind Pharm. 1985; 11: 1703–28.↩︎

  107. Pachla LA, Wright DS, Reynolds DL. Bioanalytical Considerations for Phar­ma­co­ki­netic and Biopharmaceutic Studies. J Clin Phar­ma­col. 1986; 26(5): 332–5. doi:10.1002/j.1552-4604.1986.tb03534.x.↩︎

  108. Buick AR, Doig MV, Jeal SC, Land GS, McDowall RD, Method Validation in the Bio­ana­lytical Laboratory. J Pharm Biomed Anal. 1990; 8(8–12): 629–37. doi:10.1016/0731-7085(90)80093-5. Open Access Open Access.↩︎

  109. Karnes ST, Shiu G, Shah VP. Validation of Bioanalytical Methods. Pharm Res. 1991; 8(4): 421–6. doi:10.1023/a:1015882607690.↩︎

  110. AAPS, FDA, FIP, HPB, AOAC. Analytical Methods Validation: Bioavailability, Bio­equi­valence and Pharma­co­ki­netic Studies. Arlington, VA. December 3–5, 1990.↩︎

  111. Shah VP, Midha KK, Dighe S, McGilveray IJ, Skelly JP, Yacobi A, Layloff T, Viswanathan CT, Cook CE, McDowall RD, Pittman, Spector S. Analytical methods validation: Bioavailability, bioequivalence and pharmacokinetic studies. Eur J Drug Me­ta­bol Phar­ma­co­kinet. 1991 ;16(4):249–55. doi:10.1007/bf03189968.↩︎

  112. Shah VP, Midha KK, Findlay JWA, Hill HM, Hulse JD, McGilveray IJ, McKay G, Miller KJ, Patnaik RN, Powell ML, Tonelli A, Viswanathan CT, Yacobi A. Bio­ana­ly­tical Method Validation – A Revisit with a Decade of Progress. Pharm Res. 2000; 17: 1551–7. doi:10.1023/a:1007669411738↩︎

  113. Viswanathan CT, Bansal S, Booth B, DeStefano AJ, Rose MJ, Sailstad J, Shah VP, Skelly JP, Swann PG, Weiner R. Workshop / Conference Report – Quantitative Bio­ana­ly­tical Methods Validation and Implementation: Best Prac­ti­ces for Chro­ma­to­gra­phic and Ligand Binding Assays. AAPS J. 2007; 24(10): 1962–73. doi:10.1007/s11095-007-9291-7.↩︎

  114. FDA, CDER, CVM. Guidance for Industry. Bioanalytical Method Validation. Rockville. May 2001.  Internet Archive.↩︎

  115. FDA, CDER, , CVM. Guidance for Industry. Bioanalytical Method Validation. Silver Spring. May 2018. Download.↩︎

  116. EMEA, CHMP. Guideline on Validation of Bioanalytical Methods. Draft. London. 19 November 2009. Online.↩︎

  117. EMA, CHMP. Guideline on Validation of Bioanalytical Methods. London. 21 July 2011. Online.↩︎

  118. ICH. Bioanalytical Method Validation And Study Sample Analysis. M10. 22 May 2022. Online.↩︎

  119. EMEA, CHMP. Implementation strategy of ICH Guideline M10 on bioanalytical method validation. Amsterdam. 04 April 2024 . Online.↩︎

  120. EMA, CHMP. Guideline on the pharmacokinetic and clinical evaluation of modified release dosage forms. London. 20 November 2014. Online.↩︎

  121. Schütz H. Primary and secondary PK metrics for evaluation of steady state studies, \(\small{C_\text{min}}\) vs. \(\small{C_\tau}\), relevance of \(\small{C_\text{min}}\)/\(\small{C_\tau}\) or fluctuation for bioequivalence assessment. Pre­sentation at: 4th GBHI Workshop. Amsterdam; 12 April 2018. Online.↩︎

  122. FDA, CDER, OGD. Guidance for Industry. Bioequivalence Recommendations for Specific Products. Silver Spring. June 2010. Download.↩︎

  123. Anderson S. Individual Bioequivalence: A problem of Switchability. Biopharm Rep. 1993; 2(2): 1–11.↩︎

  124. Endrényi L, Schulz M. Individual Variation and the Acceptance of Average Bio­equi­va­lence. Drug Inform J. 1993; 27(1): 195–201. doi:10.1177/009286159302700135.↩︎

  125. Endrényi L. A method for the evaluation of individual bioequivalence. Int J Clin Phar­ma­col. 1994; 32(9): 497–508. PMID 7820334.↩︎

  126. Esinhart JD, Chinchilli VM. Extension to use of tolerance intervals for the assessment of individual bioequivalence. J Biopharm Stat. 1994; 4: 39–52. doi:10.1080/10543409408835071.↩︎

  127. Chow S-C, Liu J-p. Current issues in bioequivalence trials. Drug Inform J. 1995; 29: 795–804. doi:10.1177/009286159502900302.↩︎

  128. Chen ML. Individual bioequivalence. A regulatory update. J Biopharm Stat. 1997. 7(1): 5–11. doi:10.1080/10543409708835162.↩︎

  129. Hauck WW, Anderson S. Commentary on individual bioequivalence by ML Chen. J Biopharm Stat. 1997; 7(1): 13–6. doi:10.1080/10543409708835163.↩︎

  130. Liu J-p, Chow S-C. Some thoughts on individual bioequivalence. J Biopharm Stat. 1997; 7(1): 41–8. doi:10.1080/10543409708835168.↩︎

  131. Midha KK, Rawson MJ, Hubbard JW. Prescribability and switchability of highly variable drugs and drug products. J Contr Rel. 1999; 62(1-2): 33–40. doi:10.1016/s0168-3659(99)00050-4.↩︎

  132. FDA, CDER. Guidance for Industry. Statistical Approaches to Establishing Bio­equi­va­lence. Rockville. Jan 2001. Download.↩︎

  133. Chow S-C, Shao J, Wang H. Individual bioequivalence testing under 2 × 3 designs. Stat Med. 2002; 21(5): 629–48. doi:10.1002/sim.1056.↩︎

  134. Chow S-C, Liu J-p. Design and Analysis of Bioavailability and Bioequivalence Studies. Boca Raton: Chapman & Hall/CRC Press; 3rd edition 2009. ISBN 978-1-58488-668-6. p. 596–8. ↩︎

  135. Hauschke D, Steinijans VW, Pigeot I. Bioequivalence Studies in Drug Development. Methods and Applications. Chichester: Wiley; 2007. ISBN 0-470-09475-3. p. 209. ↩︎

  136. Benet LZ. Individual Bioequivalence: Have the Opinions of the Scientific Community Changed? In: FDA Advisory Committee for Pharmaceutical Sciences and Clinical Pharmacology Meeting Transcript. US Food and Drug Administration Dockets. Nov 29, 2001.  Internet Archive.↩︎

  137. Patterson S. A Review of the Development of Biostatistical Design and Analysis Tech­ni­ques for Assessing In Vivo Bioequivalence: Part Two. Ind J Pharm Sci. 2001; 63(3): 169–86. Open Access Open Access.↩︎

  138. FDA, CDER. Guidance for Industry. Bioavailability and Bioequivalence Studies for Orally Administered Drug Pro­ducts — General Considerations. Rockville. March 2003.  Internet Archive.↩︎

  139. Schall R, Endrényi L. Bioequivalence: tried and tested. Cardiovasc J Afr. 2010. 21(2): 69–70. PMCID 3721767. PMC Free Full text.↩︎

  140. Senn S. Conference Proceedings: Challenging Statistical Issues in Clinical Trials. De­ci­sions and Bioequivalence. 2000.↩︎

  141. Midha KK, Shah VP, Singh GJP, Patnaik R. Conference Report: Bio-International 2005. J. Pharm Sci. 2007; 96(4): 747–54. doi:10.1002/jps.20786.↩︎

  142. EMA, CHMP. Concept Paper for an Addendum to the Note for Guidance on the Investigation of Bio­avail­abi­lity and Bio­equi­va­lence: Evaluation of Bio­equi­va­lence of Highly Variable Drugs and Drug Products. London. 27 April 2006. BEBAC Archive BEBAC Archive.↩︎

  143. EMEA, CHMP. Guideline on the Investigation of Bioequivalence. London. 20 January 2010. Online.↩︎

  144. FDA, OGD. Draft Guidance on Progesterone. Recommended Apr 2010; Revised Feb 2011. Online↩︎

  145. Davit BM, Chen ML, Conner DP, Haidar SH, Kim S, Lee CH, Lionberger RA, Makhlouf FT, Nwakama PE, Patel DT, Schuirmann DJ, Yu LX. Implementation of a Reference-Scaled Average Bioequivalence Approach for Highly Va­ri­able Generic Drug Products by the US Food and Drug Administration. AAPS J. 2012; 14(4): 915–24. doi:10.1208/s12248-012-9406-x. Open Access Open Access.↩︎

  146. WHO Expert Committee on Specifications for Pharmaceutical Preparations. Multisource (generic) pharmaceutical products: guidelines on registration requirements to establish interchangeability. Fifty-first report. Technical Report Series, No. 992, Annex 6. Geneva. April 2017. Download.↩︎

  147. Health Canada. Guidance Document. Comparative Bioavailability Standards: For­mu­la­tions Used for Sys­temic Effects. Ottawa. 2018/06/08. Online.↩︎

  148. Some gastric resistant formulations of diclofenac are HVDPs, practically all topical formulations are HVDPs, where­as diclofenac itself is not a HVD (\(\small{CV_\text{w}}\) of a solution ~8%).↩︎

  149. An exception is dabigatran, the first univalent direct thrombin (IIa) in­hi­bi­tor. The originator withheld information about severe bleeding events. Although dabigatran is highly variable, reference-scaling is not justified. The FDA requires for dabigatran, rivaroxaban, and edoxaban 4-period full replicate studies with the conventional limits and additionally com­par­ing \(\small{s_\text{wT}}\) with \(\small{s_\text{wR}}\).↩︎

  150. Note that the model of SABE is based on the true \(\small{\sigma_\text{wR}}\), whereas in practice the observed \(\small{s_\text{wR}}\) is used. This may lead to a misclassification and thus, and inflated type I error.104↩︎

  151. FDA, CDER. Guidance for Industry. Bioequivalence Studies With Pharmacokinetic Endpoints for Drugs Submitted Under an ANDA. Draft. Silver Spring. August 2021. Download.↩︎

  152. Tóthfalusi L, Endrényi L, García-Arieta A. Evaluation of bioequivalence for highly variable drugs with scaled average bioequivalence. Clin Pharmacokinet. 2009; 48: 725–43. doi:10.2165/11318040-000000000-00000.↩︎

  153. Picky: \(\small{CV_\text{wR}=100\sqrt{\exp(0.294^2)-1}=30.04689\ldots\%}\neq 30\%\)!↩︎

  154. Benet L. Why Highly Variable Drugs are Safer. Presentation at: FDA Advisory Committee for Phar­ma­ceu­tical Science. Rockville; 06 October, 2006. Internet Archive Internet Archive.↩︎

  155. Endrényi L, Tóthfalusi L. Regulatory Conditions for the Determination of Bio­equi­va­lence of Highly Variable Drugs. J Pharm Phar­ma­ceut. 2009; 12(1): 138–49. doi:10.18433/j3zw2c. Open Access Open Access.↩︎

  156. Endrényi L, Tóthfalusi L. Bioequivalence for highly variable drugs: regulatory agreements, disagreements, and harmonization. J. Phar­ma­co­kin Phar­ma­co­dyn. 2019; 46: 117–26. doi:10.1007/s10928-019-09623-w.↩︎

  157. Schütz H. Highly Variable Drugs and Type I Error. Presentation at: 6th International Workshop – GBHI 2024. Rockville, MD. 16 April 2024. Online.↩︎

  158. Schuirmann D. U.S. FDA Perspective: Statistical Aspects of OGD’s Approach to Bioequivalence (BE) As­sess­ment for Highly Vari­able Drugs. Presentation at the 2nd conference of The Global Harmonisation Ini­ti­a­tive (GBHI). Rockville. September 15–16, 2016.↩︎

  159. Tóthfalusi L, Endrényi L, Midha KK, Rawson MJ, Hubbard JW. Evaluation of the Bioequivalence of Highly-Va­ri­able Drugs and Drug Products. Pharm Res. 2001; 18(6): 728–33. doi:10.1023/a:1011015924429.↩︎

  160. Hyslop T, Hsuan F, Holder DJ. A small sample confidence interval approach to assess individual bioequivalence. Stat Med. 2000; 19: 2885–97. doi:10.1002/1097-0258(20001030)19:20%3C2885::aid-sim553%3E3.0.co;2-h.↩︎

  161. Howe WG. Approximate Confidence Limits on the Mean of X+Y Where X and Y are Two Tabled In­de­pen­dent Ran­dom Variables. J Am Stat Assoc. 1974; 69(347): 789–94.↩︎

  162. EMA, CHMP PKWP. Questions & Answers: Positions on specific questions addressed to the Phar­ma­co­ki­ne­tics Work­ing Party. London. 26 January 2011. Online.↩︎

  163. FDA, OGD. Draft Guidance on Warfarin Sodium. Recommended Dec 2012. Download.↩︎

  164. FDA, CDER. Guidance for Industry. Waiver of In Vivo Bioavailability and Bioequivalence Studies for Immediate-Release Solid Oral Dosage Forms Based on a Biopharmaceutics Classification System. Rockville. August 2000. BEBAC Archive BEBAC Archive.↩︎

  165. ICH. Biopharmaceutic Classification System-based Biowaivers. M9. 20 November 2019. Online.↩︎

  166. WHO/PQT: medicines. Application of reference-scaled criteria for AUC in bioequivalence studies conducted for sub­mis­sion to PQT/MED. Geneva. 02 July 2021. Online.↩︎

  167. Health Canada. Guidance Document: Conduct and Analysis of Comparative Bio­avail­abi­lity Studies. Ottawa. 2018/06/08. Online.↩︎

  168. Jiang W, Makhlouf F, Schuirmann DJ, Zhang X, Zheng Zheng N, Conner D, Yu LX, Lionberger R. A Bio­equi­va­lence Approach for Generic Narrow Therapeutic Index Drugs: Evaluation of the Reference-Scaled Approach and Va­ri­ability Comparison Criterion. AAPS J. 2015; 17(4): 891–901. doi:10.1208/s12248-015-9753-5. Open Access Open Access. Correction: AAPS J. 2015; 17(6): 1519. doi:10.1208/s12248-015-9786-9. Open Access Open Access.↩︎

  169. Paixão P, García Arieta A, Silva N, Petric Z, Bonelli M, Morais JAG, Blake K, Gouveia LF. A Two-Way Proposal for the De­ter­mi­nation of Bioequivalence for Narrow Therapeutic Index Drugs in the European Union. Pharmaceut. 2024; 16: 598. doi:10.3390/pharmaceutics16050598. Open Access Open Access.↩︎

  170. Endrényi L, Csizmadia F, Tóthfalusi L, Chen M-L. Metrics Comparing Simulated Early Concentration Profiles for the De­ter­mi­na­tion of Bioequivalence. Pharm Res. 1998; 15(8): 1292–9. doi:0.1023/a:1011912512966.↩︎

  171. Hofmann J. Bioequivalence of early exposure: tmax & pAUC. Presentation at: Bio­Bridges; Prague. 21 September 2023. Online.↩︎

  172. Almeida S. Early Exposure in IR Products: pAUC and Alternative Approaches. View from the Generic Industry. Pre­sen­ta­tion at: 6th International Workshop – GBHI 2024. Rockville, MD. 17 April 2024.↩︎

  173. Yu LX, Li BV, editors. FDA Bioequivalence Standards. New York: Springer; 2014. p. 16. ISBN 978-1-4939-1251-0.↩︎

  174. Schütz H, Burger DA, Cobo E, Dubins DD, Farkás T, Labes D, Lang B, Ocaña J, Ring A, Shitova A, Stus V, Tomashevskiy M. Group-by-Treatment Interaction Effects in Comparative Bioavailability Studies. AAPS J. 2024; 26(3): 50. doi:10.1208/s12248-024-00921-x. Open Access Open Access.↩︎

  175. FDA, CDER. Draft Guidance for Industry. Statistical Approaches to Establishing Bio­equivalence. Revision 1. Silver Spring. December 2022. Download.↩︎

  176. González-Rojano E, Marcotegui J, Ochoa D, Román M, Álvarez C, Gordon J, Abad-Santos F, García-Arieta A. In­ves­ti­ga­tion on the Existence of Sex-By-Formulation Interaction in Bioequivalence Trials. Clin Pharm Ther. 2019; 106(5): 1099–112. doi:10.1002/cpt.1539.↩︎

  177. Schütz H. Sex- and group-related problems in BE. A delusion. Presentation at: Bio­Bridges; Prague. 21 September 2023. Online.↩︎

  178. Wagner JG. Method of Estimating Relative Absorption of a Drug in a Series of Clinical Studies in Which Blood Levels Are Mea­sured After Single and/or Multiple Doses. J Pharm Sci. 1967; 56(5): 652–3. doi:10.1002/jps.2600560527.↩︎

  179. Schall R, Hundt HKL, Luus HG. Pharmacokinetic characteristics for extent of absorption and clearance in drug/drug interaction studies. Int J Clin Phar­ma­col Ther. 1994; 32(12): 633–7. PMID 7881699.↩︎

  180. Abdallah HY. An area correction method to reduce intrasubject variability in bioequivalence studies. J Pharm Phar­ma­ceut Sci. 1998; 1(2): 60–5. Open Access Open Access.↩︎

  181. Lucas AJ, Ogungbenro K, Yang S, Aarons L. Chen C. Evaluation of area under the concentration curve adjusted by the terminal-phase as a metric to reduce the impact of variability in bioequivalence testing. Br J Clin Phar­ma­col. 2022; 88(2): 619–27. doi:10.1111/bcp.14986.↩︎

  182. Midha KK, Hubbard JW, Rawson MJ. Retrospective evaluation of relative extent of absorption by the use of partial areas under plasma concentration versus time curves in bioequivalence studies on conventional release products. Eur J Pharm Sci. 1996; 4(6): 381–4. doi:10.1016/0928-0987(95)00166-2.↩︎

  183. Scheerans C, Derendorf H, Kloft C. Proposal for a Standardised Identification of the Mono-Exponential Ter­mi­nal Phase for Oral­ly Administered Drugs. Biopharm Drug Dispos. 2008; 29(3): 145–57. doi:10.1002/bdd.596.↩︎

  184. ICH. Statistical Principles for Clinical Trials. E9. 5 February 1998. Online.↩︎

  185. Paixão P, Gouveia LF, Morais JAG. An alternative single dose parameter to avoid the need for steady-state studies on oral ex­tend­ed-release drug products. Eur J Phar­ma­ceut Bio­phar­ma­ceut. 2012; 80(2): 410–7. doi:10.1016/j.ejpb.2011.11.001.↩︎

  186. ANVISA. Resolução - RDC Nº 742. Dispõe sobre os critérios para a condução de estudos de biodisponibilidade relativa / bio­equi­valência (BD/BE) e estudos far­ma­co­ci­né­ticos. Brasilia. August 10, 2022. Effective July 3, 2023. Online. [Por­tu­guese]↩︎

  187. Schütz H. Two-stage designs in bioequivalence trials. Eur J Clin Pharmacol. 2015; 71(3): 271–81. doi:10.1007/s00228-015-1806-2.↩︎

  188. Lee J, Feng K, Xu M, Gong X, Sun W, Kim J, Zhang Z, Wang M, Fang L, Zhao L. Applications of Adaptive Designs in Generic Drug Development. Clin Pharm Ther. 2020; 110(1): 32–5. doi:10.1002/cpt.2050.↩︎

  189. Maurer W, Jones B, Chen Y. Controlling the type 1 error rate in two-stage sequential designs when testing for average bioequivalence. Statist Med. 2018; 37(10): 1–21. doi:10.1002/sim.7614.↩︎

  190. Potvin D, DiLiberti CE, Hauck WW, Parr AF, Schuirmann DJ, Smith RA. Sequential design approaches for bio­equi­va­lence studies with crossover designs. Pharm Stat. 2008; 7: 245–62. doi:10.1002/pst.294.↩︎

  191. Montague TH, Potvin D, DiLiberti CE, Hauck WW, Parr AF, DJ Schuirmann DJ. Additional results for ‘Sequential design ap­proach­es for bioequivalence studies with crossover designs’. Pharm Stat. 2011; 11: 8–13. doi:10.1002/pst.483.↩︎

  192. Fuglsang A. Sequential Bioequivalence Trial Designs with Increased Power and Con­trolled Type I Error Rates. AAPS J. 2013; 15: 659–61. doi:10.1208/s12248-013-9475-5.↩︎

  193. Fuglsang A. Sequential Bioequivalence Approaches for Parallel Designs. AAPS J. 2014; 16: 373–8. doi:10.1208/s12248-014-9571-1.↩︎

  194. Molins E, Labes D, Schütz H, Cobo E, Ocaña J. An iterative method to protect the type I error rate in bioequivalence studies under two-stage adaptive 2×2 crossover designs. Biom J. 2021; 63(1): 122–33. doi:10.1002/bimj.201900388.↩︎

  195. European Parliament and Council. Directive 2001/83/EC on the Community code relating to medicinal products for human use. Article 10 2.(b) 6 November 2001, last ammended 12 April 2022. Online.↩︎

  196. FDA, CDER. Guidance for Industry. Extended Release Oral Dosage Forms: Development, Evaluation, and Appli­ca­tion of In Vitro/In Vivo Cor­relations. Rockville. September 1997. Download.↩︎

  197. EMA, CHMP. Guideline on quality of oral modified release products. London. 20 March 2014. Online.↩︎

  198. EMA, CHMP. Guideline on clinical development of fixed combination medicinal products. London. 23 March 2017. Online.↩︎

  199. Feřtek M. Clinical Aspects of the Development of Fixed Dose Com­bi­nation Products. Presentation at: BioBridges; Prague. 21 September 2017. Online.↩︎

  200. Lu D, Lee SL, Lionberger RA, Choi S, Adams W, Caramenico HN, Chowdhury BA, Conner DP, Katial R, Limb S, Peters JR, Yu L, Seymour S, Li BV. International Guidelines for Bioequivalence of Locally Acting Orally In­haled Drug Pro­ducts: Si­mi­la­ri­ties and Dif­fe­rences. AAPS J. 2015; 17(3): 546–57. doi:10.1208%2Fs12248-015-9733-9. PMC Free Full Text Free Full Text.↩︎

  201. FDA, OGD. Draft Guidance on Budesonide. Silver Spring. February 2024. Download.↩︎

  202. EMA, CHMP. Guideline on the requirements for demonstrating therapeutic equivalence between orally inhaled pro­ducts (OIP) for asthma and chronic obstructive pulmonary disease (COPD). Draft. Amsterdam. 16 March 2024. Online.↩︎

  203. Anderson S, Hauck WW. The transitivity of bioequivalence testing: potential for drift. Int J Clin Pharmacol Ther. 1996; 34(9): 369–74. PMID 8880284.↩︎

  204. Hauck WW, Anderson S. Some Issues in the Design and Analysis of Equivalence Trials. Drug Inf J. 1999; 33(1): 109–18. doi:10.1177/009286159903300114.↩︎

  205. Karalis V, Bialer M, Macheras P. Quantitative assessment of the switchability of generic products. Eur J Pharm Sci. 2013; 50(3-4): 476–83. doi:10.1016/j.ejps.2013.08.023.↩︎

  206. EMA, CHMP. Guideline on equivalence studies for the demonstration of therapeutic equivalence for locally applied, locally acting products in the gastrointestinal tract. London. 18 October 2018. Online.↩︎

  207. Wang YL, Hsu LF. Evaluating the Feasibility of Use of a Foreign Reference Product for Generic Drug Applications: A Retrospective Pilot Study. Eur J Drug Metab Pharmacokinet. 2017; 42(6): 935–42. doi:110.1007/s13318-017-0409-y.↩︎

  208. Gwaza L, Gordon J, Leufkens H, Stahl M, García-Arieta A. Global Harmonization of Comparator Products for Bio­equi­va­lence Studies. AAPS J. 2017; 19(3): 603–6. doi:10.1208/s12248-017-0068-6.↩︎

  209. Almeida S. An opportunity or a mirage: Single global development for generic products. Presentation at: Bio­Bridges; Prague. 27 September 2019. Online.↩︎

  210. Almeida S. Road Map to an International BE Reference Product? Pre­sen­ta­tion at: 4th International GBHI Work­shop; Bethesda, MD. 13 December 2019.↩︎

  211. Almeida S. Single global development of generic medicines. Pre­sen­ta­tion at: me­di­ci­nes for europe, 2nd BE Work­shop; Brussels. 26 April 2023.↩︎

  212. Gwaza L, Gordon J, Welink J, Potthast H, Hanson H, Stahl M, García-Arieta A. Statistical approaches to indirectly compare bio­equivalence between generics: a comparison of methodologies employing artemether/lu­me­fan­trine 20/120 mg tablets as prequalified by WHO. Eur J Clin Pharmacol. 2012; 68(12): 1611–8. doi:10.1007/s00228-012-1396-1.↩︎

  213. Herranz M, Morales-Alcelay S, Corredera-Hernández MA, de la Torre-Alvarado JM, Blázquez-Pérez A, Suárez-Gea ML, Álvarez C, García-Arieta A. Bioequivalence between generic tacrolimus products marketed in Spain by adjusted indirect comparison. Eur J Clin Pharmacol. 2013; 69(5): 1157–62. doi:10.1007/s00228-012-1456-6.↩︎

  214. Gwaza L, Gordon J, Welink J, Potthast H, Leufkens H, Stahl M, García-Arieta A. Adjusted indirect treatment com­pa­ri­son of the bioavailability of WHO-prequalified first-line generic antituberculosis medicines. Clin Pharmacol Ther. 2014; 96(5): 580–8. doi:10.1038/clpt.2014.144.↩︎

  215. Ring A, Morris TBS, Hohl K, Schall R. Indirect bioequivalence assessment using network meta-analyses. Eur J Clin Pharmacol. 2014; 70(8): 947–55. doi:10.1007/s00228-014-1691-0.↩︎

  216. Yu Y, Teerenstra S, Neef C, Burger D, Maliepaard M. Investigation into the interchangeability of generic formulations using immunosuppressants and a broad selection of medicines. Eur J Clin Pharmacol. 2015; 71(8): 979–80. doi:10.1007/s00228-015-1878-z.↩︎

  217. Gwaza L, Gordon J, Potthast H, Welink J, Leufkens H, Stahl M, García-Arieta A. Influence of point estimates and study power of bioequivalence studies on establishing bioequivalence between generics by adjusted indirect comparisons. Eur J Clin Phar­ma­col. 2015; 71(9): 1083–9. doi:10.1007/s00228-015-1889-9.↩︎

  218. Pejčić Z, Vučićevićc K, García-Arieta A, Miljković B. Adjusted indirect comparisons to assess bioequivalence between generic clo­pidogrel products in Serbia. Br J Clin Pharmacol. 2019; 85: 2059–65. doi:10.1111/bcp.13997.↩︎

  219. Welling PG, Tse FLS, Dighe SV, editors. Pharmaceutical Bioequivalence. New York: Marcel Dekker; 1991.↩︎

  220. Jackson AJ, editor. Generics and Bioequivalence. Boca Raton: CRC Press; 1994, Re­issued 2019. ISBN 978-0-367-20831-8.↩︎

  221. Millard SP, Krause A, editors. Applied Statistics in the Pharmaceutical Industry. New York: Springer; 2001. ISBN 0-387-98814-9. ↩︎

  222. Senn S. Cross-over Trials in Clinical Research. Chichester: Wiley; 2nd edition 2002. ISBN 0-471-49653-7. ↩︎

  223. Wellek S. Testing Statistical Hypotheses of Equivalence. Boca Raton: Chapman & Hall/CRC; 2003. ISBN 978-1-5848-8160-5. ★★↩︎

  224. Amidon G, Lesko L, Midha K, Shah V, Hilfinger J. International Bioequivalence Standards: A New Era. Ann Arbor: TSRL; 2006. ISBN 10-0-9790119-0-6.↩︎

  225. Senn S. Statistical Issues in Drug Development. Chichester: John Wiley; 2nd edition 2007. ISBN 978-0-470-01877-4. ↩︎

  226. Kanfer I, Shargel L, editors. Generic Product Development. International Regulatory Requirements for Bio­equi­va­lence. New York: informa healthcare; 2010. ISBN 978-0-8493-7785-3.↩︎

  227. Bolton S, Bon C. Pharmaceutical Statistics. Practical and Clinical Applications. New York: informa healthcare; 5th edition 2010. ISBN 978-1-4200-7422-2. ↩︎

  228. Davit B, Braddy AC, Conner DP, Yu LX. International Guidelines for Bio­equi­va­lence of Systemically Available Oral­ly Ad­mi­ni­stered Generic Drug Products: A Survey of Si­mi­larities and Differences. AAPS J. 2013; 15(4): 974–90. doi:10.1208/s12248-013-9499-x. PMC Free Full Text Free Full Text.↩︎

  229. Jones B, Kenward MG. Design and Analysis of Cross-Over Trials. Boca Raton: CRC Press. 3rd edition 2015. ISBN 978-1-4398-6142-4. ↩︎

  230. Kanfer I, editor. Bioequivalence Requirements in Various Global Jurisdictions. New York: Springer; 2017. ISBN 978-3-319-88542-1.↩︎

  231. Patterson S, Jones B. Bioequivalence and Statistics in Clinical Pharmacology. Boca Raton: CRC Press; 2nd edition 2019. ISBN 978-0-3677-8244-3. ↩︎

  232. Goldacre B. Bad Science. London: Harper­Collins; 2009. ISBN 978-0-00-728487-0. ↩︎

  233. Goldacre B. Bad Pharma. How Drug Companies Mislead Doctors and Harm Patients. London: Harper­Collins; 2012. ISBN 978-0-00-735074-2. ↩︎

  234. Eban K. Bottle of Lies. The Inside Story of the Generic Drug Boom. New York: Harper­Collins; 2019. ISBN 978-0-06-233878-5. ↩︎