Multiple  methods for trials with multiple endpoints
Source:vignettes/Multiple_Endpoints.Rmd
Multiple_Endpoints.Rmd
Suppose we are planning a drug development program testing the superiority of an experimental treatment over a control treatment. Our drug development program consists of an exploratory phase II trial which is, in case of promising results, followed by a confirmatory phase III trial.
The drugdevelopR package enables us to optimally plan such programs using a utilitymaximizing approach. To get a brief introduction, we presented a very basic example on how the package works in Introduction to planning phase II and phase III trials with drugdevelopR. Contrary to the basic setting, where only one phase II and one phase III trial were conducted, in this scenario we investigate what happens when multiple endpoints are of interest. The functions currently implemented only allow for the conduction of trials with up to two different endpoints. The drugdevelopR package enables programs where only one endpoints has to show a significant positive treatment effect or all endpoints have to show a significant positive treatment effect.
After installing the package according to the installation instructions, we can load it using the following code, and start right into the two examples:
library(drugdevelopR)
#> Loading required package: doParallel
#> Loading required package: foreach
#> Loading required package: iterators
#> Loading required package: parallel
Only one endpoint has to be significant
Suppose we are developing a new tumor treatment, exper. The
two endpoints we consider interesting are overall survival (OS) and
progressionfree survival (PFS). These are timetoevent outcome
variables. Therefore, we will use the function
optimal_multiple_tte
, which calculates optimal sample sizes
and threshold decisions values for timetoevent outcomes when multiple
endpoints are investigated.
Within our drug development program, we will compare our experimental treatment to the control treatment contro. The treatment effect measure of exper for each endpoint \(i \in \{1,2\}\) is given by \(\theta_i = −\log(HR_i)\), where the hazard ratio \(HR_i\) is the ratio of the hazard rates between the experimental treatment and the control treatment, for each endpoint. For more information on the hazard ratio, see the vignette on timetoevent endpoints.
Defining all necessary parameters
The parameters in the setting with multiple endpoints differ slightly
from the other cases (bias
adjustment and multitrial).
As in the basic setting, the treatment effect may be fixed (as in this
example) or may follow a prior distribution (see Fixed
or Prior) Note that in this case, contrary to the other settings,
the second treatment effect describes the effect of the second endpoint
and is also relevant for fixed = TRUE
. Furthermore, some
options to adapt the program to your specific needs are also available
in this setting (see More
parameters), however skipping phase II and choosing different
treatment effects in phase II and III are not possible.

hr1
is the assumed hazard ratio of the endpoint OS.hr2
is the assumed hazard ratio of the endpoint PFS. We assume that our experimental treatment exper leads to a hazard reduction of 80% compared to the control treatment contro for endpoint OS and a reduction of 75% for PFS. Therefore, we sethr1 = 0.8
andhr2 = 0.75
. 
n2min
andn2max
specify the minimal and maximal number of participants (not events) for the phase II trial. The package will search for the optimal sample size within this region. For now, we want the program to search for the optimal sample size in the interval between 20 and 400 participants. In addition, we will tell the program to search this region in steps of ten participants at a time by settingstepn2 = 10
.  Differing from the basis setting, one has to provide two sets of
benefits triples, one triple
b11
,b21
,b31
in the case endpoint OS is significant (independent of PFS beiing significant or not) and one tripleb12
,b22
,b32
in the case OS is not significant. The intuition is that overall survival is the “more relevant” endpoint, i.e. if this endpoints shows a significant result, we have higher gains than if only the “less important” endpoint PFS shows a significant result. We assign benefit triples of {1000, 2000, 3000} for the case OS is significant and {1000, 1500, 2000} if not (all in units of 10^5$).  Finally, the parameter
rho
determines the correlation between the treatment effects of the two endpoints. In our case, we setrho = 0.6
, indicating the correlation between the treatment effects of OS and PFS is 0.6.
set.seed(123)
res1 < optimal_multiple_tte(hr1 = 0.8, hr2 = 0.75, # define assumed true HRs
id1 = NULL, id2 = NULL,
n2min = 20, n2max = 400, stepn2 = 10, # define optimization set for n2
hrgomin = 0.7, hrgomax = 0.9, stephrgo = 0.05, # define optimization set for HRgo
alpha = 0.025, beta = 0.1, # drug development planning parameters
c2 = 0.75, c3 = 1, c02 = 100, c03 = 150, # define costs for phase II and III
K = Inf, N = Inf, S = Inf, # set constraints
steps1 = 1, stepm1 = 0.95, stepl1 = 0.85, # effect size categories
b11 = 1000, b21 = 2000, b31 = 3000,
b12 = 1000, b22 = 1500, b32 = 2000, # define expected benefits (both categories)
rho = 0.6, fixed = TRUE, # correlation and treatment effect
num_cl = 2)
Interpreting the output
After setting all these input parameters and running the function, let’s take a look at the output of the program.
res1
#> Optimization result:
#> Utility: 106.61
#> Sample size:
#> phase II: 90, phase III: 301, total: 391
#> Probability to go to phase III: 0.79
#> Total cost:
#> phase II: 168, phase III: 420, cost constraint: Inf
#> Fixed cost:
#> phase II: 100, phase III: 150
#> Variable cost per patient:
#> phase II: 0.75, phase III: 1
#> Effect size categories:
#> small: 1 medium: 0.95 large: 0.85
#> Expected gains if endpoint 1 is significant:
#> small: 1000 medium: 2000 large: 3000
#> Expected gains if only endpoint 2 is significant:
#> small: 1000 medium: 1500 large: 2000
#> Success probability: 0.4
#> Success probability for a trial with:
#> two arms in phase III: 0.21, three arms in phase III: 0.07
#> Probability of endpoint 1 being significant in phase III: 0.71
#> Significance level: 0.025
#> Targeted power: 0.9
#> Decision rule threshold: 0.85 [HRgo]
#> Assumed true effects [HR]:
#> endpoint 1: 0.8, endpoint2 2: 0.75
#> Correlation between endpoints: 0.6
The program returns a data frame, with the following results:

res1$n2
is the optimal number of participants for phase II andres$n3
the resulting number of events for phase III. We see that the optimal scenario requires 90 participants in phase II and 301 participants in phase III. 
res1$HRgo
is the optimal threshold value for the go/nogo decision rule. We see that we need a hazard ratio of less than 0.85 in phase II in order to proceed to phase III. 
res1$u
is the expected utility of the program for the optimal sample size and threshold value. In our case it amounts to 106.61, i.e. we have an expected utility of 10 661 000$. 
res1$OS
is the probability that endpoint OS is significant, given that the program is successful overall. The probability of a successful program is 0.4, thus, if the program is successful the probability that the endpoint OS is significant and we obtain higher benefits is 0.71.
Both endpoints have to be significant
In this scenario, suppose we are developing a new Alzheimer
medication, exper. The two endpoints we consider interesting
are improving cognition (cognitive endpoint) and improving activities of
daily living (functional endpoint). We want both outcomes to be show
statistically significant differences for the trial to be successful.
These are normally distributed outcome variables. Therefore, we will use
the function optimal_multiple_normal
, which calculates
optimal sample sizes and threshold decisions values for normally
distributed outcomes when multiple endpoints are investigated.
Within our drug development program, we will compare our experimental treatment to the control treatment contro. The treatment effect measure of exper for each endpoint \(i \in \{1,2\}\) is given by \(\Delta_i = \mu_{exper}  \mu_{contro}\), which is the difference in mean between the experimental treatment and the control treatment, for both endpoints. Note that this is not divided by the standard deviation, hence is it not standardized. For more information, see the vignette on normally distributed outcomes.
Defining all necessary parameters
Again, the parameters in the setting with multiple endpoints differ
slightly from the other cases (bias
adjustment and multitrial).
As in the basic setting, the treatment effect may be fixed (as in this
example) or may follow a prior distribution (see Fixed
or Prior) Note, that in this case, contrary to the other settings,
the second treatment effect describes the effect of the second endpoint
and is also relevant for fixed = TRUE
. Furthermore, some
options to adapt the program to your specific needs are also available
in this setting (see More
parameters), however skipping phase II and choosing different
treatment effects in phase II and III are not possible.

Delta1
is our assumed true treatment effect given as difference in means between exper and contro for the cognitive endpoint andDelta2
is our assumed true treatment effect given as difference in means between exper and contro for the functional endpoint. We assume true treatment effects to be 0.8 and 0.5 for each endpoint respectively, henceDelta1 = 0.8
andDelta2 = 0.5
. 
n2min
andn2max
specify the minimal and maximal number of participants for the phase II trial. The package will search for the optimal sample size within this region. For now, we want the program to search for the optimal sample size in the interval between 20 and 200 participants. In addition, we will tell the program to search this region in steps of ten participants at a time by settingstepn2 = 10
.  We have to include two parameters that amount for the standard
deviations \(\sigma_i\) for \(i \in \{1,2\}\). We set
sigma1 = 2
andsigma2 = 1
for our example.  The parameter
relaxed
decides how the effect sizes of the two endpoints are combined into one overall effect size (large, medium or small overall treatment effect). Ifrelaxed = FALSE
, we consider a strict rule assigning a large overall effect if both endpoints show an effect of large size, a small overall effect if at least one of the endpoints shows a small effect, and a medium overall effect otherwise. Ifrelaxed = TRUE
, we investigate the characteristics for a more relaxed rule assigning a large overall effect if at least one of the endpoints shows a large effect, a small effect if both endpoints show a small effect, and a medium overall effect otherwise. For our example, we setrelaxed = TRUE
 Finally, the parameter
rho
determines the correlation between the treatment effects of the two endpoints. In our case, we setrho = 0.5
, indicating the correlation between the treatment effects is 0.5.
set.seed(123)
res2 < optimal_multiple_normal(Delta1 = 0.8, Delta2 = 0.5, # define assumed true treatment effects
in1= NULL, in2= NULL, sigma1 = 2, sigma2= 1, # standard deviations
n2min = 20, n2max = 200, stepn2 = 10, # define optimization set for n2
kappamin = 0.02, kappamax = 0.2, stepkappa = 0.02, # define optimization set for HRgo
alpha = 0.025, beta = 0.1, # drug development planning parameters
c2 = 0.75, c3 = 1, c02 = 100, c03 = 150, # define fixed and variable costs for phase II and III
K = Inf, N = Inf, S = Inf, # set constraints
steps1 = 0, stepm1 = 0.5, stepl1 = 0.8, # benefit categories
b1 = 1000, b2 = 2000, b3 = 3000, # define expected benefits
rho = 0.5, relaxed = TRUE, # relaxed "TRUE"
fixed = TRUE, # fixed treatment effect
num_cl = 2)
Interpreting the output
After setting all these input parameters and running the function, let’s take a look at the output of the program.
res2
#> Optimization result:
#> Utility: 342.07
#> Sample size:
#> phase II: 120, phase III: 171, total: 291
#> Probability to go to phase III: 0.98
#> Total cost:
#> phase II: 190, phase III: 318, cost constraint: Inf
#> Fixed cost:
#> phase II: 100, phase III: 150
#> Variable cost per patient:
#> phase II: 0.75, phase III: 1
#> Effect size categories (expected gains):
#> small: 0 (1000), medium: 0.5 (2000), large: 0.8 (3000)
#> Success probability: 0.824
#> Success probability by effect size:
#> small: 0.798, medium: 0.026, large: 0
#> Significance level: 0.025
#> Targeted power: 0.9
#> Decision rule threshold: 0.02 [Kappa]
#> Assumed true effects [Delta]:
#> endpoint 1: 0.8, endpoint 2: 0.5
#> Correlation between endpoints: 0.5
The program returns a data frame, with the following results:

res$n2
is the optimal number of participants for phase II andres$n3
the resulting number of events for phase III. We see that the optimal scenario requires 120 participants in phase II and 171 participants in phase III. 
res$Kappa
is the optimal threshold value for the go/nogo decision rule. We see that we need a hazard ratio of less than 0.02 in phase II in order to proceed to phase III. 
res$u
is the expected utility of the program for the optimal sample size and threshold value. In our case it amounts to 342.07, i.e. we have an expected utility of 34 207 000$.
Where to go from here
In this article we presented examples when multiple endpoints are of interest.
For more information on how to use the package, see:
 Introduction to drugdevelopR: See how the package works in a basic example.
 Different outcomes: Apply it to binary endpoints and timetoevent endpoints.
 Interpreting the rest of the output: Obtain further details on your drug development program.
 Fixed or prior: Model the assumed treatment effect on a prior distribution.
 More parameters: Define custom effect size categories. Put constraints on the optimization by defining maximum costs, the total expected sample size of the program or the minimum expected probability of a successful program. Define an expected difference in treatment effect between phase II and III. Skip phase II.
 Complex drug development programs: Adapt to situations with biased effect estimators, multiple phase III trials or multiarm trials.
 Parallel computing: Be faster at calculating the optimum by using parallel computing.