Baseline and treatment effect heterogeneity for survival times

Baseline and treatment effect heterogeneity for survival times
Baseline and treatment effect heterogeneity for survival times
between centers using a random effects accelerated failure time
model with flexible error distribution
Arnošt Komárek1,‡ , Emmanuel Lesaffre1,∗,† and Catherine Legrand2,§
2
1 Biostatistical Centre, Katholieke Universiteit Leuven, Kapucijnenvoer 35, 3000 Leuven, Belgium
European Organisation for Research and Treatment of Cancer, E. Mounierlaan 83/11, 1200 Brussels,
Belgium
c
This is a preprint of an article accepted for publication in Statistics in Medicine. Copyright 2007 John Wiley & Sons, Ltd.
SUMMARY
Nowadays, most clinical trials are conducted in different centers and even in different countries. In most
multi-center studies, the primary analysis assumes that the treatment effect is constant over centers.
However, it is also recommended to perform an exploratory analysis to highlight possible center by
treatment interaction, especially when several countries are involved. We propose in this paper an
exploratory Bayesian approach to quantify this interaction in the context of survival data. To this end
we used and generalized a random effects accelerated failure time model. The generalization consists
in using a penalized Gaussian mixture as an error distribution on top of multivariate random effects
which are assumed to follow a normal distribution. For computational convenience, the computations
are based on Markov chain Monte Carlo techniques. The proposed method is illustrated on the disease
c
free survival times of early breast cancer patients collected in the EORTC trial 10854. Copyright 2007 John Wiley & Sons, Ltd.
key words:
multi-center study; penalized Gaussian mixture; regression; survival analysis
‡ Current
address: Department of Probability and Mathematical Statistics, Charles University, Sokolovská 83,
186 75 Praha 8-Karlı́n, the Czech Republic
† E-mail: [email protected]
§ Current address: Institute of Statistics, Université Catholique de Louvain, 20 voie du Roman Pays, 1348
Louvain-la-Neuve, Belgium
∗ Correspondence to: Emmanuel Lesaffre, Biostatistical Centre, Katholieke Universiteit Leuven, Kapucijnenvoer
35, 3000 Leuven, Belgium
Contract/grant sponsor: Research Funds Katholieke Universiteit Leuven; contract/grant number: PDM/06/242
Contract/grant sponsor: Belgian Federal Science Policy Office; contract/grant number: P6/03
Contract/grant sponsor: National Cancer Institute, U.S.A.; contract/grant number: 5U10 CA11488-35
BASELINE AND TREATMENT EFFECT HETEROGENEITY FOR SURVIVAL TIMES
1
1. INTRODUCTION
1.1. Motivation: EORTC trial 10854
The EORTC trial 10854 (Clahsen et al. [1]; van der Hage et al. [2]) is a large multi-center study
(n = 2 793 patients in N = 14 centers) aiming to compare perioperative polychemotherapy
(POP FAC arm) with no further treatment (control arm) on the disease free survival (DFS)
time in early breast cancer patients who underwent potentially curative surgery. The centers
are located in 5 geographical regions: the Netherlands, Poland, France, Southern Europe, and
South Africa. To improve the efficiency with which the treatment effect is evaluated, we wish
to account for known sources of variability – known patient- and center-specific characteristics
(covariates) and use an appropriate regression model. For many patients, the observed DFS
time is right-censored.
1.2. Heterogeneity
In multi-center, multinational studies, like the EORTC trial 10854, there are often unknown
sources of heterogeneity between centers, despite the use of a common protocol. This can
happen for many reasons: geographical differences, different working habits of the staff in
different centers, different patient populations attracted by different centers, etc. This applies
even more when several countries are involved, see, e.g., Anello, O’Neill and Dubey [3]. We need
to distinguish between two types of heterogeneity with respect to: (a) baseline characteristics
and (b) treatment efficacy. In the latter case one speaks of a treatment by center interaction.
If this interaction is large, the interpretation of the effect of treatment needs to be done with
caution, especially when the treatment by center interaction is qualitative (reverses in direction
from one center to another). Figure 1 shows Kaplan-Meier estimates of the DFS distribution
for the POP FAC arm and the control arm, separately for each center. From these curves, there
seems to be some heterogeneity among the centers. Not only the overall proportion of DFS
patients differs at each time point and in each treatment arm from center to center (baseline
heterogeneity) but also the effect of treatment on DFS, expressed by the relative position of the
two curves in the control and treatment arm seems to vary across centers both quantitatively
and qualitatively (treatment effect heterogeneity). A possible approach to take into account
the heterogeneity between centers is a model with the center indicator and the treatment by
center interaction as a part of the covariates (fixed effects model). However, in this paper, we
have opted for a random effects model, see Section 5 for a more detailed discussion of this
option.
<Figure 1 about here.>
1.3. Aim and outline of the paper
Random effects regression models constitute a widely used approach for regression analysis
when heterogeneity resulting from clustering of the data cannot be ruled out. Further, since
it is difficult to assess the distributional assumptions with censored data, it is preferred to
leave the distribution of survival times unspecified as, e.g., in Cox’s proportional hazards (PH)
model (Cox [4]) or, alternatively, to specify it in a flexible way.
The main objective of this paper is to present a random effects regression model in which
c 2007 John Wiley & Sons, Ltd.
Copyright Prepared using simauth.cls
Statist. Med. 2007; 26:0–0
2
A. KOMÁREK ET AL.
the distribution of survival times is not specified in a conventional parametric way like, e.g.,
log-normal, log-logistic, or Weibull. The PH model is certainly the most popular survival
regression model. This has probably to do with the elegant concept of partial likelihood
(Cox [5]). However, this does not imply that the PH assumption is taken to be granted, as is
often done in practice. In this paper, we concentrate on the AFT model (e.g., Kalbfleisch and
Prentice [6], Chap. 7), in which the covariates directly accelerate or decelerate the expected
survival time. It has been pointed out by David Cox (in Reid [7]) that the AFT models are “in
many ways more appealing because of their quite direct physical interpretation”. However, it
is not the purpose of this paper to balance the two approaches or to give statements preferring
one approach over the other.
To model a survival distribution flexibly, we use a smoothing technique based on a penalized
Gaussian mixture (PGM, see Section 2) exploited in Komárek, Lesaffre and Hilton [8] and
Komárek and Lesaffre [9]. Specifically, in Ref. [8] an AFT model for independent observations
is proposed. However, their model does not allow for the inclusion of random effects and hence
cannot be used to take heterogeneity into account. The extension allowing a random intercept
and hence modeling baseline heterogeneity is presented in Ref. [9]. Here, our main intention
is to modify the model from Ref. [9] by inclusion of multivariate random effects such that
heterogeneity of an arbitrary type can be considered. Further, we wish to illustrate the use of
this modification on modeling the baseline and treatment effect heterogeneity in the analysis
of the EORTC trial 10854.
The paper is organized as follows. Section 2 reviews the PGM approach of Ref. [8] and [9],
and explains its use in the AFT model with multivariate random effects. In Section 3, we
describe the inferential procedure for the suggested model based on Markov chain Monte
Carlo methodology and explain its motivation by the penalized maximum-likelihood method.
The analysis of the DFS time in early breast cancer patients is presented in Section 4. We
finalize the paper by a discussion in Section 5.
2. RANDOM EFFECTS AFT MODEL WITH PENALIZED GAUSSIAN MIXTURE AS
AN ERROR DISTRIBUTION
2.1. Random effects AFT model
Let Ti,l (i = 1, . . . , N, l = 1, . . . , ni ) denote the event time for the lth patient in the ith center.
Our approach not only allows for right-censored data but also for left- or interval-censored
U
data. Therefore, assume that Ti,l occurred within an interval of time btL
i,l , ti,l c, where the
symbols b and c are used for the lower and upper limits of the interval which can be open,
closed or half-closed according to the context. For instance, for an exactly observed event time,
U
L
U
btL
i,l , ti,l c = [ti,l , ti,l ], for a right-censored observation, bti,l , ti,l c = (ti,l , ∞). Further, assume
that the observed intervals are a result of an independent censoring process.
The random effects AFT model is in fact a classical linear mixed model of Laird and Ware [10]
with the logarithmic link function, i.e.
log(Ti,l ) = b0i z i,l + β 0 xi,l + εi,l ,
(i = 1, . . . , N, l = 1, . . . , ni ),
(1)
where εi,l are i.i.d. error terms having the distribution of the baseline log-event time with
a density gε , xi,l = (xi,l,1 , . . . , xi,l,s )0 are vectors of patient- and center-specific covariates
c 2007 John Wiley & Sons, Ltd.
Copyright Prepared using simauth.cls
Statist. Med. 2007; 26:0–0
BASELINE AND TREATMENT EFFECT HETEROGENEITY FOR SURVIVAL TIMES
3
assumed to have a homogeneous effect across centers and β = (β1 , . . . , βs )0 is a vector of
corresponding regression coefficients – fixed effects. Further, z i,l = (zi,l,1 , . . . , zi,l,q )0 are vectors
of factors with possibly varying (heterogeneous) effect across centers. For example, to model
the baseline and treatment effect heterogeneity between centers we take z i,l = (1, treati,l )0 ,
where treati,l is the treatment indicator for the (i, l)th patient. Finally, bi = (bi,1 , . . . , bi,q )0
are i.i.d. vectors of center-specific random effects with some (parametric) density g b having
a mean γ = (γ1 , . . . , γq )0 which expresses an overall effect of the covariates included in z i,l .
2.2. Baseline survival distribution
Any parametric assumption concerning the baseline survival distribution in the AFT model
(1) represented by the density gε is difficult to check with censored data. For this reason, it is
our intention to leave gε either unspecified or to specify it in a flexible way. This goal can be
achieved in different ways. For example, Pan and Louis [11] and Pan and Connett [12] consider
the random effects AFT model and estimate the distribution of the error term by inclusion of
a non-parametric Kaplan-Meier estimation step in their estimation procedure. Komárek and
Lesaffre [13] specify gε as a Gaussian mixture with unknown number of components, unknown
means and variances.
An alternative route, namely the use of smoothing techniques, was taken by Ref. [8] and [9]
and will be followed also here. In both papers, the error density gε is expressed as a shifted
and scaled penalized Gaussian mixture (PGM), which is specified as
gε (ε) = τ −1
K
X
j=−K
wj (a) ϕ τ −1 (ε − α) | µj , σ 2 ,
(2)
where α and τ are the (unknown) intercept and scale parameter, respectively, and
wj (a) = PK
exp(aj )
k=−K
exp(ak )
,
j = −K, . . . , K
(3)
are (unknown) mixture weights. The weights in (3) are reparametrized
to ensure that g ε is
P
a density for which we need 0 < wj < 1, j = −K, . . . , K and j wj = 1. Therefore, we will
work with the parameter vector a = (a−K , . . . , aK )0 instead of the vector w = (w−K , . . . , wK )0 .
Further, µ = {µ−K , . . . , µK } is a fine grid of equidistant knots centered around zero (µ0 = 0)
and σ 2 is a fixed basis variance, common for all mixture components. In fact, Gaussian
densities ϕ(·|µ−K , σ 2 ), . . . , ϕ(·|µK , σ 2 ) form a set of basis functions which are used, through
the estimation of weights w, to smooth the unknown density gε . The following choice has
been used in the analysis of Section 4, i.e.: K = 15, µ−K = −4.5, µK = 4.5, σ = 0.2 and
µj+1 − µj = 0.3, see Ref. [8] for a motivation.
Identification problems stemming from a high number of unknown parameters (e.g., in
the analysis in Section 4, we have 31 unknown mixture weights) are prevented by putting
a roughness penalty on the weights, see Section 3. In survival analysis, the baseline survival
distribution is more often specified by modeling the hazard function. For example, similarly
to our PGM approach, Lambert and Eilers [14], Kneib [15], or Kneib and Fahrmeir [16] use
penalized B-splines to express the baseline hazard function. In the context of an AFT the
density of the baseline log-event time (error term) seems to be more advantageous since this
density enters directly the likelihood (see further in Section 3).
c 2007 John Wiley & Sons, Ltd.
Copyright Prepared using simauth.cls
Statist. Med. 2007; 26:0–0
4
A. KOMÁREK ET AL.
2.3. Random effects distribution
We will assume that gb , the distribution of random effects bi (i = 1, . . . , N ), is multivariate
normal with unknown mean γ and unknown covariance matrix D. For example, when modeling
the baseline and treatment effect heterogeneity in Section 4, we have bi = (bi,1 , bi,2 )0 ,
z i,l = (1, treati,l ), and
d1,1 d1,2
E(bi ) = γ = (0, γ2 )0 ,
var(bi ) = D =
,
(4)
d1,2 d2,2
where γ2 is the mean treatment effect and d1,1 , d2,2 , d1,2 variance components of the random
effects distribution. Note that γ1 (the mean of the random effect expressing the baseline
heterogeneity) cannot be distinguished from the PGM parameter α (intercept of the error
term). Hence, for identifiability reasons, γ1 is constantly equal to zero.
The reasons for choosing a normal distribution for the random effects and do not a smoothing
approach as for the error term are: (i) The number of centers in our application is quite low
(14) providing only a low number of (moreover latent) “observations” to estimate the shape
of the distribution; (ii) It has been shown in the literature (Keiding, Andersen and Klein [17],
Lambert et al. [18]) that the regression parameters, which are usually of the primary interest
are robust against misspecification of the random effects distribution; (iii) When interest lies
in the marginal characteristics like the hazard or survival functions, a possible misspecification
of the random effects distribution is at least partly corrected by the estimation of the error
distribution.
3. ESTIMATION AND INFERENCE
3.1. Penalized maximum likelihood
In the original work, Ref. [8], penalized maximum likelihood (PML) was used for estimation.
In the current context, we would have to maximize the penalized likelihood
n λ
Lpenal (θ) = L(θ) × exp −
2
K
X
j=−K+s
(∆d aj )2
o
(5)
0
with respect to θ = β 0 , γ 0 , vec(D), α, τ, a0 , λ . In expression (5), L(θ) denotes the likelihood
which is equal to
"
#
n i Z tU
N Z Y
Y
i,l
L(θ) =
pi,l (t | a, α, τ, β, b) dt ϕq (b | γ, D) db ,
(6)
i=1
Rq
l=1
tL
i,l
where
pi,l (t | a, α, τ, β, b) = (tτ )
−1
K
X
j=−K
log(t) − α − b0 z i,l − β0 xi,l 2
µ
,
σ
.
wj (a)ϕ
j
τ
(7)
Further, ∆d denotes the dth-order difference operator (d = 3 was used in the analysis presented
PK
in Section 4). The roughness penalty, − λ2 j=−K+s (∆d aj )2 , which can also be written as
c 2007 John Wiley & Sons, Ltd.
Copyright Prepared using simauth.cls
Statist. Med. 2007; 26:0–0
BASELINE AND TREATMENT EFFECT HETEROGENEITY FOR SURVIVAL TIMES
5
− λ2 a0 P0d Pd a for an appropriate difference operator matrix Pd , avoids identifiability problems
or overfitting the data, see Eilers and Marx [19]. A trade-off between the smoothness of the
density gε and fitting the data is driven by the smoothing parameter λ, which has to be
estimated as well.
3.2. Bayesian specification
Maximization of the penalized likelihood (5) is quite difficult. However, as pointed out by
Wahba [20], the penalized likelihood is proportional to the posterior density in an appropriately
specified Bayesian model. Estimates and inference can then be used on the sample from this
posterior distribution obtained using the Markov chain Monte Carlo (MCMC) method (see,
e.g., Robert and Casella [21]). This approach was followed in Ref. [9] and will be used also
here, with some modifications stemming from the use of multivariate random effects combined
with covariates.
In agreement with Ref. [9], we specify the prior distribution of the model parameters, p(θ),
as a product of vague, but proper distributions and a Gaussian Markov random field (GMRF,
see, e.g., Rue and Held [22]) prior for transformed mixture weights a, that is
p(θ) =
s
Y
j=1
|
p(βj ) ×
q
Y
j=2
p(γj ) × p(D) × p(α) × p(τ −2 ) × p(λ) × p(a | λ),
{z
p(θ −a )
(8)
}
where p(βj ) (j = 1, . . . , s), p(γj ) (j = 2, . . . , q), p(α) are densities of the normal distribution
with (zero) mean and large variance, e.g., N (0, 102 ) was used in the analysis of Section 4.
Further, p(D) is the inverse Wishart distribution with a small number of degrees of freedom
dfb and a diagonal scale matrix Sb with small values on the diagonal. In Section 4, we used
dfb = q = 2 and Sb = diag(0.002). Furthermore, p(τ −2 ) and p(λ), prior densities of the
parameters that can be interpreted as inverse variances, are densities of a dispersed gamma
distribution, e.g., Gamma(1, 0.005) distributions were used in Section 4. Finally, the GMRF
prior of the vector a is such that p(a | λ) ∝ exp − λ2 a0 P0d Pd a , and corresponds to the penalty
part of the penalized likelihood (5). Using the Bayes’ rule, the posterior distribution is
p(θ | data) ∝ L(θ) × p(θ) = L(θ) × p(θ −a ) × p(a | λ)
(9)
approx.
and it is seen that provided p(θ −a ) ∝ 1, the posterior distribution (9) is approximately
proportional to the penalized likelihood (5).
3.3. Bayesian data augmentation
When using a Bayesian approach, it is advantageous to consider latent quantities, further
denoted as ψ, which are explicitly or implicitly integrated out from the likelihood (6), as
additional model parameters (Bayesian data augmentation, see Tanner and Wong [23]). For
convenience in notation we explain it in the case when all event times ti,l are censored, i.e.
0
L
U
U
U
L
[data] = (tL
1,1 , t1,1 , . . . , tN,nN , tN,nN ) and ti,l < ti,l for all i and l.
0 0
0
0
Let ψ = (t , r , B ) , with t = (t1,1 , . . . , tN,nN )0 , r = (r1,1 , . . . , rN,nN )0 , and B =
0
(b1 , . . . , b0N )0 , be the vector of latent exact event times, component labels (explained below),
c 2007 John Wiley & Sons, Ltd.
Copyright Prepared using simauth.cls
Statist. Med. 2007; 26:0–0
6
A. KOMÁREK ET AL.
and random effects, respectively. The prior distribution of the full parameter vector (ψ 0 , θ0 )0
is specified as
p(ψ, θ) = p(t | r, B, θ) × p(r | B, θ) × p(B | θ) × p(θ),
(10)
|
{z
}
p(ψ | θ)
where
p(t | r, B, θ) =
ni N Y
Y
(ti,l τ )−1 ϕ
i=1 l=1
log(ti,l ) − α − b0i z i,l − β0 xi,l 2
µ
,
σ
,
ri,l
τ
ni
N Y
Y
wji,l (a),
p(r | B, θ) = P r = (j1,1 , . . . , jN,nN )0 θ =
(11)
(12)
i=1 l=1
p(B | θ) =
N
Y
i=1
ϕq (bi | γ, D),
(13)
and p(θ) is given by (8). Prior distributions (11)–(13) follow directly from the original
expressions for the likelihood, eq. (6) and (7) and their product is in fact equal to the likelihood
if the latent data had been observed.
Specifically, the form of p(B | θ) stems from our assumption of normality of random effects.
The form of p(r | θ, B) follows from the fact that, intrinsicly, we can assume that the (i, l)th
residual log-event time belongs to one of the 2K + 1 normal components, labeled by r i,l .
Following our model of a penalized Gaussian mixture (7), the probability of belonging to the
jth mixture component is equal to wj (a) and hence the form of (12). Finally, the form for
p(t | r, B, θ) follows from the AFT model with the error distribution specified as a PGM.
However, the mixture is involved only implicitly by conditioning on the component labels r.
Given the full parameter vector (ψ 0 , θ 0 )0 , the likelihood has now a trivial form
ni
N Y
Y
U
L(ψ, θ) = p(data | ψ, θ) = p(data | t) ∝
I ti,l ∈ btL
i,l , ti,l c .
(14)
i=1 l=1
Marginal posterior characteristics of the original parameter vector θ, used for inference, are
indeed the same, irrespectively whether they are obtainedR from p(ψ, θ | data) ∝ L(ψ, θ) ×
p(ψ, θ) or from p(θ | data), eq. (9), which is also equal to p(ψ, θ | data)dψ.
3.4. Markov chain Monte Carlo
To determine the posterior distribution, we used the Gibbs sampling algorithm (Geman and
Geman [24]). The majority of the full conditional distributions are identical to those given in
Ref. [9] and we refer the reader therein. The remaining full conditional distributions pertain to
the random effects bi (i = 1, . . . , N ), the means of random effects γ and the covariance matrix
D of the random effects. However, they either have a multivariate normal or an inverse-Wishart
distribution. Details are given in the Appendix.
The complexity of the MCMC to be used requires control of the calculation which is difficult
to be achieved by a general purpose software (e.g., WinBUGS). For this reason, an R (R
Development Core Team [25]) package bayesSurv, freely available from the Comprehensive
R Archive Network on http://www.R-project.org, has been written to sample from the
c 2007 John Wiley & Sons, Ltd.
Copyright Prepared using simauth.cls
Statist. Med. 2007; 26:0–0
BASELINE AND TREATMENT EFFECT HETEROGENEITY FOR SURVIVAL TIMES
7
posterior distribution of the model parameters (function bayessurvreg2) and draw the inference
(e.g., function predictive2). Detailed description on how to use the package for the analysis
shown in Section 4 is included in the documentation to the package.
3.5. Inference on the model parameters
For each component of the parameter vector θ we derive summary statistics of the posterior
distribution p(θ | data), obtained from the MCMC sample, (ψ (m) , θ(m) ) (m = 1, . . . , M ).
For example, the posterior median values are approximated by the MCMC sample medians.
Highest posterior density (HPD) intervals are derived to express the uncertainty with which
the parameter is estimated.
To draw inference
on the transformed parameter (vector) ζ(θ), we use the posterior
distribution p ζ(θ) data and the corresponding MCMC sample ζ θ(m) (m = 1, . . . , M ).
For example, in the context of the AFT model, rather than reporting the results for the fixed
effects β1 , . . . , βs or the means γ1 , . . . , γq of the random effects, we prefer reporting of the
acceleration factors eζ1 , . . . , eζs , or eγ1 , . . . , eγq , respectively. Indeed, these quantities express
how much a unit change in the covariate accelerates (eβ < 1) or decelerates (eβ > 1) the
reference event time.
3.6. Inference on the survival distribution
When interest lies in the survival distribution for a specific combination of covariates x pred
and z pred , we can compute the predictive survival function S(t | data, xpred , z pred ), or the
predictive hazard function }(t | data, xpred , z pred ) (t > 0) from the MCMC output. The
procedure is analogous, with only an obvious change in notation, to that described in Ref. [9]
(Section 4.6) and the reader is referred therein for details.
3.7. Inference on random effects
When interest lies in investigating and explaining the heterogeneity, we can use the (marginal)
posterior distribution p(B | data) of the random effects b 1 , . . . , bN , which is obtained from the
joint posterior distribution p(ψ, θ | data) by integrating out the remaining parameters. When
an MCMC sample from the joint posterior distribution is available, integration is achieved by
simply ignoring these remaining parameters in the sample.
4. THE ANALYSIS OF THE DFS TIME IN EARLY BREAST CANCER PATIENTS
For the analysis of the DFS time in early breast cancer patients in the EORTC trial 10854, we
fitted two random effects AFT models (1). In both models, we included the following covariates:
age group (<40, 40 – 50, >50 years), type of prior surgery (mastectomy, breast conserving),
tumor size (not palpable or <2 cm, ≥2 cm), axillary nodal status (negative, positive), presence
of other related disease (no, yes). The first AFT model (Model with region) contained also
dummies for the geographical location, whereas in the second AFT model (Model without
region), the geographical location was not included in the covariate vector for fixed effects.
Since centers are nested within geographical regions it should be possible to reveal, at least
partially, the regional structure of the centers from the estimates of the center-specific random
c 2007 John Wiley & Sons, Ltd.
Copyright Prepared using simauth.cls
Statist. Med. 2007; 26:0–0
8
A. KOMÁREK ET AL.
effects b1,1 , . . . , bN,1 in the model without region.
For inference we sampled a chain of length 125 000 with 1:5 thinning which took about 2.5
hour on a Pentium IV 2 GHz PC with 512 MB RAM. The last 25 000 iterations of the chain
were used to derive the summary statistics which are shown in Tables I and II.
<Table I about here.>
<Table II about here.>
4.1. Effect of covariates and the survival distribution
Table I shows the posterior summaries for the acceleration factors revealing the effect of
considered covariates in both models. It is seen that the DFS time in the control arm is
approximately 0.86 times shorter than in the POP FAC arm, or reversible, it is approximately
1/0.86 ≈ 1.16 times longer in the POP FAC arm compared to the control arm. Based on
the model with region included, the DFS time for the middle age group (40 – 50 years) is by
a factor of 1.38 longer than the DFS time of the youngest group (<40 years). For the patients
from the oldest group (>50 years), the DFS time is by a factor of 1.33 longer than the DFS
time of the youngest group. In the group where breast conserving surgery was used, the DFS
time is longer by a factor of 1.26 compared to the mastectomy group. Further, in the bigger
tumors (≥2 cm) group, the DFS time is shorter by a factor of 0.63 compared to the smaller
tumors (<2 cm) group. In a group with the positive pathological nodal status the DFS time
is shorter by a factor of 0.55 compared to a group with the negative result. In a group where
other related disease is present, the DFS time is shorter by a factor of 0.72. From the regional
effects it is, for example, seen that South Africa performs far worse than all remaining regions.
In the model without region, the effect of the included covariates is estimated to be practically
the same as in the model with region. This illustrates, among other things, the general property
that the AFT model is robust towards omission of important covariates (Hougaard [26]).
Although, probably the reason for this phenomenon is that the omitted covariate region is
strongly related to the center indicator which is nested within the region. Consequently, region
remains partially or completely included in both models. A complete view on the distribution
of the DFS time is given in Figure 2, which shows the predictive hazard and survival functions
in the POP FAC and control arm when fixing remaining covariates at their reference values.
<Figure 2 about here.>
4.2. Random effects, heterogeneity and the error density
Figure 4 shows the posterior medians and 95% HPD intervals for acceleration factors based
on the center-specific random effects in both models considered. For comparison purposes, the
plot related to the random intercepts bi,1 , (i = 1, . . . , 14) takes also into account the fixed
effect of a geographical region in the model with region explicitly included. In the left part of
Figure 4, France serves as a reference region (model with region) whereas an average over all
regions serves as a reference in the right part of Figure 4 (model without region). This causes
an overall shift when going from left to right in the upper panel of Figure 4. However, besides
that shift, the structure of the posterior medians of the random intercepts is quite similar in
both models. That is, the random intercepts in the model without region were to a large extent
able to capture the effect of the region.
c 2007 John Wiley & Sons, Ltd.
Copyright Prepared using simauth.cls
Statist. Med. 2007; 26:0–0
BASELINE AND TREATMENT EFFECT HETEROGENEITY FOR SURVIVAL TIMES
9
<Figure 4 about here.>
As one could have expected, omission of the covariate region led to the increase of the
variability p
of the random intercept. Namely, its standard deviation, estimated
p by the posterior
median of d1,1 , increased from 0.111 to 0.302, the 95% HPD interval for d1,1 changed from
(0.015, 0.292) to (0.142, 0.513). The lower panel of Figure 4 shows that the heterogeneity of
the treatment effect between centers is of a lower magnitude than
p the baseline heterogeneity.
This is also seen on the posterior medians of the parameter d2,2 , standard deviation of
bi,2 (i = 1, . . . , 14) which equals to 0.057 in the model with region and to a slightly higher
value of 0.074 in the model without region, respectively. Most importantly, all increase of the
variability caused by the omission of the important covariate (region) was captured by the
variance components of the random effects. The residual variability, which has a direct impact
on the precision with which the effect of the covariates is evaluated, remains practically the
same, see the row labeled as sd(ε) in Table II.
From the negative posterior estimate (median) of the correlation coefficient between the
two random effects we might conclude that patients in centers which perform relatively bad,
benefit more from the treatment than patients treated in better performing centers. However,
the HPD interval for the correlation is very wide, covering practically (−1, 1) and hence not
much can be concluded from the analysis including or excluding the region. Finally, Figure 3
shows a pointwise posterior mean of the error density gε which was modeled as the penalized
Gaussian mixture.
<Figure 3 about here.>
5. DISCUSSION
We have introduced here a possible approach to perform a regression analysis with survival
clustered data dealing with a heterogeneity between clusters (centers). Both the baseline
heterogeneity, as well as the heterogeneity with respect to the effect of selected covariates
has been considered. The heterogeneity has been taken into account by including the random
effects in the AFT model. Parametric assumptions concerning the baseline survival distribution
have been avoided by using the penalized Gaussian mixture as a model for the error terms in the
AFT model. In general, parametric models lead to a higher precision in the estimation of the
regression coefficients. Our approach can serve as an exploratory tool to choose an appropriate
parametric model. For example, the analysis of Section 4 might continue with an AFT model
with a skewed error distribution suggested by Figure 3.
Not surprisingly, we have illustrated also that the random effects are capable to reveal much
of the structure in the data arising from an omitted covariate. In fact, the random effects are
able to capture practically all the variability caused by omission of important covariates and
this leads to an improved precision of the estimated regression coefficients.
Earlier, Legrand et al. [27] analyzed the same clinical outcome of the EORTC trial
10854 using a frailty PH model. By considering a fixed treatment effect and a random
center effect their objective was to quantify heterogeneity in outcome over centers. However,
they did not include a treatment by center interaction and therefore did not account for
a possible heterogeneity in the treatment effect between centers. With respect to the baseline
c 2007 John Wiley & Sons, Ltd.
Copyright Prepared using simauth.cls
Statist. Med. 2007; 26:0–0
10
A. KOMÁREK ET AL.
heterogeneity between centers, we have found, as in Ref. [27], that it is largely explained by the
geographical differences. Legrand et al. [27] investigated heterogeneity in DFS between centers
and interpreted it in terms of density of predicted 5-year DFS rates over centers. A similar
comparison was performed in this paper using the posterior summaries of the acceleration
factors based on the center-specific random effects. Similar results were found by both methods
for the baseline heterogeneity. Our model can estimate the heterogeneity by inspecting the
diagonal elements of the matrix D (variances of random effects). However, a formal test for
heterogeneity is not possible with this approach. In our analysis, Figure 1 reveals that the
treatment by center interaction, while not large, cannot be automatically ruled out.
Finally, we have opted for a random effects approach to model the effect of centers. The
choice between a fixed effects and a random effects model is still a matter of debate, see, e.g.,
Anello et al. [3] and both approaches have their merits. However, we do admit that there
is no general agreement on this, see also Glidden and Vittinghoff [28]. The choice of a fixed
approach, though, allows to test statistically whether the heterogeneity of the baseline and
treatment effect by including a center by treatment interaction in the covariate vector x i,l
for fixed effects. On the other hand, as mentioned in the CPMP Guideline [29], assessment of
interaction terms based on statistical significance tests is of little value since (a) these tests
often lack statistical power and the absence of statistical evidence of an interaction is not
evidence that there is no clinically relevant interaction, (b) conversely, an interaction cannot
be considered as relevant on the sole basis of a significant test for interaction.
ACKNOWLEDGEMENTS
We thank the editor and the anonymous referees for their valuable comments that led to a considerable
improvement of the paper.
The research of the first author was performed in the framework of the postdoctoral mandate
PDM/06/242 financed by the Research Funds of Katholieke Universiteit Leuven. The authors further
acknowledge support from the Interuniversity Attraction Poles Program P6/03 – Belgian State
– Federal Office for Scientific, Technical and Cultural Affairs. The authors thank the European
Organisation for Research and Treatment of Cancer (EORTC) Breast Cancer Group for permission to
use the data from EORTC trial 10854 for this research. This research project was further supported by
grant number 5U10 CA11488-35 from the US National Cancer Institute (Bethesda, Maryland, USA).
The contents of this publication are solely the responsibility of the authors and do not necessarily
represent the official views of the National Cancer Institute.
REFERENCES
1. P. C. Clahsen, C. J. van de Velde, J. P. Julien, J. L. Floiras, T. Delozier, F. Y. Mignolet, and T. M.
Sahmoud. Improved local control and disease-free survival after perioperative chemotherapy for earlystage breast cancer. A European Organization for Research and Treatment of Cancer Breast Cancer
Cooperative Group Study. Journal of Clinical Oncology, 14:745–753, 1996.
2. J. A. van der Hage, C. J. H. van de Velde, J.-P. Julien, J.-L. Floiras, T. Delozier, C. Vandervelden,
and L. Duchateau. Improved survival after one course of perioperative chemotherapy in early breast
cancer patients: long-term results from the European Organization for Research and Treatment of Cancer
(EORTC) Trial 10854. European Journal of Cancer, 37:2184–2193, 2001.
3. C. Anello, R. T. O’Neill, and S. Dubey. Multicenter trials: a US regulatory perspective. Statistical Methods
in Medical Research, 14:303–318, 2005.
4. D. R. Cox. Regression models and life-tables (with Discussion). Journal of the Royal Statistical Society,
Series B, 34:187–220, 1972.
c 2007 John Wiley & Sons, Ltd.
Copyright Prepared using simauth.cls
Statist. Med. 2007; 26:0–0
BASELINE AND TREATMENT EFFECT HETEROGENEITY FOR SURVIVAL TIMES
11
5. D. R. Cox. Partial likelihood. Biometrika, 62:269–276, 1975.
6. J. D. Kalbfleisch and R. L. Prentice. The Statistical Analysis of Failure Time Data. John Wiley & Sons,
Chichester, Second edition, 2002.
7. N. Reid. A conversation with Sir David Cox. Statistical Science, 9:439–455, 1994.
8. A. Komárek, E. Lesaffre, and J. F. Hilton. Accelerated failure time model for arbitrarily censored data
with smoothed error distribution. Journal of Computational and Graphical Statistics, 14:726–745, 2005.
9. A. Komárek and E. Lesaffre. Bayesian accelerated failure time model with multivariate doubly-intervalcensored data and flexible distributional assumptions. To appear in Journal of the American Statistical
Association, 2007.
10. N. M. Laird and J. H. Ware. Random-effects models for longitudinal data. Biometrics, 38:963–974, 1982.
11. W. Pan and T. A. Louis. A linear mixed-effects model for multivariate censored data. Biometrics,
56:160–166, 2000.
12. W. Pan and J. E. Connett. A multiple imputation approach to linear regression with clustered censored
data. Lifetime Data Analysis, 7:111–123, 2001.
13. A. Komárek and E. Lesaffre. Bayesian accelerated failure time model for correlated censored data with a
normal mixture as an error distribution. Statistica Sinica, 17:549–569, 2007.
14. P. Lambert and P. H. C. Eilers. Bayesian proportional hazards model with time-varying regression
coefficients: A penalized Poisson regression approach. Statistics in Medicine, 24:3977–3989, 2005.
15. T. Kneib. Mixed model based inference in geoadditive hazard regression for interval censored survival
times. Computational Statistics and Data Analysis, 51:777–792, 2006.
16. T. Kneib and L. Fahrmeir. A mixed model approach for geoadditive hazard regression. Scandinavian
Journal of Statistics, 34:207–228, 2007.
17. N. Keiding, P. K. Andersen, and J. P. Klein. The role of frailty models and accelerated failure time models
in describing heterogeneity due to omitted covariates. Statistics in Medicine, 16:215–225, 1997.
18. P. Lambert, D. Collett, A. Kimber, and R. Johnson. Parametric accelerated failure time models with
random effects and an application to kidney transplant survival. Statistics in Medicine, 23:3177–3192,
2004.
19. P. H. C. Eilers and B. D. Marx. Flexible smoothing with B-splines and penalties (with Discussion).
Statistical Science, 11:89–121, 1996.
20. G. Wahba. Bayesian “confidence intervals” for the cross–validated smoothing spline. Journal of the Royal
Statistical Society, Series B, 45:133–150, 1983.
21. C. P. Robert and G. Casella. Monte Carlo Statistical Methods. Springer-Verlag, New York, Second
edition, 2004.
22. H. Rue and L. Held. Gaussian Markov Random Fields: Theory and Applications. Chapman & Hall/CRC,
Boca Raton, 2005.
23. M. A. Tanner and W. H. Wong. The calculation of posterior distributions by data augmentation. Journal
of the American Statistical Association, 82:528–550, 1987.
24. S. Geman and D. Geman. Stochastic relaxation, Gibbs distributions and the Bayes restoration of image.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 6:721–741, 1984.
25. R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation
for Statistical Computing, Vienna, Austria, 2007. ISBN 3-900051-07-0.
26. P. Hougaard. Fundamentals of survival data. Biometrics, 55:13–22, 1999.
27. C. Legrand, L. Duchateau, R. Sylvester, P. Janssen, J. A. van der Hage, C. J. H. van de Velde, and
P. Therasse. Heterogeneity in disease free survival between centers: lessons learned from an EORTC
breast cancer trial. Clinical Trials, 3:10–18, 2006.
28. D. V. Glidden and E. Vittinghoff. Modelling clustered survival data from multicentre clinical trials.
Statistics in Medicine, 23:369–388, 2004.
29. Committee for Proprietary Medicinal Products. Points to Consider on Adjustment for Baseline
Covariates. The European Agency for the Evaluation of Medical Products, Evaluation of Medicines
for Human Use, London, May 2003. CPMP/EWP/2863/99.
APPENDIX: MARKOV CHAIN MONTE CARLO
In appendix, we provide the full conditional distributions for the random effects bi (i = 1, . . . , N ), the
means of random effects γ and the covariance matrix D of the random effects. For convenience in the
notation, we will assume that zi,l,1 ≡ 1, in which case γ1 is constantly equal to zero for identifiability
reasons.
c 2007 John Wiley & Sons, Ltd.
Copyright Prepared using simauth.cls
Statist. Med. 2007; 26:0–0
12
A. KOMÁREK ET AL.
Namely,
“
”
bi | · · · ∼ N E(bi | · · · ), var(bi | · · · ) ,
with
i = 1, . . . , N,
(15)
ni
h
X
˘
¯i
zi,l log(ti,l ) − α − β 0 xi,l − τ µri,l ,
E(bi | · · · ) = var(bi | · · · ) × D−1 γ + (σ τ )−2
l=1
n
var(bi | · · · ) = D−1 + (σ τ )−2
ni
X
z i,l z 0i,l
l=1
o−1
.
Further, let ν (−1) be the vector of prior means of γ (−1) = (γ2 , . . . , γq )0 and U(−1) be a diagonal matrix
having prior variances of γ (−1) on the diagonal. Let V(−1) and V(−1,1) be the (2, . . . , q)-(2, . . . , q)
block and the (2, . . . , q)-1 block, respectively, of the matrix D−1 . Finally, let bi(−1) = (bi,2 , . . . , bi,q )0
(i = 1, . . . , N ). Then
“
”
γ (−1) | · · · ∼ N E(γ (−1) | · · · ), var(γ (−1) | · · · ) ,
(16)
with
E(γ (−1) | · · · ) = var(γ (−1) | · · · ) ×
N
N
“
”
X
X
U−1
bi(−1) + V(1,−1)
bi,1 ,
(−1) ν (−1) + V(−1)
i=1
“
var(γ (−1) | · · · ) = U−1
(−1) + N V(−1)
Finally,
”−1
i=1
.
N
X
(bi − γ)(bi − γ)0
D | · · · ∼ inverse-Wishart dfb + N, Sb +
i=1
c 2007 John Wiley & Sons, Ltd.
Copyright Prepared using simauth.cls
!
.
(17)
Statist. Med. 2007; 26:0–0
BASELINE AND TREATMENT EFFECT HETEROGENEITY FOR SURVIVAL TIMES
13
Table I. Posterior medians and 95% highest posterior density (HPD) intervals for the acceleration
factors (exp(γ) and exp(β) parameters).
Effect
Model with region
Posterior
95% HPD
median
interval
Treatment group (reference: POP FAC arm)
control arm
0.858 (0.712, 1.010)
Age group (reference: <40 years)
40 – 50 years
1.384 (1.035, 1.762)
>50 years
1.330 (1.019, 1.656)
Type prior surgery (reference: mastectomy)
breast conserving
1.257 (1.041, 1.483)
Tumor size (reference: <2 cm)
≥2 cm
0.630 (0.521, 0.748)
Nodal status (reference: negative)
positive
0.549 (0.461, 0.635)
Other disease (reference: absent)
present
0.724 (0.538, 0.930)
Region (reference: France)
The Netherlands
0.669 (0.457, 0.943)
Poland
1.417 (0.845, 2.154)
Southern Europe
0.713 (0.465, 1.007)
South Africa
0.479 (0.295, 0.700)
c 2007 John Wiley & Sons, Ltd.
Copyright Prepared using simauth.cls
Model without region
Posterior
95% HPD
median
interval
0.860
(0.729, 1.009)
1.411
1.368
(1.064, 1.819)
(1.061, 1.738)
1.281
(1.070, 1.509)
0.625
(0.515, 0.745)
0.546
(0.459, 0.639)
0.716
(0.536, 0.926)
Statist. Med. 2007; 26:0–0
14
A. KOMÁREK ET AL.
Table II. Posterior medians and 95% highest posterior density intervals for the moments of the error
distribution and variance components of random effects.
Effect
E(ε)
sd(ε)
p
d = sd(b1 )
p 1,1
d2,2 = sd(b2 )
√ d1,2
= corr(b1 , b2 )
d1,1 d2,2
Model with region
Posterior
95% HPD
median
interval
Moments of the error distribution
9.155 (8.763, 9.513)
1.481 (1.341, 1.640)
Model without region
Posterior
95% HPD
median
interval
8.967
1.470
Variance components of the random effects
0.111 (0.015, 0.292)
0.302
0.057 (0.014, 0.180)
0.074
−0.219 (−0.997, 0.939)
−0.675
c 2007 John Wiley & Sons, Ltd.
Copyright Prepared using simauth.cls
(8.559, 9.340)
(1.345, 1.628)
(0.142, 0.513)
(0.015, 0.212)
(−0.998, 0.967)
Statist. Med. 2007; 26:0–0
PSfrag replacements
BASELINE AND TREATMENT EFFECT HETEROGENEITY FOR SURVIVAL TIMES
France (31)
0.8
0.4
0.0
Survivor
0.8
0.4
0.8
0.4
(75.2%, 64.5%)
1000 2000 3000 4000 5000
0
n = 60
0.0
Survivor
0.4
0.0
1000 2000 3000 4000 5000
0
South Europe (43)
1000 2000 3000 4000 5000
0.4
(78.9%, 63.8%)
1000 2000 3000 4000 5000
0
n = 25
0.0
Survivor
0.4
n = 185
0.0
(65.7%, 51.7%)
0.8
France (33)
0.8
Days
The Netherlands (13)
0
(68.6%, 60%)
1000 2000 3000 4000 5000
Days
South Europe (44)
1000 2000 3000 4000 5000
0
0.4
(81.5%, 71.1%)
n = 48
0.0
n = 902
Survivor
Survivor
(69.4%, 53.9%)
0.8
Days
France (34)
0.8
Days
Poland (21)
1000 2000 3000 4000 5000
0
Days
South Africa (51)
1000 2000 3000 4000 5000
Days
0.8
0.4
(76.9%, 71.3%)
n = 206
0.0
n = 78
(72.7%, 51.7%)
1000 2000 3000 4000 5000
Days
Survivor
0.8
0.4
1000 2000 3000 4000 5000
Days
Poland (22)
0.0
(68.3%, 57.7%)
Days
Survivor
0.4
0.0
n = 622
0.4
0.4
(80%, 80%)
0.8
South Europe (42)
Days
0
1000 2000 3000 4000 5000
France (32)
0.8
0.4
0.0
0
(60.5%, 48.4%)
The Netherlands (12)
n = 40
0
1000 2000 3000 4000 5000
0
n = 54
Days
n = 184
0
(80.7%, 71.8%)
Days
0.8
0
South Europe (41)
Days
n = 25
0.0
n = 25
n = 311
0.0
Survivor
1000 2000 3000 4000 5000
0.8
0
(71.7%, 64%)
0.0
0.0
n = 53
Survivor
0.4
0.8
The Netherlands (11)
15
0
(57.8%, 49.3%)
1000 2000 3000 4000 5000
Days
Figure 1. Kaplan-Meier estimates of the DFS time distribution separately for each center and each
treatment group. Solid line: POP FAC arm, dotted-dashed line: control arm. Further, we report the
sample size and overall DFS proportion after 5 years and 10 years per center.
c 2007 John Wiley & Sons, Ltd.
Copyright Prepared using simauth.cls
Statist. Med. 2007; 26:0–0
PSfrag replacements
Model with region16
odel without region
A. KOMÁREK ET AL.
Survival
0.6
0.7
0.8
S(t)
0.000 10
0.000 00
0.000 05
}(t)
0.9
0.000 15
1.0
Hazard
0
1000
2000
3000
4000
0
5000
1000
2000
t (Days)
3000
4000
5000
t (Days)
Figure 2. Model with region. Predictive hazard and survival function for the POC FAC arm (solid
line) and control arm (dotted-dashed line) and remaining covariates fixed to the reference values.
PSfrag replacements
0.15
0.00
0.05
0.10
gε (ε)
0.20
0.25
0.30
Error density
2
4
6
8
10
12
14
ε
Figure 3. Model with region. Pointwise posterior mean of the error density gε . The density gε involves
a shift by the intercept α and a scaling by the parameter τ and thus it is not standardized.
c 2007 John Wiley & Sons, Ltd.
Copyright Prepared using simauth.cls
Statist. Med. 2007; 26:0–0
PSfrag replacements
exp(b2 )
BASELINE AND TREATMENT EFFECT HETEROGENEITY FOR SURVIVAL TIMES
Model with region
17
Model without region
˘
¯
0.6
exp b1 + β(region)
exp(b0.8
1)
1.2
51 44 43 42 41 34 33 32 31 22 21 13 12 11
PSfrag replacements
Institution
Intercept (+ region effect)
NL
PL
F
SE
SA
0.5
1.0
1.5
0.5
2.0
1.0
˘
¯
exp b1 + β(region)
1.5
2.0
exp(b1 )
1.5
2.0
51 44 43 42 41 34 33 32 31 22 21 13 12 11
0.5
Institution
Treatment effect
NL
PL
F
SE
SA
0.6
0.8
1.0
1.2
exp(b2 )
0.6
0.8
1.0
1.2
exp(b2 )
Figure 4. Posterior medians and 95% highest posterior density intervals for center-specific random
effects based acceleration factors. Random intercepts in the model with region are further shifted by
a corresponding region main effect β(region).
c 2007 John Wiley & Sons, Ltd.
Copyright Prepared using simauth.cls
Statist. Med. 2007; 26:0–0
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement