# User manual | Chapter 11 - individual covariates

CHAPTER 11
Individual covariates
In many of the analyses we’ve looked at so far in this book, we’ve partitioned variation in one or
more parameters among different levels of what are commonly referred to as ‘classification’ factors.
For example, comparing survival probabilities between male and female individuals (where ‘sex’ is
the classification factor), good and poor breeding colonies (where ‘colony’ is the classification factor),
among age-classes, and so on.
However, in many cases, there may be one or more factors which you might think are important
determinants of variation among parameters which do not have natural ‘classification’ levels. For
example, consider body size. It is often hypothesized that survival of individuals may be significantly
influenced by individual differences in body size. While it is possible to take individuals and classify
them as ‘large’, ‘medium’ or ‘small’ (based on some criterion), such classifications are artificial, and
arbitrary. For a continuous covariate such as body size, there are an infinite number of possible
classification levels you might create. And, your results may depend upon how many classification
levels for body size (or some other continuous factor) you use, and exactly where these levels fall.
As such, it would be preferable to be able to use the real, continuous values for body size (for example)
in your analysis – each individual in the data set has a particular body size, so you want to constrain
the estimates of the various parameters in your model to be linear functions of one or more continuous
individual covariates. The use of the word ‘covariate’ might tweak some memory cells – think ‘analysis
of covariance’ (ANCOVA), which looks at the influence of one or more continuous covariates on some
response variable, conditional on one or more classification variables. For example, suppose you have
measured the resting pulse rate for male and female children in a given classroom. You believe that
pulse rate is influenced by the sex of the individual, and their body weight. So, you might set up a linear
model where SEX is entered as a classification variable (with 2 levels: male and female), and WEIGHT is
entered as a continuous linear covariate. You might also include an interaction term between SEX and
WEIGHT.
In analysis of data from marked individuals, you essentially do much the same thing. Of course,
there are a couple of ‘extra steps’ in the process, but essentially, you use the same mechanics for model
building and model selection we’ve already considered elsewhere in the book. The major differences
concern: data formatting, modifying the design matrix, and reconstituting parameter estimates. We will
introduce the basic ideas with a series of worked examples.
Before we begin, though, it is important that you fully understand the semantic and functional
distinction between an ‘individual covariate’ (a covariate that applies to that individual; e.g., body size at
birth), and an ‘environmental’ or ‘group’ covariate (a covariate which applies to all individuals encountered
at a particular casion or over a particular interval; e.g., weather).
04.18.2017
11.1. ML estimation and individual covariates
11 - 2
11.1. ML estimation and individual covariates
Conceptually, the idea behind modeling survival or recapture (or any other parameter) as a function of
an individual covariate isn’t particularly difficult. It stems from the realization that it is possible to write
the likelihood as a product of individual ’contributions’ to the overall likelihood. Consider the following
example. Suppose you have 8 individuals, which you mark and release. You go out next year, and find
3 of them alive (we’ll ignore issues of encounter probability and so forth for the moment). We know
from Chapter 1 that the MLE for the estimate of survival probability S is simply (3/8) 0.375. More
formally, the (binomial) likelihood of observing 3 survivors out of 8 individuals marked and released
is given as (where Y 3, and N 8):
N Y
L S data S (1 − S)N−Y
Y
Or, dropping the binomial probability term (which is a constant, and not a function of the parameter
– see Chapter 1):
L S data ∝ S Y (1 − S)N−Y
If we let Q (1 − S), then we could re-write this likelihood as
L S data ∝ S Y Q N−Y S 3 Q 5
We could rewrite this likelihood expression as
L S data ∝ S 3 Q 5 (S.S.S).(Q.Q.Q.Q.Q) 3
Ö
i1
Si
8
Ö
Qi
i4
Alternatively, we might define a variable a, which we use to indicate whether or not the animal is
found alive (a 1) or dead (a 0). Thus, we could write the likelihood for the i th individual as
L S N, {a1 , a2 , . . . , a8 } ∝
8
Ö
S a i Q (1−a i )
i1
Try it and confirm this is correct. Let S = the MLE = 0.375. Then, (0.375)3(1 − 0.375)5 0.00503, which
is equivalent to
(0.375)1(0.625)(1−1)(0.375)1 (0.625)(1−1)(0.375)1(0.625)(1−1)
×(0.375)0(0.625)(1−0)(0.375)0 (0.625)(1−0)(0.375)0(0.625)(1−0) (0.375)0(0.625)(1−0)(0.375)0 (0.625)(1−0)
(0.05273) × (0.09537)
0.00503
In each of these 3 forms of the likelihood the individual ‘fate’ has its own probability term (and
the likelihood is simply the product of these individual probabilities). Written in this way there is
a straightforward and perhaps somewhat obvious way to introduce individual covariates into the
likelihood. All we need to do to model the survival probability of the individuals is to express the
survival probability of each individual S i as some function of an individual covariate X i .
Chapter 11. Individual covariates
11.2. Example 1 – normalizing selection on body weight
11 - 3
For example, we could use
e β1 +β2 (Xi )
1
Si β
+β
(X
)
−
β
1 + e( 1 2 i )
1 + e ( 1 +β2 (Xi ))
!
ln
Si
1 − Si
!
β1 + β2 (X i )
Then, we simply substitute this expression for S i into
L S N, {a1 , a2 , . . . , a8 } ∝
8
Ö
S a i Q (1−a i )
i1
Written this way, the MLE’s for the β1 and β2 (intercept and slope, respectively) become the focus of
the estimation.
Pretty slick, eh? Well, it is, with one caveat. The likelihood expression gets ‘really ugly’ to write down.
It becomes a very long, cumbersome expression (which fortunately MARK handles for us), and because
of the way it is constructed, numerically deriving the estimates takes somewhat longer than it does when
the likelihood is not constructed from individuals. Also, there are a couple of things to keep in mind.
First, it is important to realize that the survival probabilities are replaced by a logistic submodel of the
individual covariate(s). Conceptually, then, every animal i has its own survival probability, and this may
be related to the covariate. During the analysis, the covariate of the i th animal must correspond to the
survival probability of that animal. MARK handles this, and it is this sort of ‘book-keeping’ that slows
down the estimation (relative to analyses that don’t include individual covariates).
OK – enough background. Let’s look at some examples, and how you handle individual covariates
in MARK.
11.2. Example 1 – normalizing selection on body weight
Consider the following example. You believe that the survival probability of some small bird is a function
of the mass of the bird at the time it was marked. However, you believe that there might be normalizing
selection on body mass, such that there is a penalty for being either ‘too light’ or ‘too heavy’, relative to
some ‘optimal’ body mass.
Now, a key assumption – we’re going to assume that survival probability for each individual bird
is potentially influenced by the mass of the bird at the time it was first marked and released. Now,
you might be saying to yourself ‘hmmm, but body mass is likely to change from year to year?’. True –
and this is an important point to keep in mind – we assume that the individual covariate (in this case,
body mass) is fixed over the lifetime of the individual bird. We will consider using ‘temporally variable
covariates’ later on. For now, we will assume that the mass of the bird when it is marked and released
is the important factor.
We simulated some capture-recapture data, according to the following function relating survival
probability (ϕ) to body mass (mass), according to the following equation:
ϕ −0.039 + 0.0107(mass) − 0.000045(mass2 )
Chapter 11. Individual covariates
11.2. Example 1 – normalizing selection on body weight
11 - 4
To help visualize how survival varies as a function of body mass, based on this equation, consider
the following figure:
We see that survival first rises with increasing body mass, then eventually declines – this represents
‘normalizing’ selection, since survival is ‘maximized’ for birds that are neither too heavy nor too light
(right about now, some of the hard core evolutionary ecologists among you may be rolling your eyes,
but it is a reasonable simplification. . .).
We simulated data for 8 occasions, 500 newly marked birds per release cohort (i.e., per year). We
also made our life simple (for this example) by assuming that survival probability does not vary as a
function of time, only body mass. We set recapture probability to be 0.7 for all birds, whereas survival
probability was set as a function of a randomly generated body mass (with mean of 110 mass units).
We’ll deal with the complications of time-variation in a later example.
Here is a ‘piece’ of the simulated data set (contained in indcov1.inp):
11111111
11111110
11111110
11111110
11111110
1
1
1
1
1
120.71
86.26
118.23
72.98
101.52
14570.24;
7440.76;
13978.42;
5325.47;
10305.69;
Several things to note. First, and perhaps obviously, in order to use individual covariate data, you
must include the encounter history for each individual in the data file – you can’t summarize your data
by calculating the frequency of each encounter history as you may have done earlier (see Chapter 2 for
the basic concepts if you’re unsure). Each line of the .INP file contains an individual encounter history.
The encounter history is followed immediately by a single digit ‘1’, to indicate that the frequency of this
individual history is 1 (or, that each line of data in the .inp file corresponds to 1 individual).
Chapter 11. Individual covariates
11.2.1. Specifying covariate data in MARK
11 - 5
What about the next 2 columns? Consider the following line from the data file:
11111111
1
120.71
14570.24;
The values 120.71 and 14,570.24 refer to the mass of this individual bird (i.e., mass in the equation), and
the square of the mass (i.e., mass2 in the equation 14,570.24 (120.71)2). Now, in this example, we’ve
‘hard-coded’ the value of the square of body mass right in the .INP file. While this may, on occasion, be
convenient, we’ll see later on that there are situations where you don’t want to do this, where it will be
preferable to let MARK ‘handle the calculation of the covariate functions (squaring mass, in this case)
for you’.
So, for each bird, we have the encounter history, the number ‘1’ to indicate 1 bird per history, and then
one or more columns of ‘covariates’ – these are the individual values for each bird – in this example,
corresponding to mass and the square of the mass, respectively.
Finally, what about missing values? Suppose you have individual covariate data for some, but not all
of the individuals in your data set. Well, unfortunately, there is no simple way to handle missing values.
You can either (i) use the mean value of the covariate, calculated among all the other individuals in the
data set, in place of the missing value, or (ii) discard the individual from the data set. Or, alternatively,
you can discretize the covariates, and use a multi-state approach. The general problem of missing
covariates, time-varying covariates and so forth is discussed later in this chapter (section 11.6).
That’s about it really, as far as data formatting goes. The next step involves bring these data into
MARK, and specifying which covariates you want to use in your analyses, and how.
11.2.1. Specifying covariate data in MARK
Start program MARK, and begin a new project – ‘recaptures only’. We will use the live encounter
data contained in indcov1.inp – 8 occasions, ‘standard’ mark-recapture ‘LLLLL’ format. The encounter
data for each individual are accompanied by 2 individual covariates for each individual, which we’ll
call mass (for mass) and mass2 (for mass2 ). At this point, we need to ‘tell’ MARK we have 2 individual
covariates (below):
Next, we want to give the covariates some ‘meaningful’ names, so we click the ‘Enter Ind. Cov.
Names’ button. We’ll use mass and mass2 to refer to body mass and body mass-squared, respectively
(shown at the top of the next page). That’s it! From here on, we refer to the covariates in our analyses
by using the assigned labels mass and mass2.
Chapter 11. Individual covariates
11.2.2. Executing the analysis
11 - 6
11.2.2. Executing the analysis
In this example, we simulated data with a constant survival and recapture probability over time. Thus,
for our starting model, we will modify the model structure to reflect this – in other words, we’ll start by
fitting model {ϕ. p . }. Go ahead and set up this model using your preferred method (by either modifying
the PIMs directly, or modifying the PIM chart), and run it. When you run MARK, you’ll notice that it
seems to take a bit longer to start the analysis. This is a result of the fact that this is a fairly large simulated
data set, and that you are not using summary encounter histories – because we’ve told MARK that the
data file contains individual covariates, MARK will build the likelihood piece by piece – or, rather,
individual by individual. This process takes somewhat longer than building the likelihood from data
summarized over individuals.
Add the results to the browser. Let’s have a look at the 2 reconstituted parameter estimates:
Start with parameter 2 – the recapture probability. The estimate of 0.7009 is very close to the ‘true’
value of p 0.70 used in simulating the data (not surprising they should be so close given the size of
the data set). What about the first parameter estimate – ϕ̂ 0.568? This is the estimate of the apparent
survival probability assuming (i) no time variation, and (ii) all individuals are the same. Clearly, it is
this second assumption which is most important here, since we know (in this case) that all individuals
in this data set are not the same – there is heterogeneity among individuals in survival probability, as
a function of individual differences in body mass.
Thus, we expect that a model which accounts for this heterogeneity will fit significantly better than a
model which ignores it. Where does the value of 0.568 come from? Remember that the actual probability
of survival was set in the simulation to be a function of body mass:
ϕ −0.039 + 0.0107(mass) − 0.000045(mass2 )
The data were simulated using a normal distribution with mean 110 mass units, and a standard
deviation of 25. Thus, the value of 0.568 is the mean survival probability expected given the normal
distribution of body mass values, and the function relating survival to body mass. However, if you put
Chapter 11. Individual covariates
11.2.2. Executing the analysis
11 - 7
the value of ’110’ into this equation, you get an estimate of survival of ϕ̂ 0.594, which is somewhat
different from the reported value of ϕ̂ 0.568. Why? Because what MARK is reporting is the mean
survival of the data set as a whole: if you were to take all of the mass data in the input file, run each
individual value for mass through the preceding equation, and take the mean of all of the generated
values of ϕ, you would get an estimate of ϕ̂ 0.566, which is basically identical to the value reported
by MARK.
But, back to the question at hand – as suggested, we expect a model which incorporates individual
covariates (body mass) to fit better than a model which ignores these differences. How do we go about
fitting models with covariates? Simple – we include the individual covariate(s) in the design matrix.
All linear models which include individual covariates must be built using a design matrix!
In fact, including individual covariates in the design matirix is often straightforward. For our present
example, we’re effectively performing a multiple regression analysis. We want to take our starting model
{ϕ. p . } and constrain the estimates of survival to be functions of body mass, and (if we believe that
normalizing selection is operating), the square of body mass. These were the 2 covariates contained in
the input file (mass and mass2, respectively).
To fit a model with both mass and mass2, we need to modify the design matrix for our starting model.
We can do this in several ways, but as a test of your understanding of the design matrix (discussed
at length in Chapter 6), we’ll consider it the following way. Our starting model is model {ϕ. p . }. One
parameter for survival and recapture probability, respectively. Thus, the starting design matrix will be
a (2 × 2) matrix. We want to modify this starting model to now include terms for mass and mass2. We
want to constrain survival probability to be a function of both of these covariates.
Remembering what you know about linear models and design matrices, you should recall that this
means an intercept term, and one term (‘slope’) for mass and mass2, respectively. Thus, 3 terms in total, or,
more specifically, 3 columns in the design matrix for survival, and 1 column for the recapture probability.
Let’s look at how to do this. Select ‘design matrix | reduced’.
This will spawn a window asking you to specify the number of covariate columns you want.
Translation – how many total columns do you want in your design matrix. As noted above, we want 4
columns – 3 to specify the survival parameter, and 1 to specify the encounter probability (since this is
the parameter structure specified by the PIMs we created when we started). So, enter ‘4’.
Chapter 11. Individual covariates
11.2.2. Executing the analysis
11 - 8
Once you have entered the number of covariate columns you want in the design matrix, and clicked
the ‘OK’ button, you’ll be presented with an ‘empty’ (4 × 2) design matrix.
To start with, let’s move the grey ‘Parm’ column one column to the right, just to make things a bit
clearer.
Now, all we need to do is add the appropriate values to the appropriate cells of the design matrix.
If you remember any of the details from Chapter 6, you might at this moment be thinking in terms of
‘0’ and ‘1’ dummy variables. Well, you’re not far off. We do more or less the same thing here, with one
twist – we use the names of the covariates explicitly, rather than dummy variables, for those columns
corresponding to the covariates.
Let’s start with the probability of survival. We have 3 columns in the design matrix to specify survival:
1 for the intercept, and 1 each for the covariates mass and mass2, respectively. For the intercept, we enter
a ‘1’ in the first cell of the first column. However, for the 2 covariate columns (columns 2 and 3), we enter
the labels we assigned to the covariates, mass and mass2. For the recapture parameter, we simply enter
a ‘1’ in the lower right-hand corner. The completed design matrix for our model is shown below:
That’s it! Go ahead and run this model. When you click on the ‘Run’ icon, you’ll be presented with
Chapter 11. Individual covariates
11.2.2. Executing the analysis
11 - 9
the ‘Setup Numerical Estimation Run’ window. We need to give our model a title. We’ll use ‘phi(mass
mass2)p(.)’ for the model specified by this design matrix. Again, notice that the sin link is no longer
available – recall from Chapter 6 that the sin link is available only when the identity design matrix is
used. The new ‘default’ is the logit link. We’ll go ahead and use this particular link function.
Now, before we run the model, the first ‘complication’ of modeling individual covariates. On the right
hand side of the ‘Setup Numerical Estimation Run’ window, you’ll notice a list of various options. Two
of these options refer to ‘standardizing’ – the first, refers to standardizing the individual covariates. The
second, specifies that you do not want to standardize the design matrix. These two ‘standardization’
check boxes are followed by a nested list of suboptions (which have to do with how the real parameter
estimates from the individual covariates are presented – more on this later).
The first check box (standardize individual covariates) essentially causes MARK to ‘z-transform’ your
individual covariates. In other words, take the value of the covariate for some individual, subtract from it
the mean value of that covariate (calculated over all individuals), and divide by the standard deviation of
the distribution of that covariate (again, calculated over all individuals). The end result is a distribution
for the transformed covariate which has a mean of 0.0, and a standard deviation of 1.0, with individual
transformed values ranging from approximately (−3 → +3) (depending on the distribution of the
individual data). One reason to standardize individual covariates in this way is to make all of your
covariates have the same mean and variance, which can be useful for some purposes.
Another reason is as an ad hoc method for accommodating any missing values in your data – if you use
the z-transform standardization, the mean of the covariates over all individuals is 0, and thus missing
data could simply be coded with 0 (which, again, is the mean of the transformed distribution). If you
compute the mean of the non-missing values of an individual covariate, and then scale the non-missing
values to have a mean of zero, the missing values can be included in the analysis as zero values, and
will not affect the slope of the estimated β. However, this ‘trick’ is not advisable for a covariate with
a large percentage of missing values because you will have little to no power. [The issue of ‘missing
values’ is treated more generally in a later section of this chapter.] While these seem fairly reasonable
and innocuous reasons to use this standardization option, there are several reasons to be very careful
when using this option, as discussed in the following -sidebar-. In fact, it is because of some of these
complications that the default condition for this standardization option is ‘off’.
Chapter 11. Individual covariates
11.2.2. Executing the analysis
11 - 10
What about the second option – ‘Do not standardize (the) design matrix’? As noted in the MARK
help file, it is often helpful to scale the values of the covariates to ensure that the numerical optimization
algorithm finds the correct parameter estimates. The current version of MARK defaults to scaling your
covariate data for you automatically (without you even being aware of it). This ‘automatic scaling’ is
done by determining the maximum absolute value of the covariates, and then dividing each covariate
by this value. This results in each column scaled to between -1 and 1. This internal scaling is purely
for purposes of ensuring the success of the numerical optimization – the parameter values reported
by MARK (i.e., in the output that you see) are ‘back-transformed’ to the original scale. There may be
reasons you don’t want MARK to perform this ‘internal standardization’ – if so, you simply check the
‘Do not standardize (the) design matrix’ button.
begin sidebar
when to standardize – careful!
While using the z-transform standardization on your individual covariates may appear reasonable, or
at the least, innocuous, you do need to think carefully about when, and how, to standardize individual
covariates. For example, when you specify a model with a common intercept but 2 or more slopes for
the individual covariate, and instruct MARK to standardize the individual covariate, you will get a
different value of the deviance than from the model run with unstandardized individual covariates.
This behavior is because the centering effect of the standardization method affects the intercept
differently depending on the value of the slope parameter. The effect is caused by the nonlinearity
of the logit link function. You get the same effect if you standardize variables in a logistic regression,
and run them with a common intercept. The result is that the estimates are not scale independent,
but depend on how much centering is performed by subtracting the mean value. In other words,
situations can arise where the real parameter estimates and the model’s AIC differ between runs
using the standardized covariates and the unstandardized covariates. This situation arises because
the z-transformation affects both the slope and intercept of the model. For example, with a logit link
function and the covariate x1 ,
logit(S) β1 + β2 x1 − x¯1 /SD1
β1 − β2 x¯1 /SD1 + (β2 /SD1 )x1
where the intercept is the quantity shown in the first set of brackets, and the second bracket is the
slope. This result shows the conversion between the β parameter estimates for the standardized
covariate and the β parameter estimates for the untransformed covariate, i.e., the intercept for the
untransformed analysis would correspond to the quantity in the first set of brackets, and the slope for
the untransformed analysis would correspond to the quantity in the second set of brackets. All well and
good so far, because the model with a standardized covariate and the model with the unstandardized
covariate will result in identical models with identical AICc values.
However, now consider the case where we have 2 groups, and want to build a model with different
slope parameters for each group’s individual covariate values, but a common intercept. In this example,
x1 and x2 are considered to be the same individual covariate, each standardized to the overall mean
and SD, but with values specific to group 1 (x1 ) or group 2 (x2 ). The unstandardized model would look
like:
Group 1: logit(S1 ) β1 + β2 x1
Group 2: logit(S2 ) β1 + β3 x2
Unfortunately, when the individual covariates are standardized, the result is:
Group 1: logit(S1 ) (β0 − β1 x̄1 /SD) + (β1 /SD)x1
Group 2: logit(S2 ) (β0 − β2 x̄2 /SD) + (β2 /SD)x2
In this case, the intercepts for the 2 groups are no longer the same with the standardized covariates,
Chapter 11. Individual covariates
11.2.2. Executing the analysis
11 - 11
resulting in a different model with a different AICc value than for the unstandardized case. This
difference causes the AIC values for the 2 models to differ because the real parameter estimates differ
between the 2 models.
An alternative to this z-transformation is to use the product function in the design matrix (c.f. p. 20)
to multiply the individual covariate by a scaling value. As an example, suppose the individual covariate
Var ranges from 100 to 900. Using the design matrix function product(Var,0.001) in the entries of the
design matrix would result in values ranging from 0.1 to 0.9, and would result in 3 more significant
digits being reported in the estimates of the β parameter for this individual covariate.
end sidebar
Acknowledging the need for caution discussed in the preceding -sidebar-, for purposes of demonstration, we’ll go ahead and run our model, using the z-transformation on the covariate data (by checking
the ‘Standardize Individual Covariates’ checkbox). Add the results to the browser.
First, we notice right away that the model including the 2 covariates fits much better than the model
which doesn’t include them – so much so that it is clear there is effectively no support for our naïve
starting model.
Do we have any evidence to support our hypothesis that there is normalizing selection on body mass?
Well, to test this, we might first want to run a model which does not include the mass2 term. Recall that it
was the inclusion of this second order term which allowed for a decrease in survival with mass beyond
some threshold value. How do you run the model with mass, but not mass2? The easiest way to do this
is to simply eliminate the column corresponding to mass2 from the design matrix. So, simply bring the
design matrix for the current model up on the screen (by retrieving the current model), and delete the
column corresponding to mass2 (i.e., delete column 3 from the design matrix). The modified design
matrix now looks like:
Go ahead and run this model – again using standardized covariates. Call this model ‘phi(mass)p(.)’.
Add the results to the results browser (shown at the top of the next page).
Chapter 11. Individual covariates
11.2.2. Executing the analysis
11 - 12
Note that the model with mass only (but not the second order term) fits better than our general starting
model, but nowhere near as well as the model including both mass and mass2 – it has essentially no
support. In other words, our model with both mass and mass2 is clearly the best model for these data
(this is not surprising, since this is the very model we used to simulate the data in the first place!).
So, at this stage, we could say with some assurance that there is fairly strong support for the hypothesis
that there is normalizing selection on body mass. However, suppose we want to actually look at the
‘shape’ of this function. How can we derive the function relating survival to mass, given the results
from our MARK? In fact, it’s fairly easy, if you remember the details concerning the logit transform, and how
we standardized our data.
To start, let’s look at the output from MARK for the model including mass and mass2 (shown at
the top of the next page). In this case, it’s easier to use the ‘full results’ option (i.e., the option in
the browser toolbar which presents all of the details of the numerical estimation). Scroll down until
you come to the section shown at the top of the next page. Note that we have 3 sections of the output
at this point. In the first section we see the estimated logit function parameters for the model. There
are 4 β values, corresponding to the 4 columns of the design matrix (the intercept, mass, mass2 and
the encounter probability, p, respectively). These parameters, in fact, are what we need to specify the
function relating survival to body weight.
In fact, if you think about it, only the first 3 of these logit parameters are needed – the last one refers
to the encounter probabiliity, which is not a function of body mass. What is our function? Well, it is
logit(ϕ̂) 0.256733 + 1.1750545(masss ) − 1.0555046(mass2s )
Note that for the two mass terms, we have added a small subscript ‘s’ – reflecting the fact that these
are ‘standardized’ masses. Recall that we standardized the covariates by subtracting the mean of the
covariate, and dividing by the standard deviation. Thus, for each individual,
m − m̄
m2 − m̄2
− 1.0555
logit(ϕ̂) 0.256733 + 1.17505
SDm
SDm2
In this expression, m refers to mass and m2 refers to mass2.
The output from MARK (shown at the top of the next page) actually gives you the mean and standard
deviations for both covariates. For mass, mean = 109.97, and SD = 24.79, while for mass2, the mean =
12,707.46, and the SD = 5,532.03. The ‘value’ column shows the standardized values for mass and mass2
(0.803 and 0.752) for the first individual in the data file. Let’s look at an example. Suppose the mass of
the bird was 110 units. Thus mass = 110, mass2 = 1102 12,100. Thus,
!
!
12,100 − 12,707.46
110 − 109.97
logit(ϕ̂) 0.2567 + 1.17505
− 1.0555
0.374.
24.79
5,532.03
Chapter 11. Individual covariates
11.2.2. Executing the analysis
11 - 13
So, if logit(ϕ̂) 0.374, then how do we get the reconstituted values for survival? Recall that
θ
logit(θ) log
α + βx
1−θ
and
θ
e α+βx
1 + e α+βx
Thus, if logit(ϕ̂) 0.374, then the reconstituted estimate of ϕ, transformed back from the logit scale
is e 0.374 /(1 + e 0.374 ) 0.592. Thus, for an individual weighing 110 units, expected annual survival
probability is 0.592. How well does the estimated function match with the ‘true’ function used to
simulate the data? Let’s plot the observed versus expected values:
Chapter 11. Individual covariates
11.2.2. Executing the analysis
11 - 14
As you can see from the plot, the fit between the values expected given the ‘true’ function (solid
black line) and those based on the function estimated from MARK (red dots) are quite close, as they
should be. The slight deviation between the two is simply because the simulated data are simply one
realization of the stochastic process governed by the underlying survival and recapture parameters.
Note: in the preceding, we’ve described the mechanics of reconstituting the parameter estimate – this
basically involves back-transforming from the logit scale to the normal [0, 1] probability scale. What
about reconstituting the variance, or SE of the estimate, on the normal scale? This is somewhat more
complicated. As briefly introduced in Chapter 6, reconstituting the sampling variance on the normal
scale involves use of something known as the ‘Delta method’. The Delta method, and its application to
reconstituting estimates of sampling variance is discussed at length in Appendix B.
begin sidebar
AIC, BIC – example of the difference
Back in Chapter 4, we briefly introduced two different information theoretic criteria which can be used
to assist in model selection, the AIC (which we’ve made primary use of), and the BIC. Recall that we
briefly discussed the differences between the two – noting that (in broad, simplified terms), the AIC
has a tendency to pick overly complex models – especially if the ‘true’ model structure is complex,
whereas the BIC has a tendency to pick overly simple models when the reverse is true.
We can demonstrate these differences by contrasting the results of model selection using AIC or
BIC for our analysis of the normalizing selection data. To highlight differences between the two, we’ll
consider the following 4 models: {ϕ. p . }, {ϕ(mas s) p . }, {ϕ(mas s ,mas s 2) p . }, and ϕ(mas s ,mas s 2 ,mas s 3) p . }.
Recall that the true model used to generate the simulated data was model {ϕ(mas s ,mas s 2) p . }. So, our
candidate model set consists of two models which are simpler than the ‘true’ model, and one model
that is more complex than the ‘true’ model.
Here are the results from fitting the model set to the data, using AIC as the model selection criterion:
Note that although model {ϕ(mas s ,mas s 2) p . } is the true generating model, it was not the most
parsimonious model using AIC – in fact, it was 5-6 times less well supported than was a more complex
model {ϕ(mas s ,mas s 2,mas s 3) p . }.
What happens if we use BIC as our model selection criterion? (Remember this can be accomplished
by changing MARK’s preferences; ‘File | Preferences’). If you look at the results browser at the top
of the next page, you’ll see that the BIC selected what we know to be the ‘true’ model {ϕ(mas s ,mas s 2) p . }
– the next best model {ϕ(mas s ,mas s 2,mas s 3) p . } was 5-6 times less well supported than was the most
parsimonious model.
Chapter 11. Individual covariates
11.3. A more complex example – time variation
11 - 15
So, is this an example of BIC ‘doing better’ when the true model is relatively simple? Or is the fact
that the BIC picked the right model an artifact of the inclusion of the right model in the candidate model
set (a point of some contention in the larger discussion)? Our point here is not to make conclusions one
way or the other. Rather, it is merely to demonstrate the fact that different model selection criterion can
yield quite different results (conclusions) – so much so (at least on occasion) that it will be worth you
spending some time thinking hard about the general question, and reading the pertinent literature.
Particularly good starting points are Burnham & Anderson (2004) and Link & Barker (2006).
end sidebar
11.3. A more complex example – time variation
In the preceding example, we made life simple by simulating some data where there was no variation
in either survival or recapture rates over time. In this example, we’ll consider the more complicated
problem of handling data where there is potential variation in survival over time.
We’ll use the same approach as before, except this time we will simulate some data where survival
probability is a complex function of both mass and cohort. In this case, we simulated a data set having
normalizing selection in early cohorts, with a progressive shift towards diversifying selection in later
cohorts. Arguably, this is a rather ‘artificial’ example, but it will suffice to demonstrate some of the
considerations involved in using MARK to handle temporal variation in the relationship between
estimates of one or more parameters and one or more individual covariates.
The data for this example are contained in indcov2.inp. We simulated 8 occasions, and assumed a
constant recapture rate (p 0.7) for all individuals in all years. The data file contains 2 covariates – mass
and mass2 (as in the previous example). As with the first example, we start by creating a new project,
and importing the indcov2.inp data file. Label the two covariates mass and mass2 (respectively).
We will start by fitting model {ϕ t p . }, since this is structurally consistent with the data, and will
provide a reasonable starting point for comparisons with subsequent models. Go ahead and add the
results of this model to the results browser.
Now, to fit models with both individual covariates, and time variation in the relationship between
survival and the covariates, we need to think a bit more carefully than in our first example. If you
understood the first example, you might realize that to do this, we need to modify the design matrix.
However, how we do this will depend on what hypothesis we want to test. For example, we might
believe that the relationship between survival and mass changes with each time interval. Alternatively,
we might suppose there is a common intercept, but different slopes for each interval. It is important to
consider carefully what hypothesis you want to test before proceeding.
We’ll start with the hypothesis that the relationship between survival and mass changes with each
time interval. With a bit of thought, you might guess how to construct this design matrix. In the previous
example, we used 3 columns to specify this relationship – representing the intercept, mass and mass2,
respectively. However, in the first example, we assumed that this model was constant over all years.
So, what do we do if we believe the relationship varies from year to year? Easy, we simply have 3
columns for each interval in the design matrix for survival (with 1 additional column at the end for
Chapter 11. Individual covariates
11.3. A more complex example – time variation
11 - 16
the constant recapture probability). So, 7 intervals = 21 columns for survival, plus 1 column for the
recapture probability. How many rows? Remembering from Chapter 6, 8 rows total – 7 rows for the 7
survival intervals, and 1 for the constant recapture rate.
So, let’s go ahead and construct the design matrix for this model, using the ‘Design | Reduced’
menu option we discussed previously. We’ll start simply, using a DM based on the basic structure of
the identity matrix – recall that for an identity DM, each row corresponds to a ‘time-specific regression
model’, since each row has its own intercept (see Chapter 6). Or, put another way, each interval ‘has
its own multiple regression line – separate intercept, separate slope(s) – relating survival to mass and
mass2’.
This matrix (shown below) is sufficiently big such that it’s rather difficult to see the entire structure
at once.
To help you visualize it, let’s look at just a small piece of this design matrix:
As you can see, for each survival interval, we have 3 columns – 1 intercept, and 1 column each for
mass and mass2, respectively. So, the columns B1, B2 and B3 correspond to interval 1, B4, B5 and B6 for
interval 2, and so on. You simply do this for each of the 7 survival intervals. The bottom right-hand cell
of the matrix (shown on the preceding page) contains a single ‘1’ for the constant encounter probability.
Call this model ‘phi(t * mass mass2)p(.) - separate intcpt’, and run it – remember to standardize
the covariates before running the model. Add the results to the browser.
Again, note that the model constrained to be a function of mass and mass2 fits much better than our
naïve starting model. Not surprising, since the data were simulated under the assumption that survival
varies as a function of mass and mass2, and that the function relating survival to both covariates changes
over time (i.e., we just fit the true model to the data).
Chapter 11. Individual covariates
11.4. The DM & individual covariates – some elaborations
11 - 17
Of course, in practice, we don’t know what the true model is, so we fit a set of approximating models.
How do we construct those models if the include one or more individual covariates? In the following, we
discuss various ways to construct design matrices – in principle, we use the same ideas and mechanics
introduced in Chapter 6. However, the design matrices ‘look somewhat different’ when they include
one or more individual covariates.
11.4. The DM & individual covariates – some elaborations
Suppose you want to fit a model with different intercepts and different slopes for each year. In other
words, the same model we just built. Start by considering what such a model means. In the following
figure, each line represents the relationship (which we assume here is strictly linear) between the
parameter, ϕ, and the individual covariate, mass, for each of the 7 years in the study (i.e., separate slope
and intercept for each year):
As we’ve already seen (above), you could accomplish this by adding an ‘intercept’ and ‘slope’
parameter(s) to each row for the parameter in question (i.e., using a identity-like structure, have a
‘separate regression’ for each interval). So, for a simple linear model of survival as a function of mass,
we could use could use something like the following:
Chapter 11. Individual covariates
11.4. The DM & individual covariates – some elaborations
11 - 18
However, a more flexible way to model this would have been to use:
In other words, a column of ‘1’s for the intercept, a column for the covariate (mass, m), and then the
columns of dummy variables corresponding to each of the time intervals (t1 → t6), and then columns
reflecting the interaction of the the covariate and time. You might recognize this as the same analysis
of covariance (ANCOVA) design you saw back in Chapter 6. If you take this design matrix, and run it,
you’ll see that you get exactly the same results as you did with the design matrix we used initially –
each leads to time-specific estimates of the slope and intercept.
So, if they both yield the ‘same results’, why even consider this more formal design matrix? As we
noted in Chapter 6, the biggest advantage is that using this more complete (formal) design matrix allows
you to test some models which aren’t possible using the first approach.
For example, consider the additive model – where we have different intercepts, but a common slope
among years:
In other words, testing model
ϕ time + mass
as opposed to the first model which included the (time.mass) interaction (i.e., where the slopes and
intercepts vary among years):
ϕ time + mass + time.mass
As we discussed in Chapter 6, this sort of additive model can only be fit using this formal designmatrix approach.
Chapter 11. Individual covariates
11.4. The DM & individual covariates – some elaborations
11 - 19
So, to fit this model – where we have different intercepts, but a common slope among years – we
simply delete the interaction columns. It’s that simple!
Here is the reduced design matrix:
If instead you wanted a common intercept for all years, but different slopes for mass for each year,
then the DM would look like:
Now that you have the general idea, let’s consider constructing a set of models to test various (madeup) hypotheses concerning the encounter data in indcov2.inp.
Chapter 11. Individual covariates
11.4. The DM & individual covariates – some elaborations
11 - 20
We’ll suppose that we’re interested in fluctuating selection for survival as a function of body mass.
Meaning, we suspect that survival varies as a function of body mass (in a potentially non-linear way),
and that the pattern of variation varies over time. So, we’ll consider a set of models where we fit
both first- and second-order polynomial of survival as a function of mass (i.e., survival = f (mass), and
survival=f (mass+mass2)), with and without variation in that function over time. We’ll start with the
most general model - survival as a second-order function of mass, with time variation in the slope and
the intercept of that function: {ϕ time.(m+m 2 ) , p . }.
In fact, we built precisely this model in the preceding section, but, using the following design matrix,
with a separate intercept for each time interval:
Here, we’ll build the exact same model, but using a common intercept for all time intervals. If you
followed what we did earlier in this section, you should have a pretty good guess what it might look
like. We know from above that we need 21 columns for survival.
Here is the DM:
The models are entirely equivalent – in terms of fit, and reconstituted parameter estimates. So is there
an advantage of one over the other (i.e., common intercept, versus separate intercepts)? The common
intercept approach makes it easier to fit models with specific types of constraint – for example, additive
models. On the other hand, interpreting interval-specific intercepts and slopes from the DM built using
separate intercepts is somewhat more straightforward that when using a common intercept.
For example, if you look at the parameter (β) estimates from the ‘separate intercept’ approach, you will
see that they correspond to what we expected (given the model under which the data were simulated):
in the early cohorts the sign of the slope for mass is positive, and for mass2 is negative – consistent with
normalizing selection. In later cohorts, the signs are consistent with increasingly disruptive selection.
In contrast, to figure out what is going on when you use a ‘common intercept’ approach, where each
estimated slope is interpreted relative to a reference level (by default, the final time interval), requires
more work.
This distinction between the ‘separate intercept’ approach (which in effect amounts to using an
identity matrix), and the ‘common intercept’ approach (where the slopes reflect variation of levels
Chapter 11. Individual covariates
11.4. The DM & individual covariates – some elaborations
11 - 21
of a factor – say, time – relative to a reference level of that factor) were introduced in Chapter 6. We’ll
consider a more direct way to ‘parse out the pattern’ – by graphing the relationships directly – later in
this chapter.
For the moment, we’ll continue building models using the ‘common intercept’-based DM as our
starting structure. Let’s now consider a model that does not have time variation in the relationship
between survival and body mass. All we need do is modify our general DM (with the common intercept
for all time intervals), by eliminating the time columns, and the columns showing the interaction of mass
with time:
Finally, suppose you want to test the hypothesis that there is a common intercept for each year, but a
different slope. How would you modify the design matrix for our general model to reflect this? Well, by
now you might have guessed – you simply have 1 column for an intercept for all 7 intervals, and then
multiple columns for the mass and mass2 terms for each interval:
which you might now realize is entirely equivalent to
It is worth noting that when you specify a model with a common intercept but 2 or more slopes
for the individual covariate, and standardize the individual covariate, you will get a different value of
Chapter 11. Individual covariates
11.4. The DM & individual covariates – some elaborations
11 - 22
the deviance than from the model run with unstandardized individual covariates. This is because the
centering effect of the standardization method affects the intercept differently depending on the value of
the slope parameter. The effect is caused by the nonlinearity of the logit link function. You get the same
effect if you standardize variables in a logistic regression, and run them with a common intercept. The
result is that the estimates are not scale independent, but depend on how much centering is performed
by subtracting the mean value.
begin sidebar
Design Matrix Functions
A number of special functions are allowed as entries in the design matrix: add, product, power, min, max,
log, exp, eq (equal to), gt (greater than), ge (greater than or equal to), lt (less than), and le (less than
or equal to). These names can be either upper- or lower-case. You should not include blanks within
these function specifications to allow MARK to properly retrieve models with these functions in their
design matrix.
As shown below, these functions can be nested to create quite complicated expressions, which may
require setting a larger value of the design matrix cell size (something you can specify by changing
MARK’s preferences – ‘File | Preferences’).
These two functions require 2 arguments. The add function adds the 2 arguments together, whereas
the product function multiplies the 2 arguments. The arguments for both functions must be one of the
3 types allowed: numeric constant, an individual covariate, or another function call.
The following design matrix demonstrates the functionality of these 2 functions, where wt is an
individual covariate.
1
1
1
1
1
1
1
1
1
0
0
0
1
2
3
wt
wt
wt
wt
wt
wt
product(1,wt)
product(2,wt)
product(3,wt)
product(1,wt)
product(2,wt)
product(3,wt)
product(wt,wt)
product(wt,wt)
product(wt,wt)
product(wt,wt)
product(wt,wt)
product(wt,wt)
The use of the add function in column 3 is just to demonstrate examples; it would not be used in
a normal application. In each case, a continuous variable is created by adding constant values. The
results are the values 1, 2, and 3, in rows 4, 5, and 6, respectively.
Column 5 of the design matrix demonstrates creating an interaction between an individual covariate
and another column (the first 3 rows) or a constant and an individual covariate (the last 3 rows). Column
6 of the design matrix demonstrates creating a quadratic effect for an individual covariate. Note that
if the 2 arguments were different individual covariates, an interaction effect between 2 individual
covariates would be created in column 6.
2. IF functions: eq (equal to), gt (greater than), ge (greater than or equal to), lt (less than), le (less
than or equal to)
These five functions require 2 arguments. The eq, gt, ge, lt, and le functions will return a zero if
the operation is false and a one if the operation is true. For each of these functions, 2 arguments (x1
and x2) are compared based on the function.
For example, eq(x1,x2) returns 1 if x1 equals x2, and zero otherwise; gt(x1,x2) returns 1 if x1
is greater than x2, zero otherwise; and le(x1,x2) returns 1 if x1 is less than or equal to x2, zero
otherwise. The arguments for these functions must be one of the 3 types allowed: numeric constant,
column variable, or an individual covariate.
Chapter 11. Individual covariates
11.4. The DM & individual covariates – some elaborations
11 - 23
The following design matrix demonstrates the functionality of both the add function and the IF
function (eq), where age is an individual covariate.
1
1
1
1
1
1
In this particular example, the individual covariate age corresponds to the number of days before
a bird fledges from its nest (fledge day 0) and subsequently enters the study. Suppose an individual
fledges from its nest during the fourth survival period. Its encounter history (LDLD format) would
consist of ‘00 00 00 10’ and the individual would have -3 as its age covariate because the individual
did not fledge from its nest until the fourth survival period. A bird that did not fledge from its nest
until survival period 20 would have -19 as its age covariate. Think of the use of negative numbers as
an accounting technique to help identify when the individual fledges.
Column 2 of the design matrix demonstrates the use of the add function to create a continuous age
covariate for each individual by adding a constant to age. The value returned in the first row of the
second column is -3 (0 + (−3) −3). The value returned in the second row of the second column is -2
(1 + (−3) −2). The value returned in the fourth row of the second column is zero and corresponds
to fledge day 0 (3 + (−3) 0). The value returned in the fifth row of the second column is one and
corresponds to fledge day 1. Thus, column 2 is producing a trend effect of age on survival, with the
intercept of the trend model being age zero. A trend model therefore models a constant rate of change
with age on the logit scale, so that each increase in age results in a constant change in survival, either
positive or negative depending on the sign of β2 .
Now, suppose that survival is thought to be different on the first day that a bird fledges, i.e., the first
day that the bird enters the encounter history. To model survival as a function of fledge day 0, use the
eq function to create the necessary dummy variable. This is demonstrated in the third column. The eq
function returns a value of one only when the statement is true, which only occurs on the first day the
bird is fledged. Recall that the value for age of this individual is -3; therefore, the add function column
will return a value of -3 (0 + (−3) −3) in the first row. The eq function in the third column would
return a value of zero because age (-3) is not equal to zero. The eq function in the third column, fourth
row would return a value of one because age (0) is equal to (0). Note this will only be true for row four
for this particular individual; all other rows return a value of zero because they are false. Thus, the eq
function will produce a dummy variable allowing for a different survival probability on the first day
after fledging from the trend model for age which applies thereafter.
Note that the eq function in this example is using the same results of the add function from the
preceding column, and illustrates the nesting of functions.
3. power function
This function requires 2 arguments (x,y). The first argument is raised to the power of the second
argument; i.e., the result is xy . As an example, to create a squared term of the individual covariate
length, you would use power(length,2). To create a cubic term, power(length,3). So, in our normalizing selection example (first example of this chapter), we did not need to explicitly include mass2 in
the .INP file – we could have used power(mass,2) to accomplish the same thing.
4. min/max functions
The min function returns the minimum of the 2 arguments, whereas the max function returns
the maximum of the 2 arguments. These functions allow the creation of thresholds with individual
covariates. So, with the individual covariate length, the function min(5,length) would use the value
of length when the variable is < 5, but replace length with the value 5 for all lengths > 5. Similarly,
max(3,length) would replace all lengths < 3 with the value 3.
Chapter 11. Individual covariates
11.4. The DM & individual covariates – some elaborations
11 - 24
5. Log, Exp functions
These functions are equivalent to the natural logarithm function and the exponential function.
Each only requires one argument. So, for the individual covariate length = 2, log(length) returns
0.693147181, and exp(length) returns 7.389056099.
Example
These functions are useful for constructing a design matrix when using the nest survival analysis
(Chapter 17). Here, the add and ge functions are demonstrated. Stage-specific survival (egg or nestling)
could be estimated only if nests were aged and frequent nest checks were done to assess stage of failure.
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
In this particular example, the age covariate corresponds to the day that the first egg was laid in a
nest (nest day 0). Suppose a nest is initiated during the fourth survival period. Its encounter history
(LDLD format) would consist of 00 00 00 10 and the nest would have -3 as its age covariate because
the first egg was not laid in the nest until the fourth survival period.
Column 2 of the design matrix demonstrates the use of the add function to create a continuous
age covariate for each nest. The value returned in the first row of the second column is -3. The value
returned in the second row of the second column is -2. The value returned in the fourth row of the
second column is a zero and corresponds to the initiation of egg laying. The value returned in the fifth
row of the second column is one (the nest is one day old).
To model survival as a function of stage, use the ge function to quickly create the necessary dummy
variable. This is demonstrated in third column. The value of 15 is used in this example because it
corresponds to the number of days before a nest will hatch young birds. Day 0 begins with the laying
of the first egg, so values of 0 → 14 correspond to the egg stage. Values of 15 → 23 correspond to the
nestling stage. The ge function will return a value of one (nestling stage) only when the statement is
true.
Because the value of age for this nest is -3, the add function column returns a value of -3 (since
0 + −3 −3) for the first row. The ge function (third column) returns a value of zero because the
statement is false; age (-3) is not greater than or equal to 15. A value of one appears for the first time
in row 19; here, the add function returns a value of 15 (since 18 + (−3) 15). The ge function returns
a value of one because the statement is true; add(18,age) results in 15 which is greater than or equal
to 15.
The fourth column produces an age slope variable that will be zero until the bird reaches 15 days of
age, and then becomes equal to the bird’s age. The result is that the age trend model of survival now
changes to a different intercept and slope once the bird hatches.
Chapter 11. Individual covariates
11.5. Plotting + individual covariates
11 - 25
Some useful tricks
An easy way to prepare these complicated sets of functions is to use Excel to prepare the values
and then paste them into the design matrix. The following illustrates how to used the concatenate
function in Excel to concatenate together a column and a closing ‘)’ to create a complicated column of
functions that duplicate the above example.
A
B
C
D
...
Other details
The design matrix values can have up to 60 characters, and unlimited nesting of functions (within the
60 character limit). As an example, the following is a very complicated way of computing a value of 1:
log(exp(log(exp(product(max(0,1),min(1,5))))))
Before the design matrix is submitted to the numerical optimizer, each entry in the design matrix is
checked for a valid function name at the outermost level of nesting, plus that the number of ‘(’ matches
the number of ‘)’.
In previous versions of MARK, the design matrix functions were allowed to reference a value in
one of the preceding columns. This capability was removed when the ability to nest functions was
installed. No flexibility was lost with the removal of the ‘Colxx’ capability, and a considerable increase
in versatility was obtained with the nested design matrix function calls. As shown in the Excel ‘Tricks’
example above, the ability to use values from other columns is still available. The ‘Colxx’ capability
was also a very error prone method in that a column could be inserted ahead of the column being
referenced, and the entire model would become nonsense without the user realizing that a mistake
end sidebar
11.5. Plotting + individual covariates
In the first example presented in this chapter, we considered the relationship between survival and
individual body mass, under the hypothesis that there was strong ‘normalizing selection’ on mass – i.e.,
that the relationship between survival and mass was quadratic. We found that a quadratic model
logit(ϕ̂) 0.256733 + 1.1750545(masss ) − 1.0555046(mass2s )
had good support in the data. We discussed briefly the mechanics of reconstituting the estimates of
survival on the normal probability scale – the complication is that you need to generate a reconstituted
value for each plausible value of the covariate(s) in the model. In fact, this is not particularly challenging
for simple models such as this. Because the linear model consists of a covariate (mass) plus a function
of the covariate (mass2 ), it is relatively trivial to code this into a spreadsheet and generate a basic plot
of predicted survival values over a range of values for mass. In fact, this is effectively what was done to
generate the plot of predicted versus observed values we saw earlier (example on p. 14).
But, there are no confidence bounds on the predicted value function. The calculation of 95% CI for
this function requires use of the Delta method – although not overly difficult to apply (the Delta method
is discussed at length in Appendix B), it can be cumbersome and time consuming to program.
Fortunately, MARK has a plotting tool that make it convenient to generate a plot of predicted values
Chapter 11. Individual covariates
11.5. Plotting + individual covariates
11 - 26
from models with individual covariates, which includes the estimated 95% CI. MARK also makes it
possible to output the data (including the data corresponding to the 95% CI) to a spreadsheet.
Let’s demonstrate this for the analysis we previously completed on the normalizing selection data in
indcov1.inp. Open up the .DBF file corresponding to those results, and retrieve the most parsimonious
model from the model set we fit to those data {ϕmass mass2 p . }. Then,click on the ‘Individual Covariate
Plot’ icon in the main MARK toolbar:
This will bring up a new window which will allow you to specify key attributes of the plot:
Notice that the title of the currently active model is already inserted in the title box. Next, are two
boxes where you specify (i) which parameter you want to plot, and (ii) which individual covariate you
want to plot. In our model, there are 2 different individual covariates – mass and mass2.
So, first question – which one to plot? If you look back at the figure at the bottom of p. 13, you’ll
see that we’re interest in plotting ‘survival’ versus ‘mass’. So, if our goal is to essentially replicate these
plots, with the addition of 95% CI, using this individual covariate plot tool in MARK, it would seem to
make sense that we should specify mass as the covariate we want to plot.
Finally, two boxes which allow us to specify the numerical range of the individual covariate to plot.
Also notice the check box you can check if you want to output the various estimates that go into the
Chapter 11. Individual covariates
11.5. Plotting + individual covariates
11 - 27
OK – seems easy enough. Let’s start by clicking on the survival parameter ‘Phi’.
As soon as we do so, the window ‘updates’, and now presents you with the ‘Design Matrix Row’.
For this example, the DM has only 2 rows, so what is presented is in fact the linear model itself.
Next, we click on ‘mass’ to specify that as the individual covariate we want to plot. The window
immediately updates – and spawns a new box in the process.
As you can see, the range of covariate values has been updated showing the maximum and minimum
values that are actually in the .INP file. You can change these manually as you see fit (usual caveats about
extrapolating a plot outside the range of the data apply).
Now, what about the new box – showing mass2 set to 12,707.4638? First, you might recognize the
number 12,707.4638 as the square of the mean mass of all individuals in the sample. But, why is a box
for mass2 there in the first place? It’s there because the linear model that MARK is going to plot has 2
covariates – mass and mass2.
Chapter 11. Individual covariates
11.5. Plotting + individual covariates
11 - 28
OK – so what does MARK actually plot? Well, if you click the ‘OK’ button, MARK responds with
which doesn’t look remotely like the quadratic curve we were expecting. What is actually being plotted?
Well, if you think about it for a moment, it should be clear that MARK is plotting the functional
relationship between survival and mass, holding the value of mass2 constant at the mean value! Different
values of mass2 would yield different plots.
So, MARK isn’t doing anything wrong – it’s simply plotting what you told it to plot. MARK generates
a 2-D plot between some parameter and one covariate. If there are other covariates in the model, then
it needs to know what to do with them. Clearly, if there were only 2 covariates in the model, you could
construct a 3-D plot (the two covariates on the x- and y-axes, and the parameter on the z-axis), but what
if you had > 2 covariates? If would be difficult to program MARK to accommodate all permutations in
the plot specification window, so it defaults to 2-D plots, meaning (i) you plot a parameter against only
one covariate, and (ii) you need to tell MARK what to do with the other covariates.
So, how do you tell MARK to plot survival versus mass and mass2 together, as a single 2-D plot? The
key is in specifying the relationship between mass and mass2 explicitly – in effect, telling MARK that
mass2 is in fact just (mass × mass). MARK doesn’t ‘know’ that the second covariate (mass2) is a simple
function of the first (mass). MARK doesn’t know this because you haven’t told MARK that this is the
case. In your DM, you simply entered mass and mass2 as label names for the covariates, which were in
fact ‘hard-coded’ in the .INP file. You (the user) know what they represent, but all MARK sees are two
different covariates with two different labels.
So, if you can’t pass this information to MARK in the plot specification window, where can you do
so? Hint: what was the subject of the last -sidebar- presented several pages back? Looking back, you’ll
see that we introduced a series of ‘design matrix functions’, which included power and product. In our
current analysis, we coded for mass and mass2 explicitly in the DM by entering the labels corresponding
to the mass and mass2 covariates, which were hard-coded into the .INP file. As such, we know what
the covariates represent, but MARK doesn’t – it only knows the label names.
Chapter 11. Individual covariates
11.5. Plotting + individual covariates
11 - 29
we used
Look closely at this second DM – notice that we’ve used the power function. Recall that the power
function has two arguments – the first argument (mass, in this example) is raised to the power of the
second argument (2, in this case). Now, we have explicitly coded (i.e., told MARK) that the second
covariate is a power function of the first covariate. And because MARK now knows this, it knows what
to plot, and how.
Run this model, and add the results to the browser. As expected, the results are identical to what we
saw when we ran this model using the hard-coded mass2 in the INP file. But, more importantly, when
we plot this model, we get exactly what we were looking for:
Chapter 11. Individual covariates
11.5. Plotting + individual covariates
11 - 30
Note that there are two other options in the ‘Individual Covariate Plot’ specification window:
you can (i) output the estimates into Excel, or (ii) plot only the actual estimates (meaning, plot only the
reconstituted estimates for the parameter for the actual covariates in the input file – the estimate are
presented without their estimated SE).
Beyond the mechanics of plotting individual covariate functions, which is clearly part of the intent of
this section, this example also demonstrates one of the ‘hidden’ advantages of using the DM functions
to handle coding any functional relationships you might have among your covariates. Not only does
this save you from having to do those calculations by hand while you construct the INP file, they also
provide a convenient mechanism to make those functional relationships ‘known’ to MARK.
Plotting model averaged models with covariates is possible in MARK (see section 11.8), and using
RMark (see Appendix C – discussion of the covariate.predictions function).
begin sidebar
plotting ‘environmental’ covariates as ‘individual’ covariates
In Chapter 6, section 6.8.2, we considered the plotting of the functional relationship between some
parameter of interest and a particular ‘environmental’ covariate. One of the things noted in Chapter 6
was the lack of a direct option in MARK to plot this functional relationship.
But, we can, in fact, generate exactly the plot we’re looking for, within MARK, by using a ‘trick’
that involves individual covariates. The ‘trick’ is to get MARK to treat environmental covariates as
individual covariates, and then use the individual covariate plotting capabilities in MARK that we
introduced in the preceding section.
The basic idea is actually quite simple – if you remember the difference between an ‘environmental’
and ‘individual’ covariate. They key is the idea that an ‘environmental covariate’ is a covariate that
applies to all individuals. So, how do we use individual covariates to model/plot environmental
covariates? Easy – you simply add the value of the environmental covariate to each individual in
the INP file, as if it were an individual covariate.
We’ll demonstrate this using the dipper data (what else?). Assume that we believe that annual
apparent survival, ϕ is a function of some measure of rainfall. The dipper data consists of live capture
data over 7 occasions (6 intervals).
Here are the ‘rain data’ we’ll use in our model.
Interval 1
rain
1
2
10
3
8
4
15
5
3
6
6
For this demonstration, we’ll use the full dipper data (ed.inp) – 7 occasions, 2 attribute groups (males
and females). The first step involves entering the environmental covariate data into the .INP file, such
that each value of the environmental covariate (rain) will be a time-specific individual covariate, with
the values of those covariates repeated for all of the individuals in the data set.
The easiest way to explain is this be demonstration. First, here are the top few lines of the full
dipper encounter history file (which consists of 294 individuals). There are 2 frequency columns after
the encounter history – the first column corresponds to males, while the second corresponds to females.
The first few lines of the .INP file happen to be for male individuals.
1111110 1 0;
1111000 1 0;
1100000 1 0;
Now, all we need to do is enter the environmental covariates as a set of time-specific individual
covariates.
Chapter 11. Individual covariates
11.5. Plotting + individual covariates
11 - 31
Here is what the modified .INP file will look like (ed_mod.inp) – again, we’ll only show the first few
lines of the file:
1111110 1 0
1111000 1 0
1100000 1 0
1 10
1 10
1 10
8 15
8 15
8 15
3
3
3
6;
6;
6;
OK, now that we have this modified .INP file, start a new project in MARK start a new project – 7
occasions, 2 attribute groups (males and females), and 6 individual covariates, which we’ll refer to as
{r1,r2,...,r6}, corresponding to time interval 1, time interval 2, and so on.
We’ll start by fitting model {ϕ t p . } – in other words, no sex differences in ϕ, but ϕ allowed to vary
over time, t. Encounter probability, p, is constant over time, with no sex differences.
To make our lives simpler, we’ll build the underlying parameter structure for our starting model
using the following PIM chart (we’ll assume that by now you know how to do this). Then, we’ll build
the DM corresponding to this PIM structure – again, this should all be familiar territory:
Go ahead and run this model, and add the results to the browser.
Next, we want to modify the DM to constraint ϕ to be a linear function of rain. Recall from Chapter
6 that all we need to is (i) eliminate the time columns from the DM, and (ii) insert a column containing
the values for the environmental covariate, rain. The modifed DM is shown at the top of the next page.
Go ahead and run the model, and add the results to the browser:
If we look at the β estimates, we see that the linear model for apparent survival is
logit(ϕ) 0.3027129 + (−0.0076410)(rain )
Chapter 11. Individual covariates
11.5. Plotting + individual covariates
11 - 32
So, as rain increases, apparent survival decreases, since the estimate for the coefficient for the ‘rain’
covariate is negative.
But, now, we’d like to plot this relationship, using MARK. To do this, we’re first going to duplicate
model {ϕ r ain p . }, but this time using our individual covariates corresponding to the environmental
covariates – recall that we named them {r1, r2,..., r6} when we set up the specifications for the analysis.
How do we modify the DM to use these individual covariates? Easy – simply remember that each
covariate is time-specific. In other words, r1 corresponds to interval 1, r2 corresponds to interval 2, and
so on. Keeping this in mind, then here is what our modified DM will look like:
Go ahead and run this model – let’s name it ‘phi(ind rain cov)p(.)’. Let’s have a look at the browser:
We see that the model deviance for model ‘phi(rain)p(.)’ – built using the environmental
covariates ‘the usual way’, and the deviance of model ‘phi(ind rain cov)p(.)’, are identical. If you
compare reconstituted parameter estimates between the two models, they’re also the same.
Simply put, the 2 models are equivalent, in all but one important way. Because model ‘phi(ind
rain cov)p(.)’ was built using individual covariates, we can use the individual covariate plotting
capabilities in MARK to plot the functional relationship – and the uncertainty in that relationship –
between the parameter (in this case, ϕ), and the covariate (rain).
To generate the plot, simply click the ‘Individual covariate’ plot icon in the toolbar, which will
bring up the following window:
Chapter 11. Individual covariates
11.5. Plotting + individual covariates
11 - 33
Now, all you need to do is pick any one of the 6 parameters to plot (1:Phi, 2:Phi,...), and (this is
important) the correct (matching) individual covariate. For example, parameter ‘1:Phi’ corresponds
to the first interval, which corresponds to time-specific individual covariate ‘r1’. Parameter ‘2:Phi’
corresponds to covariate ‘r2’, and so on. It doesn’t matter which parameter you pick, but it does matter
that you pick the appropriate covariate it matches to. For present purposes, we’ll select ‘1:Phi’ and
‘r1’.
Now, notice that on the far right-hand side, the range for ‘r1’ is shown as 1 for the minimum, and
1 for the maximum. That is because ‘r1’ corresponds to the rain covariate for the first interval, which
was 1. Needless to say, if we don’t adjust the range, the plot won’t be particularly interesting. Let’s
change the range to 1 for the minimum, and 20 for the maximum:
All that remains is to generate the plot (or export everything to Excel, by selecting the appropriate
output option). For now, we’ll simply generate the plot using the plotting capabilities in MARK click the ‘OK’ button and we get exactly the plot we’re after – the basic function, and the uncertainty
represented by the 95% CI.
end sidebar
Chapter 11. Individual covariates
11.6. Missing covariate values, time-varying covariates, and other complications...
11 - 34
11.6. Missing covariate values, time-varying covariates, and other
complications...
Several strategies for handling missing individual covariates are available. Probably the best option is
to code missing individual covariate values with the mean of the variable for the population measured.
Replacing the missing value with the average means that the mean of the observed values will not
change, although the variance will be slightly smaller because all missing values will be exactly equal
to the mean and hence not variable.
The easiest way to accomplish this in MARK is to use the ‘standardize covariates’ option – if you
compute the mean of the non-missing values of an individual covariate, and then scale the non-missing
values to have a mean of zero, the missing values can be included in the analysis as zero values, and
will not affect the value of the estimated β term. (note: we don’t advise this trick for a covariate with
a large percentage of missing values because you have no power, but this approach does work for a
‘small’ number of missing values).
If you have lots of missing values, another option is to code the animals into 2 groups, where all the
missing values are in one group. Then, you can use both groups to estimate a common parameter, and
only apply the individual covariate to one group. This approach can be tricky, so think through what
you are doing before you try it.
What about covariates that vary through time? In all our examples so far, we’ve made the assumption
that the covariate is a constant over the lifetime of the animal. But, clearly, this will often (perhaps
generally) not be the case. For example, consider body mass. Body mass typically changes dynamically
over time, and if we believe that body mass influences survival or some other parameter, then we might
want to constrain our estimates to be functions of a dynamically changing covariate, rather than a static
one (typically measured at the time the individual was initially captured and marked). You can handle
time-varying covariates in one of a couple of ways.
First, you can include time-varying individual covariates in MARK files, but you must have a value
for every animal on every occasion, even if the animal is not captured. Typically, you can impute these
values if they are missing (not observed), but be sure to recognize what this imputation might do to
your estimates. As demonstrated in the preceding - sidebar - you implement time-varying individual
covariates just like any other individual covariate, except that you have to have a different name for each
covariate corresponding to each time period. ’
For example, suppose you have a known fate model (which we’ll cover in chapter 16) with 5 occasions,
and you have estimated the parasite load for each animal at the beginning of each of the 5 occasions.
The 5 values for each animal are contained in the variables v1, v2, v3, v4, and v5.
A design matrix that would estimate the effect of the parasite load assuming that the effect is constant
across time would be:
1
1
1
1
1
v1
v2
v3
v4
v5
The second β estimate is the slope parameter associated with the time-varying individual covariates.
Note that you do not want to standardize these individual covariates, because standardizing them will
cause them to no longer relate to one another on the same scale (making a common slope parameter
nonsensical). Each would have a different scale after standardizing. If you need to standardize the
covariates, you must do so before the values are included in a MARK encounter histories input file, and
Chapter 11. Individual covariates
11.6. Missing covariate values, time-varying covariates, and other complications...
11 - 35
you must use a common mean and standard deviation across the entire set of variables and observations.
The following design matrix would build a model where you assume the effect of parasite load is
different for each interval, but with the same survival probability for animals with no parasites (i.e., the
same intercept).
1
1
1
1
1
v1
0
0
0
0
0 0 0 0
v2 0 0 0
0 v3 0 0
0 0 v4 0
0 0 0 v5
The following model would allow different survival probabilities for each interval (i.e., time-specific
survival), but assumes the same impact of parasites on survival on the logit scale (assuming that a logit
link function is used). In other words, same slope, different intercept for each interval:
1
1
1
1
1
1
0
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
1
0
v1
v2
v3
v4
v5
Finally, a DM like the one shown below would allow a completely different survival probability and
parasite effect for each occasion:
1
1
1
1
1
1
0
0
0
0
0
1
0
0
0
0
0
1
0
0
0 v1 0 0 0 0
0 0 v2 0 0 0
0 0 0 v3 0 0
1 0 0 0 v4 0
0 0 0 0 0 v5
which is equivalent to specifying a separate function for each interval – this is perhaps illustrated
in a ‘more obvious’ fashion in the following DM, which is equivalent to the one above (although
interpretation of the β terms is clearly different).
1
0
0
0
0
v1
0
0
0
0
0
1
0
0
0
0 0 0 0 0 0 0
v2 0 0 0 0 0 0
0 1 v3 0 0 0 0
0 0 0 1 v4 0 0
0 0 0 0 0 1 v5
Alternatively, you can ‘discretize’ the covariate, and use a multi-state model (chapter 10) to model
transitions as a function of the covariate ‘class’ the individual is in. For example, suppose you believe
that survival from time (i) to (i+1) is strongly influenced by the size of the organism at time (i). Now,
size is clearly a continuously distributed trait. But, perhaps you might reasonably classify each marked
individual as either ‘large’, ‘average’, or ‘small’ size. Then, each individual at each occasion is classified
into one of these 3 different size classes, and you use a multi-state approach to estimate the probability
of surviving as a function of being in a particular size class. If the covariate is not measured (typically,
if the individual is not captured), then the missing value is accounted for explicitly by including the
encounter probability p in the model. Moreover, you would also be able to look at the relationship
between survival as a function of size, and the probability of moving among size classes.
Chapter 11. Individual covariates
11.6.1. Continuous individual covariates & multi-state models...
11 - 36
Sounds reasonable, but you need to consider a couple of things. First, in applying this approach, you
are discretizing a continuous distribution, and how many discrete categories you use, and how you
decide to partition them (e.g., what criterion you use to define a ‘large’ versus ‘small’ individual), may
strongly influence the results you get. However, when there are a large number of missing covariate
values, or if discretizing seems ‘reasonable’, this is a robust and easily implemented approach. Second,
you might need to be a bit ‘clever’ in setting up your design matrix to account for trends (relationships)
among states, as we’ll see in the following worked example.
11.6.1. Continuous individual covariates & multi-state models...
Let’s work through an example – not only to demonstrate an application of multi-state modeling to this
sort of problem (giving you a chance to practice what you learned in Chapter 10), but also to force you
to think deeply (yet again) about the building of a design matrix.
Consider a situation where we believe there is strong directional selection on (say) body size, where
larger individuals have higher survival than do smaller individuals. Suppose we have categorized
individuals as ‘small’ (S), ‘medium’ (M) and ‘large’ (L). For this example, we simulated a 6-occasion
data set (ms_directional.inp) according to the parameter values for ‘size-specific survival’ tabulated
at the top of the next page. If you look closely, you’ll see that within each interval, the difference in the
latent survival probability used in the simulation differs by a constant multiplicative factor such that
there is a linear increase in survival with size.
interval
state
1
2
3
4
5
S
M
L
0.500
0.525
0.551
0.700
0.749
0.801
0.600
0.624
0.649
0.700
0.749
0.801
0.700
0.749
0.801
However, if you look even more closely, you’ll note that the rate of this increase in survival with
size is not constant over intervals. So, imagine that for each time interval, you calculate the slope of the
relationship between survival and size. This slope should show heterogeneity among intervals (i.e., the
strength of directional selection on size varies over time).
To make things ‘fun’ (i.e., more realistic) we’ll also specify some size-specific transition parameters:
to
S
M
L
S
from
M
L
0.7
0.2
0.1
0.0
0.8
0.2
0.0
0.0
1.0
So, small (S) and medium (M) individuals can stay in the same size class or grow over a given interval,
but individuals cannot get smaller. We’ll assume that the encounter probability for all size classes and
all intervals is the same; however, to make this even more realistic, we’ll assume that p 0.7 for all size
classes – since p < 1, then we have ‘missing covariates’.
So, start MARK, and begin a new ‘multi-state’ analysis: select the ms_directional.inp file, and
specify 6 occasions, and 3 states: S, M, L. We’ll start with a general model with time-dependence in
survival, among states, and among time intervals. We’ll make the encounter parameter p constant among
states and over time, and will make ψ constant within state.
Chapter 11. Individual covariates
11.6.1. Continuous individual covariates & multi-state models...
11 - 37
This general structure is reflected in the following PIM chart:
Now, before we run this model, we have to consider if there are any parameters we need to fix (due to
logical constraints). As noted earlier, some of the transitions are not possible; specifically, ψ MS ψ LM ψ LS 0. Thus, looking at our PIM chart, we see that this corresponds to setting parameters 19, 21 and
22 to 0.
Go ahead and fix these parameters in the numerical estimation setup window. Call this model
‘s(state*time)p(.)psi(state)’, run it, and add the results to the browser. If we look at the estimates,
we’ll see that, by and large, the values are consistent with the underlying model structure.
OK – on to the ‘clever’ design matrix we alluded to before. The model we just fit is a naïve model,
as far as our underlying hypothesis is concerned – it is a model which simply allows the estimates for
survival to vary among states, and over time. In essence, a simple heterogeneity model. By itself, this
is not particularly interesting, although it is arguably a reasonable null model.
But, we’re interested in a particular a priori hypothesis: specifically, that survival increases with size.
We may also suspect that the strength (magnitude) of this directional selection favoring larger sized
individuals varies over time. So, what we want to fit is a model where, within a given interval, survival
is constrained to be a linear function of size (i.e., follow a trend), and that the slope of this trend may
vary over time.
So, here’s the tricky bit – in effect, we’re now going to treat each time interval as a group, and ask
if the slope of a relationship between survival and size varies among levels of this group (i.e., among
time intervals). So, we need to figure out how to do two things: (1) build a design matrix where each
time interval is a group, and (2) within a time interval, have survival constrained to follow a trend with
size among states (i.e., an ordinal constraint on survival with increasing size). How do we do this?
Chapter 11. Individual covariates
11.6.1. Continuous individual covariates & multi-state models...
11 - 38
Well, with a bit of thought, you might see your way to the solution. First, start by writing out the
linear model. We know we need an intercept (β1 ). There are 6 occasions, so 5 time intervals, meaning
we need 4 columns in the design matrix to code for the TIME grouping (β2 → β5 ). Next, we want to
impose a TREND over states. Recall from Chapter 6 how we handled trends: a single column consisting of
an ordinal series. So, for TREND, one column (β6 ). Next, the interaction term of TIME and TREND - (4×1) 4
columns for the interaction terms (β7 → β10 ). So, for the survival parameter,
S β1
+ β2 (T1) + β3 (T2) + β4 (T3) + β5 (T4)
+ β6 (TREND)
+ β7 (T1.TREND) + β8 (T2.TREND) + β9 (T3.TREND) + β10 (T4.TREND)
Now, encounter probability p is constant among states and over time, so one column (β11 ) for that
parameter. For the ψ parameters, one for each of the estimated transitions. Remember that if there are n
states that there are n(n − 1) estimated transitions, then for 3 size states, 3(3 − 1) 6 transitions, meaning
6 columns (β12 → β17 ). So, in total, our design matrix should have 17 columns. So, we tell MARK we
want to build a ‘ reduced design matrix, with 17 columns. MARK will then respond by giving us a
‘blank’ design matrix with 17 columns.
Starting the process of specifying our design matrix is easy enough: a column of 15 ‘1’s for the
intercept. Then, looking back at our linear model, we see that we next want to code for the 5 TIME
intervals: 4 columns (β1 → β4 ). We use the same coding scheme we’re familiar with – all we want to do
is make sure the dummy-variable structure unambiguously indicates the time interval:
So far, so good. Now, for the ‘hard part’. We now need to code for TREND. But, remember, here, we’re
not coding for TREND over TIME, but rather, TREND over states within TIME. You might remember that if
we have 3 levels we want to constrain some estimate to follow a trend over, then we can use the ordinal
sequence 1, 2, and 3 as the TREND covariate (check the relevant sections of Chapter 6 if you’re unsure
here). But, where do we put these TREND coding variables?
Chapter 11. Individual covariates
11.6.1. Continuous individual covariates & multi-state models...
11 - 39
The key is remembering – TREND among states within TIME interval. So, here is how we code TREND for
this model:
Holy smokes! OK, after you’ve caught your breath (or had a beer or two), it’s actually not that bad.
Remember, TREND among states within TIME interval. So, for the first interval for the 3 states, corresponding
to rows 1, 6, and 11, respectively, we enter 1, 2 and 3. Similarly, for the second interval for the 3 states,
corresponding to rows 2, 7, and 12, respectively, we again enter 1, 2 and 3, and so on for each of the
After all this, the interaction terms (and the encounter and transition parameters) are straightforward
(the full design matrix is shown below):
Chapter 11. Individual covariates
11.7. Individual covariates as ‘group’ variables
11 - 40
Go ahead and run this model – call it ‘s(time*trend)p(.)psi(state)’, where the time*trend part
indicates an interaction of the trend among TIME intervals. Run the model, and add the results to the
browser. Then, build the additive model (by deleting the interaction columns from the design matrix)
– call this model ‘s(time+trend)p(.)psi(state)’, and again, add the results to the browser.
We see clearly that our model constraining survival to show a trend among states, with full interaction
among time intervals, is by far the best supported model. Of course, this isn’t surprising, since the data
were simulated under this model.
So – a fairly complex example of using a multi-state approach to handle covariates which vary through
time. And, yet another example of why it is important to have a significant level of comfort with design
matrices – unless you do, you won’t be able to build the ‘fancy models’ you’d like to.
11.7. Individual covariates as ‘group’ variables
Suppose you were interested in whether or not survival probability differed between male and female
dippers. Having come this far in the book, you’ll probably regard this as a trivial exercise – you specify
the PIMs corresponding to the two sexes, perhaps construct the corresponding design matrix, and
proceed to fit the models in your candidate model set. This is all fairly straightforward, and easy to
implement – in part because the problem is sufficiently ‘small’ (meaning, only two parameters, relatively
few occasions, only two groups) that the overall number, and complexity of the PIMs you construct (and
the corresponding design matrix) is small. But, as we’ve seen, especially for ‘large’ problems (many
parameters, many occasions, many PIMs), manipulating all the PIMs and the design matrix can become
cumbersome (even given the convenience of manipulating the PIMs using the PIM chart).
Is there an option? Well, as you might guess, given that this chapter concerns the use of individual
covariates, you can, for a number of categorical models, use an alternative approach based on individual
covariates. Such an approach can in some cases be easier and more efficient to implement. We’ll consider
a couple of examples here, starting with the dippers.
11.7.1. Individual covariates for a binary classification variable
Let’s consider fitting the following 3 candidate models to the data collected for male and female dippers:
n
o n
o n
o
ϕ g∗t p · , ϕ g+t p · , ϕ g p · ,
where g is the ‘grouping’ variable – in this case, sex (male or female). Recall that the dipper data
(dipper.inp) consist of live encounter data collected over 7 encounter occasions. We specify 2 attribute
groups in the data type specification window in MARK (which we’ll label m and f, respectively), and
proceed to fit the three models in the candidate model set.
Chapter 11. Individual covariates
11.7.1. Individual covariates for a binary classification variable
11 - 41
n
o
To specify the underlying parameter structure for our general model ϕ g∗t p · , we’ll use fully timedependent PIMs for survival, and constant PIMs for the encounter probability. The PIM chart looks
like
and the corresponding design matrix is
We’ll skip the details on how to modify this design matrix to specify the remaining two models in
the model set (you should be pretty familiar with this by now).
Chapter 11. Individual covariates
11.7.1. Individual covariates for a binary classification variable
11 - 42
The results of fitting the three models to the dipper data are shown below:
Now, let’s consider using an individual covariate approach to fitting the same three models to the
dipper data. Our first step involves reformatting the input file. We need to reformat the input file to
specify gender as an individual covariate. Much like with the design matrix, you need to consider how
many covariates you need to specify group (in this case, sex). Clearly, in this case, the grouping variable
is binary (has only two states), and thus we need only a single covariate to indicate group (sex). How
do we reformat our data, using a single covariate to indicate sex? We’ll use ‘1’ to indicate males, and
‘0’ to indicate females. Now, we reformat the dipper data as follows – consider the following table of
different encounter histories (selected from the original dipper.inp file in ‘standard’ format), which
we’ve transformed to use an individual covariate approach:
standard
1111110
1111100
1111000
1111000
1101110
1
0
1
0
1
reformatted
0;
1;
0;
1;
0;
1111110
1111100
1111000
1111000
1101110
1
1
1
1
1
1;
0;
1;
0;
1;
The key is to remember that under the original ‘standard’ formatting, there is one column in the
input file for each of the groups: two sexes, two columns following the encounter history itself. So, a ‘1
0’ indicates male (1 in the male column, 0 in the female column), and a ‘0 1’ indicates a female (0 in the
male column, 1 in the female column). When using an individual covariates approach, you have only
one column for the covariate.
But, notice there are 2 columns after the encounter history. Why? Don’t we need just 1 covariate
column? Yes, but remember that we also need a column of ‘1’s’ to indicate the frequency of number of
individuals with a given encounter history (and since we’re working with individual covariates, each
encounter history corresponds to one individual, hence the frequency column has a ‘1’ in it for each
individual history). The first column after the encounter history is the frequency, and the second column
is the covariate column for group (sex). So, a male in the original file (indicated by ‘1 0’) becomes ‘1
1’ in the reformatted file, and a female in the original file (indicated by ‘0 1’) becomes ‘1 0’ in the
reformatted file. The reformatted data are contained in the file dipper_ind.inp (we’ll leave it to you to
figure out an efficient way to transform your data from one format to the other).
Now, when we specify the data type in MARK, we do not indicate 2 attribute groups, but instead
change the default number of individual covariates from 0 to 1. We’ll call this covariate s (for sex). If
we make the encounter probability constant, the corresponding PIM chart should look like the one
pictured at the top of the next page. Note that there are now only 6 parameters in the PIM chart for
survival, instead of the 12 parameters specified in the PIM chart of our general model using the standard
input format. Obviously, we’re going to need to make up the difference somehow. In fact, you may have
already guessed – by entering the individual covariates into the design matrix.
Chapter 11. Individual covariates
11.7.1. Individual covariates for a binary classification variable
11 - 43
For our general model ϕ g∗t p · , here is the corresponding design matrix using individual covariates:
We see that it has 13 columns, corresponding to 13 estimable parameters – we know from our initial
analysis that this model does indeed have 13 estimable parameters. From this design matrix, we can
build the other two models in the candidate model set:
ϕ g+t p · , ϕ g p · ,
simply
by deleting the appropriate columns from the design matrix (e.g., for the additive model
ϕ g+t p · , we simply delete the interaction columns 8 → 12).
Chapter 11. Individual covariates
11.7.1. Individual covariates for a binary classification variable
11 - 44
Here are the model fits for the 3 models, built using the individual covariates approach:
Compare them with the results obtained using the standard approach where sex was treated as an
‘attribute group’:
We see that the model AICc values, and the number of parameters, are identical between the two.
However, the deviances are different. Does this indicate a problem? No – not if you think about it for
a moment. If the AICc values and the number of parameters are the same, then the likelihoods for the
models are also the same (since the AICc is simply a function of the sum of the likelihood and the
number of parameters – if two out of the three are the same between the different analyses, then so
must the third (likelihood) be the same). In fact, if you look closely at the deviances, you’ll see that
the difference between the deviances – which is related to the likelihood (as discussed elsewhere) – is
identical. For example, (666.6762 − 659.6491) (84.1991 − 77.1720) 7.0271.
So, the results are identical, regardless of the approach taken (attribute groups versus individual
covariates coding for groups). And, it is pretty clear that the number of PIMs and the design matrix
for the analysis using individual covariates is smaller (easier to handle, potentially less prone to errors)
when using individual covariates. As such, is there any reason not to use the individual covariate
approach to handling groups?
There are at least two possible reasons why you might not want to use the individual covariate
approach for coding groups. First, as discussed earlier in this chapter, execution time generally increases
for models involving individual covariates. For very large, complex data sets, this can be a significant
issue.
Second, and perhaps more important, while the individual covariate approach might simplify aspects
of building the models, in fact it complicates derivation (reconstitution) nof group-specific
parameter
o
estimates. For example, take estimates of ϕ from our simplest model, ϕ g p · . Using the standard
attribute group approach, the estimates MARK reports for male and female survival are ϕ̂ m 0.5702637
and ϕ̂ f 0.5507352, respectively.
Chapter 11. Individual covariates
11.7.1. Individual covariates for a binary classification variable
11 - 45
What does MARK report for the estimates for this model fit using the individual covariates approach?
Clearly, the reported estimates using individual covariates appear to be quite different. But, are
they? What does the value ϕ̂ 0.5601242 represent? What about the value 0.4795918 reported for
the sex covariate, s? How can we reconstitute separate estimates of apparent survival for both males
and females?
The key is remembering that this analysis is based on individual covariates. Recall that MARK defaults
to reporting the parameter estimates for the mean value of the covariate. In this case, the sex covariate
is 1 (indicating male) or 0 (indicating female). If the sex-ratio of the sample was exactly 50:50, then
the mean value of the covariate would be 0.5. In fact, in the dipper data set, 47.96% of the individuals
are male. Does that number look familiar? It should – it is the value of 0.4796 reported (above) as the
average value of the covariate. And, the estimates of ϕ̂ are the reconstituted values of survival for an
average individual. Thus, the value of 0.5601242 is essentially identical (to within rounding error) to the
weighted average of ϕ̂ m 0.5702637 and ϕ̂ f 0.5507352, which we obtained from the analysis using
attribute groups ([0.4796 × 0.5702] + [0.5204 × 0.5507]) 0.5601) – here, the weights are the frequencies
of males and females in the sample (i.e., the sex ratio of the sample).
OK – fine, but that still doesn’t answer the practical question of how to reconstitute separate survival
estimates for males and females? The ‘brute-force’ approach is to use a ‘user-specified covariate
value’, when you setup the numerical estimation. You do this by checking the appropriate radio button:
Now, when you click the ‘OK to run’ button, MARK will ask you to specify the individual covariate
value for that model – in this case, either a 1 (for male) or 0 (for female). If we enter a ‘1’, run the model,
Chapter 11. Individual covariates
11.7.2. Individual covariates for non-binary classification variables
11 - 46
and then look at the reconstituted parameter estimates, MARK shows ϕ̂ 0.5703, which is exactly
what we expected for males. Similarly, if instead we enter a ‘0’ for the covariate value, MARK shows
ϕ̂ 0.5507 for females, again, precisely matching the estimate from the model fit using attribute groups.
OK, that is a functional solution, but not one that is particularly elegant (it can also be cumbersome
if you have multiple levels of group, or a lot of interactions between one or more grouping variables
and – say – time). It also is somewhat devoid of ‘thinking’, which is rarely a good strategy, since not
understanding what MARK is doing when you ‘click this button’ or ‘that button’ will catch up with you
sooner or later. The key to understanding what is going on is to remember from earlier in this chapter
how parameter estimates were reconstituted for a given value of one or more individual covariates.
Essentially, all you need to do is calculate the value of the parameter on the logit scale (assumingyou’re
using the default logit link), and then back-transform to the real probability scale. For model ϕ g p · ,
the linear model is
logit(ϕ̂) β1 + β2 (s)
0.2036416 + 0.0792854(s)
So, if the value of the covariate is 1 (for males), then
logit(ϕ̂ m ) β1 + β2 (s)
0.2036416 + 0.0792854(1)
0.282927
which, when back-transformed from the logit scale to the normal probability scale,
ϕ̂ m e 0.282927
1 + e 0.282927
0.570264
which is identical (within rounding error) to the estimate for male survival MARK reports using either
the attribute group approach, or by specifying the value of the covariate in the numerical estimation
using the individual covariate approach. The same is true for reconstituting the estimate for females.
While this is easy enough, it can get tiresome, especially
if the linear model you’re working with is
‘big and ugly’. Even for fairly simple models like ϕ g∗t p . , the linear model you need to work with can
be cumbersome:
logit(ϕ) β1 + β2 (s) + β3 (t1 ) + β4 (t2 ) + β5 (t3 ) + β6 (t4 ) + β7 (t5 )
+ β8 (s · t1 ) + β9 (s · t2 ) + β10 (s · t3 ) + β11 (s · t4 ) + β12 (s · t5 )
Each extra term in the equation adds to the possibility you’ll make a calculation error. The complexity
of the linear equation you need to work with will clearly be increased if you have > 2 levels of a grouping
factor. We consider just such a situation in our final example.
11.7.2. Individual covariates for non-binary classification variables
Here, we consider the analysis of a simulated data set with 3 levels of some grouping variable (we’ll
call the grouping variable colony, and the three levels ‘poor’, ‘fair’, and ‘good’, reflecting the impact of
some colony attribute on – say – apparent survival). The true
model under which the simulated data
(contained in cjs3grp.inp) were generated is model ϕ g+t p . – additive survival differences among the
3 colonies (in fact, additive, and ordinal, such that ϕ g > ϕ f > ϕ p , although this ordinal sequencing isn’t
Chapter 11. Individual covariates
11.7.2. Individual covariates for non-binary classification variables
11 - 47
of primary interest here). In the input file, the group columns (from left to right) indicate the poor, fair
and good colonies, respectively. For our model set, we’ll
(structurally)
that we used
fit the
same models
for the dipper data used in the preceding example: ϕ g∗t p · , ϕ g+t p · , ϕ g p · . Here are the results for
the analysis of the data formatted using the attribute grouping approach:
As expected, model ϕ g+t p · has virtually all of the support in the data (it should, given that it was
the true model under which the data were simulated in the first place).
Now, let’s recast this analysis in terms of individual covariates. As noted in the preceding example,
we need to specify enough covariates to correctly specify group association. Your first thought might
be to use a single column, with (say) a covariate value of 1, 2 or 3 to indicate a particular colony. This
would work, but the model you’d be fitting would be one where you’d be constraining the estimates to
following a strict ordinal trend (this is strictly analogous to how you built trend models in Chapter 6).
What if we simply want to test for heterogeneity among colonies? This, of course, is the null hypothesis
of the standard analysis of variance. Since there are 3 colonies, then (perhaps not surprisingly) we need
2 columns of covariates to uniquely code for the different colonies. In effect, we’re using exactly the same
logic in constructing the covariate columns as we would in constructing corresponding columns in the
design matrix. In fact, it is reasonable to describe what we’re doing here – with individual covariates –
as ‘moving’ the basic linear structure out the of the design matrix, and coding it explicitly in the input
file itself.
We’ll call the covariates c1 and c2. For dummy coding of the colonies, we’ll let ‘1 0’ indicate the
first (poor) colony, ‘0 1’ indicate the second (fair) colony, and ‘1 1’ indicate the third (good) colony.
So, the encounter history ‘111011 1 0 0’ in the original file (indicating an individual from the poor
colony) would be recoded as ‘111011 1 1 0’. Again, the first column after the encounter history after
recoding is the frequency column, and is a ‘1’ for all individuals (regardless of which colony they’re in).
The following two columns indicate values of the covariates c1 and c2, respectively. The reformatted
encounter histories are contained in csj3ind.inp.
Now, when we specify the data type in MARK, we set the number of individual covariates to 2, and
label them as c1 and c2, nrespectively.
The design matrix corresponding to the most general model in
o
the candidate model set ϕ g∗t p . is shown at the top of the next page.
Chapter 11. Individual covariates
11.8. Model averaging and individual covariates
11 - 48
Column 1 is the intercept, columns 2-3 are the covariates c1 and c2 (respectively), columns 4 → 7
are the time intervals (6 occasions, 5 intervals), columns 8 → 12 and 13 → 17 are the interactions
of the covariates c1 and c2 with time, respectively. Column 18 is the constant encounter probability.
Go ahead and fit this model to the data – notice immediately how much longer it takes MARK to do
the numerical estimation (again, one of the penalties in using the individual covariate approach is the
increased computation time required).
Here are the results for our candidate model set:
If you compared these results with those shown on the preceding page (generated using group
attributes rather than individual covariates), you’ll see they are identical (again, the differences among the
model deviances are identical, even if the individual model deviances are not). Again, using individual
covariates in this case seems like a reasonable ‘time-savings’ strategy, since the number of PIMs, and
the complexity of the general design matrix, is considerably reduced relative to what you’d face if you
worked directly with attribute groups in the ‘standard’ way.
However, as noted in our discussion of the preceding dipper analysis, there are other potential ‘costs’
which might temper your enthusiasm for using the individual covariate approach to coding ‘attribute
groups’. First, you’ll need to handle reconstituting parameter estimates from what might potentially
be pretty sizeable linear model (for our present example, it’s sufficiently sizeable – 15 terms – that we
won’t write it out in full here). Second, you (instead of MARK) would have to handle the accompanying
calculation of SE of the reconstituted estimates (using the Delta method – Appendix B).
However, while this is possible (albeit somewhat time consuming), what is not possible is the
derivation of the SE for the effect size (see Chapter 6 – section 6.12) for the difference between levels of
a discrete ‘attribute variable’ when you’ve coded the ‘attribute variable’ using an individual covariate
(e.g., ‘sex’ – see section 11.7; the dipper example in subsection 11.7.1). Calculation of the SE for the
‘effect size’ (i.e., the difference between the estimates for different levels of the ‘attribute variable’)
requires an estimate of the variance-covariance matrix between estimates for the different attribute
levels, which is not estimable when using the individual covariate approach. Finally, generating model
averaged parameter estimates from models with individual covariates is decidedly more complicated
(as discussed in the next section) than for models without individual covariates.
So, while there is a clear ‘up-front savings’ in terms of simpler PIMs, and simpler design matrices,
when using the individual covariate approach to handling attribute groups, the ‘after-the-fact cost’ of
the number of things you’ll need to do by hand (or, more typically, program into some spreadsheet)
to generate parameter estimates is not insubstantial, and may be more than the hassle of dealing with
lots of PIMs and big, ugly design matrices. An alternative to using individual covariates to simplify
model-building is to use the RMark package (see Appendix C).
11.8. Model averaging and individual covariates
In chapter 4 we introduced the important topic of model averaging. If you don’t remember the details,
or the motivation, it might be a good idea to re-read the relevant sections. In a nutshell, the idea behind
Chapter 11. Individual covariates
11.8. Model averaging and individual covariates
11 - 49
model averaging is pretty simple: there is uncertainty in our model set as to which model is ‘closest
to truth’. We quantify this uncertainty by means of normalized AIC weights – the greater the model
weight, the more support in the data for a given model in that particular model set. Thus, it seems
reasonable that any average parameter value must take this uncertainty into account. We do this by (in
effect) weighting the estimates over all models by the corresponding model weights (strictly analogous
to a weighted average that you’re used to from elementary statistics).
For models with individual covariates, you might guess that the situation is a bit more complex. The
model averaging provides average parameter values over the models, but what you’re often (perhaps
generally) most interested in with individual covariates is the ‘average survival probability for an
organism with a value of individual covariate XYZ’. For example, suppose you’ve done an analysis
of the relationship of body mass to survival, using individual body mass as a covariate in your analysis.
Some of your models may have body mass (mass) included, some may have mass, and mass2 (as in the
first example in this chapter). What would report as the ‘average survival probability for an individual
with body mass X’?
Mechanically, what you would need to do, if doing it by hand, is take the reconstituted values of ϕ for
each model, for a given value of the covariate, then average them using the AIC weights as weighting
factors (for models without the covariate, the β for the covariate is, in fact, 0). This is fairly easy to do,
but a bit cumbersome by hand. Moreover, you have the problem of calculating the standard errors.
Fortunately, MARK has a couple of options to let you handle this ‘drudgery’ automatically. Basically,
you can either (i) specify (‘define’) the value of the individual covariate, and model average for that
value or (ii) you can calculate (and plot) the value of the model-averaged parameter over a range of
covariate values, using the individual covariate plot capability.
Consider the following example – here we’ve simulated a new live encounter data set (indcov1_avg.inp,
8 occasions), where survival (ϕ) is a function of body mass, m (over the range 85-140 mass units). The
form of the relationship used in simulating the encounter data is shown in the following figure:
Here, we see that the relationship between survival and body mass is non-linear – there is a tendency
for survival to increase with mass, but at higher mass values the rate of change asymptotes. The data
were simulated assuming no annual variation in the relationship between survival and mass, and no
temporal variation in the encounter probability.
Chapter 11. Individual covariates
11.8. Model averaging and individual covariates
11 - 50
We will start by building a candidate model set consisting of 3 models: {ϕ. p . }, {ϕ m p . }, and {ϕ m+m 2 p . }.
What is important to note about this model set is that we have 2 models which we anticipate will get
some significant support in the data (models {ϕ m p . }, and {ϕ m+m 2 p . }). We also have a model, {ϕ. p . },
which is notable because it does not contain the covariate. As we will discuss, this is an important
consideration – how does ‘model averaging’ account for models without the individual covariate?
If we fit these 3 models to the data,
we see that there is relatively strong support for the model where survival is a linear function of mass,
{ϕ m p . }, but there is non-negligible support for the non-linear model, {ϕ m+m 2 p . }.
Now, we might for some purposes want to know what the model-averaged survival probability is for
a particular mass – say, some value near the extremes of the range (a very light or very heavy individual),
or perhaps the mean value. MARK makes it very easy to do this. Simply build the models, each time
specifying whether you want MARK to provide real parameter estimates from either the first encounter
record, a user-defined set of values, or the mean of the covariates.
For purposes of demonstration, we’ll use a user-defined covariate value (which allows us to generate
a model-averaged estimate of survival for a covariate value we specify). Now, if you know you want to
do this before you run your models, then fine. Simply select the model you want to re-run, and then in
the ‘Setup Numerical Estimation Run’ window, simply check the ‘user-specific covariate values’
option box in the lower right-hand corner:
If you’ve checked the ‘user-specified covariate values’ radio button, once you click the ‘OK to
run’ button you’ll then be presented with another small window asking you to enter the values of the
covariate(s) you want to generate real parameter estimates for.
But quite often, you may run your models using the default covariate value (the mean), and then
‘after the fact’, decide you want to re-run the model, this time using a user-define covariate value. In
fact, MARK makes this quite easy do. Simply select ‘Run | Re-run models(s)’ from the main menu.
This will bring up the dialog window shown at the top of the next page
Chapter 11. Individual covariates
11.8. Model averaging and individual covariates
11 - 51
All of the models currently in the browser are shown in the main part of this window. You select the
models you want to re-run (typically, ‘Select all’). Then, to specify individual covariate values to use
for re-running the models, simply check that box, as shown on the preceding page. When you click ‘OK’,
another window will pop up, asking you to enter the value of the covariate you want to use – say, 85
for mass (m):
Now, all that remains is to run the model averaging routine. For this example, using m=85 as the value
of the covariate, model averaged survival value is
One conceptual issue to consider – body mass (m) was contained in 2 of 3 models in our candidate
model set. What about the third model, {ϕ. p . } which does not contain body mass? Well, clearly, if the
covariate for a particular covariate does not show up in a model, then the β estimate for that covariate
is 0, for that model. But, our interest is (typically) in model averaging real parameter estimates, not β
estimates.
So how does MARK average real estimates over models including those that do not include the
covariate? You can get a partial clue by looking back at the table of estimates used in the model averaging
(above). Note that the reported estimate for survival for model {ϕ. p . } is 0.6524177.
Chapter 11. Individual covariates
11.8. Model averaging and individual covariates
11 - 52
Where does this value come from? Simple – it is the estimate of survival you would get if you ignore
the mass covariate (which is implicit in the model, which does not include mass), which in effects is
equivalent to assuming that all individuals in the sample have the same mass – i.e., the average mass
for the sample. You can confirm this for yourself by re-running all the models, and changing the userspecified model for mass. If you do this, you will see that the reported estimates of survival for models
{ϕ. p . } and {ϕ m p . } will change, since they both include mass as a term in the model. However, the
reported value for model {ϕ. p . } will not change.
While calculating model averaged survival for specific, user-defined values of the covariate (as above)
is straightforward, we’re often most interested in evaluating (and visualizing) the model averaged
parameter (in this example, survival) over a range of the covariate (mass). This is quite easy to do
in MARK. Simply select ‘Output | Model Averaging | Individual Covariate Plot’:
A dialog window nearly identical to the single model plot we considered earlier (section 11.5) is then
opened (top of the next page), and you select the real parameter you want to plot.
However, the design matrix entry now shows the names of the individual covariates available to
be plotted, because not all models in the results browser would normally use the same functional
Chapter 11. Individual covariates
11.8.1. Careful! – traps to watch when model averaging
11 - 53
relationship between the real parameter and the individual covariate that is to be plotted. For example,
some models with AICc weight in the results browser might not have any relationship between the
covariate and the real parameter to be plotted, meaning a flat line results for this model. As with the
single model plot, you select from the second list box the individual covariate to be plotted, and the
range over which to plot the function. If there were other covariates included in one or more of the
models in the model set, all of these other individual covariates are listed with the values used when
they are included in the model for the real parameter being plotted.
For our present example, the plotted model averaged values (below) don’t indicate much evidence
for any non-linearity in the relationship between survival and mass (in other words, this figure doesn’t
look very similar to the true generating function used to generate the data used in this analysis – p. 45).
However, this plot of model averaged values is entirely consistent with the previous observation
that the non-linear quadratic model in the candidate model set, {ϕ m+m 2 p . }, did not receive appreciable
support in the data. In fact, the linear model, {ϕ m p . }, had 2.6 times the support in the data as the
quadratic model – and this much stronger support for the linear model is reflected in the model averaged
estimates.
11.8.1. Careful! – traps to watch when model averaging
In the process of building some of your candidate models, you may have changed the definition of some
of the PIMs with the ‘Change PIM Definition’ menu choice. For example, consider a multi-state model
(Chapter 10) – if the first transition probability from strata A is defined in some models as ψ A→A , and
in other models as ψ A→B , and these real parameters are model averaged, the results may be incorrect.
Thus, be sure to check the model averaging results to verify that correct parameters were selected.
Another potential ‘gotcha’ might arise if you want to use the ‘individual covariate plot’ for
modeling averaging, and if you’ve used different PIM structures for some of your models in your
candidate model set (rather than using the same PIM structure for all your models, using the design
matrix to construct reduced parameter models based on that PIM structure). For example, consider the
example presented at the start of this section, based on the simulated data in indcov_avg1.inp. Recall
that for these data, we fit the following 3 candidate models: {ϕ. p . }, {ϕ m p . }, and {ϕ m+m 2 p . }.
Chapter 11. Individual covariates
11.8.1. Careful! – traps to watch when model averaging
11 - 54
However, what we didn’t discuss when we initially analyzed these data is what the underlying PIM
structure was. We noted that we assumed no temporal variation in ϕ or p. As such we could have used
either of the following PIMs and corresponding DM for (say) model {ϕ m+m 2 p . }:
which is entirely equivalent (in terms of fit to the data, and parameter estimation) to
For purposes of making the point, we’ll refer to the first approach as being based on ‘t-PIM’ (say, for
‘time-based PIM’), and the second approach being based on ‘simple PIM’ (no time-dependence in the
PIM). We’ll use the time-based PIMs for models {ϕ. p . } and {ϕ m p . }, and the ‘simple’ PIM for model
{ϕ m+m 2 p . }.
As you can see from the browser (below), the results of fitting these models to the data are identical
to what we saw before, even though we have used a different underlying PIM structure for one of the
models:
Make model {ϕ m p . } active, by selecting it in the browser, and retrieving it. Recall that this model
was built with the time-based PIM.
Now, select ‘Output | Model averaging | Individual covariate plot’. You’ll be presented with
the individual plot window shown at the top of the next page.
Chapter 11. Individual covariates
11.8.1. Careful! – traps to watch when model averaging
11 - 55
You’ll see that you have 7 parameters for ϕ (labeled 1:Phi → 7:phi). Now, we ‘know’ that here, we
could select any of the 7 x :Phi, because our DM is set up to constrain them to be equivalent.
However, if instead we made model {ϕ m+m 2 p . } active, then we see the following when we select
‘Output | Model averaging | Individual covariate plot’:
Now, we see see only 1 parameter for ϕ, not 7, as above. Why? because we constructed model
{ϕ m+m 2 p . } using a ‘simple’ PIM structure for the underlying model.
Now, in this particular case, you’ll end up with the same model averaged estimates regardless of
which model was ‘active’, but that may not always be the case (especially for complicated models where
the functional relationship between the covariate(s) and the parameter vary over time). So, the general
recommendation is to use a common PIM structure over all your models, and if you do want/need to
use a different PIM structure for some models in your model set, be careful when model averaging.
A final trap concerns individual covariates in particular. The user can specify the values of individual
covariates to be used to compute the real and derived parameter values. If different values of the
individual covariate are specified for different models to be model averaged, the results will be nonsense.
Thus, be sure to use the same individual covariate values in all models to be model averaged, e.g.,
the mean value. The real and derived estimates can be changed to use a different individual covariate
value with the ‘ReGenerate Real Derived Model(s)’ option in the results browser ‘Run’ menu.
Chapter 11. Individual covariates
11.8.2. Model averaging and environmental covariates
11 - 56
11.8.2. Model averaging and environmental covariates
In chapter 6 (section 6.16), we considered model averaging across models where survival or some other
parameter was constrained to be a function of one or more ‘environmental covariates’. Our interest is
in coming up with a way to estimate the relationship between the parameter and the covariate (similar
to what was presented in the -sidebar- starting on p. 28 of this chapter), but averaged over multiple
models.
As in Chapter 6, let’s consider, again, the full Dipper data set, where we hypothesize that the encounter
probability, p, might differ as a function of (i) the sex of the individual, (ii) the number of hours of
observation by investigators in the field, with (iii) the relationship between encounter probability and
hours of observation potentially differing between males and females.
Recall that our ‘fake’ observation hour covariates were:
Occasion 2
hours
12.1
3
4
6.03 9.1
5
6
7
14.7 18.02 12.12
Now, when we introduced this example earlier in this chapter, we fit only a single model to the data:
logit(p) β1 + β2 (SEX) + β3 (HOURS) + β4 (SEX.HOURS )
But, here, we acknowledge uncertainty in our candidate models, and will fit the following candidate
model set to our data:
model M1
logit(p) β1 + β2 (SEX) + β3 (HOURS) + β4 (SEX.HOURS),
model M2
logit(p) β1 + β2 (SEX) + β3 (HOURS),
model M3
logit(p) β1 + β2 (HOURS),
model M4
logit(p) β1 + β2 (SEX).
There are a couple of things to note. First, this is not intended to be an ‘exhaustive, well-thought-out’
candidate model set for these data. We’re using these models to introduce some of the considerations for
model averaging. In particular, we’re using this example where encounter probability is hypothesized
to be a function of a continuous environmental covariate, to force us to consider how – and what – we
model average when some models include the environmental covariate (HOURS), and some don’t.
Let’s fit these 4 candidate models (M1 → M4 ) to the full Dipper data set, treating sex as a categorical,
group attribute variable. We’ll build all of the models using a design matrix approach, using the
encounter data in ED.INP. Note that models M2 → M4 in the model set are all nested within the first
model, M1 . For all 4 models, we’ll assume that apparent survival, ϕ, varies over time, but not between
males and females.
The results of fitting our 4 candidate models to the full Dipper data are shown below:
We see from the AICc weights that there is considerable model selection uncertainty. In fact, the
∆AICc values among all models is < 4.
Chapter 11. Individual covariates
11.8.2. Model averaging and environmental covariates
11 - 57
Now, we want to fit the same candidate model set, but coding both SEX and HOURS as individual
covariates. Recall from p. 28 that we code each occasions covariate value as an individual covariate.
This requires reformatting the .inp data. Here are the top few lines of the reformatted .inp file (which
we’ll call ED_cov.inp):
The first 7 columns comprise the encounter history for the individual. Column 9 is the frequency
(1) for that individual. Column 11 is the coding for SEX, as an individual covariate (SEX=1, male, SEX=0,
female), and columns 13 → 42 list the environmental covariates (HOURS), coded as occasion-specific
individual covariates.
Now, that we’ve re-formatted our .inp file, let’s fit the same 4 candidate models. We’ll refer to the sex
covariate as sex, and the environmental covariates as h1,h2,h3,h4,h5, and h6, corresponding to HOURS for
each encounter occasion:
Compare these results with those shown in the browser at the top of this page. Note that the
reported deviances are quite different – because the underlying likelihood structures differ, depending
on whether you use individual covariates, or not. However, even though the deviances differ, the relative
AIC differences, and so on, are identical. And, if we looked at the reconstituted parameter estimates,
we’d see they were also identical.
OK, so we’ve just confirmed that our 4 candidate models built using the individual covariates
approach are ‘correct’, in that they match the models we built earlier, based on treating sex as a group
attribute variable, and entering the covariate values into the DM.
Now what? Well, now we can use the model averaging (and plotting capabilities) for individual
covariates in MARK, to generate model averaged estimates of the relationship between the parameter
(in this case, encounter probability, p), and the environmental covariate, HOURS.
In Chapter 6, we focussed on averaging over models for SEX=1 (males). Let’s try the same thing here.
Simply select ‘Output | Model Averaging | Individual Covariate Plot’ This will bring up the
model averaging window we’ve seen earlier in this chapter:
Chapter 11. Individual covariates
11.8.2. Model averaging and environmental covariates
11 - 58
Now, have a look what happens if we click the first encounter probability (7:p) and the first HOURS
covariate (h1):
On the right-hand side, we see the range of the individual covariate we want to plot (h1, corresponding
to encounter probability for sampling occasion 2, although it is not labeled as such). We’ll change this
range in a moment.
Below this are the other values of the covariates which will be ‘fixed’ during the averaging and
plotting. Note that the SEX covariate is reported as 0.4795911. Where does this number come from?
Remember, we coded males using SEX 1, and females as SEX 0. If we had an equal number of males
and females in our sample, then the average coding for SEX would be 0.5. However, in our sample,
we have slightly more females than males, and the average for SEX is 0.4795911 (which, in fact, is the
sex-ratio for our sample).
Below the SEX covariate value are the values of the environmental covariate HOURS for each encounter
occasions (h2 for occasion 3, h3 for occasion 4, and so on...).
To generate the plot we’re after, we’ll need to modify a few things (shown on the top of the next
page). First, since we are focussing on males (SEX 1), we’ll change the value of the SEX covariate to
1. In addition, we’ll change the range of the individual covariate h1 we want to average over, and plot,
from 12.1 → 12.1 to (say), 5 → 20. Remember, it doesn’t matter which covariate you plot (p1 , p2 , . . . ),
so long as you select the correct environmental covariate for that occasion (ie., 7:p with h1, 8:p with h2,
and so on...).
For convenience, we’ll also check the box to output everything to Excel.
Chapter 11. Individual covariates
11.8.2. Model averaging and environmental covariates
11 - 59
Back in Chapter 6 (section 6.16), we hand-calculated model averaged estimates for male encounter
probability as function of HOURS of observation, and their associated confidence intervals, which when
plotted, looked like the following:
Chapter 11. Individual covariates
11.9. GOF Testing and individual covariates
11 - 60
How do the results from our ‘averaging over individual covariates’ compare? In fact, they are
essentially identical.∗ Here is the plot generated by MARK, which is a near-perfect match to the handgenerated plot shown at the bottom of the previous page:
If you look back at section 6.16 in Chapter 6, you’ll see that doing the calculation(s) ‘by hand’ was
a lot of work. Using the individual covariate model averaging capabilities in MARK, demonstrated in
this section, is much faster, and likely far less error-prone. The only really ‘trade-off’ is that to use the
approach based on individual covariates, you need to re-format your .inp file such that everything in
your analysis is coded using individual covariates (all attribute grouping variables, all environmental
covariates, everything...). Depending on the scope of your data set, and the models you’re fitting to
those data, this can also require a fair bit of work.
11.9. GOF Testing and individual covariates
Well, now that we’ve seen how easy it is to handle individual covariates, now for the good news/bad
news part of the chapter. The good news is that individual covariates offer significant potential for
explaining some of the differences among individuals, which, as we know (see Chapter 5), is one
potential source of lack of fit of the model to the data.
OK – now the bad news. At the moment, we don’t have a good method for testing fit of models with
individual covariates. If you try to run one of the GOF tests based on simulation or resampling – say,
the median-ĉ – you’ll be presented with a pop-up warning that ‘the median c-hat only works for models
without individual covariates’. The Fletcher-ĉ isn’t even printed in the full output. And so on.
For the moment, the recommended approach is to perform GOF testing on the most general model
∗
As discussed in Chapter 6, the back-transform of the model averaged value of logit( p̂) is not the same as the model averaged
value of the back-transforms of the individual estimates of p̂ from each model. This difference reflects Jensen’s inequality. In
Chapter 6, the reported and plotted model averaged estimates for the encounter probability, and associated 95% CI, were based
on the model averaged value of logit( p̂), while the values MARK uses for the individual covariate model averaging are based
on the model averaged value of the back-transforms of the individual estimates of p̂ from each model. The difference between
the two is generally very small.
Chapter 11. Individual covariates
11.9. GOF Testing and individual covariates
11 - 61
that does not include the individual covariates, and use the ĉ value for this general model on all of the
other models, even those including individual covariates. If individual covariates will serve to reduce
(or at least explain) some of the variation, then this would imply that the ĉ from the general model
without the covariates is likely to be too high, and thus, the analysis using this ĉ will be ’somewhat
conservative’. So, keep this in mind...
begin sidebar
individual covariates and deviance plots
One approach to assessing the fit of a model to a particular set of data is to consider the deviance residual
plots. While this can prove useful – in particular, to assess lack of fit because the structure of the model
is not appropriate given the data (e.g., TSM models – see Chapter 7), if you try this approach for models
with individual covariates, you’ll quickly run into a problem.
For example, consider the deviance residual plot for the first example analysis presented in this
chapter (for model {ϕ. p . }).
Clearly, something ‘strange’ is going on – we see fairly discrete ‘clusters’ of residuals, virtually all
below the 0.000 line. Obviously, this is quite different than any other residual plot we’ve seen so far.
Why the difference? In simple terms, the reason that the residual plots change so much when an
individual covariate is added is because the number of animals in each observation changes. Without
individual covariates, the data are summarized for each unique capture history, so that variation within
a history due to the individual covariate is lost. However, when the covariate is added into the model,
each animal (i.e., each encounter history, even if it is the same as another history) is plotted as a separate
point. The result is quite different, obviously. Without individual covariates, the binomial functions
are the sample size, so animals are ‘pooled’. With individual covariates, the number of animals is the
sample size, each resulting in a unique residual.
In other words, the deviance residual plots for models with individual covariates are not generally
interpretable.
end sidebar
Chapter 11. Individual covariates
11.10. Summary
11 - 62
11.10. Summary
That’s it for Chapter 11! In this chapter, we looked at the basic mechanics of using MARK to fit models
where one or more parameters are constrained to be functions of individual covariates. Individual
covariates can be used with any of the models in MARK (not just recapture models). This is a significant
increase in the flexibility of analyses you can execute with MARK.
11.11. References
Burnham, K. P., and Anderson, D. R. (2004) Multimodel inference – understanding AIC and BIC in
model selection. Sociological Methods & Research, 33, 261-304.
Link, W. A., and Barker, R. (2006) Model weights and the foundations of multimodel inference. Ecology,
87, 2626-2635
Chapter 11. Individual covariates