C.4.2. Function make.design.data C - 19
A processed data frame can only be analyzed using the model that was specified in the call to
process.data. The model value is used by the functions make.design.data and make.mark.model to
define the design data and the appropriate input file structure for MARK. Thus, if the data are going
to b e ana lyzed with different underlying models, create different processed data objects possibly
using the type of model as an extension. For example,
dipper.cjs=process.data(dipper,model="CJS")
dipper.popan=process.data(dipper,model="POPAN")
The process.data function will report any inconsistencies in the lengths of the capture history
values and when invalid entries are given in the capture history. For example, with the "CJS" model,
the capture history should only contain 0 and 1 whereas for "Barker" it can contain 0,1,2. For
"Multistrata" models, the code will automatically identify the number of strata (nstrata) and
strata labels (strata.labels) based on the unique alphab etic codes used in the capture histories.
For "Robust" design models, the number of secondary occasions (nocc.secondary) is determined by
the specified time.intervals.
The argument begin.time specifies the time for the first capture/release occasion. This is used
in creating the levels of the time factor variable in the design data and for labeling parameters. If
begin.time varies by group, enter a vector of times with one for each group.
The argument groups can contain one or more character strings specifying the names of factor
variables contained in data. A group is created for each unique combination of the levels of the factor
variables. Further examples of grouping and use of age variables will be given later a nd they can be
found in the help documentation with R (?process.data and ?example.data) .
C.4.2. Function make.design.data
The next step is to create the design data and PIM structure which dep ends on the selected type of
analysis model (e.g., CJS or Multistrata), number of occasions, grouping variables and other attributes
of the data that were defined in the processed data, which is the first and primary argument to the
function make.design.data that creates the design data. For parameter s with triangular PIMS the
default design data are cohort, age and time and any grouping factor variables that were defined.
For parameters with square PIMS, there is only one row so the cohort variable is not automatically
included in the design data but there are ways to create a cohort structure in this case with groups.
In creating the factor variables for cohort, age, and time, a separate factor level is created for e ach
value of the variable. However, you can optionally bin the values into intervals in creating the factor
variable. For example, if birds were always classified as either young (< 1) or as adult (1+), then
age.bins could be specified in the call to make.design.data. However, if you wanted the option to
model age based on all levels of the factor and other models with some ages collapsed into intervals
then it is best to allow make.design.data to create the default factor variables and create ad ditional
design data with the function add.design.data or using R statements and functions. Ther e are many
other features of make.design.data including restricting parameters to use "time" or "constant" PIMS,
setting the subtraction stratum for " Multistrata" models, and automatic removal of unused design
data. These features are described in the help files (?make.design.data and ?add.design.data) and
they a re described in more detail in late r sec tions.
For now, a simple example with the d ipp er data will suffice to illustrate this step and explain the
basic concepts. But before we do that we’ll reprocess the data to use annual time intervals rather than
the fictitious ones used above:
Chapter C. RMark - an alternative approach to building linear m odels in MARK