`vignettes/a05_SuggestedQualityControl.Rmd`

`a05_SuggestedQualityControl.Rmd`

The specific requirements and assumptions for each function in
`Rage`

varies. Here we provide notes giving an overview of
these requirements, which are also applicable to other
functions/calculations in packages such as `popbio`

and
`popdemo`

. We make some suggestions of how users should
filter their dataset before analysis. To assist with this, the function
`Rcompadre::cdb_flag`

conducts a series of checks on the
matrices in a `compadreDB`

object and adds “flags” to
facilitate the filtering out of problematic matrices. This function
(`cdb_flag`

) can automatically be run by
`Rcompadre::cdb_fetch`

using the argument
`flag = TRUE`

.

The most obvious requirement for most of the `Rage`

methods is that missing (`NA`

) values in matrices prevent
calculations using those matrices. Sometimes these `NA`

values are in one of the submatrices (i.e., **U**,
**F** or **C**) of the matrix model, but other
submatrices are complete. For example, there may be `NA`

entries in the **F** submatrix, while the
**U** matrix remains complete. These issues are flagged
with the columns `check_NA_A`

, `check_NA_U`

,
`check_NA_F`

and `check_NA_C`

.

Submatrices composed entirely of zero values can also be problematic.
There may be good biological reasons for this phenomenon. Species that
do not reproduce clonally will have zero-value **C**
matrices. Another biologically reasonable explanation could be that in
the particular focal population in the particular focal year, there was
no sexual reproduction recorded, so the **F** matrix was
composed entirely of zeros. Nevertheless, zero-value submatrices can
cause some calculations to fail and it may be necessary to exclude them.
These issues are flagged with the columns `check_zero_F`

,
`check_zero_C`

, `check_zero_U`

.

In a biologically reasonable matrix population model, the set of
survival and growth transitions (i.e., in the **U** matrix)
from a particular stage cannot exceed 1. However, in some cases, errors
in the original matrices (including data entry and rounding errors)
cause this situation to occur and may persist in the data set. We can
check for this error using column sums of the **U** matrix,
and may wish to exclude matrices with any column sum greater than
1.

This issue is examined using the column `SurvivalIssue`

which
gives the maximum value of the column sums for `matU`

. An
additional column `check_surv_gte_1`

(produced with
`cdb_flag`

) reports whether any **single** value
is greater than or equal to 1.

At the opposite end of the survival spectrum, there may be some
matrices where some of the column sums of the **U** matrix
are zero, implying that there is no survival from that particular stage.
This may be a perfectly valid parameterisation for a particular
year/place but is biologically unreasonable in the longer term and users
may wish to exclude problematic matrices from their analysis. This issue
is indicated by the column `check_zero_U_colsum`

.

Several matrix manipulations or calculations require that the MPM
(`matA`

) be irreducible and ergodic (Stott et al. 2018).
Irreducible MPMs are those where parameterised transition rates
facilitate pathways from all stages to all other stages. Conversely,
reducible MPMs depict incomplete life cycles where pathways from all
stages to every other stage are not possible. Ergodic MPMs are those
where there is a *single* asymptotic stable state that does not
depend on initial stage structure. Conversely, non-ergodic MPMs are
those where there are multiple asymptotic stable states, which depend on
initial stage structure. MPMs that are reducible and/or non-ergodic are
usually biologically unreasonable, both in terms of their life cycle
description and their projected dynamics. They cause some calculations
in `Rage`

(and elsewhere) to fail. Irreducibility is
necessary but not sufficient for ergodicity. These issues are flagged
with `check_irreducible`

and `check_ergodic`

. Even
if `Rage`

functions do not fail due to these issues, the fact
that they can indicate biologically unreasonable life cycles may mean
that users nevertheless wish to exclude reducible, non-ergodic matrices
from their analyses.

Matrices are said to be singular if they cannot be inverted.
Inversion is required for many matrix calculations and, therefore,
singularity can cause some calculations to fail. This issue is flagged
with `check_singular_U`

. Calculations for
`longevity`

, `life_expect_mean`

,
`life_expect_var`

and `net_repro_rate`

fail with
singular matrices, so users may wish to exclude singular matrices when
conducting analyses using these functions.

A complete MPM (**A**) can be split into its component
submatrices (i.e. **U**, **F** and
**C**). The sum of these submatrices should equal the
complete MPM (i.e. **A** = **U** +
**F** + **C**). Sometimes, however, errors
occur so that the submatrices do NOT sum to **A**.
Normally, this is caused by rounding errors, but more significant errors
are possible. This problem is flagged with
`check_component_sum`

(only relevant for divided (split)
matrices). We recommend that users carefully check their matrices for
these errors and correct or exclude them as appropriate.

It is a general requirement for almost all `Rage`

functions that the matrices used as arguments do not include
`NA`

values. With divided (split) matrices, `NA`

values may be present in some submatrices but not others. For example,
the **U** matrix may be complete, but the
**F** matrix may have `NA`

values. In this case,
functions that require an **F** matrix will fail, while
those that only require a **U** matrix will work. Users
should filter the data to exclude entries with `NA`

values in
the matrices required for their analysis. The functions
`mpm_split`

, `mpm_rearrange`

and
`mpm_standardise`

do not require complete
`NA`

-free matrices.

For functions that use the **U** matrix, we further
suggest filtering the data to exclude the biologically unreasonable
entries where one or more of the `matU`

columns sum to zero,
or to greater than 1 (see *Excessive Survival*, above).
Alternatively, users could examine the offending matrices and make
sensible corrections (e.g. to correct rounding errors).

For functions that use the **F** matrix, and where
sexual reproduction is known to occur in the species, we suggest that
users consider filtering the data to exclude entries where
**F** is entirely zero. This is not always desirable
because there are some situations where zero recorded reproduction is
biologically reasonable. We suggest a similar approach for the
**C** matrix.

When using age-from-stage methods, users should be aware of the issue
of convergence to quasi-stationary distribution (see ). Briefly, All
age-from-stage calculations produce age-trajectories that inevitably
asymptote as a mathematical consequence of describing the vital rates as
functions of discrete stages (Horvitz & Tuljapurkar, 2008). This
mathematical artefact can introduce bias into measures obtained using
age-from-stage methods. `Rage`

provides a convenient and
principled way of correcting for this artefact by imposing a lower
probability threshold defined by the degree of convergence to the
quasi-stationary distribution (see ). We suggest that users filter out
from their analyses matrices that do not pass this threshold
criterion.

Users should also be aware of the issue of census type. For
populations that reproduce in a pulse once per year. The demographic
census may be carried out before or after the reproduction event. There
are thus two types of census: Pre- and post-reproductive census. This
distinction has potentially important implications for demographic
measures because of its effects on measured population structure. For
example, the fraction of individuals in the first age class will tend to
be larger larger in a post-reproductive census than a pre-reproductive
census. There is a column in the com(p)adre metadata
(`CensusType`

) that is intended to record this information
but, because authors of source publications have rarely clearly stated
this information, it is very incomplete. For serious analyses we
therefore recommend that users carefully collect this information
themselves from the source papers.

Although we highlight here a range of issues that could cause problems for MPM analyses we have likely inadvertently omitted some issues. We therefore urge users to carefully consider issues that may pertain to their particular analyses.

Horvitz, C. C., & Tuljapurkar, S. (2008). Stage dynamics, period survival, and mortality plateaus. The American Naturalist, 172(2), 203–215.

Stott, I., Townley, S., & Carslake, D. (2010). On reducibility and ergodicity of population projection matrix models. Methods in Ecology and Evolution. 1 (3), 242-252