The moultmcmc package implements the regression models outlined in Underhill and Zucchini (1988) and Underhill, Zucchini, and Summers (1990). In their notation1, samples consist of $$I$$ pre-moult birds, $$J$$ birds in active moult, and $$K$$ post-moult birds. Birds in each category are observed on days $$t = t_1,\ldots,t_I$$; $$u = u_1,\ldots,u_J$$; $$v = v_1,\ldots,v_K$$, respectively. Moult scores for actively moulting birds, where available, are encoded as $$y = y_1,\ldots,y_J$$.

Each moult state has a probability of occurrence \begin{aligned} P(t) &= \Pr\{Y(t)=0\}&= &1-F_T(t)\\ Q(t) &= \Pr\{0<Y(t)<1\}& =&F_T(t)-F_T(t-\tau)\\ R(t) &= \Pr\{Y(t)=1\}&= &1-F_T(t-\tau)\\ \end{aligned} Further, assuming a linear progression of the moult indices over time, the probability density of a particular moult score at time $$t$$ is $f_Y(t)(y)=\tau f_T(t-y\tau),\quad0 < y < 1,$ In moultmcmc the unobserved start date of the study population is assumed to follow a normal distribution with mean $$\mu$$ and standard deviation $$\sigma$$, such that $F_T(t)=\Phi\left(\frac{t-\mu}{\sigma}\right)$ where $$\Phi$$ is the standard normal distribution function and $f_T(t) = \phi(t) = \frac{1}{\sqrt{2\pi}}\exp\frac{-t^2}{2}$.

We further assume that $$F_T(t)$$ has $$p$$ parameters $$\mathbf{\theta} = \theta_1, \theta_2, \ldots, \theta_p$$ and for convenience the start date $$\mu$$, duration $$\tau$$, and population standard deviation of moult $$\sigma$$ will be elements of $$\mathbf{\theta}$$.

## Likelihoods

### Type 1

Type 1 data consist of observations of categorical moult state (pre-moult, active moult, post-moult) and sampling is representative in all three categories. The likelihood of these observations is $\mathcal{L}(\boldsymbol\theta,t,u,v) = \prod_{i=1}^IP(t_i)\prod_{j = 1}^JQ(u_j)\prod_{k=1}^KR(v_k).$

### Type 2

Type 2 data consist of observations of birds in all three moult states (pre-moult, active moult, post-moult). Sampling is representative for all three categories, and for actively moulting birds a sufficiently linear moult index $$y$$ (e.g. percent feather mass grown) is known. The likelihood of these observations is $\mathcal{L}(\boldsymbol\theta,t,u,y,v) = \prod_{i=1}^IP(t_i)\prod_{j = 1}^Jq(u_j,y_j)\prod_{k=1}^KR(v_k),$ where $$q(u_j,y_j) = \tau f_T(u_j-y_j\tau).$$

### Lumped Type 2

Lumped type 2 data consist of observations of birds in all three moult states (pre-moult, active moult, post-moult), but where the pre-moult and post-moult states cannot be distinguished from each other, yielding $$I$$ observations of dates $$t_i$$ on which non-moulting birds were observed. Sampling is representative for all three categories, and for actively moulting birds a sufficiently linear moult index $$y$$ (e.g. percent feather mass grown) is known. The likelihood of these observations is $\mathcal{L}(\boldsymbol\theta,t,u,y) = \prod_{i=1}^IPR(t_i)\prod_{j = 1}^Jq(u_j,y_j),$ where $$q(u_j,y_j) = \tau f_T(u_j-y_j\tau)$$ and $$PR(t_i)=P(t_i)+R(t_i).$$

### Type 3

Type 3 data consist of observations of actively moulting birds only, and a sufficiently linear moult index $$y$$ (e.g. percent feather mass grown) is known for each individual. The likelihood of these observations is $\mathcal{L}(\boldsymbol\theta,u,y) = \prod_{j = 1}^J\frac{q(u_j,y_j)}{Q(u_j)}.$

### Type 4

Type 4 data consist of observations of birds in active moult and post-moult only. Sampling is representative for these two categories, and for actively moulting birds a sufficiently linear moult index $$y$$ (e.g. percent feather mass grown) is known. The likelihood of these observations is $\mathcal{L}(\boldsymbol\theta,u,y,v) = \prod_{j = 1}^J\frac{q(u_j,y_j)}{1-P(u_j)}\prod_{k=1}^K\frac{R(v_k)}{1-P(v_k)}.$

### Type 5

Type 5 data consist of observations of birds in pre-moult and active moult. Sampling is representative for these two categories, and for actively moulting birds a sufficiently linear moult index $$y$$ (e.g. percent feather mass grown) is known. The likelihood of these observations is $\mathcal{L}(\boldsymbol\theta,t,u,y) = \prod_{i=1}^I\frac{P(t_i)}{1-R(t_i)}\prod_{j = 1}^J\frac{q(u_j,y_j)}{1-R(u_j)}.$

### Type 1 + 2

As outlined in Underhill and Zucchini (1988) estimates can also be derived from mixtures of data types. Type 1 + 2 data consist of observations of birds in all three moult states (pre-moult, active moult, post-moult). Sampling is representative for all three categories, but a sufficiently linear moult index $$y$$ (e.g. percent feather mass grown) is known only for some of the actively moulting birds. This means the sample consist of $$I$$ pre-moult birds, $$J$$ birds in active moult with known indices, $$L$$ birds in active moult without known indices but known capture dates $$u'=u'_l,\ldots,u'_L$$, and $$K$$ post-moult birds. The likelihood of these observations is $\mathcal{L}(\boldsymbol\theta,t,u,y,u',v) = \prod_{i=1}^IP(t_i)\prod_{j = 1}^Jq(u_j,y_j)\prod_{l = 1}^{L}Q(u'_{l})\prod_{k=1}^KR(v_k),$

### Recapture models

moultmcmc currently implements a recaptures model which allows for heterogeneity in start dates $$\mu$$ but assumes a common moult duration $$\tau$$. When repeat observations are available an individual’s start date $$\mu_n$$ then becomes

$\begin{equation} \mu_n = \mu_0 + \mu'_n + \mathbf{x}_\mu\boldsymbol{\beta}_\mu \end{equation}$

where $$\boldsymbol{x}_\mu$$ is a row vector containing the values of individual-specific predictors (in the same format as $$\boldsymbol{X}_\mu$$), and $$\mu'_n$$ is an individual-level random effect intercept

$\begin{equation} \mu'_n \sim \mathrm{Normal}(0,\sigma_n) \end{equation}$ where $$\sigma_n$$ is the individual-specific standard deviation. We can then exploit the linearity assumption and treat observed moult scores as $\begin{equation} y_{ni} \sim \mathrm{Normal}(\mu_0 + \mu'_n + \tau * u_{ni}, \sigma_\tau) \end{equation}$ where $$\sigma_\tau$$ captures any unmodelled variance in $$\tau$$ as well as any measurement error in $$y$$.

The likelihood for the Type 3-like model for a sample of $$J$$ birds in active moult without repeated observations, and $$N$$ birds in active moult with a total of $$M$$ repeated observations $$u'=u'_m,\ldots,u'_M$$ and $$y'=y'_m,\ldots,y'_M$$ then is

$\mathcal{L}(\boldsymbol\theta,u,y,u',y') = \prod_{j = 1}^J\frac{q(u_j,y_j)}{Q(u_j)}\prod_{m = 1}^{M}f(u'_m,y'_m)\prod_{n=1}^N\phi(\mu'_n|0,\sigma_n),$ where $$f(u'_m,y'_m)$$ follows from above.

## Priors

Users have a choice between two set of priors for the intercept terms of the linear predictors on the start date $$\mu$$, the duration $$\tau$$, and the population standard deviation of the start date $$\sigma$$, respectively. By default flat priors are used for $$\mu_0$$ and $$\tau_0$$ and a vaguely informative normal prior on $$\ln(\sigma_0)$$

$$\mu_0 \sim \mathrm{Uniform(0,366)}$$
$$\tau_0 \sim \mathrm{Uniform(0,366)}$$
$$\ln(\sigma_0) \sim \mathrm{Normal(0,5)}$$

In some cases the models sample poorly with these priors, and better convergence can be achieved by setting the argument flat_prior = FALSE. In this case vaguely informative truncated normal priors are used for $$\mu_0$$ and $$\tau_0$$:

$$\mu_0 \sim \mathrm{TruncNormal(150,50,0,366)}$$
$$\tau_0 \sim \mathrm{TruncNormal(100,30,0,366)}$$

These priors work well for data from passerines in seasonal environments, i.e. when the sampling occasion data is encoded as days from mid-winter.

For any additional regression coefficients an improper flat prior is used as a default.

Underhill, Les G., and Walter Zucchini. 1988. “A Model for Avian Primary Moult.” Ibis 130: 358–72. https://doi.org/10.1111/j.1474-919x.1988.tb00993.x.

Underhill, L. G., W. Zucchini, and R. W. Summers. 1990. “A Model for Avian Primary Moult-Data Types Based on Migration Strategies and an Example Using the Redshank Tringa Totanus.” Ibis 132: 118–23. https://doi.org/10.1111/j.1474-919x.1990.tb01024.x.