Posts

Non-ignorable missingness

background
statistics, ignorability, missing data

Statistics is basically a missing data problem! – Little 2013 Nearly all samples – whether by design or by accident – are incomplete. We very rarely make a complete census of all individuals in a population or all sites on a landscape. Sometimes we don’t collect, or can’t collect, complete information for individual samples or measures. For instance, we might know an animal was alive when it was last seen, so we know it survived at least that long, but know nothing about its current status. ...

Disentangling concepts of status, trend, and trajectory

background
status, trends, trajectory

The terms status and trend are ubiquitous in resource monitoring and management settings. To be useful and robust, however, they require precise (mathematical) definitions. It has been my experience that misunderstanding these terms can lead to misapplication of model predictions and to researchers and managers drawing the wrong conclusions from the data. In this post we show how relatively simple, even intuitive, definitions for each of these terms clarifies their intent, and improves the insights provided by models of monitoring data. ...

Unequal inclusion probabilities

applications, simulation
travel time, stratification

The Sonoran Desert is among the most extreme environments on Earth. Sampling in these remote, rugged landscapes requires a different approach. When the Park Service established monitoring in Organ Pipe Cactus National Monument they used an approach to select sites based on the cost of travel to sites on the broader landscape, visiting less “costly” sites with higher probability than more costly sites. The cost surface that defined the probability of inclusion of sites was developed using terrain data, and a tool that estimates the time to travel to any arbitrary location on the landscape. ...

Sampling and populations

background
statistics, sample, population

We sample for a very practical reason. It’s usually impossible to get information on the whole population, so we use a sample to make inferences about the population. In our case, the population is typically all sites in a stratum or all sites – in all strata – at the scale of an entire park. Typically, the inference we seek entails three questions. What’s the best estimate of the population mean? ...

Interpreting coefficients

applications
inference

Making sense of the effects of variables included as predictors # Some aspects of covariate effects are readily apparent – for instance, the sign of a coefficient in a model says at least something about the general directionality of the effect, positive or negative. However, a deeper understanding of a model typically requires inferences that go well beyond simple measures of the directionality or significance of effects – it requires understanding the size of effects. ...

Stratum-varying fixed effects

background
statistics, parameterizations

Assume we have three strata, \(s_0\) , \(s_1\) , and \(s_2\) , where \(s_0\) is the “reference” stratum – in other words, \(s_0\) is the stratum for which the 0/1 indicator is 0 across the board in the indicator matrix below (the first row): \[\begin{bmatrix} 1 & 0 & 0 \\ 1 & 1 & 0 \\ 1 & 0 & 1 \end{bmatrix}\] B_0 + (B_1 + B_1_s1_offset * s1 + B_1_s2_offset * s2) * x_1 # in stratum s0 B_0 + (B_1) * x_1 # in stratum s1 B_0 + (B_1 + B_1_s1_offset * s1) * x_1 # in stratum s2 B_0 + (B_1 + B_1_s2_offset * s2) * x_1 # lm(y~x1*x2) model. ...

The offset term

background
statistics

Counts of things naturally scale with the length or duration of observation, the area sampled, and sampling intensity ( Citation: McElreath, 2018 McElreath, R. (2018). Statistical rethinking: A bayesian course with examples in r and stan. Chapman; Hall/CRC. ) . For instance, the longer the river stretch we survey, the more fish we’ll tend to find. Offset terms are used to model rates – e.g., counts per unit area or time. ...