Wednesday, 4 December 2019

Benchmarking Springtail Recording

Recently, I've been thinking a lot about occupancy models for invertebrates. Other taxa, notably birds and butterflies (through the BTO Wetland Bird Survey and Butterfly Conservation's UK Butterfly Monitoring Scheme (UKBMS), respectively) have good negative data, i.e. an indication of where species are absent as well as where they are present. For most invertebrate taxa, partly because of lack of resource (recording effort) but mostly because of the inefficiency of recording (how can you be sure a springtail is absent from a particular area?), all we have are "White Holes" - gaps in the data which are difficult to interpret. This makes occupancy models difficult if not impossible to derive. The alternative is to fall back to benchmark species as indicators of recording coverage. Previously (Progress on the VC55 Springtail Atlas) I discussed the use of Orchesella cincta as a benchmark species for springtails. While the ubiquity of this species is a good reason to think that this is a valid choice, I've never actually tested the hypothesis - so here we go.

Heatmaps are pretty but inevitably focus attention on where the data is, rather than where it is missing. As an attempt to try to switch the emphasis I have used quadrat mapping - arbitrarily dividing VC55 into a grid and looking at the number of records within each section. A 25x25 grid worked but the the intervals were a bit small and a 10x10 grid is more informative (all VC55 Collembola records to end 2018):



The grid for Orchesella cincta looks like this:



To make sense of this, I converted the distributions into histograms:



These look pretty similar, but to be sure, I ran some further analysis:



There's a good correlation between the Orchesella cincta distribution and the overall Collembola dataset, and this is statistically significant (p = 2.2e-16). Thus Orchesella cincta is a good benchmark for springtail recording effort (at least in VC55). Phew!



Acknowledgements:
All data Copyright Leicestershire and Rutland Environmental Records Centre.
Data visualization performed using the R platform, v. 3.6.1 (R Core Team (2014) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org).
J. Cann for assistance with data visualization.