Mark and recapture

From Infogalactic: the planetary knowledge core
Jump to: navigation, search

Lua error in package.lua at line 80: module 'strict' not found. Lua error in package.lua at line 80: module 'strict' not found.

Collar tagged Rock Hyrax
Jackdaw with a numbered aluminum ring on its left tarsus
Biologist is marking a Chittenango ovate amber snail to monitor the population.
right side view of a snail with a number 87 on its shell
Marked Chittenango ovate amber snail.

Mark and recapture is a method commonly used in ecology to estimate an animal population's size. A portion of the population is captured, marked, and released. Later, another portion is captured and the number of marked individuals within the sample is counted. Since the number of marked individuals within the second sample should be proportional to the number of marked individuals in the whole population, an estimate of the total population size can be obtained by dividing the number of marked individuals by the proportion of marked individuals in the second sample. The method is most useful when it is not practical to count all the individuals in the population. Other names for this method, or closely related methods, include capture-recapture, capture-mark-recapture, mark-recapture, sight-resight, mark-release-recapture, multiple systems estimation, band recovery, the Petersen method,[1] and the Lincoln method.

Another major application for these methods is in epidemiology,[2] where they are used to estimate the completeness of ascertainment of disease registers. Typical applications include estimating the number of people needing particular services (i.e. services for children with learning disabilities, services for medically frail elderly living in the community), or with particular conditions(i.e. illegal drug addicts, people infected with HIV, etc.).[citation needed]

Field work related to mark-recapture

Typically a researcher visits a study area and uses traps to capture a group of individuals alive. Each of these individuals is marked with a unique identifier (e.g., a numbered tag or band), and then is released unharmed back into the environment. A mark recapture method was first used for ecological study in 1896 by C.G. Johannes Petersen to estimate plaice, Pleuronectes platessa, populations.[3]

Sufficient time is allowed to pass for the marked individuals to redistribute themselves among the unmarked population.

Next, the researcher returns and captures another sample of individuals. Some individuals in this second sample will have been marked during the initial visit and are now known as recaptures. Other animals captured during the second visit will not have been captured during the first visit to the study area. These unmarked animals are usually given a tag or band during the second visit and then are released.

Population size can be estimated from as few as two visits to the study area. Commonly, more than two visits are made, particularly if estimates of survival or movement are desired. Regardless of the total number of visits, the researcher simply records the date of each capture of each individual. The "capture histories" generated are analyzed mathematically to estimate population size, survival, or movement.

In the epidemiological setting, different sources of patients take the place of the repeated field visits in ecology. To take a concrete example, establishing a register of children with Type 1 diabetes children were identified from hospital admission records, from general practitioners (family doctors), and from the records of the local Diabetes Association[citation needed]. None of these sources had a complete list, but by putting them together it was possible to do two things, first to see how many children were identified in total, and secondly to estimate how many more children with Type 1 diabetes were living in the vital community.

Notation

Let

N = Number of animals in the population
K = Number of animals marked on the first visit
n = Number of animals captured on the second visit
k = Number of recaptured animals that were marked

A biologist wants to estimate the size of a population of turtles in a lake. She captures 10 turtles on her first visit to the lake, and marks their backs with paint. A week later she returns to the lake and captures 15 turtles. Five of these 15 turtles have paint on their backs, indicating that they are recaptured animals. This example is (K, n, k) = (10, 15, 5). The problem is to estimate N.

Lincoln–Petersen estimator

<templatestyles src="Module:Hatnote/styles.css"></templatestyles>

The Lincoln–Petersen method[4] (also known as the Petersen–Lincoln index[3] or Lincoln index) can be used to estimate population size if only two visits are made to the study area. This method assumes that the study population is "closed"[citation needed]. In other words, the two visits to the study area are close enough in time so that no individuals die, are born, move into the study area (immigrate) or move out of the study area (emigrate) between visits. The model also assumes that no marks fall off animals between visits to the field site by the researcher, and that the researcher correctly records all marks.

Given those conditions, estimated population size is:

\hat{N} = \frac{Kn}{k},

Derivation

It is assumed[5] that all individuals have the same probability of being captured in the second sample, regardless of whether they were previously captured in the first sample (with only two samples, this assumption cannot be tested directly).

This implies that, in the second sample, the proportion of marked individuals that are caught (k/K) should equal the proportion of the total population that is caught (n/N). For example, if half of the marked individuals were recaptured, it would be assumed that half of the total population was included in the second sample.

In symbols,

\frac{k}{K} = \frac{n}{N}.

A rearrangement of this gives

\hat{N}=\frac{Kn}{k},

the formula used for the Lincoln–Petersen method.[5]

Sample calculation

In the example (K, n, k) = (10, 15, 5) the Lincoln–Petersen method estimates that there are 30 turtles in the lake.

\hat{N} = \frac{Kn}{k} = \frac{10\times 15}{5} = 30

Chapman estimator

The Lincoln–Peterson estimator is asymptotically unbiased as sample size approaches infinity, but is biased at small sample sizes.[6] An alternative less biased estimator of population size is given by the Chapman estimator:[6]

\hat{N}_C = \frac{(K+1)(n+1)}{k+1} - 1

Sample calculation

The example (K, n, k) = (10, 15, 5) gives

\hat{N}_C = \frac{(K+1)(n+1)}{k+1} -1= \frac{11\times 16}{6}-1 = 28.3

Note that the answer provided by this equation must be truncated not rounded. Thus, the Chapman method estimates 28 turtles in the lake.

Surprisingly, Chapman's estimate was one conjecture from a range of possible estimators: "In practice, the whole number immediately less than (K+1)(n+1)/(k+1) or even Kn/(k+1) will be the estimate. The above form is more convenient for mathematical purposes."[6](see footnote, page 144). Chapman also found the estimator could have considerable negative bias for small Kn/N [6](page 146), but was unconcerned because the estimated standard deviations were large for these cases.

A Bayesian analysis[7] found that Chapman's estimator is the maximum a posteriori estimator for a situation when the first person searches for a fixed number of animals then marks and releases them, but the second person searches for as many animals as possible.

Variance

An approximately unbiased variance of {\textstyle \hat{N}_C} can be estimated as:

 \operatorname{var}(\hat{N}_C) = \frac{(K+1)(n+1)(K-k)(n-k)}{(k+1)(k+1)(k+2)}.

The moments of the hypergeometric equation studied by Chapman can be calculated exactly, giving the answers discussed below.

Confidence interval

An approximate 100(1-\alpha)% confidence interval for the population size N can be obtained as:

K + n - k + \frac{(K-k+0.5)(n-k+0.5)}{(k+0.5)}\exp(\pm z_{\alpha/2}\hat{\sigma}_{0.5}),

where {\textstyle z_{\alpha/2}} corresponds to the 1-\alpha/2 quantile of a standard normal random variable, and

\hat{\sigma}_{0.5}=\sqrt{\frac{1}{k+0.5}+\frac{1}{K-k+0.5}+\frac{1}{n-k+0.5}+\frac{k+0.5}{(n-k+0.5)(K-k+0.5)}}.

It has been shown that this confidence interval has actual coverage probabilities that are close to the nominal 100(1-\alpha)% level even for small populations and extreme capture probabilities (near to 0 or 1), in which cases other confidence intervals fail to achieve the nominal coverage levels.[8]

Bayesian estimate

A Bayesian analysis is provided by.[7] The final answer depends on the priors and the type of search assumed but the approach gives, for Chapman's assumptions (and changing variables from the original notation),

Mean value ± standard deviation

N\approx \frac{(K-1)(n-1)}{k-2}\pm\sqrt{\frac{(K-1)(n-1)(K-k+1)(n-k+1)}{(k-2)(k-2)(k-3)}}

A derivation is found here: Talk:Mark and recapture#Statistical treatment.

Sample calculation

The example (K, n, k) = (10, 15, 5) gives the estimate N ≈ 42 ± 21.5

More than two visits

The literature on the analysis of capture-recapture studies has blossomed since the early 1990s[citation needed]. There are very elaborate statistical models available for the analysis of these experiments.[9] A simple model which easily accommodates the three source, or the three visit study, is to fit a Poisson regression model. Sophisticated mark-recapture models can be fit using Rcapture,[10] a package of the Open Source R programming language, or specialized programs such as MARK[11] or M-SURGE.[12] Other related methods which are often used include the Jolly–Seber model (used in open populations and for multiple census estimates) and Schnabel estimators (described above as an expansion to the Lincoln–Peterson method for closed populations). These are described in detail by Sutherland.[13]

Integrated approaches

Modelling mark-recapture data is trending towards a more integrative approach,[14] which combines mark-recapture data with population dynamics models and other types of data. The integrated approach is more computationally demanding, but extracts more information from the data improving parameter and uncertainty estimates.[15]

See also

References

  1. Lua error in package.lua at line 80: module 'strict' not found.
  2. Chao, A., Tsay, P. K., Lin, S. H., Shau, W. Y., and Chao, D. Y., 2001, The applications of capture-recapture models to epidemiological data, Statistics in Medicine, volume 20, issue 20, pages 3123–3157, doi 10.1002/sim.996
  3. 3.0 3.1 Southwood, T.R.E. & Henderson, P. (2000) Ecological Methods, 3rd edn. Blackwell Science, Oxford.
  4. Seber, G.A.F.. The Estimation of Animal Abundance and Related Parameters. Caldwel, New Jersey: Blackburn Press. ISBN 1-930665-55-5
  5. 5.0 5.1 Lua error in package.lua at line 80: module 'strict' not found.
  6. 6.0 6.1 6.2 6.3 Lua error in package.lua at line 80: module 'strict' not found.
  7. 7.0 7.1 Webster and Kemp (2013)Lua error in package.lua at line 80: module 'strict' not found.
  8. Lua error in package.lua at line 80: module 'strict' not found.
  9. McCrea, R.S. and Morgan, B.J.T. (2014) Lua error in package.lua at line 80: module 'strict' not found. Lua error in package.lua at line 80: module 'strict' not found.
  10. Lua error in package.lua at line 80: module 'strict' not found.
  11. Lua error in package.lua at line 80: module 'strict' not found.
  12. Lua error in package.lua at line 80: module 'strict' not found.
  13. Lua error in package.lua at line 80: module 'strict' not found.
  14. Maunder M.N. (2003) Paradigm shifts in fisheries stock assessment: from integrated analysis to Bayesian analysis and back again. Natural Resource Modeling 16:465–475
  15. Maunder, M.N. (2001) Integrated Tagging and Catch-at-Age Analysis (ITCAAN). In Spatial Processes and Management of Fish Populations, edited by G.H. Kruse,N. Bez, A. Booth, M.W. Dorn, S. Hills, R.N. Lipcius, D. Pelletier, C. Roy, S.J. Smith, and D. Witherell, Alaska Sea Grant College Program Report No. AK-SG-01-02, University of Alaska Fairbanks, pp. 123–146.
  • Lua error in package.lua at line 80: module 'strict' not found.
  • Lua error in package.lua at line 80: module 'strict' not found.
  • Lua error in package.lua at line 80: module 'strict' not found.
  • Lua error in package.lua at line 80: module 'strict' not found.
  • Lua error in package.lua at line 80: module 'strict' not found.
  • Lua error in package.lua at line 80: module 'strict' not found.
  • Lua error in package.lua at line 80: module 'strict' not found.
  • Lua error in package.lua at line 80: module 'strict' not found.
  • Lua error in package.lua at line 80: module 'strict' not found.
  • Lua error in package.lua at line 80: module 'strict' not found.

Further reading

  • Bonett, D.G., Woodward, J.A., & Bentler, P.M. (1986). "A Linear Model for Estimating the Size of a Closed Population", British Journal of Mathematical and Statistical Psychology, 39, 28–40.
  • Evans, M.A., Bonett, D.G., & McDonald, L. (1994). "A General Theory for Analyzing Capture-recapture Data in Closed Populations." Biometrics, 50, 396–405.
  • Lincoln, F. C. (1930). "Calculating Waterfowl Abundance on the Basis of Banding Returns". United States Department of Agriculture Circular, 118, 1–4.
  • Petersen, C. G. J. (1896). "The Yearly Immigration of Young Plaice Into the Limfjord From the German Sea", Report of the Danish Biological Station (1895), 6, 5–84.
  • Schofield, J. R. (2007). "Beyond Defect Removal: Latent Defect Estimation With Capture-Recapture Method", Crosstalk, August 2007; 27–29.

External links