Talk:Mark and recapture

{{WikiProject banner shell|class=Start|1=

{{WikiProject Statistics|importance=low}}

{{WikiProject Ecology|importance=top}}

{{WikiProject Biology|importance=mid|attention=yes}}

{{WikiProject Mathematics|importance=low}}

}}

{{annual readership|scale=log}}

Other methods

jollys method should be mentioned also.

Yes, the cormack jolly seber method should be described here. It is of extreme importance. 24.147.119.33 17:51, 20 March 2007 (UTC) gukarma

Sources

Some Sources, References, Links and related topics would be nice... Anyone with knowledge of the subject cares to do it?

--Lucas Gallindo 22:52, 12 July 2007 (UTC)

Lincoln Index

Please see Talk:Lincoln Index to discuss relation of content here with that of the newer article. Melcombe (talk) 12:20, 2 December 2010 (UTC)

Merge discussion

I would like to propose that Tag and release is merged with Mark and recapture for two main reasons;

  1. Tag and release has very little content
  2. Tag and release is just another mark and recapture method

Jamesmcmahon0 (talk) 12:04, 2 May 2013 (UTC)

  • OpposeTag and release can include Mark and recapture, but it can also be quite different. For example small archival tags can be attached to marine animals like fish. These archival tags can be equipped with a camera or sensors that monitor and log things like salinity, temperature, depth, acceleration, and pitch and roll. They are designed to detach at a later date and float to the surface where some method can be used to retrieve the logged data. This has nothing to do with "marking" or recapture. Tag and release is often linked with catch and release, and is a term widely used, particularly in fisheries and by recreational fishermen. --Epipelagic (talk) 04:21, 3 May 2013 (UTC)
  • Support – In school, we learned about Mark and recapture as Tag and release. I actually have never heard of mark and recapture and "tag and release" is what I always knew "mark and recapture" as. Mark and recapture is definitely known as tag and release in many textbooks and the MCPS curriculum. 173.79.218.246 (talk) 09:32, 29 May 2013 (UTC)

::Also, an example of an organization that refers to the practice as "tag and release" despite not being related to fishing: http://www.thevlm.org/turtle_tag_release.aspx 173.79.218.246 (talk) 09:36, 29 May 2013 (UTC)

  • Oppose - Despite Tag and release being a small page, it doesn't mention anything about the purpose of estimating population size. - Paul2520 (talk) 09:33, 21 October 2015 (UTC)

Statistical treatment

I was requested here to improve this article. However my expertise is on Bayesian statistics and so any contribution of mine on Mark and Recapture might be considered original research.

Assume that K animals out of a population of the unknown size N have been marked. Later n animals are captured out of which k animals turned out to be marked.

NK animals are unmarked. Nn animals are uncaptured. nk captured animals are unmarked. Kk marked animals are uncaptured. NKn+k uncaptured animals are unmarked. As all these numbers are non-negative, the following inequalities result: ( NK+nk) and (nk) and (Kk) and (k ≥ 0).

Knowing K and n and k the problem is to estimate N.

= Probability distribution=

The conditional probability (k|N) of observing k knowing N (and n and K), is the hypergeometric distribution.

:(k|N)=\frac{\binom K k \binom{N-K}{n-k}}{\binom N n}

But we were interested in estimating N knowing k.

= Credibility distribution=

When there is no prior knowledge regarding N and k, the credibility distribution (N|k) is proportional to the likelihood function (k|N).

:(N|k)=\frac{(k|N)}{\sum_{N=K+n-k}^\infty (k|N)}

Inserting the expression for (k|N) and cancelling the common factor:

:(N|k)=\frac{\frac{\binom{N-K}{n-k}}{\binom N n}}{\sum_{N=K+n-k}^\infty \frac{\binom{N-K}{n-k}}{\binom N n}}

The denominator series is convergent for k ≥ 2.

= Graphs =

load 'plot' NB. plotting software

LF =: 4 : 0 NB. Likelihood Function

'K n k'=:x

(N>:K+n-k)*((n-k)!N-K)%n!N=:i.y

)

g =: [: 'dot; labels 1 0 ; pensize 4' & plot LF

11 10 0 g 701

11 10 1 g 501

11 10 2 g 301

11 10 3 g 101

11 10 4 g 71

File:Likelihood function for Mark and Retrieve. K=11, n=10, k=0.png

File:Likelihood function for Mark and Retrieve. K=11, n=10, k=1.png

File:Likelihood function for Mark and Retrieve. K=11, n=10, k=2.png

File:Likelihood function for Mark and Retrieve. K=11, n=10, k=3.png

File:Likelihood function for Mark and Retrieve. K=11, n=10, k=4.png

This J programming created the 5 graphs to the right showing likelihood functions for the total number of animals, N, for K=11 marked animals and n=10 captured animals, and for k= 0, 1, 2, 3 and 4 recaptured marked animals.

When k=0 we didn't recapture any marked animals and the physical limit on how many animals there are around is not reduced by the mark and recapture observation. The likelihood function has no maximum, and so the maximum likelihood estimate is infinite.

When k=1 we recaptured a single marked animal, and the likelihood function has a maximum at N=110, but the median is infinite.

When k=2 we recaptured two marked animals. The maximum likelihood is at N=55, and the median is at N=137. A 95% confidence interval is 19≤N≤1367, but the mean value is infinite.

When k=3 we recaptured three marked animals. The maximum likelihood is at N=36, and the median is at N=57. A 95% confidence interval is 18≤N≤236. The mean value is finite but the standard deviation is infinite.

When k=4 the maximum likelihood is at N=27. The median is at N=36. The 95% confidence interval is 17≤N≤ 96.

The frequentist formulas from the article gives the estimates NKn/k = 27.5 and N ≈ (K+1)(n+1)/(k+1)−1 = 25.4.

b=.6000 LF~a=.11 10 4

+/0>:2-/\b

27

*`%/a

27.5

*`%/&.:>:a

25.4

+/0 0.95 0.5<:/~(%{:)+/\b

17 96 36

Bo Jacoby (talk) 06:14, 6 March 2014 (UTC).

= Order of magnitude and statistical uncertainty =

Knowing the credibility distribution function, (N|k), one can compute the order of magnitude, μ, and the statistical uncertainty, σ, of the unknown number N.

:N\approx \mu \pm \sigma

where

:\mu =\sum_{N=K+n-k}^\infty (N|k)N

:\sigma^2+\mu^2 =\sum_{N=K+n-k}^\infty (N|k)N^2

Bo Jacoby (talk) 14:10, 27 February 2014 (UTC).

= Summation =

A closed form for the above sums can be found using Gosper's algorithm. However Wolframalpha does not immediately do it [http://www.wolframalpha.com/input/?i=sum_+{N%3Dn%2BK-k}^{m-2}%28+binomial%28N-K%2Cn-k%29*N^Q%2Fbinomial%28N%2Cn%29%29]. But the following detour does the trick.

Define the sums

:S_Q = {\sum_{N=K+n-k}^\infty \frac{\binom{N-K}{n-k}\binom{N}{Q}}{\binom{N}{n}}} for Q = 0, 1, 2

so

:\mu=\frac{S_1}{S_0}

and

:\frac{\sigma^2}{\mu}+\mu-1= 2\frac{S_2}{S_1}

The sums are evaluated [http://www.wolframalpha.com/input/?i=sum_+{N%3Dn%2BK-k}^{m-2}%28+binomial%28N-K%2Cn-k%29*binomial%28N%2CQ%29%2Fbinomial%28N%2Cn%29%29]

:\sum _{N=K+n-k}^{m-2} \frac{\binom{N-K}{n-k}\binom{N}{Q}}{\binom{N}{n}}=

\frac{\binom{n+K-k}{Q}}{\binom{n+K-k}{n}}A_Q

-\frac{\binom{m-1}{Q}}{\binom{m-1}{n}}\binom{m-K-1}{n-k}B_Q

where

:A_Q=\,_2F_1(1+K-k,1+n-k;1+K+n-k-Q;1)

and

:B_Q=\,_3F_2(1,m-K,m-n;m-Q,m-(K+n-k);1)

are generalized hypergeometric functions.

The limiting case for m → ∞ is

:S_Q=\frac{\binom{K+n-k}{Q}}{\binom{K+n-k}{n}}A_Q

and so

:Q \frac{S_Q}{S_{Q-1}}=(K+n-k-Q+1){A_Q\over A_{Q-1}}

Gauss's theorem

:_2F_1 (a,b;c;1)=\frac{(c-1)!}{(c-a-1)!} \frac{(c-a-b-1)!}{(c-b-1)!}

gives the simplification

:A_Q=\frac{(K+n-k-Q)!}{(K-Q-1)!}\frac{(k-Q-2)!}{(n-Q-1)!}

so

:\frac{A_Q}{A_{Q-1}}=\frac{K-Q}{K+n-k-Q+1}\frac{n-Q}{k-Q-1}

and

:Q \frac{S_Q}{S_{Q-1}}=\frac{K-Q}1\frac{n-Q}{k-Q-1}

So μ and σ are given by

:\mu=\frac{K-1}1\frac{n-1}{k-2} for k≥3

and

:\frac{\sigma^2}{\mu}+\mu-1=\frac{K-2}1\frac{n-2}{k-3} for k≥4.

and the final result is [http://www.wolframalpha.com/input/?i=1%2B%28K-2%29%28n-2%29%2F%28k-3%29-%28K-1%29%28n-1%29%2F%28k-2%29]

:N\approx \frac{K-1}1\frac{n-1}{k-2}\pm\sqrt{\frac{K-1}1\frac{n-1}{k-2}\frac{K-k+1}{k-2}\frac{n-k+1}{k-3}}

Bo Jacoby (talk) 09:51, 25 July 2014 (UTC).

Lately I abandoned fraction bars and square root signs, using negative exponents and fractional exponents instead.

N\approx A B C^{-1}(1\pm (1-A^{-1}C)^{2^{-1}}(1-B^{-1}C)^{2^{-1}}(C-1)^{-2^{-1}})

where A=K-1 and B=n-1 and C=k-2.

Bo Jacoby (talk) 20:45, 10 July 2017 (UTC).

= Example =

]a=.11 10 4, 11 10 5,: 11 10 6

11 10 4

11 10 5

11 10 6

MR=.[:({.,.[:%:{.*[:>:-~/)[:(*`%`:3"1)0 1-~/1 1 2-~"1/]

MR a

45 35.4965

30 14.4914

22.5 7.5

This calculation for K = 11 and n = 10 shows that if k = 4 then N ≈ 45 ± 35.5. If k=5 then N ≈ 30 ± 14.5. If k=6 then N ≈ 22.5 ± 7.5.

Bo Jacoby (talk) 20:07, 8 July 2014 (UTC).

Thanks for this, I think I mostly followed what you have here, the stuff I've been looking at is using frequentist MLE approximations. Seeing it done from a Bayesian perspective is very interesting though. Have you seen any references or is it purely yourself? Jamesmcmahon0 (talk) 11:18, 4 March 2014 (UTC)

:Hi James.

:I recently wrote this paper on Mark and Recapture.

:https://www.dropbox.com/scl/fi/1ch767qo87qr2qzazsfkl/duction.pdf?rlkey=opm0x4byn6xqg6yoj6mden6j7&dl=0

:There is also a text in Open Office format, such that you may play with the spreadsheets.

:https://www.dropbox.com/scl/fi/h1kpdzfmjp0qi17dxi6et/duction.odt?rlkey=0iei7sc1df7jdth8sejqo3jpf&dl=0

:The method is much simpler than what I did 10 years ago, but the results are not the same.

:Now I get: for K = 11 and n = 10 , if k = 4 then N ≈ 35 ± 18. If k=5 then N ≈ 26 ± 10. If k=6 then N ≈ 21 ± 6. I am not sure which result is most wrong.

:Bo Jacoby (talk) 06:25, 29 May 2024 (UTC)

Reference to 2013 paper in The American Statistician ([http://arxiv.org/abs/1205.1150 arXiv]) for a Bayesian analysis, which demonstrates the above. — Preceding unsigned comment added by 194.81.223.66 (talk) 13:11, 15 December 2014 (UTC)

Tagging may alert predators

Just read [http://www.gizmag.com/tagging-fish-predators-whereabouts/34852/ this] the linked article will probably have contributions for this article. I don't have time for the foreseeable future unfortunately. 86.179.58.222 (talk) 18:54, 20 November 2014 (UTC)

Confidence Interval

The formula

K + n - k + \frac{(K-k+0.5)(n-k+0.5)}{(k+0.5)}\exp(\pm z_{\alpha/2}\hat{\sigma}_{0.5})

should I think be

K + n - k -0.5 + \frac{(K-k+0.5)(n-k+0.5)}{(k+0.5)}\exp(\pm z_{\alpha/2}\hat{\sigma}_{0.5})

See (6) in {{Cite journal |title = Transformed Logit Confidence Intervals for Small Populations in Single Capture–Recapture Estimation |journal = Communications in Statistics - Simulation and Computation |date = 2009-10-01 |issn = 0361-0918 |pages = 1909–1924 |volume = 38 |issue = 9 |doi = 10.1080/03610910903168595 |first = Mauricio |last = Sadinle }}

I have made this change.

Nick Mulgan (talk) 01:53, 1 February 2019 (UTC)

:Maybe because of this change, but when I work out the example confidence interval I get 23 to 51, not the values shown on the web page. 2601:47:477E:43C0:53E:E0C2:D4C7:8CAC (talk) 15:22, 9 September 2024 (UTC)