Fisher transformation

{{short description|Statistical transformation}}

{{Redirect-distinguish|Fisher z-transformation|Fisher's z-distribution}}

Image:fisher transformation.svg

In statistics, the Fisher transformation (or Fisher z-transformation) of a Pearson correlation coefficient is its inverse hyperbolic tangent (artanh).

When the sample correlation coefficient r is near 1 or -1, its distribution is highly skewed, which makes it difficult to estimate confidence intervals and apply tests of significance for the population correlation coefficient ρ.{{cite journal| last=Fisher | first= R. A. | year=1915 | title= Frequency distribution of the values of the correlation coefficient in samples of an indefinitely large population | journal=Biometrika | volume=10 | pages=507–521 | jstor=2331838| issue=4| doi=10.2307/2331838| hdl= 2440/15166 | hdl-access=free }}{{cite journal|authorlink=Ronald Fisher | last=Fisher | first= R. A. | year=1921 | title=On the 'probable error' of a coefficient of correlation deduced from a small sample | journal=Metron | volume=1 | pages=3–32|url=http://digital.library.adelaide.edu.au/dspace/bitstream/2440/15169/1/14.pdf}}Rick Wicklin. Fisher's transformation of the correlation coefficient. September 20, 2017. https://blogs.sas.com/content/iml/2017/09/20/fishers-transformation-correlation.html. Accessed Feb 15,2022.

The Fisher transformation solves this problem by yielding a variable whose distribution is approximately normally distributed, with a variance that is stable over different values of r.

Definition

Given a set of N bivariate sample pairs (XiYi), i = 1, ..., N, the sample correlation coefficient r is given by

:r = \frac{\operatorname{cov}(X,Y)}{\sigma_X \sigma_Y} = \frac{\sum ^N _{i=1}(X_i - \bar{X})(Y_i - \bar{Y})}{\sqrt{\sum ^N _{i=1}(X_i - \bar{X})^2} \sqrt{\sum ^N _{i=1}(Y_i - \bar{Y})^2}}.

Here \operatorname{cov}(X,Y) stands for the covariance between the variables X and Y and \sigma stands for the standard deviation of the respective variable. Fisher's z-transformation of r is defined as

:z = {1 \over 2}\ln\left({1+r \over 1-r}\right) = \operatorname{artanh}(r),

where "ln" is the natural logarithm function and "artanh" is the inverse hyperbolic tangent function.

If (XY) has a bivariate normal distribution with correlation ρ and the pairs (XiYi) are independent and identically distributed, then z is approximately normally distributed with mean

:{1 \over 2}\ln\left({{1+\rho} \over {1-\rho}}\right),

and a standard deviation which does not depend on the value of the correlation rho (i.e., a Variance-stabilizing transformation)

:{1 \over \sqrt{N-3}},

where N is the sample size, and ρ is the true correlation coefficient.

This transformation, and its inverse

:r = \frac{\exp(2z)-1}{\exp(2z)+1} = \operatorname{tanh}(z),

can be used to construct a large-sample confidence interval for r using standard normal theory and derivations. See also application to partial correlation.

Derivation

{{cleanup|reason=the steps of the derivation are not laid out completely.|date=July 2021}}

File:Fisher Transformation.png

Hotelling gives a concise derivation of the Fisher transformation.{{Cite journal|last=Hotelling|first=Harold|date=1953|title=New Light on the Correlation Coefficient and its Transforms|url=http://dx.doi.org/10.1111/j.2517-6161.1953.tb00135.x|journal=Journal of the Royal Statistical Society, Series B (Methodological)|volume=15|issue=2|pages=193–225|doi=10.1111/j.2517-6161.1953.tb00135.x|issn=0035-9246}}

To derive the Fisher transformation, one starts by considering an arbitrary increasing, twice-differentiable function of r, say G(r). Finding the first term in the large-N expansion of the corresponding skewness \kappa_3 results{{Cite journal|last=Winterbottom|first=Alan|date=1979|title=A Note on the Derivation of Fisher's Transformation of the Correlation Coefficient|url=http://dx.doi.org/10.2307/2683819|journal=The American Statistician|volume=33|issue=3|pages=142–143|doi=10.2307/2683819|jstor=2683819 |issn=0003-1305}} in

:\kappa_3=\frac{6\rho -3(1-\rho ^{2})G^{\prime \prime }(\rho )/G^{\prime }(\rho )}{\sqrt{N}}+O(N^{-3/2}).

Setting \kappa_3=0 and solving the corresponding differential equation for G yields the inverse hyperbolic tangent G(\rho)=\operatorname{artanh}(\rho) function.

Similarly expanding the mean m and variance v of \operatorname{artanh}(r), one gets

:m = \operatorname{artanh}(\rho )+\frac{\rho }{2N}+O(N^{-2})

and

:v = \frac{1}{N}+\frac{6-\rho ^{2}}{2N^{2}}+O(N^{-3})

respectively.

The extra terms are not part of the usual Fisher transformation. For large values of \rho and small values of N they represent a large improvement of accuracy at minimal cost, although they greatly complicate the computation of the inverse – a closed-form expression is not available. The near-constant variance of the transformation is the result of removing its skewness – the actual improvement is achieved by the latter, not by the extra terms. Including the extra terms, i.e., computing (z-m)/v1/2, yields:

:\frac{z-\operatorname{artanh}(\rho )-\frac{\rho }{2N}}{\sqrt{\frac{1}{N}+\frac{6-\rho ^{2}}{2N^{2}}}}

which has, to an excellent approximation, a standard normal distribution.{{cite journal |last1=Vrbik |first1=Jan |title=Population moments of sampling distributions |journal=Computational Statistics |date=December 2005 |volume=20 |issue=4 |pages=611–621 |doi=10.1007/BF02741318|s2cid=120592303 }}

File:rsquared.png

Application

The application of Fisher's transformation can be enhanced using a software calculator as shown in the figure. Assuming that the r-squared value found is 0.80, that there are 30 data {{Clarify|date=May 2022}}, and accepting a 90% confidence interval, the r-squared value in another random sample from the same population may range from 0.656 to 0.888. When r-squared is outside this range, the population is considered to be different.

Discussion

The Fisher transformation is an approximate variance-stabilizing transformation for r when X and Y follow a bivariate normal distribution. This means that the variance of z is approximately constant for all values of the population correlation coefficient ρ. Without the Fisher transformation, the variance of r grows smaller as |ρ| gets closer to 1. Since the Fisher transformation is approximately the identity function when |r| < 1/2, it is sometimes useful to remember that the variance of r is well approximated by 1/N as long as |ρ| is not too large and N is not too small. This is related to the fact that the asymptotic variance of r is 1 for bivariate normal data.

The behavior of this transform has been extensively studied since Fisher introduced it in 1915. Fisher himself found the exact distribution of z for data from a bivariate normal distribution in 1921; Gayen in 1951{{cite journal | last=Gayen | first=A. K. |title=The Frequency Distribution of the Product-Moment Correlation Coefficient in Random Samples of Any Size Drawn from Non-Normal Universes | volume=38 | year=1951 | pages=219–247 | journal=Biometrika | jstor=2332329 | issue=1/2 | doi=10.1093/biomet/38.1-2.219}}

determined the exact distribution of z for data from a bivariate Type A Edgeworth distribution. Hotelling in 1953 calculated the Taylor series expressions for the moments of z and several related statistics{{cite journal |authorlink=Harold Hotelling | last=Hotelling | first=H | year=1953 | title=New light on the correlation coefficient and its transforms | journal=Journal of the Royal Statistical Society, Series B | volume=15 | pages=193–225 | jstor=2983768 |issue=2 }} and Hawkins in 1989 discovered the asymptotic distribution of z for data from a distribution with bounded fourth moments.{{cite journal | last=Hawkins | first=D. L. | year=1989 | title=Using U statistics to derive the asymptotic distribution of Fisher's Z statistic | journal=The American Statistician | volume=43 | pages=235–237 | doi=10.2307/2685369 | issue=4 | jstor=2685369| title-link=u-statistic }}

An alternative to the Fisher transformation is to use the exact confidence distribution density for ρ given by{{Cite journal|last=Taraldsen|first=Gunnar|date=2021|title=The Confidence Density for Correlation|url=https://doi.org/10.1007/s13171-021-00267-y|journal=Sankhya A|language=en|doi=10.1007/s13171-021-00267-y|s2cid=244594067 |issn=0976-8378|doi-access=free|hdl=11250/3133125|hdl-access=free}}{{Cite journal|last=Taraldsen|first=Gunnar|date=2020|title=Confidence in Correlation|url=http://rgdoi.net/10.13140/RG.2.2.23673.49769| language=en|doi=10.13140/RG.2.2.23673.49769}}

\pi (\rho | r) =

\frac{\Gamma(\nu+1)}{\sqrt{2\pi}\Gamma(\nu + \frac{1}{2})}

(1 - r^2)^{\frac{\nu - 1}{2}} \cdot

(1 - \rho^2)^{\frac{\nu - 2}{2}} \cdot

(1 - r \rho )^{\frac{1-2\nu}{2}} F\!\left(\frac{3}{2},-\frac{1}{2}; \nu + \frac{1}{2}; \frac{1 + r \rho}{2}\right)

where F is the Gaussian hypergeometric function and \nu = N-1 > 1 .

Other uses

While the Fisher transformation is mainly associated with the Pearson product-moment correlation coefficient for bivariate normal observations, it can also be applied to Spearman's rank correlation coefficient in more general cases.{{cite encyclopedia|last=Zar | first= Jerrold H. | year=2005| title=Spearman Rank Correlation: Overview | encyclopedia=Encyclopedia of Biostatistics| doi= 10.1002/9781118445112.stat05964 | isbn= 9781118445112 }} A similar result for the asymptotic distribution applies, but with a minor adjustment factor: see the cited article for details.

See also

References