Goodman and Kruskal's gamma
{{short description|Statistic for rank correlation}}
In statistics, Goodman and Kruskal's gamma is a measure of rank correlation, i.e., the similarity of the orderings of the data when ranked by each of the quantities. It measures the strength of association of the cross tabulated data when both variables are measured at the ordinal level. It makes no adjustment for either table size or ties. Values range from −1 (100% negative association, or perfect inversion) to +1 (100% positive association, or perfect agreement). A value of zero indicates the absence of association.
This statistic (which is distinct from Goodman and Kruskal's lambda) is named after Leo Goodman and William Kruskal, who proposed it in a series of papers from 1954 to 1972.{{cite journal
|title=Measures of Association for Cross Classifications
|first1=Leo A. |last1=Goodman
|first2=William H. |last2=Kruskal |authorlink2=William Kruskal
|journal=Journal of the American Statistical Association
|volume=49 |issue=268 |year=1954 |pages=732–764
|jstor=2281536 |doi=10.2307/2281536
|title=Measures of Association for Cross Classifications. II: Further Discussion and References
|first1=Leo A. |last1=Goodman
|first2=William H. |last2=Kruskal |authorlink2=William Kruskal
|journal=Journal of the American Statistical Association
|volume=54 |issue=285 |year=1959 |pages=123–163
|jstor=2282143 |doi=10.1080/01621459.1959.10501503
|title=Measures of Association for Cross Classifications III: Approximate Sampling Theory
|first1=Leo A. |last1=Goodman
|first2=William H. |last2=Kruskal |authorlink2=William Kruskal
|journal=Journal of the American Statistical Association
|volume=58 |issue=302 |year=1963 |pages=310–364
|jstor=2283271 |doi=10.1080/01621459.1963.10500850
|title=Measures of Association for Cross Classifications, IV: Simplification of Asymptotic Variances
|first1=Leo A. |last1=Goodman
|first2=William H. |last2=Kruskal |authorlink2=William Kruskal
|journal=Journal of the American Statistical Association
|volume=67 |issue=338 |year=1972 |pages=415–421
|jstor=2284396 |doi=10.1080/01621459.1972.10482401
}}
Definition
The estimate of gamma, G, depends on two quantities:
:*Ns, the number of pairs of cases ranked in the same order on both variables (number of concordant pairs),
:*Nd, the number of pairs of cases ranked in reversed order on both variables (number of reversed pairs),
where "ties" (cases where either of the two variables in the pair are equal) are dropped.
Then
:
This statistic can be regarded as the maximum likelihood estimator for the theoretical quantity , where
:
and where Ps and Pd are the probabilities that a randomly selected pair of observations will place in the same or opposite order respectively, when ranked by both variables.
Critical values for the gamma statistic are sometimes found by using an approximation, whereby a transformed value, t of the statistic is referred to Student t distribution, where{{Citation needed|date=August 2010}}
:
and where n is the number of observations (not the number of pairs):
:
Yule's Q
A special case of Goodman and Kruskal's gamma is Yule's Q, also known as the Yule coefficient of association,{{cite journal
|title=On the methods of measuring association between two attributes
|first=G U. |last=Yule
|journal=Journal of the Royal Statistical Society
|volume=49 |issue=6 |year=1912 |pages=579–652
|doi=10.2307/2340126 |jstor=2340126
|url=https://zenodo.org/record/1449482}} which is specific to 2×2 matrices. Consider the following contingency table of events, where each value is a count of an event's frequency:
class="wikitable" | ||
----
! !! Yes !! No !! Totals | ||
----
! Positive | {{mvar|a}} | {{mvar|b}} | align="center" | {{math|a+b}} |
----
! Negative | align="right" | {{mvar|c}} | align="right" |{{mvar|d}} | align="center" | {{math|c+d}} |
----
! Totals | {{math|a+c}} | {{math|b+d}} | align="center" | {{mvar|n}} |
Yule's Q is given by:
:
Although computed in the same fashion as Goodman and Kruskal's gamma, it has a slightly broader interpretation because the distinction between nominal and ordinal scales becomes a matter of arbitrary labeling for dichotomous distinctions. Thus, whether Q is positive or negative depends merely on which pairings the analyst considers to be concordant, but is otherwise symmetric.
Q varies from −1 to +1. −1 reflects total negative association, +1 reflects perfect positive association and 0 reflects no association at all. The sign depends on which pairings the analyst initially considered to be concordant, but this choice does not affect the magnitude.
In term of the odds ratio OR, Yule's Q is given by
:
and so Yule's Q and Yule's Y are related by
:
:
See also
References
{{Reflist}}
Further reading
- Sheskin, D.J. (2007) The Handbook of Parametric and Nonparametric Statistical Procedures. Chapman & Hall/CRC, {{ISBN|9781584888147}}
{{Statistics}}
{{DEFAULTSORT:Gamma Test (Statistics)}}