Correlation coefficient#Interpreting correlation coefficient values

{{short description|Numerical measure of a statistical relationship between variables}}

{{Merge to|Correlation|discuss=Talk:Correlation#Proposed merge of Correlation coefficient into Correlation|date=February 2025}}

A correlation coefficient is a numerical measure of some type of linear correlation, meaning a statistical relationship between two variables.{{efn|Correlation coefficient: A statistic used to show how the scores from one measure relate to scores on a second measure for the same group of individuals. A high value (approaching +1.00) is a strong direct relationship, values near 0.50 are considered moderate and values below 0.30 are considered to show weak relationship. A low negative value (approaching -1.00) is similarly a strong inverse relationship, and values near 0.00 indicate little, if any, relationship.{{cite web |url=http://www.ncme.org/ncme/NCME/Resource_Center/Glossary/NCME/Resource_Center/Glossary1.aspx?hkey=4bb87415-44dc-4088-9ed9-e8515326a061#anchorC |title=correlation coefficient |author= |website=NCME.org |publisher=National Council on Measurement in Education |access-date=April 17, 2014 |archive-url=https://web.archive.org/web/20170722194028/http://www.ncme.org/ncme/NCME/Resource_Center/Glossary/NCME/Resource_Center/Glossary1.aspx?hkey=4bb87415-44dc-4088-9ed9-e8515326a061#anchorC |archive-date=July 22, 2017 |url-status=dead}}}} The variables may be two columns of a given data set of observations, often called a sample, or two components of a multivariate random variable with a known distribution.{{citation needed|date=July 2019}}

Several types of correlation coefficient exist, each with their own definition and own range of usability and characteristics. They all assume values in the range from −1 to +1, where ±1 indicates the strongest possible correlation and 0 indicates no correlation.{{cite book |last1=Taylor |first1=John R. |title=An Introduction to Error Analysis: The Study of Uncertainties in Physical Measurements |date=1997 |publisher=University Science Books |location=Sausalito, CA |isbn=0-935702-75-X |page=217 |edition=2nd |url=http://faculty.kfupm.edu.sa/phys/aanaqvi/Taylor-An%20Introduction%20to%20Error%20Analysis.pdf |access-date=14 February 2019 |archive-url=https://web.archive.org/web/20190215050550/http://faculty.kfupm.edu.sa/phys/aanaqvi/Taylor-An%20Introduction%20to%20Error%20Analysis.pdf |archive-date=15 February 2019 |url-status=dead }} As tools of analysis, correlation coefficients present certain problems, including the propensity of some types to be distorted by outliers and the possibility of incorrectly being used to infer a causal relationship between the variables (for more, see Correlation does not imply causation).{{cite book |last1=Boddy |first1=Richard |last2=Smith |first2=Gordon |title=Statistical Methods in Practice: For scientists and technologists |date=2009 |publisher=Wiley |location=Chichester, U.K. |isbn=978-0-470-74664-6 |pages=95–96}}

Types

There are several different measures for the degree of correlation in data, depending on the kind of data: principally whether the data is a measurement, ordinal, or categorical.

= Pearson =

The Pearson product-moment correlation coefficient, also known as {{mvar|r}}, {{mvar|R}}, or Pearson's {{mvar|r}}, is a measure of the strength and direction of the linear relationship between two variables that is defined as the covariance of the variables divided by the product of their standard deviations.{{Cite web|last=Weisstein|first=Eric W.|title=Statistical Correlation|url=https://mathworld.wolfram.com/StatisticalCorrelation.html|access-date=2020-08-22|website=mathworld.wolfram.com|language=en}} This is the best-known and most commonly used type of correlation coefficient. When the term "correlation coefficient" is used without further qualification, it usually refers to the Pearson product-moment correlation coefficient.

= Intra-class =

Intraclass correlation (ICC) is a descriptive statistic that can be used, when quantitative measurements are made on units that are organized into groups; it describes how strongly units in the same group resemble each other.

= Rank =

Rank correlation is a measure of the relationship between the rankings of two variables, or two rankings of the same variable:

Spearman's rank correlation coefficient is a measure of how well the relationship between two variables can be described by a monotonic function.
The Kendall tau rank correlation coefficient is a measure of the portion of ranks that match between two data sets.
Goodman and Kruskal's gamma is a measure of the strength of association of the cross tabulated data when both variables are measured at the ordinal level.

= Tetrachoric and polychoric =

The polychoric correlation coefficient measures association between two ordered-categorical variables. It's technically defined as the estimate of the Pearson correlation coefficient one would obtain if:

The two variables were measured on a continuous scale, instead of as ordered-category variables.
The two continuous variables followed a bivariate normal distribution.

When both variables are dichotomous instead of ordered-categorical, the polychoric correlation coefficient is called the tetrachoric correlation coefficient.

=Interpreting correlation coefficient values=

The correlation between two variables have different associations that are measured in values such as {{mvar|r}} or {{mvar|R}}. Correlation values range from −1 to +1, where ±1 indicates the strongest possible correlation and 0 indicates no correlation between variables.{{cite book |last1=Taylor |first1=John R. |title=An Introduction to Error Analysis: The Study of Uncertainties in Physical Measurements |date=1997 |publisher=University Science Books |location=Sausalito, CA |isbn=0-935702-75-X |page=217 |edition=2nd |url=http://faculty.kfupm.edu.sa/phys/aanaqvi/Taylor-An%20Introduction%20to%20Error%20Analysis.pdf |access-date=14 February 2019 |archive-url=https://web.archive.org/web/20190215050550/http://faculty.kfupm.edu.sa/phys/aanaqvi/Taylor-An%20Introduction%20to%20Error%20Analysis.pdf |archive-date=15 February 2019 |url-status=dead }}

{{mvar\|r}} or {{mvar\|R}}	{{mvar\|r}} or {{mvar\|R}}	Strength or weakness of association between variables{{cite web \|title=The Correlation Coefficient (r) \|url=https://sphweb.bumc.bu.edu/otlt/MPH-Modules/PH717-QuantCore/PH717-Module9-Correlation-Regression/PH717-Module9-Correlation-Regression4.html \|website=Boston University}}
class="wikitable"
+1.0 to +0.8	-1.0 to -0.8	Perfect or very strong association
+0.8 to +0.6	-0.8 to -0.6	Strong association
+0.6 to +0.4	-0.6 to -0.4	Moderate association
+0.4 to +0.2	-0.4 to -0.2	Weak association
+0.2 to 0.0	-0.2 to 0.0	Very weak or no association

Notes

References

Category:Correlation indicators

Category:Mathematical terminology