Getis–Ord statistics

{{Short description|Spatial autocorrelation statistic}}

Getis–Ord statistics, also known as Gi*, are used in spatial analysis to measure the local and global spatial autocorrelation. Developed by statisticians Arthur Getis and J. Keith Ord they are commonly used for Hot Spot Analysis{{cite web | url=https://rpubs.com/heatherleeleary/hotspot_getisOrd_tut | title=RPubs - R Tutorial: Hotspot Analysis Using Getis Ord Gi }}{{cite web | url=https://pro.arcgis.com/en/pro-app/latest/tool-reference/spatial-statistics/hot-spot-analysis.htm | title=Hot Spot Analysis (Getis-Ord Gi*) (Spatial Statistics)—ArcGIS Pro | Documentation }} to identify where features with high or low values are spatially clustered in a statistically significant way. Getis-Ord statistics are available in a number of software libraries such as CrimeStat, GeoDa, ArcGIS, PySALhttps://pysal.org/ and R.{{cite web | url=https://github.com/r-spatial/spdep/ | title=R-spatial/Spdep | website=GitHub }}{{Cite journal

| doi = 10.1007/s11749-018-0599-x

| first1 = R.S.

| last1 = Bivand

| first2 = D.W.

| last2 = Wong

| year = 2018

| title = Comparing implementations of global and local indicators of spatial association.

| journal = Test

| volume = 27

| issue = 3

| pages = 716–748

| hdl = 11250/2565494

| hdl-access = free

}}

Local statistics

File:USA Contiguous Unemployment Rate 2020.jpg

There are two different versions of the statistic, depending on whether the data point at the target location i is included or not{{cite web | url=https://geodacenter.github.io/workbook/6b_local_adv/lab6b.html | title=Local Spatial Autocorrelation (2) }}

: G_i = \frac{ \sum_{j \neq i} w_{ij} x_j}{\sum_{j \neq i} x_j}

: G_i^* = \frac{ \sum_{j} w_{ij} x_j}{\sum_{j} x_j}

Here x_i is the value observed at the i^{th} spatial site and w_{ij} is the spatial weight matrix which constrains which sites are connected to one another. For G_i^* the denominator is constant across all observations.

A value larger (or smaller) than the mean suggests a hot (or cold) spot corresponding to a high-high (or low-low) cluster. Statistical significance can be estimated using analytical approximations as in the original work{{Cite journal

| doi = 10.1111/j.1538-4632.1992.tb00261.x

| first1 = A.

| last1 = Getis

| first2 = J.K.

| last2 = Ord

| year = 1992

| title = The analysis of spatial association by use of distance statistics

| journal = Geographical Analysis

| volume = 24

| issue = 3

| pages = 189–206

}}{{Cite journal

| doi = 10.1111/j.1538-4632.1995.tb00912.x

| first2 = A.

| last2 = Getis

| first1 = J.K.

| last1 = Ord

| year = 1995

| title = Local spatial autocorrelation statistics: distributional issues and an application

| journal = Geographical Analysis

| volume = 27

| issue = 4

| pages = 286–306

}}

however in practice permutation testing is used to obtain more reliable estimates of significance for statistical inference.

Global statistics

The Getis-Ord statistics of overall spatial association are{{cite web | url=https://pro.arcgis.com/en/pro-app/latest/tool-reference/spatial-statistics/h-how-high-low-clustering-getis-ord-general-g-spat.htm | title=How High/Low Clustering (Getis-Ord General G) works—ArcGIS Pro | Documentation }}

: G = \frac{ \sum_{ij, i\neq j} w_{ij} x_i x_j}{\sum_{ij, i\neq j} x_i x_j}

: G^* = \frac{ \sum_{ij} w_{ij} x_i x_j}{\sum_{ij} x_i x_j}

The local and global G^* statistics are related through the weighted average

: \frac{ \sum_i x_i G^*_i }{ \sum_i x_i } = \frac{ \sum_{ij} x_i w_{ij} x_j }{ \sum_i x_i \sum_j x_j } = G^*

The relationship of the G and G_i statistics is more complicated due to the dependence of the denominator of G_i on i.

Relation to Moran's I

Moran's I is another commonly used measure of spatial association defined by

:

I = \frac{N}{W} \frac{\sum_{ij} w_{ij}(x_i - \bar{x})(x_j - \bar{x})}{\sum_{i} (x_i - \bar{x})^2}

where N is the number of spatial sites and W = \sum_{ij} w_{ij} is the sum of the entries in the spatial weight matrix. Getis and Ord show that

:

I = (K_1/K_2) G - K_2 \bar{x} \sum_i (w_{i \cdot} + w_{\cdot i}) x_i + K_2 \bar{x}^2 W

Where w_{i \cdot} = \sum_j w_{ij}, w_{\cdot i} = \sum_j w_{ji}, K_1 = \left( \sum_{ij, i\neq j} x_i x_j \right)^{-1} and K_2 = \frac{W}{N}\left(\sum_{i} (x_i - \bar{x})^2\right)^{-1}. They are equal if w_{ij}=w is constant, but not in general.

Ord and Getis also show that Moran's I can be written in terms of G_i^*

:

I = \frac{1}{W} \left( \sum_i z_i V_i G_i^* - N\right)

where z_i = (x_i - \bar{x})/s, s is the standard deviation of x and

:

V_i^2 = \frac{1}{N-1}\sum_j \left( w_{ij} - \frac{1}{N} \sum_k w_{ik}\right)^2

is an estimate of the variance of w_{ij}.

See also

References

{{reflist}}