Getis–Ord statistics

Getis–Ord statistics, also known as G_i^*, are used in spatial analysis to measure the local and global spatial autocorrelation. Developed by statisticians Arthur Getis and J. Keith Ord they are commonly used for Hot Spot Analysis{{cite web | url=https://rpubs.com/heatherleeleary/hotspot_getisOrd_tut | title=RPubs - R Tutorial: Hotspot Analysis Using Getis Ord Gi }}{{cite web | url=https://pro.arcgis.com/en/pro-app/latest/tool-reference/spatial-statistics/hot-spot-analysis.htm | title=Hot Spot Analysis (Getis-Ord Gi*) (Spatial Statistics)—ArcGIS Pro | Documentation }} to identify where features with high or low values are spatially clustered in a statistically significant way. Getis-Ord statistics are available in a number of software libraries such as CrimeStat, GeoDa, ArcGIS, PySALhttps://pysal.org/ and R.{{cite web | url=https://github.com/r-spatial/spdep/ | title=R-spatial/Spdep | website=GitHub }}{{Cite journal

| doi = 10.1007/s11749-018-0599-x

| first1 = R.S.

| last1 = Bivand

| first2 = D.W.

| last2 = Wong

| year = 2018

| title = Comparing implementations of global and local indicators of spatial association.

| journal = Test

| volume = 27

| issue = 3

| pages = 716–748

| hdl = 11250/2565494

| hdl-access = free

}}

Local statistics

File:USA Contiguous Unemployment Rate 2020.jpg

There are two different versions of the statistic, depending on whether the data point at the target location $i$ is included or not{{cite web | url=https://geodacenter.github.io/workbook/6b_local_adv/lab6b.html | title=Local Spatial Autocorrelation (2) }}

: $G_i = \frac{ \sum_{j \neq i} w_{ij} x_j}{\sum_{j \neq i} x_j}$

: $G_i^* = \frac{ \sum_{j} w_{ij} x_j}{\sum_{j} x_j}$

Here $x_i$ is the value observed at the $i^{th}$ spatial site and $w_{ij}$ is the spatial weight matrix which constrains which sites are connected to one another. For $G_i^*$ the denominator is constant across all observations.

A value larger (or smaller) than the mean suggests a hot (or cold) spot corresponding to a high-high (or low-low) cluster. Statistical significance can be estimated using analytical approximations as in the original work{{Cite journal

| doi = 10.1111/j.1538-4632.1992.tb00261.x

| first1 = A.

| last1 = Getis

| first2 = J.K.

| last2 = Ord

| year = 1992

| title = The analysis of spatial association by use of distance statistics

| journal = Geographical Analysis

| volume = 24

| issue = 3

| pages = 189–206

}}{{Cite journal

| doi = 10.1111/j.1538-4632.1995.tb00912.x

| first2 = A.

| last2 = Getis

| first1 = J.K.

| last1 = Ord

| year = 1995

| title = Local spatial autocorrelation statistics: distributional issues and an application

| journal = Geographical Analysis

| volume = 27

| issue = 4

| pages = 286–306

}}

however in practice permutation testing is used to obtain more reliable estimates of significance for statistical inference.

Global statistics

The Getis-Ord statistics of overall spatial association are{{cite web | url=https://pro.arcgis.com/en/pro-app/latest/tool-reference/spatial-statistics/h-how-high-low-clustering-getis-ord-general-g-spat.htm | title=How High/Low Clustering (Getis-Ord General G) works—ArcGIS Pro | Documentation }}

: $G = \frac{ \sum_{ij, i\neq j} w_{ij} x_i x_j}{\sum_{ij, i\neq j} x_i x_j}$

: $G^* = \frac{ \sum_{ij} w_{ij} x_i x_j}{\sum_{ij} x_i x_j}$

The local and global $G^*$ statistics are related through the weighted average

: $\frac{ \sum_i x_i G^*_i }{ \sum_i x_i } = \frac{ \sum_{ij} x_i w_{ij} x_j }{ \sum_i x_i \sum_j x_j } = G^*$

The relationship of the $G$ and $G_i$ statistics is more complicated due to the dependence of the denominator of $G_i$ on $i$ .

Relation to Moran's I

Moran's I is another commonly used measure of spatial association defined by

: $I = \frac{N}{W} \frac{\sum_{ij} w_{ij}(x_i - \bar{x})(x_j - \bar{x})}{\sum_{i} (x_i - \bar{x})^2}$

where $N$ is the number of spatial sites and $W = \sum_{ij} w_{ij}$ is the sum of the entries in the spatial weight matrix. Getis and Ord show that

: $I = (K_1/K_2) G - K_2 \bar{x} \sum_i (w_{i \cdot} + w_{\cdot i}) x_i + K_2 \bar{x}^2 W$

Where $w_{i \cdot} = \sum_j w_{ij}$ , $w_{\cdot i} = \sum_j w_{ji}$ , $K_1 = \left( \sum_{ij, i\neq j} x_i x_j \right)^{-1}$ and $K_2 = \frac{W}{N}\left(\sum_{i} (x_i - \bar{x})^2\right)^{-1}$ . They are equal if $w_{ij}=w$ is constant, but not in general.

Ord and Getis also show that Moran's I can be written in terms of $G_i^*$

: $I = \frac{1}{W} \left( \sum_i z_i V_i G_i^* - N\right)$

where $z_i = (x_i - \bar{x})/s$ , $s$ is the standard deviation of $x$ and

: $V_i^2 = \frac{1}{N-1}\sum_j \left( w_{ij} - \frac{1}{N} \sum_k w_{ik}\right)^2$

is an estimate of the variance of $w_{ij}$ .

Getis–Ord statistics

Local statistics

Global statistics

Relation to Moran's I

See also

References