Euclidean distance matrix
In mathematics, a Euclidean distance matrix is an {{math|n×n}} matrix representing the spacing of a set of {{mvar|n}} points in Euclidean space.
For points in {{mvar|k}}-dimensional space {{math|ℝk}}, the elements of their Euclidean distance matrix {{mvar|A}} are given by squares of distances between them.
That is
:
A & = (a_{ij}); \\
a_{ij} & = d_{ij}^2 \;=\; \lVert x_i - x_j\rVert^2
\end{align}
where denotes the Euclidean norm on {{math|ℝk}}.
:
0 & d_{12}^2 & d_{13}^2 & \dots & d_{1n}^2 \\
d_{21}^2 & 0 & d_{23}^2 & \dots & d_{2n}^2 \\
d_{31}^2 & d_{32}^2 & 0 & \dots & d_{3n}^2 \\
\vdots&\vdots & \vdots & \ddots&\vdots& \\
d_{n1}^2 & d_{n2}^2 & d_{n3}^2 & \dots & 0 \\
\end{bmatrix}
In the context of (not necessarily Euclidean) distance matrices, the entries are usually defined directly as distances, not their squares.
However, in the Euclidean case, squares of distances are used to avoid computing square roots and to simplify relevant theorems and algorithms.
Euclidean distance matrices are closely related to Gram matrices (matrices of dot products, describing norms of vectors and angles between them).
The latter are easily analyzed using methods of linear algebra.
This allows to characterize Euclidean distance matrices and recover the points that realize it.
A realization, if it exists, is unique up to rigid transformations, i.e. distance-preserving transformations of Euclidean space (rotations, reflections, translations).
In practical applications, distances are noisy measurements or come from arbitrary dissimilarity estimates (not necessarily metric).
The goal may be to visualize such data by points in Euclidean space whose distance matrix approximates a given dissimilarity matrix as well as possible — this is known as multidimensional scaling.
Alternatively, given two sets of data already represented by points in Euclidean space, one may ask how similar they are in shape, that is, how closely can they be related by a distance-preserving transformation — this is Procrustes analysis.
Some of the distances may also be missing or come unlabelled (as an unordered set or multiset instead of a matrix), leading to more complex algorithmic tasks, such as the graph realization problem or the turnpike problem (for points on a line).{{harvtxt|Dokmanic|Parhizkar|Ranieri|Vetterli|2015}}{{harvtxt|So|2007}}
Properties
By the fact that Euclidean distance is a metric, the matrix {{mvar|A}} has the following properties.
- All elements on the diagonal of {{mvar|A}} are zero (i.e. it is a hollow matrix); hence the trace of {{mvar|A}} is zero.
- {{mvar|A}} is symmetric (i.e. ).
- (by the triangle inequality)
In dimension {{mvar|k}}, a Euclidean distance matrix has rank less than or equal to {{math|k+2}}. If the points are in general position, the rank is exactly {{math|min(n, k + 2).}}
Distances can be shrunk by any power to obtain another Euclidean distance matrix. That is, if is a Euclidean distance matrix, then is a Euclidean distance matrix for every {{math|0<s<1}}.{{Cite journal |last=Maehara |first=Hiroshi |date=2013 |title=Euclidean embeddings of finite metric spaces |journal=Discrete Mathematics |language=en |volume=313 |issue=23 |pages=2848–2856 |doi=10.1016/j.disc.2013.08.029 |issn=0012-365X|doi-access=free }} Theorem 2.6
Relation to Gram matrix
The Gram matrix of a sequence of points in {{mvar|k}}-dimensional space {{math|ℝk}}
is the {{math|n×n}} matrix of their dot products (here a point is thought of as a vector from 0 to that point):
: , where is the angle between the vector and .
In particular
: is the square of the distance of from 0.
Thus the Gram matrix describes norms and angles of vectors (from 0 to) .
Let be the {{math|k×n}} matrix containing as columns.
Then
: , because (seeing as a column vector).
Matrices that can be decomposed as , that is, Gram matrices of some sequence of vectors (columns of ), are well understood — these are precisely positive semidefinite matrices.
To relate the Euclidean distance matrix to the Gram matrix, observe that
:
That is, the norms and angles determine the distances.
Note that the Gram matrix contains additional information: distances from 0.
Conversely, distances between pairs of {{math|n+1}} points determine dot products between {{mvar|n}} vectors ({{math|1≤i≤n}}):
:
(this is known as the polarization identity).
Characterizations
For a {{math|n×n}} matrix {{mvar|A}}, a sequence of points in {{mvar|k}}-dimensional Euclidean space {{math|ℝk}}
is called a realization of {{mvar|A}} in {{math|ℝk}} if {{mvar|A}} is their Euclidean distance matrix.
One can assume without loss of generality that (because translating by preserves distances).
{{Math theorem|name=Theorem{{harvtxt|So|2007}}, Theorem 3.3.1, p. 40
independently shown by Young & Householder{{Cite journal |last1=Young |first1=Gale |last2=Householder |first2=A. S. |s2cid=122400126 |date=1938-03-01 |title=Discussion of a set of points in terms of their mutual distances |journal=Psychometrika |language=en |volume=3 |issue=1 |pages=19–22 |doi=10.1007/BF02287916 |issn=1860-0980}}
|
A symmetric hollow {{math|n×n}} matrix {{mvar|A}} with real entries admits a realization in {{math|ℝk}} if and only if the {{math|(n-1)×(n-1)}} matrix defined by
:
is positive semidefinite and has rank at most {{mvar|k}}.
}}
This follows from the previous discussion because {{mvar|G}} is positive semidefinite of rank at most {{mvar|k}} if and only if it can be decomposed as where {{mvar|X}} is a {{math|k×n}} matrix.{{harvtxt|So|2007}}, Theorem 2.2.1, p. 10
Moreover, the columns of {{mvar|X}} give a realization in {{math|ℝk}}.
Therefore, any method to decompose {{mvar|G}} allows to find a realization.
The two main approaches are variants of Cholesky decomposition or using spectral decompositions to find the principal square root of {{mvar|G}}, see Definite matrix#Decomposition.
The statement of theorem distinguishes the first point . A more symmetric variant of the same theorem is the following:
{{Math theorem|name=Corollary{{harvtxt|So|2007}}, Corollary 3.3.3, p. 42|
A symmetric hollow {{math|n×n}} matrix {{mvar|A}} with real entries admits a realization if and only if {{mvar|A}}
is negative semidefinite on the hyperplane , that is
: for all such that .
}}
Other characterizations involve Cayley–Menger determinants.
In particular, these allow to show that a symmetric hollow {{math|n×n}} matrix is realizable in {{math|ℝk}} if and only if every {{math|(k+3)×(k+3)}} principal submatrix is.
In other words, a semimetric on finitely many points is embedabble isometrically in {{math|ℝk}} if and only if every {{math|k+3}} points are.
{{Cite journal |last=Menger |first=Karl |date=1931 |title=New Foundation of Euclidean Geometry |journal=American Journal of Mathematics |volume=53 |issue=4 |pages=721–745 |doi=10.2307/2371222|jstor=2371222 }}
In practice, the definiteness or rank conditions may fail due to numerical errors, noise in measurements, or due to the data not coming from actual Euclidean distances.
Points that realize optimally similar distances can then be found by semidefinite approximation (and low rank approximation, if desired) using linear algebraic tools such as singular value decomposition or semidefinite programming.
This is known as multidimensional scaling.
Variants of these methods can also deal with incomplete distance data.
Unlabeled data, that is, a set or multiset of distances not assigned to particular pairs, is much more difficult to deal with.
Such data arises, for example, in DNA sequencing (specifically, genome recovery from partial digest) or phase retrieval.
Two sets of points are called homometric if they have the same multiset of distances (but are not necessarily related by a rigid transformation).
Deciding whether a given multiset of {{math|n(n-1)/2}} distances can be realized in a given dimension {{mvar|k}} is strongly NP-hard.
In one dimension this is known as the turnpike problem; it is an open question whether it can be solved in polynomial time.
When the multiset of distances is given with error bars, even the one dimensional case is NP-hard.
Nevertheless, practical algorithms exist for many cases, e.g. random points.{{Cite book |title=Discrete and Computational Geometry |last1=Lemke |first1=Paul |chapter=Reconstructing Sets From Interpoint Distances |last2=Skiena |first2=Steven S. |last3=Smith |first3=Warren D. |date=2003 |publisher=Springer Berlin Heidelberg |isbn=978-3-642-62442-1 |editor-last=Aronov |editor-first=Boris |volume=25 |location=Berlin, Heidelberg |pages=597–631 |doi=10.1007/978-3-642-55566-4_27 |editor-last2=Basu |editor-first2=Saugata |editor-last3=Pach |editor-first3=János |editor-last4=Sharir |editor-first4=Micha}}{{Cite journal |arxiv=1804.02465 |title=Reconstructing Point Sets from Distance Distributions |first1=Shuai |last1=Huang |first2=Ivan |last2=Dokmanić |journal=IEEE Transactions on Signal Processing |year=2021|volume=69 |pages=1811–1827 |doi=10.1109/TSP.2021.3063458 |s2cid=4746784 }}{{Cite arXiv |eprint=1212.2386 |title=Reconstruction of Integers from Pairwise Distances |first1=Kishore |last1=Jaganathan |first2=Babak |last2=Hassibi|year=2012 |class=cs.DM }}
Uniqueness of representations
Given a Euclidean distance matrix, the sequence of points that realize it is unique up to rigid transformations – these are isometries of Euclidean space: rotations, reflections, translations, and their compositions.
{{Math theorem|name=Theorem|
Let and be two sequences of points in {{mvar|k}}-dimensional Euclidean space {{math|ℝk}}.
The distances and are equal (for all {{math|1≤i,j≤n}}) if and only if there is a rigid transformation of {{math|ℝk}} mapping to (for all {{math|1≤i≤n}}).
}}
{{Collapse top|title=Proof}}
Rigid transformations preserve distances so one direction is clear.
Suppose the distances and are equal.
Without loss of generality we can assume by translating the points by and , respectively.
Then the {{math|(n-1)×(n-1)}} Gram matrix of remaining vectors is identical to the Gram matrix of vectors ({{math|2≤i≤n}}).
That is, , where {{mvar|X}} and {{mvar|Y}} are the {{math|k×(n-1)}} matrices containing the respective vectors as columns.
This implies there exists an orthogonal {{math|k×k}} matrix {{mvar|Q}} such that {{math|QX{{=}}Y}}, see Definite symmetric matrix#Uniqueness up to unitary transformations.
{{mvar|Q}} describes an orthogonal transformation of {{math|ℝk}} (a composition of rotations and reflections, without translations) which maps to (and 0 to 0).
The final rigid transformation is described by .
{{Collapse bottom}}
In applications, when distances don't match exactly, Procrustes analysis aims to relate two point sets as close as possible via rigid transformations, usually using singular value decomposition.
The ordinary Euclidean case is known as the orthogonal Procrustes problem or Wahba's problem (when observations are weighted to account for varying uncertainties).
Examples of applications include determining orientations of satellites, comparing molecule structure (in cheminformatics), protein structure (structural alignment in bioinformatics), or bone structure (statistical shape analysis in biology).
See also
- Adjacency matrix
- Coplanarity
- Distance geometry
- Hollow matrix
- Distance matrix
- Euclidean random matrix
- Classical multidimensional scaling, a visualization technique that approximates an arbitrary dissimilarity matrix by a Euclidean distance matrix
- Cayley–Menger determinant
- Semidefinite embedding
Notes
{{Reflist|40em}}
References
- {{Cite journal |last1=Dokmanic |first1=Ivan |last2=Parhizkar |first2=Reza |last3=Ranieri |first3=Juri |last4=Vetterli |first4=Martin |s2cid=8603398 |date=2015 |title=Euclidean Distance Matrices: Essential theory, algorithms, and applications |journal=IEEE Signal Processing Magazine |volume=32 |issue=6 |pages=12–30 |doi=10.1109/MSP.2015.2398954 |issn=1558-0792 |arxiv=1502.07541}}
- {{cite book | author=James E. Gentle | title=Matrix Algebra: Theory, Computations, and Applications in Statistics | publisher=Springer-Verlag | date=2007 | isbn=978-0-387-70872-0 | page=299|url=https://books.google.com/books?id=PDjIV0iWa2cC}}
- {{Cite thesis |last=So |first=Anthony Man-Cho |title=A Semidefinite Programming Approach to the Graph Realization Problem: Theory, Applications and Extensions |date=2007 |url=http://www.se.cuhk.edu.hk/~manchoso/papers/thesis.pdf |language=en |type=PhD}}
- {{Cite journal |last1=Liberti |first1=Leo |last2=Lavor |first2=Carlile |last3=Maculan |first3=Nelson |last4=Mucherino |first4=Antonio |s2cid=15472897 |date=2014 |title=Euclidean Distance Geometry and Applications |journal=SIAM Review |language=en |volume=56 |issue=1 |pages=3–69 |doi=10.1137/120875909 |issn=0036-1445 |arxiv=1205.0349}}
- {{Cite book |last=Alfakih |first=Abdo Y. |title=Euclidean Distance Matrices and Their Applications in Rigidity Theory |date=2018 |publisher=Springer International Publishing |isbn=978-3-319-97845-1 |location=Cham |language=en |doi=10.1007/978-3-319-97846-8}}
{{Matrix classes}}