Sammon mapping
{{Short description|Machine learning algorithm}}
Sammon mapping or Sammon projection is an algorithm that maps a high-dimensional space to a space of lower dimensionality (see multidimensional scaling) by trying to preserve the structure of inter-point distances in high-dimensional space in the lower-dimension projection.{{cite web|first=Nivash|last=Jeevanandam|date=2021-09-13
|title=Underrated But Fascinating ML Concepts #5 – CST, PBWM, SARSA, & Sammon Mapping
|url=https://analyticsindiamag.com/underrated-but-fascinating-ml-concepts-5-cst-pbwm-sarsa-sammon-mapping/
|access-date=2021-12-05
|website=Analytics India Magazine
|language=en}}
It is particularly suited for use in exploratory data analysis.
The method was proposed by John W. Sammon in 1969.{{cite journal|author=Sammon JW|title=A nonlinear mapping for data structure analysis|journal=IEEE Transactions on Computers| volume=18|issue=5|year=1969|pages=401,402 (missing in PDF),403–409|url=http://theoval.cmp.uea.ac.uk/~gcc/matlab/sammon/sammon.pdf|doi=10.1109/t-c.1969.222678 |s2cid=43151050 }}
It is considered a non-linear approach as the mapping cannot be represented as a linear combination of the original variables as possible in techniques such as principal component analysis, which also makes it more difficult to use for classification applications.{{cite journal|author=Lerner, B; Hugo Guterman, Mayer Aladjem, Itshak Dinsteint, Yitzhak Romem|title= On pattern classification with Sammon's nonlinear mapping an experimental study|journal= Pattern Recognition|volume=31|issue=4|year=1998|pages=371–381| doi=10.1016/S0031-3203(97)00064-2|bibcode= 1998PatRe..31..371L}}
Denote the distance between ith and jth objects in the original space by , and the distance between their projections by .
Sammon's mapping aims to minimize the following error function, which is often referred to as Sammon's stress or Sammon's error:
:
The minimization can be performed either by gradient descent, as proposed initially, or by other means, usually involving iterative methods.
The number of iterations needs to be experimentally determined and convergent solutions are not always guaranteed.
Many implementations prefer to use the first Principal Components as a starting configuration.{{cite journal|journal=Pattern Analysis and Applications|volume=3|issue=2|pages=61–68|doi=10.1007/s100440050006|title= On the Initialisation of Sammon's Nonlinear Mapping|author=Lerner, B; H. Guterman, M. Aladjem and I. Dinstein|year=2000|citeseerx=10.1.1.579.8935|s2cid=2055054}}
The Sammon mapping has been one of the most successful nonlinear metric multidimensional scaling methods since its advent in 1969, but effort has been focused on algorithm improvement rather than on the form of the stress function.
The performance of the Sammon mapping has been improved by extending its stress function using left Bregman divergence
{{cite journal|journal=Pattern Recognition|volume=44|issue=5|pages=1137–1154|title= Extending metric multidimensional scaling with Bregman divergences|author=J. Sun, M. Crowe, C. Fyfe|date=May 2011|doi=10.1016/j.patcog.2010.11.013|bibcode=2011PatRe..44.1137S }} and right Bregman divergence.{{cite journal|journal=Information Sciences|title= Extending Sammon mapping with Bregman divergences|author=J. Sun, C. Fyfe, M. Crowe|year=2011|doi= 10.1016/j.ins.2011.10.013|volume=187|pages=72–92}}
See also
References
{{reflist|2}}
External links
- [http://hisee.sourceforge.net/ HiSee – an open-source visualizer for high dimensional data]
- [http://www.codeproject.com/KB/recipes/SammonProjection.aspx A C# based program with code on CodeProject].
- [http://theoval.cmp.uea.ac.uk/~gcc/matlab/default.html#sammon Matlab code and method introduction]
Category:Functions and mappings
{{Statistics-stub}}