Relevance vector machine

{{Short description|Machine learning technique}}

{{Machine learning|Supervised learning}}

In mathematics, a Relevance Vector Machine (RVM) is a machine learning technique that uses Bayesian inference to obtain parsimonious solutions for regression and probabilistic classification.{{cite journal | last=Tipping | first=Michael E. |title=Sparse Bayesian Learning and the Relevance Vector Machine |year=2001 |journal = Journal of Machine Learning Research |volume=1 |pages=211–244 |url=http://jmlr.csail.mit.edu/papers/v1/tipping01a.html }} A greedy optimisation procedure and thus fast version were subsequently developed.{{cite journal |last1=Tipping |first1=Michael |last2=Faul |first2=Anita |title=Fast Marginal Likelihood Maximisation for Sparse Bayesian Models |journal=Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics |date=2003 |pages=276–283 |url=https://proceedings.mlr.press/r4/tipping03a.html |access-date=21 November 2024}}{{cite journal |last1=Faul |first1=Anita |last2=Tipping |first2=Michael |title=Analysis of Sparse Bayesian Learning |journal=Advances in Neural Information Processing Systems |date=2001 |url=https://proceedings.neurips.cc/paper_files/paper/2001/file/02b1be0d48924c327124732726097157-Paper.pdf |access-date=21 November 2024}}

The RVM has an identical functional form to the support vector machine, but provides probabilistic classification.

It is actually equivalent to a Gaussian process model with covariance function:

:k(\mathbf{x},\mathbf{x'}) = \sum_{j=1}^N \frac{1}{\alpha_j} \varphi(\mathbf{x},\mathbf{x}_j)\varphi(\mathbf{x}',\mathbf{x}_j)

where \varphi is the kernel function (usually Gaussian), \alpha_j are the variances of the prior on the weight vector

w \sim N(0,\alpha^{-1}I), and \mathbf{x}_1,\ldots,\mathbf{x}_N are the input vectors of the training set.{{cite thesis

|type=Ph.D.

|last=Candela

|first=Joaquin Quiñonero

|date=2004

|title=Learning with Uncertainty - Gaussian Processes and Relevance Vector Machines

|publisher=Technical University of Denmark |url=http://www2.imm.dtu.dk/pubdb/views/edoc_download.php/3237/pdf/imm3237.pdf |chapter=Sparse Probabilistic Linear Models and the RVM

|access-date=April 22, 2016

}}

Compared to that of support vector machines (SVM), the Bayesian formulation of the RVM avoids the set of free parameters of the SVM (that usually require cross-validation-based post-optimizations). However RVMs use an expectation maximization (EM)-like learning method and are therefore at risk of local minima. This is unlike the standard sequential minimal optimization (SMO)-based algorithms employed by SVMs, which are guaranteed to find a global optimum (of the convex problem).

The relevance vector machine was patented in the United States by Microsoft (patent expired September 4, 2019).{{cite patent

|country = US

|number = 6633857

|title = Relevance vector machine

|inventor = Michael E. Tipping

}}

See also

References

{{reflist}}

Software

  • [http://dlib.net dlib] C++ Library
  • [http://www.terborg.net/research/kml/ The Kernel-Machine Library]
  • [http://www.maths.bris.ac.uk/R/web/packages/rvmbinary/index.html rvmbinary]: R package for binary classification
  • [https://github.com/JamesRitchie/scikit-rvm scikit-rvm]
  • [https://github.com/AmazaspShumik/sklearn-bayes/blob/master/skbayes/rvm_ard_models/fast_rvm.py fast-scikit-rvm], [https://github.com/AmazaspShumik/sklearn-bayes/blob/master/ipython_notebooks_tutorials/rvm_ard/rvm_demo.ipynb rvm tutorial]