non-negative least squares

In mathematical optimization, the problem of non-negative least squares (NNLS) is a type of constrained least squares problem where the coefficients are not allowed to become negative. That is, given a matrix {{math|A}} and a (column) vector of response variables {{math|y}}, the goal is to find

: $\operatorname{arg\,min}\limits_\mathbf{x} \|\mathbf{Ax} - \mathbf{y}\|_2^2$ subject to {{math|x ≥ 0}}.

Here {{math|x ≥ 0}} means that each component of the vector {{math|x}} should be non-negative, and {{math|‖·‖₂}} denotes the Euclidean norm.

Non-negative least squares problems turn up as subproblems in matrix decomposition, e.g. in algorithms for PARAFAC and non-negative matrix/tensor factorization.{{Cite journal | last1 = Lin | first1 = Chih-Jen| title = Projected Gradient Methods for Nonnegative Matrix Factorization | doi = 10.1162/neco.2007.19.10.2756 | journal = Neural Computation| volume = 19 | issue = 10 | pages = 2756–2779 | year = 2007 | pmid = 17716011| url = http://www.csie.ntu.edu.tw/~cjlin/papers/pgradnmf.pdf| citeseerx = 10.1.1.308.9135}}{{cite journal |title=Random projections for the nonnegative least-squares problem |first1=Christos |last1=Boutsidis |first2=Petros |last2=Drineas |journal=Linear Algebra and Its Applications |volume=431 |issue=5–7 |year=2009 |pages=760–771 |doi=10.1016/j.laa.2009.03.026|arxiv=0812.4547 }} The latter can be considered a generalization of NNLS.

Another generalization of NNLS is bounded-variable least squares (BVLS), with simultaneous upper and lower bounds {{math|α_i ≤ x_i ≤ β_i}}.{{r|lawson}}{{rp|291}}{{cite journal |last1=Stark |first1=Philip B. |first2=Robert L. |last2=Parker |title=Bounded-variable least-squares: an algorithm and applications |journal=Computational Statistics |volume=10 |year=1995 |pages=129 |url=http://digitalassets.lib.berkeley.edu/sdtr/ucb/text/394.pdf}}

Quadratic programming version

The NNLS problem is equivalent to a quadratic programming problem

: $\operatorname{arg\,min}\limits_\mathbf{x \ge 0} \left(\frac{1}{2} \mathbf{x}^\mathsf{T} \mathbf{Q}\mathbf{x} + \mathbf{c}^\mathsf{T} \mathbf{x}\right),$

where {{math|Q}} = {{math|A^TA}} and {{math|c}} = {{math|−A^T y}}. This problem is convex, as {{math|Q}} is positive semidefinite and the non-negativity constraints form a convex feasible set.{{cite book|doi=10.1007/11556121_50|volume=3691|pages=407–414|series=Lecture Notes in Computer Science|year=2005|last1=Franc|first1=Vojtěch|last2=Hlaváč|first2=Václav|last3=Navara|first3=Mirko|title=Computer Analysis of Images and Patterns |chapter=Sequential Coordinate-Wise Algorithm for the Non-negative Least Squares Problem |isbn=978-3-540-28969-2}}

Algorithms

The first widely used algorithm for solving this problem is an active-set method published by Lawson and Hanson in their 1974 book Solving Least Squares Problems.{{cite book |last1=Lawson |first1=Charles L. |last2=Hanson |first2=Richard J. |title=Solving Least Squares Problems |year=1995 |publisher=SIAM |url=https://epubs.siam.org/doi/abs/10.1137/1.9781611971217.ch23 |chapter=23. Linear Least Squares with Linear Inequality Constraints |pages=161|doi=10.1137/1.9781611971217.ch23 |isbn=978-0-89871-356-5 }}{{rp|291}} In pseudocode, this algorithm looks as follows:{{r|chen}}{{Cite journal |doi=10.1002/(SICI)1099-128X(199709/10)11:5<393::AID-CEM483>3.0.CO;2-L |title=A fast non-negativity-constrained least squares algorithm |journal=Journal of Chemometrics |volume=11 |issue=5 |pages=393 |year=1997 |last1=Bro |first1=Rasmus |last2=De Jong |first2=Sijmen}}

Inputs:
a real-valued matrix {{mvar|A}} of dimension {{math|m × n}},
a real-valued vector {{math|y}} of dimension {{mvar|m}},
a real value {{mvar|ε}}, the tolerance for the stopping criterion.
Initialize:
Set {{math|P {{=}} ∅}}.
Set {{math|R {{=}} {1, ..., n}}}.
Set {{math|x}} to an all-zero vector of dimension {{mvar|n}}.
Set {{math|w {{=}} A^T(y − Ax)}}.
Let {{math|w^R}} denote the sub-vector with indexes from R
Main loop: while {{math|R ≠ ∅}} and {{math|max(w^R) > ε}}:
Let {{mvar|j}} in {{math|R}} be the index of {{math|max(w^R)}} in {{math|w}}.
Add {{mvar|j}} to {{mvar|P}}.
Remove {{mvar|j}} from {{mvar|R}}.
Let {{mvar|A^P}} be {{mvar|A}} restricted to the variables included in {{mvar|P}}.
Let {{math|s}} be vector of same length as {{math|x}}. Let {{math|s^P}} denote the sub-vector with indexes from P, and let {{math|s^R}} denote the sub-vector with indexes from R.
Set {{math|s^P {{=}} ((A^P)^T A^P)⁻¹ (A^P)^Ty}}
Set {{math|s^R}} to zero
While {{math|min(s^P) ≤ 0}}:
Let {{math|α {{=}} min {{sfrac|x_i|x_i − s_i}} for i in P where s_i ≤ 0}}.
Set {{math|x}} to {{math|x + α(s − x)}}.
Move to {{mvar|R}} all indices {{mvar|j}} in {{mvar|P}} such that {{math|x_j ≤ 0}}.
Set {{math|s^P {{=}} ((A^P)^T A^P)⁻¹ (A^P)^Ty}}
Set {{math|s^R}} to zero.
Set {{math|x}} to {{mvar|s}}.
Set {{math|w}} to {{math|A^T(y − Ax)}}.
Output: x

This algorithm takes a finite number of steps to reach a solution and smoothly improves its candidate solution as it goes (so it can find good approximate solutions when cut off at a reasonable number of iterations), but is very slow in practice, owing largely to the computation of the pseudoinverse {{math|((A^P)^T A^P)⁻¹}}.{{r|chen}} Variants of this algorithm are available in MATLAB as the routine {{mono|lsqnonneg}}{{cite web |url=https://www.mathworks.com/help/matlab/ref/lsqnonneg.html |title=lsqnonneg |website=MATLAB Documentation |access-date=October 28, 2022}}{{cite conference |author-link2=Robert J. Plemmons |last1=Chen |first1=Donghui |first2=Robert J. |last2=Plemmons |title=Nonnegativity constraints in numerical analysis |conference=Symposium on the Birth of Numerical Analysis |date=2009 |citeseerx = 10.1.1.157.9203}} and in SciPy as {{mono|optimize.nnls}}.{{cite web |url=http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.nnls.html |title=scipy.optimize.nnls |website=SciPy v0.13.0 Reference Guide |access-date=25 January 2014}}

Many improved algorithms have been suggested since 1974.{{r|chen}} Fast NNLS (FNNLS) is an optimized version of the Lawson–Hanson algorithm.{{r|bro}} Other algorithms include variants of Landweber's gradient descent method,{{Cite journal | doi = 10.1016/j.mcm.2005.12.010| title = The application of an oblique-projected Landweber method to a model of supervised learning| journal = Mathematical and Computer Modelling| volume = 43| issue = 7–8| pages = 892| year = 2006| last1 = Johansson | first1 = B. R. | last2 = Elfving | first2 = T. | last3 = Kozlov | first3 = V. | last4 = Censor | first4 = Y. | last5 = Forssén | first5 = P. E. | last6 = Granlund | first6 = G. S. | doi-access = free }} coordinate-wise optimization based on the quadratic programming problem above{{r|sca}}, and an active set method called TNT-NN.{{cite journal |last1=Myre |first1=J.M. |last2=Frahm |first2=E. |last3=Lilja |first3=D.J. |last4=Saar |first4=M.O. |date=2017 |title=TNT-NN: A Fast Active Set Method for Solving Large Non-Negative Least Squares Problems |url=https://www.sciencedirect.com/science/article/pii/S1877050917307858 |journal=Procedia Computer Science |volume=108 |pages=755-764 |doi=10.1016/j.procs.2017.05.194 |access-date=February 13, 2025 |issn=1877-0509|doi-access=free }}

References

Category:Least squares

Category:Articles with example pseudocode

non-negative least squares

Quadratic programming version

Algorithms

See also

References