Proximal operator

{{Short description|Function in mathematical optimization}}

In mathematical optimization, the proximal operator is an operator associated with a proper,An (extended) real-valued function f on a Hilbert space is said to be proper if it is not identically equal to +\infty, and -\infty is not in its image. lower semi-continuous convex function f from a Hilbert space \mathcal{X}

to [-\infty,+\infty], and is defined by: {{cite journal|url=https://web.stanford.edu/~boyd/papers/pdf/prox_algs.pdf |title=Proximal Algorithms |author=Neal Parikh and Stephen Boyd |journal= Foundations and Trends in Optimization |volume=1 |issue=3 |year=2013 |pages=123–231 |access-date=2019-01-29}}

::\operatorname{prox}_f(v) = \arg \min_{x\in\mathcal{X}} \left(f(x) + \frac 1 2 \|x - v\|_\mathcal{X}^2\right).

For any function in this class, the minimizer of the right-hand side above is unique, hence making the proximal operator well-defined. The proximal operator is used in proximal gradient methods, which is frequently used in optimization algorithms associated with non-differentiable optimization problems such as total variation denoising.

Properties

The \text{prox} of a proper, lower semi-continuous convex function f enjoys several useful properties for optimization.

  • Fixed points of \text{prox}_f are minimizers of f: \{x\in \mathcal{X}\ |\ \text{prox}_fx = x\} = \arg \min f.
  • Global convergence to a minimizer is defined as follows: If \arg \min f \neq \varnothing, then for any initial point x_0 \in \mathcal{X}, the recursion (\forall n \in \mathbb{N})\quad x_{n+1} = \text{prox}_f x_n yields convergence x_n \to x \in \arg \min f as n \to +\infty. This convergence may be weak if \mathcal{X} is infinite dimensional.{{Cite book |last=Bauschke |first=Heinz H. |title=Convex Analysis and Monotone Operator Theory in Hilbert Spaces |last2=Combettes |first2=Patrick L. |publisher=Springer |year=2017 |isbn=978-3-319-48310-8 |series=CMS Books in Mathematics |location=New York |doi=10.1007/978-3-319-48311-5}}
  • The proximal operator can be seen as a generalization of the projection operator. Indeed, in the specific case where f is the 0-\infty characteristic function \iota_C of a nonempty, closed, convex set C we have that

:

\begin{align}

\operatorname{prox}_{\iota_C}(x)

&= \operatorname{argmin}\limits_y

\begin{cases}

\frac{1}{2} \left\| x-y \right\|_2^2 & \text{if } y \in C \\

+ \infty & \text{if } y \notin C

\end{cases} \\

&=\operatorname{argmin}\limits_{y \in C} \frac{1}{2} \left\| x-y \right\|_2^2

\end{align}

: showing that the proximity operator is indeed a generalisation of the projection operator.

  • A function is firmly non-expansive if (\forall (x,y) \in \mathcal{X}^2) \quad \|\text{prox}_fx - \text{prox}_fy\|^2 \leq \langle x-y\ , \text{prox}_fx - \text{prox}_fy\rangle.
  • The proximal operator of a function is related to the gradient of the Moreau envelope M_{\lambda f} of a function \lambda f by the following identity: \nabla M_{\lambda f}(x) = \frac{1}{\lambda} (x - \mathrm{prox}_{\lambda f}(x)).
  • The proximity operator of f is characterized by inclusion p=\operatorname{prox}_f(x) \Leftrightarrow x-p \in \partial f(p)

, where \partial f is the subdifferential of f, given by

:

\partial f(x) = \{ u \in \mathbb{R}^N \mid \forall y \in \mathbb{R}^N, (y-x)^\mathrm{T}u+f(x) \leq f(y)\}

In particular, If f is differentiable then the above equation reduces to p=\operatorname{prox}_f(x) \Leftrightarrow x-p = \nabla f(p)

.

Notes

{{reflist|group=note}}

References

{{Reflist}}

See also