least trimmed squares

{{More footnotes|date=July 2019}}

Least trimmed squares (LTS), or least trimmed sum of squares, is a robust statistical method that fits a function to a set of data whilst not being unduly affected by the presence of outliers{{cite book | last=Fox | first=John | title=Applied Regression Analysis and Generalized Linear Models | edition=3rd | year=2015 | location=Thousand Oaks | chapter=19 }}

. It is one of a number of methods for robust regression.

Description of method

Instead of the standard least squares method, which minimises the sum of squared residuals over n points, the LTS method attempts to minimise the sum of squared residuals over a subset, k, of those points. The unused n - k points do not influence the fit.

In a standard least squares problem, the estimated parameter values β are defined to be those values that minimise the objective function S(β) of squared residuals:

:S = \sum_{i=1}^n r_i(\beta)^2,

where the residuals are defined as the differences between the values of the dependent variables (observations) and the model values:

:r_i(\beta) = y_i - f(x_i, \beta),

and where n is the overall number of data points. For a least trimmed squares analysis, this objective function is replaced by one constructed in the following way. For a fixed value of β, let r_{(j)}(\beta) denote the set of ordered absolute values of the residuals (in increasing order of absolute value). In this notation, the standard sum of squares function is

:S(\beta) = \sum_{j=1}^n r_{(j)}(\beta)^2,

while the objective function for LTS is

:S_k(\beta) = \sum_{j=1}^k r_{(j)}(\beta)^2.

Computational considerations

Because this method is binary, in that points are either included or excluded, no closed-form solution exists. As a result, methods for finding the LTS solution sift through combinations of the data, attempting to find the k subset that yields the lowest sum of squared residuals. Methods exist for low n that will find the exact solution; however, as n rises, the number of combinations grows rapidly, thus yielding methods that attempt to find approximate (but generally sufficient) solutions.

References

{{Reflist}}

  • {{cite journal |last=Rousseeuw |first=P. J. |author-link=Peter Rousseeuw |year=1984 |title=Least Median of Squares Regression |journal=Journal of the American Statistical Association |volume=79 |issue= 388|pages=871–880 |jstor=2288718 |doi=10.1080/01621459.1984.10477105}}
  • {{cite book |last1=Rousseeuw |first1=P. J. |last2=Leroy |first2=A. M. |orig-year=1987 |title=Robust Regression and Outlier Detection |title-link= Robust Regression and Outlier Detection |publisher=Wiley |isbn=978-0-471-85233-9 |year=2005 |doi= 10.1002/0471725382 }}
  • {{cite journal |last=Li |first=L. M. |year=2005 |title=An algorithm for computing exact least-trimmed squares estimate of simple linear regression with constraints |journal=Computational Statistics & Data Analysis |volume=48 |issue=4 |pages=717–734 |doi=10.1016/j.csda.2004.04.003 }}
  • {{cite journal |last1=Atkinson |first1=A. C. |last2=Cheng |first2=T.-C. |year=1999 |title=Computing least trimmed squares regression with the forward search |journal=Statistics and Computing |volume=9 |issue=4 |pages=251–263 |doi=10.1023/A:1008942604045 }}
  • {{cite journal |last=Jung |first=Kang-Mo |year=2007 |title=Least Trimmed Squares Estimator in the Errors-in-Variables Model |journal=Journal of Applied Statistics |volume=34 |issue=3 |pages=331–338 |doi=10.1080/02664760601004973 |bibcode=2007JApSt..34..331J }}

Category:Robust statistics

Category:Robust regression