Logistic model tree

In computer science, a logistic model tree (LMT) is a classification model with an associated supervised training algorithm that combines logistic regression (LR) and decision tree learning.{{cite conference |author1=Niels Landwehr |author2=Mark Hall |author3=Eibe Frank |url=http://www.cs.waikato.ac.nz/~ml/publications/2003/landwehr-etal.pdf |title=Logistic model trees |year=2003 |conference=ECML PKDD}}{{Cite journal | last1 = Landwehr | first1 = N.| last2 = Hall | first2 = M.| last3 = Frank | first3 = E.| title = Logistic Model Trees | doi = 10.1007/s10994-005-0466-3 | journal = Machine Learning | volume = 59 | pages = 161–205 | year = 2005 | issue = 1–2| url = http://www.cs.waikato.ac.nz/~eibe/pubs/LMT.pdf| doi-access = free }}

Logistic model trees are based on the earlier idea of a model tree: a decision tree that has linear regression models at its leaves to provide a piecewise linear regression model (where ordinary decision trees with constants at their leaves would produce a piecewise constant model). In the logistic variant, the LogitBoost algorithm is used to produce an LR model at every node in the tree; the node is then split using the C4.5 criterion. Each LogitBoost invocation is warm-started{{vague|date=October 2016}} from its results in the parent node. Finally, the tree is pruned.{{cite conference |author1=Sumner, Marc |author2=Eibe Frank |author3=Mark Hall |title=Speeding up logistic model tree induction |conference=PKDD |publisher=Springer |year=2005 |pages=675–683 |url=http://www.cs.waikato.ac.nz/~ml/publications/2005/SumnerFrankHallCameraReady.pdf}}

The basic LMT induction algorithm uses cross-validation to find a number of LogitBoost iterations that does not overfit the training data. A faster version has been proposed that uses the Akaike information criterion to control LogitBoost stopping.

Logistic model tree

References

See also