Scott Grimshaw
Department of Statistics
Brigham Young University
Regression Trees
Given a data set consisting of n observations on p independent variables and
a single dependent variable, a general tree-structured regression model
consists of a binary tree with parametric models at each of the leaves. Each
node of the tree consists of an inequality condition on one of the
independent variables. The tree structure is generated from training data by
a recursive partitioning algorithm. CART models are one special case where
the leaf models are a probability or a mean. Alexander and Grimshaw (1996)
proposed treed regression by placing the best simple linear regression at
each leaf. Treed regression models are more parsimonious than CART because
there are fewer splits. Additional topics such as monotonicity in the tree
structure and stacked treed regression for prediction will be discussed.
Back to Colloquium Series