Scott Grimshaw

Department of Statistics

Brigham Young University


Regression Trees

Given a data set consisting of n observations on p independent variables and a single dependent variable, a general tree-structured regression model consists of a binary tree with parametric models at each of the leaves. Each node of the tree consists of an inequality condition on one of the independent variables. The tree structure is generated from training data by a recursive partitioning algorithm. CART models are one special case where the leaf models are a probability or a mean. Alexander and Grimshaw (1996) proposed treed regression by placing the best simple linear regression at each leaf. Treed regression models are more parsimonious than CART because there are fewer splits. Additional topics such as monotonicity in the tree structure and stacked treed regression for prediction will be discussed.


Back to Colloquium Series