Edsel Peña

Department of Statistics

University of South Carolina


Global Validation of Linear Model Assumptions

One of the most-used statistical models in many scientific areas is the general linear model which relates a vector of responses Y to a matrix of fixed covariates X according to the equation Y = Xb + se, where e is a vector of unobserved errors. The model parameters are the regression coefficient vector b and the error standard deviation s. The validity of this model relies on four assumptions: (i) linearity of the relationship; (2) homoscedastic (equal) variances at each x; (3) normal error distribution; and (4) uncorrelated errors. The breakdown of any of these assumptions invalidates many inferential methods pertaining to this model such as point estimation, construction of confidence intervals, and tests of hypotheses pertaining to the model parameters. Many existing methods for validating these model assumptions, especially those taught at our undergraduate and graduate-level statistics courses, are ad hoc, graphical, and usually does not take into account the synergy of effects inherent in simultaneous violations. In this talk I will describe a global procedure for validating simultaneously these model assumptions, and indicate how specific model assumption violations could be pinpointed. The procedure will be demonstrated using real data sets. It is hoped that the procedure will help in partially eliminating the subjectivity inherent in existing graphical procedures.

This is joint research work with Prof. Elizabeth Slate of the Medical University of South Carolina.


Back to Colloquium Series