Deborah Hurley
(Department of Epidemiology and Biostatistics, University of South Carolina)
An Evaluation of Splines in Linear Regression
Spline modeling may provide a better fit to data than polynomial or categorical models. There is no one best approach, however, as some modeling methods may produce better results for predicted values (e.g., smaller confidence intervals) than other methods, depending on the data. To address this, a simulation study was undertaken. Data were simulated for five different structures (patterns), using one dependent variable and one independent variable. Each of these five structures was generated with three different sample sizes, and two different standard deviations, for a total of 30 scenarios. Six different regression models were evaluated for each of the 30 scenarios: simple linear regression (SLR), polynomial regression (quadratic and cubic), and spline regression (linear, quadratic and cubic). Selected measures were compared for each scenario to assess model "fit" verses model simplicity: confidence interval coverage on the predicted values, confidence interval widths, mean squared error (MSE), and the PRESS and
R2 statistics. Results indicated that the best choice of a modeling method should take into account preliminary plots and the estimated standard deviation. Splines are most appropriate when the plots of the data clearly indicate that they are needed (i.e., when the standard deviation is small and we can detect knots and changes in structure). This is especially true of the linear spline, which is clearly most suited for piecewise linear data structures. When the plots do not show much detail (i.e., when the standard deviation is large), it is generally better to use a simpler model (e.g., polynomial). Results also reinforce the need to look at a plot of the predicted values for your model, as some of the usual selection criteria (MSE, PRESS, R2) can give similar results for various models, but the coverage for these models may be diverse.
Francisco Vera
(Department of Statistics, University of South Carolina)
Iterated Total Time on Test Transformations: An Illustration Using Mixed Gamma Distributions
Iterated Total Time on Test (TTT) transforms can be used to characterize certain stochastic orderings. In particular, orderings with respect to expectation of functions, convex with respect to polynomials, are equivalent to higher order TTT domination under certain conditions. This ordering gives rise to a k-mart structure, which gives meaningful representation for the relationship of two jointly distributed random variables. Here we illustrate these ideas applied to mixtures of gamma distributions.
Phil Yates
(Department of Statistics, University of South Carolina)
Estimation of a Mixtrue of Two Three-Parameter Gammas
This talk will focus on techniques to find likelihood estimates of a
mixture of two three-parameter gammas via the ECM algorithm and ways to
accelerate that algorithm via methods by Louis (1982) and Meilijson
(1989).