Sunil Rao

Department of Biostatistics

University of Miami


Spike and Slab Regression For High Dimensional Data Analysis : A Review and Some New Directions

I will present a type of statistical regularization known as spike and slab regression which has effectively been used to analyze high dimensional data such as genomic data. Computationally this involves using a hierarchical model with a particular prior specification that results in what is known as selective shrinkage of traditional OLS estimates. Early applications included ANOVA-based multigroup designs however, the methodology can also be framed in a way to handle problem like clustering of high dimensional profiles as well. All of these fall under an orthogonal type setting. Under more general settings, other interesting uses of spike and slab models are possible. Spike and slab regression is closely connected with generalized ridge regression (GRR), which proves very useful in studying another related problem in high dimensional data analysis: the so-called, large p, small n problem where collinearity abounds. I will show some recent work related to mixing generalized ridge estimators for the purpose of improved prediction. Implementation in practice can be achieved efficiently by using a Bayesian computational strategy based on a spike and slab model. This is joint work with Hemant Ishwaran of the University of Miami.


Back to Colloquium Series