Sunil Rao
Department of Biostatistics
University of Miami
Spike and Slab Regression For High Dimensional Data Analysis : A Review and Some New Directions
I will present a type of statistical regularization known as spike and slab
regression which has effectively been used to analyze high dimensional data
such as genomic data. Computationally
this involves using a hierarchical model with a particular prior specification that
results in what is known as selective shrinkage of traditional OLS estimates. Early
applications included ANOVA-based multigroup designs however, the methodology can
also be framed in a way to handle problem like clustering of high dimensional
profiles as well. All of these fall under an orthogonal type setting.
Under more general settings, other interesting uses of spike and slab models are possible.
Spike and slab regression is closely connected with generalized ridge regression (GRR),
which proves very useful in studying another related problem in high dimensional data analysis:
the so-called, large p, small n problem where collinearity abounds. I will show some recent work
related to mixing generalized ridge estimators for the purpose of improved prediction.
Implementation in practice can be achieved efficiently by using a Bayesian computational
strategy based on a spike and slab model.
This is joint work with Hemant Ishwaran of the University of Miami.
Back to Colloquium Series