Sounak Chakraborty
Department of Statistics
University of Florida
Multiclass Cancer Diagnosis using Bayesian Kernel Machine Models
Precise classification of tumors is critical, for cancer diagnosis and
treatment. In recent years, several works showed successful classification
of tumor types using gene expression patterns. Thus, gene expression data
is proving to be a very promising tool in cancer diagnosis. However, the
simultaneous classification across a heterogeneous set of tumor types has
not been well-studied yet. Usually, this multicategory classification
problems are solved by using a binary classifiers which may fail in a
variety of circumstances. We tackle the problem of cancer classification in
the context of multiple-tumor types. We develop a full probabilistic
model-based approach, specifically probabilistic relevance vector machine
(RVM), as well as support vector machines (SVM) for multicategory
classification. A hierarchical model is also proposed where the unknown
smoothing parameter is interpreted as a shrinkage parameter. We assign a
prior distribution to it and obtain its posterior distribution using a
Bayesian computation. In this way, not only do we obtain the point
predictors, but also the associated measures of uncertainty.
We also propose a Bayesian variable selection method for selecting
the differentially expressed genes, integrated with our RVM and SVM models
for improved classification. Our method makes use of mixture priors and
Markov chain Monte Carlo technique to identify the important predictors
(genes) and classify samples simultaneously. We have applied or methods on
two different microarray datasets to identify differentially expressed genes
and compared their classification performance with the existing methods.
Back to Colloquium Series