Sounak Chakraborty

Department of Statistics

University of Florida


Multiclass Cancer Diagnosis using Bayesian Kernel Machine Models

Precise classification of tumors is critical, for cancer diagnosis and treatment. In recent years, several works showed successful classification of tumor types using gene expression patterns. Thus, gene expression data is proving to be a very promising tool in cancer diagnosis. However, the simultaneous classification across a heterogeneous set of tumor types has not been well-studied yet. Usually, this multicategory classification problems are solved by using a binary classifiers which may fail in a variety of circumstances. We tackle the problem of cancer classification in the context of multiple-tumor types. We develop a full probabilistic model-based approach, specifically probabilistic relevance vector machine (RVM), as well as support vector machines (SVM) for multicategory classification. A hierarchical model is also proposed where the unknown smoothing parameter is interpreted as a shrinkage parameter. We assign a prior distribution to it and obtain its posterior distribution using a Bayesian computation. In this way, not only do we obtain the point predictors, but also the associated measures of uncertainty.

We also propose a Bayesian variable selection method for selecting the differentially expressed genes, integrated with our RVM and SVM models for improved classification. Our method makes use of mixture priors and Markov chain Monte Carlo technique to identify the important predictors (genes) and classify samples simultaneously. We have applied or methods on two different microarray datasets to identify differentially expressed genes and compared their classification performance with the existing methods.


Back to Colloquium Series