Wenxuan Zhong

Department Statistics

Harvard University


Variable Selection Using Single Index Models for Motif Discovery

Information for regulating a gene's transcription is contained in the conserved patterns (motifs) on the upstream/downstream DNA sequence (promoter region) close to the target gene. By combining the information contained in both gene expression measurements and genes' promoter sequences, I proposed a novel procedure for identifying functional active motifs under certain stimuli. A nonlinear regression model, single index model, was used to associate promoter sequence information of a gene and its mRNA expression measurements. Single index models postulate that the response variable y depends on a unique linear combination of predictors X through an unknown link function y=f(Xβ, ε), where βs are index vector and ε represents measurement errors. In this talk, I will describe computational efficient variable selection procedures and criteria, which were developed by us under profile likelihood frameworks for the single index model. I will also demonstrate the advantage of these methods both theoretically and empirically. Compared with existing methods, our proposed procedures can greatly improve variable selection sensitivities and specificities.


Back to Colloquium Series