David B. Hitchcock

Department of Statistics

University of Florida


Smoothing Functional Data for Cluster Analysis

Cluster analysis is an important exploratory tool for analyzing many types of data. In particular, we explore the problem of clustering functional data, which arise as curves, characteristically observed as part of a continuous process. We examine the effect of smoothing such data on dissimilarity estimation and cluster analysis. We prove that a shrinkage method of smoothing results in a better estimator of the dissimilarities among a set of noisy curves. Strong empirical evidence is given that smoothing functional data before clustering results in a more accurate grouping than clustering the observed data without smoothing. An example involving yeast gene expression data illustrates the technique.


Back to Colloquium Series