Webster West

Department of Statistics

University of South Carolina


The Data Image: A Tool for Exploring High Dimensional Data Sets

The color histogram was first introduced as a tool for visualizing higher dimensional data by Wegman (1990). A new version of this concept, called a data image, is discussed. Each variable is transformed into a greyscale or color range so that a high-dimensional data set may be viewed as an image, with observations on one axis and variables on the other. The rows and columns of the image may be rearranged to highlight relationships between variables. New ways of displaying the image based on various linear orderings of a data set are discussed. The ability of the data image to show relationships between variables and clusters in higher dimensions is explored. A Jave applet which may be used to construct an interactive date image via the World Wide Web is used as the basis for this exploration.


Back to Colloquium Series