Skip to Content

Matthew Dannenberg

Picture of Matthew Dannenberg.


Pattern Recognition in High-Dimensional Data

Weiqing Gu
Second Reader(s)
Satyan L. Devadoss (Williams College)


Vast amounts of data are produced all the time. Yet this data does not easily equate to useful information: extracting information from large amounts of high dimensional data is nontrivial. People are simply drowning in data. A recent and growing source of high-dimensional data is hyperspectral imaging. Hyperspectral images allow for massive amounts of spectral information to be contained in a single image. In this thesis, a robust supervised machine learning algorithm is developed to efficiently perform binary object classification on hyperspectral image data by making use of the geometry of Grassmann manifolds. This algorithm can consistently distinguish between a large range of even very similar materials, returning very accurate classification results with very little training data. When distinguishing between dissimilar locations like crop fields and forests, this algorithm consistently classifies more than 95 percent of points correctly. On more similar materials, more than 80 percent of points are classified correctly. This algorithm will allow for very accurate information to be extracted from these large and complicated hyperspectral images.

Additional Materials


Data or Source Code