Metabolite-based clustering and visualization of mass spectrometry data using one-dimensional self-organizing maps
1 Department of Bioinformatics, Institute of Microbiology and Genetics, University of Göttingen, Göttingen, Germany
2 Department of Developmental Biochemistry, Institute for Biochemistry and Molecular Cell Biology, University of Göttingen, Göttingen, Germany
3 Department for Plant Biochemistry, Albrecht-von-Haller-Institute for Plant Sciences, University of Göttingen, Göttingen, Germany
4 Molecular Phytopathology and Mycotoxin Research Unit, University of Göttingen, Göttingen, Germany
Algorithms for Molecular Biology 2008, 3:9 doi:10.1186/1748-7188-3-9Published: 26 June 2008
One of the goals of global metabolomic analysis is to identify metabolic markers that are hidden within a large background of data originating from high-throughput analytical measurements. Metabolite-based clustering is an unsupervised approach for marker identification based on grouping similar concentration profiles of putative metabolites. A major problem of this approach is that in general there is no prior information about an adequate number of clusters.
We present an approach for data mining on metabolite intensity profiles as obtained from mass spectrometry measurements. We propose one-dimensional self-organizing maps for metabolite-based clustering and visualization of marker candidates. In a case study on the wound response of Arabidopsis thaliana, based on metabolite profile intensities from eight different experimental conditions, we show how the clustering and visualization capabilities can be used to identify relevant groups of markers.
Our specialized realization of self-organizing maps is well-suitable to gain insight into complex pattern variation in a large set of metabolite profiles. In comparison to other methods our visualization approach facilitates the identification of interesting groups of metabolites by means of a convenient overview on relevant intensity patterns. In particular, the visualization effectively supports researchers in analyzing many putative clusters when the true number of biologically meaningful groups is unknown.