An automated data analysis pipeline for GC-TOF-MS metabonomics studies

J Proteome Res. 2010 Nov 5;9(11):5974-81. doi: 10.1021/pr1007703. Epub 2010 Sep 29.

Abstract

Recent technological advances have made it possible to carry out high-throughput metabonomics studies using gas chromatography coupled with time-of-flight mass spectrometry. Large volumes of data are produced from these studies and there is a pressing need for algorithms that can efficiently process and analyze data in a high-throughput fashion as well. We present an Automated Data Analysis Pipeline (ADAP) that has been developed for this purpose. ADAP consists of peak detection, deconvolution, peak alignment, and library search. It allows data to flow seamlessly through the analysis steps without any human intervention and features two novel algorithms in the analysis. Specifically, clustering is successfully applied in deconvolution to resolve coeluting compounds that are very common in complex samples and a two-phase alignment process has been implemented to enhance alignment accuracy. ADAP is written in standard C++ and R and uses parallel computing via Message Passing Interface for fast peak detection and deconvolution. ADAP has been applied to analyze both mixed standards samples and serum samples and identified and quantified metabolites successfully. ADAP is available at http://www.du-lab.org .

MeSH terms

  • Algorithms*
  • Automation
  • Cluster Analysis
  • Data Interpretation, Statistical*
  • Gas Chromatography-Mass Spectrometry / instrumentation
  • Gas Chromatography-Mass Spectrometry / methods*
  • Metabolomics / instrumentation*
  • Metabolomics / methods
  • Neural Networks, Computer