[HTML][HTML] IPO: a tool for automated optimization of XCMS parameters

G Libiseller, M Dvorzak, U Kleb, E Gander… - BMC …, 2015 - Springer
G Libiseller, M Dvorzak, U Kleb, E Gander, T Eisenberg, F Madeo, S Neumann
BMC bioinformatics, 2015Springer
Background Untargeted metabolomics generates a huge amount of data. Software
packages for automated data processing are crucial to successfully process these data. A
variety of such software packages exist, but the outcome of data processing strongly
depends on algorithm parameter settings. If they are not carefully chosen, suboptimal
parameter settings can easily lead to biased results. Therefore, parameter settings also
require optimization. Several parameter optimization approaches have already been …
Background
Untargeted metabolomics generates a huge amount of data. Software packages for automated data processing are crucial to successfully process these data. A variety of such software packages exist, but the outcome of data processing strongly depends on algorithm parameter settings. If they are not carefully chosen, suboptimal parameter settings can easily lead to biased results. Therefore, parameter settings also require optimization. Several parameter optimization approaches have already been proposed, but a software package for parameter optimization which is free of intricate experimental labeling steps, fast and widely applicable is still missing.
Results
We implemented the software package IPO (‘Isotopologue Parameter Optimization’) which is fast and free of labeling steps, and applicable to data from different kinds of samples and data from different methods of liquid chromatography - high resolution mass spectrometry and data from different instruments.
IPO optimizes XCMS peak picking parameters by using natural, stable 13C isotopic peaks to calculate a peak picking score. Retention time correction is optimized by minimizing relative retention time differences within peak groups. Grouping parameters are optimized by maximizing the number of peak groups that show one peak from each injection of a pooled sample. The different parameter settings are achieved by design of experiments, and the resulting scores are evaluated using response surface models. IPO was tested on three different data sets, each consisting of a training set and test set. IPO resulted in an increase of reliable groups (146% - 361%), a decrease of non-reliable groups (3% - 8%) and a decrease of the retention time deviation to one third.
Conclusions
IPO was successfully applied to data derived from liquid chromatography coupled to high resolution mass spectrometry from three studies with different sample types and different chromatographic methods and devices. We were also able to show the potential of IPO to increase the reliability of metabolomics data.
The source code is implemented in R, tested on Linux and Windows and it is freely available for download at https://github.com/glibiseller/IPO . The training sets and test sets can be downloaded from https://health.joanneum.at/IPO .
Springer