Semi-supervised learning for peptide identification from shotgun proteomics datasets

L Käll, JD Canterbury, J Weston, WS Noble… - Nature …, 2007 - nature.com
Nature methods, 2007nature.com
Shotgun proteomics uses liquid chromatography–tandem mass spectrometry to identify
proteins in complex biological samples. We describe an algorithm, called Percolator, for
improving the rate of confident peptide identifications from a collection of tandem mass
spectra. Percolator uses semi-supervised machine learning to discriminate between correct
and decoy spectrum identifications, correctly assigning peptides to 17% more spectra from a
tryptic Saccharomyces cerevisiae dataset, and up to 77% more spectra from non-tryptic …
Abstract
Shotgun proteomics uses liquid chromatography–tandem mass spectrometry to identify proteins in complex biological samples. We describe an algorithm, called Percolator, for improving the rate of confident peptide identifications from a collection of tandem mass spectra. Percolator uses semi-supervised machine learning to discriminate between correct and decoy spectrum identifications, correctly assigning peptides to 17% more spectra from a tryptic Saccharomyces cerevisiae dataset, and up to 77% more spectra from non-tryptic digests, relative to a fully supervised approach.
nature.com