What does it mean to identify a protein in proteomics?

J Rappsilber, M Mann - Trends in biochemical sciences, 2002 - cell.com
Trends in biochemical sciences, 2002cell.com
The annotation of the human genome indicates the surprisingly low number of∼ 40 000
genes. However, the estimated number of proteins encoded by these genes is two to three
orders of magnitude higher. The ability to unambiguously identify the proteins is a
prerequisite for their functional investigation. As proteins derived from the same gene can be
largely identical, and might differ only in small but functionally relevant details, protein
identification tools must not only identify a large number of proteins but also be able to …
Abstract
The annotation of the human genome indicates the surprisingly low number of ∼40 000 genes. However, the estimated number of proteins encoded by these genes is two to three orders of magnitude higher. The ability to unambiguously identify the proteins is a prerequisite for their functional investigation. As proteins derived from the same gene can be largely identical, and might differ only in small but functionally relevant details, protein identification tools must not only identify a large number of proteins but also be able to differentiate between close relatives. This information can be generated by mass spectrometry, an approach that identifies proteins by partial analysis of their digestion-derived peptides. Information gleaned from databases fills in the missing sequence information. Because both sequence databases and experimental data are limited, a certain ambiguity often remains concerning which sequence variant(s) and modification(s) are present. As the common denominator of all the isoforms is a gene, in our opinion, it would be more accurate to state that a product of this particular gene rather than a certain protein has been identified by mass spectrometry.
cell.com