Gapped BLAST and PSI-BLAST: a new generation of protein database search programs

SF Altschul, TL Madden, AA Schäffer… - Nucleic acids …, 1997 - academic.oup.com
Nucleic acids research, 1997academic.oup.com
The BLAST programs are widely used tools for searching protein and DNA databases for
sequence similarities. For protein comparisons, a variety of definitional, algorithmic and
statistical refinements described here permits the execution time of the BLAST programs to
be decreased substantially while enhancing their sensitivity to weak similarities. A new
criterion for triggering the extension of word hits, combined with a new heuristic for
generating gapped alignments, yields a gapped BLAST program that runs at approximately …
Abstract
The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSIBLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily.
Oxford University Press