Identification of novel transcripts in annotated genomes using RNA-Seq

A Roberts, H Pimentel, C Trapnell, L Pachter - Bioinformatics, 2011 - academic.oup.com
Bioinformatics, 2011academic.oup.com
We describe a new 'reference annotation based transcript assembly'problem for RNA-Seq
data that involves assembling novel transcripts in the context of an existing annotation. This
problem arises in the analysis of expression in model organisms, where it is desirable to
leverage existing annotations for discovering novel transcripts. We present an algorithm for
reference annotation-based transcript assembly and show how it can be used to rapidly
investigate novel transcripts revealed by RNA-Seq in comparison with a reference …
Abstract
Summary: We describe a new ‘reference annotation based transcript assembly’ problem for RNA-Seq data that involves assembling novel transcripts in the context of an existing annotation. This problem arises in the analysis of expression in model organisms, where it is desirable to leverage existing annotations for discovering novel transcripts. We present an algorithm for reference annotation-based transcript assembly and show how it can be used to rapidly investigate novel transcripts revealed by RNA-Seq in comparison with a reference annotation.
Availability: The methods described in this article are implemented in the Cufflinks suite of software for RNA-Seq, freely available from http://bio.math.berkeley.edu/cufflinks. The software is released under the BOOST license.
Contact:  cole@broadinstitute.org; lpachter@math.berkeley.edu
Supplementary Information:  Supplementary data are available at Bioinformatics online.
Oxford University Press