Google Scholar

The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants

PJA Cock, CJ Fields, N Goto, ML Heuer… - Nucleic acids …, 2010 - academic.oup.com

PJA Cock, CJ Fields, N Goto, ML Heuer, PM Rice

Nucleic acids research, 2010•academic.oup.com

ABSTRACT FASTQ has emerged as a common file format for sharing sequencing read data
combining both the sequence and an associated per base quality score, despite lacking any
formal definition to date, and existing in at least three incompatible variants. This article
defines the FASTQ format, covering the original Sanger standard, the Solexa/Illumina
variants and conversion between them, based on publicly available information such as the
MAQ documentation and conventions recently agreed by the Open Bioinformatics …

Abstract

FASTQ has emerged as a common file format for sharing sequencing read data combining both the sequence and an associated per base quality score, despite lacking any formal definition to date, and existing in at least three incompatible variants. This article defines the FASTQ format, covering the original Sanger standard, the Solexa/Illumina variants and conversion between them, based on publicly available information such as the MAQ documentation and conventions recently agreed by the Open Bioinformatics Foundation projects Biopython, BioPerl, BioRuby, BioJava and EMBOSS. Being an open access publication, it is hoped that this description, with the example files provided as Supplementary Data, will serve in future as a reference for this important file format.

Oxford University Press

Show moreShow less

Save Cite Cited by 2151 Related articles All 21 versions

Cite

Advanced search

Saved to My library

The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants