Eoulsan: a cloud computing-based framework facilitating high throughput sequencing analyses

L Jourdren, M Bernard, MA Dillies, S Le Crom - Bioinformatics, 2012 - academic.oup.com
L Jourdren, M Bernard, MA Dillies, S Le Crom
Bioinformatics, 2012academic.oup.com
We developed a modular and scalable framework called Eoulsan, based on the Hadoop
implementation of the MapReduce algorithm dedicated to high-throughput sequencing data
analysis. Eoulsan allows users to easily set up a cloud computing cluster and automate the
analysis of several samples at once using various software solutions available. Our tests
with Amazon Web Services demonstrated that the computation cost is linear with the number
of instances booked as is the running time with the increasing amounts of data. Availability …
Abstract
Summary: We developed a modular and scalable framework called Eoulsan, based on the Hadoop implementation of the MapReduce algorithm dedicated to high-throughput sequencing data analysis. Eoulsan allows users to easily set up a cloud computing cluster and automate the analysis of several samples at once using various software solutions available. Our tests with Amazon Web Services demonstrated that the computation cost is linear with the number of instances booked as is the running time with the increasing amounts of data.
Availability and implementation: Eoulsan is implemented in Java, supported on Linux systems and distributed under the LGPL License at: http://transcriptome.ens.fr/eoulsan/
Contact:  eoulsan@biologie.ens.fr
Supplementary information:  Supplementary data are available at Bioinformatics online.
Oxford University Press