Pedro Seoane, Sara Ocaña, Rosario Carmona, Rocío Bautista, Eva Madrid, Ana M. Torres and M. Gonzalo Claros* Pages 440 - 450 ( 11 )
The use of workflows to automate routine tasks is an absolute requirement in many bioinformatics fields. Current workflow manager systems usually compromise between providing a user-friendly interface and constructing complex, scalable pipelines. We present AutoFlow, a Ruby-based workflow engine devoid of graphic interface and tool repository, that is useful in most computer systems and most workflow requirements in any scientific field. It accepts any local or remote command-line software and converts one workflow into a series of independent tasks. It has been supplied with control patterns that allow for iterative task capability, supporting static and dynamic variables for decision-making or chaining workflows, as well as debugging utilities that include graphs, file searching, functional consistency and timing. Two proof-of-concept cases are presented to illustrate AutoFlow capabilities, and a case-of-use illustrates the automated construction of the best transcriptome for a non-model species (Vicia faba) after analysis of several combinations of Illumina reads and Sanger sequences with different assemblers and different parameters in a complex and repetitive workflow where branching and convergent tasks were used and internal, automated decisions were taken. The workflow finally produced an optimal transcriptome of 118,188 transcripts, of which 38,004 were annotated, 10,516 coded for a complete protein, 3,314 were putatively new faba-specific transcripts, and 23,727 were considered the representative transcriptome of V. faba.
Workflow, pipeline, automation, Ruby, bash, transcriptome, assembly, non-model species.
Departamento de Biología Molecular y Bioquímica, Facultad de Ciencias, Universidad de Málaga, E- 29071, Malaga, Spain.