CRS4

An Automated Infrastructure to Support High-throughput Bioinformatics

Gianmauro Cuccuru, Simone Leo, Luca Lianas, Michele Muggiri, Andrea Pinna, Luca Pireddu, Paolo Uva, Andrea Angius, Giorgio Fotia, Gianluigi Zanetti
Proc. IEEE International Conference on High Performance Computing & Simulation (HPCS 2014), page 600-607 - july 2014
Download the publication : hpcs_2014_bologna_global_view_automation.pdf [775Ko]  
The number of domains affected by the big data phenomenon is constantly increasing, both in science and industry, with high-throughput DNA sequencers being among the most massive data producers. % Building analysis frameworks that can keep up with such a high production rate, however, is only part of the problem: current challenges include dealing with articulated data repositories where objects are connected by multiple relationships, managing complex processing pipelines where each step depends on a large number of configuration parameters and ensuring reproducibility, error control and usability by non-technical staff. % Here we describe an automated infrastructure built to address the above issues in the context of the analysis of the data produced by the CRS4 next-generation sequencing facility. The system integrates open source tools, either written by us or publicly available, into a framework that can handle the whole data transformation process, from raw sequencer output to primary analysis results.

BibTex references

@InProceedings{CLLMPPUAFZ14,
  author       = {Cuccuru, G. and Leo, S. and Lianas, L. and Muggiri, M. and Pinna, A. and Pireddu, L. and Uva, P. and Angius, A. and Fotia, G. and Zanetti, G.},
  title        = {An Automated Infrastructure to Support High-throughput Bioinformatics},
  booktitle    = {Proc.  IEEE  International Conference on High Performance Computing \& Simulation (HPCS 2014)},
  pages        = {600-607},
  month        = {july},
  year         = {2014},
  editor       = {Smari, Waleed W.  and  Zeljkovic, Vesna Eds.},
  publisher    = {IEEE},
  note         = {IEEE Catalog Number: CFP1478H-CDR},
  keywords     = {Genomics,Muscles,Simple object access protocol,Bioinformatics,MapReduce,NGS},
  doi          = {10.1109/HPCSim.2014.6903742},
  isbn         = {978-1-4799-5313-4},
  url          = {https://publications.crs4.it/pubdocs/2014/CLLMPPUAFZ14},
}

Other publications in the database

» Gianmauro Cuccuru
» Simone Leo
» Luca Lianas
» Michele Muggiri
» Andrea Pinna
» Luca Pireddu
» Paolo Uva
» Andrea Angius
» Giorgio Fotia
» Gianluigi Zanetti