Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Data handling strategies for high throughput pyrosequencers.

Gabriele A Trombetti1, Raoul J P Bonnal, Ermanno Rizzi

  • 1Institute for Biomedical Technologies-National Research Council, via Fratelli Cervi 93, 20090 Segrate, MI, Italy. gabriele.trombetti@itb.cnr.it

BMC Bioinformatics
|April 14, 2007
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Implementation of a Digital Maturity Framework for Biobanking.

Journal of biomedical informatics·2025
Same author

Transmembrane protein TMEM230, regulator of metalloproteins and motor proteins in gliomas and gliosis.

Advances in protein chemistry and structural biology·2024
Same author

Long-term culture of patient-derived mammary organoids in non-biogenic electrospun scaffolds for identifying metalloprotein and motor protein activities in aging and senescence.

Advances in protein chemistry and structural biology·2024
Same author

Genetic Contribution to Medium-Term Disease Activity in Multiple Sclerosis.

Molecular neurobiology·2024
Same author

(Re)-definition of the holo- and apo-Fur direct regulons of Helicobacter pylori.

Journal of molecular biology·2024
Same author

scMuffin: an R package to disentangle solid tumor heterogeneity by single-cell gene expression analysis.

BMC bioinformatics·2023
Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026
Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026
Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026
Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026
Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026
Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026
See all related articles

High-throughput DNA sequencing generates large datasets. We developed a computational pipeline and used the European Grid to analyze mutations in human samples, enabling efficient data handling and analysis.

Area of Science:

  • Genomics
  • Bioinformatics
  • Computational Biology

Background:

  • New high-throughput pyrosequencers (e.g., 454 Life Sciences GS 20) offer increased DNA sequencing output but present challenges in data handling, analysis, and computational power requirements.
  • These sequencers have different error profiles and shorter reads compared to traditional Sanger sequencing, necessitating new analytical approaches.

Purpose of the Study:

  • To develop an automated computational pipeline integrated with a database for efficient storage, handling, indexing, and searching of high-throughput sequencing data.
  • To leverage the European Grid for cost-effective, high-performance computing to manage the increased data output from pyrosequencers.
  • To analyze sequenced amplicons from human samples for punctual mutations.

Main Methods:

Related Experiment Videos

  • Developed an automated, multi-step computational pipeline with database storage for managing sequencing output, analysis projects, and results.
  • Ported the pipeline to the European Grid, utilizing a load-balanced cluster community for increased computation power.
  • Created Vnas, a framework for Grid job submission, virtual sandbox management, and job callback, to facilitate Grid porting.
  • Main Results:

    • Successfully analyzed 273 sequenced amplicons from a human cancerous sample using the developed pipeline and European Grid infrastructure.
    • Identified punctual mutations, which were confirmed by Sanger resequencing or NCBI dbSNP, demonstrating the pipeline's accuracy.
    • The system allowed for storage and searching of raw data, analysis projects, and intermediate/final computation results, ensuring repeatability.

    Conclusions:

    • An automated computational pipeline coupled with database storage and the European Grid effectively addresses the challenges of high-throughput pyrosequencer data.
    • The European Grid provides a cost-effective solution for handling uneven scientific workloads.
    • The implemented infrastructure successfully analyzed human amplicons for mutations, with future analyses planned.