Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Proofreading01:31

Proofreading

6.5K
Synthesis of new DNA molecules is carried out by the enzyme DNA polymerase, which adds nucleotides on the daughter strand complementary to the template DNA strand. DNA polymerase has a higher affinity to add the correct base and ensures fidelity during DNA replication. Furthermore,  it exhibits proofreading activity during replication, using an exonuclease domain that cuts off incorrect nucleotides from the nascent DNA strand.
Errors During Replication are Corrected by the DNA Polymerase...
6.5K
Mismatch Repair01:36

Mismatch Repair

40.5K
Overview
40.5K
Genome Copying Errors02:46

Genome Copying Errors

4.3K
DNA replication is a well-evolved process that copies millions of base pairs with high fidelity during each cell division. Occasionally a wrong base or a long stretch of wrong bases may get added to the daughter strands. If the errors are left unchecked, cells might accumulate several mutations that might endanger their  survival. Therefore, the copying errors are checked and repaired at three levels.
4.3K
Fixing Double-strand Breaks02:04

Fixing Double-strand Breaks

12.8K
The double-stranded structure of DNA has two major advantages. First, it serves as a safe repository of genetic information where one strand serves as the back-up in case the other strand is damaged. Second, the double-helical structure can be wrapped around proteins called histones to form nucleosomes, which can then be tightly wound to form chromosomes. This way, DNA chains up to 2 inches long can be contained within microscopic structures in a cell. A double-stranded break not only damages...
12.8K
Translesion DNA Polymerases02:10

Translesion DNA Polymerases

10.1K
Translesion (TLS) polymerases rescue stalled DNA polymerases at sites of damaged bases by replacing the replicative polymerase and installing a nucleotide across the damaged site. Doing so, TLS allows additional time for the cell to repair the damage before resuming regular DNA replication.
TLS polymerases are found in all three domains of life - archaea, bacteria, and eukaryotes. Of the different classes of TLS polymerases, members of the Y family are fitted with specialized structures that...
10.1K
Base Excision Repair01:54

Base Excision Repair

22.8K
One of the common DNA damages is the chemical alteration of single bases by alkylation, oxidation, or deamination. The altered bases cause mispairing and strand breakage during replication. This type of damage causes minimal change to the DNA double helix structure and can be repaired by the base excision repair (BER) pathways. BER corrects damaged DNA sequences by removing the damaged base and restoring the original base sequence using the complementary strand as a template.
The first step of...
22.8K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Hybrid MPI/OpenMP Implementation of RIblast for Transcriptome-Scale lncRNA-RNA Interaction Prediction.

Methods in molecular biology (Clifton, N.J.)·2025
Same author

SeQual-Stream: approaching stream processing to quality control of NGS datasets.

BMC bioinformatics·2023
Same author

ParRADMeth: Identification of Differentially Methylated Regions on Multicore Clusters.

IEEE/ACM transactions on computational biology and bioinformatics·2023
Same author

PATO: genome-wide prediction of lncRNA-DNA triple helices.

Bioinformatics (Oxford, England)·2023
Same author

HSRA: Hadoop-based spliced read aligner for RNA sequencing data.

PloS one·2018
Same author

MarDRe: efficient MapReduce-based removal of duplicate DNA reads in the cloud.

Bioinformatics (Oxford, England)·2017
Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026
Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026
Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026
Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026
Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026
Same journal

Benchmarking DNA barcode decoding strategies under high error rates.

BMC bioinformatics·2026
See all related articles

Related Experiment Video

Updated: Aug 22, 2025

Proofreading and DNA Repair Assay Using Single Nucleotide Extension and MALDI-TOF Mass Spectrometry Analysis
11:08

Proofreading and DNA Repair Assay Using Single Nucleotide Extension and MALDI-TOF Mass Spectrometry Analysis

Published on: June 19, 2018

9.8K

SparkEC: speeding up alignment-based DNA error correction tools.

Roberto R Expósito1, Marco Martínez-Sánchez2, Juan Touriño2

  • 1Universidade da Coruña, CITIC, Computer Architecture Group, Campus de Elviña, 15071, A Coruña, Spain. roberto.rey.exposito@udc.es.

BMC Bioinformatics
|November 7, 2022
PubMed
Summary
This summary is machine-generated.

SparkEC is a new parallel software tool that significantly speeds up the correction of errors in Next Generation Sequencing (NGS) data. This high-performance tool offers scalable solutions for large genomic datasets, improving analysis efficiency.

Keywords:
Apache SparkBig dataDistributed processingError correction

More Related Videos

Rare Event Detection Using Error-corrected DNA and RNA Sequencing
10:36

Rare Event Detection Using Error-corrected DNA and RNA Sequencing

Published on: August 3, 2018

12.2K
Genome-wide Surveillance of Transcription Errors in Eukaryotic Organisms
09:30

Genome-wide Surveillance of Transcription Errors in Eukaryotic Organisms

Published on: September 13, 2018

9.6K

Related Experiment Videos

Last Updated: Aug 22, 2025

Proofreading and DNA Repair Assay Using Single Nucleotide Extension and MALDI-TOF Mass Spectrometry Analysis
11:08

Proofreading and DNA Repair Assay Using Single Nucleotide Extension and MALDI-TOF Mass Spectrometry Analysis

Published on: June 19, 2018

9.8K
Rare Event Detection Using Error-corrected DNA and RNA Sequencing
10:36

Rare Event Detection Using Error-corrected DNA and RNA Sequencing

Published on: August 3, 2018

12.2K
Genome-wide Surveillance of Transcription Errors in Eukaryotic Organisms
09:30

Genome-wide Surveillance of Transcription Errors in Eukaryotic Organisms

Published on: September 13, 2018

9.6K

Area of Science:

  • Genomics
  • Bioinformatics
  • Computational Biology

Background:

  • Next Generation Sequencing (NGS) generates large genomic datasets with inherent DNA sequencing errors.
  • These errors can compromise the accuracy of downstream genomic analyses.
  • Current error correction methods often demand substantial computational resources and time, hindering the analysis of massive datasets.

Purpose of the Study:

  • To develop a high-performance, scalable software tool for correcting errors in Next Generation Sequencing (NGS) data.
  • To optimize existing error correction algorithms for faster processing of large genomic datasets.
  • To provide an efficient solution that addresses the computational bottlenecks in NGS data preprocessing.

Main Methods:

  • Developed SparkEC, a parallel tool leveraging the Apache Spark framework for distributed computation.
  • Optimized CloudEC algorithms, incorporating memory-efficient data structures and eliminating input preprocessing.
  • Implemented a scalable architecture designed to run on clusters of nodes.

Main Results:

  • SparkEC demonstrated significant reductions in computational time compared to CloudEC.
  • Achieved average and maximum speedups of 4.9x and 11.9x, respectively.
  • Performance improvements were consistent across various datasets and evaluation scenarios.

Conclusions:

  • SparkEC offers a scalable and computationally efficient solution for correcting errors in large NGS datasets.
  • Its distributed nature allows for performance scaling with the number of cluster nodes.
  • The software is freely available, open-source (GPLv3), and cross-platform compatible (Linux, Windows, macOS).