Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

RNA-seq

RNA-seq

RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases.
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while...

Next-generation Sequencing

Next-generation Sequencing

The first human genome sequencing project cost $2.7 billion and was declared complete in 2003, after 15 years of international cooperation and collaboration between several research teams and funding agencies. Today, with the advent of next-generation sequencing technologies, the cost and time of sequencing a human genome have dropped over 100 fold.
Next-Generation Sequencing Methods
Although all next-generation methods use different technologies, they all share a set of standard features....

Sanger Sequencing

Sanger Sequencing

DNA sequencing is a fundamental technique that is routinely used in the biological sciences. This method can be applied to a range of questions at different scales - from the sequencing of a cloned DNA fragment or the study of a mutation in a gene up to whole-genome sequencing. However, despite the widespread use of sequencing today, it was not until 1977 that Fredrick Sanger and his collaborators developed the chain-termination method to decode DNA sequences. It relies on the separation of a...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Comprehensive review and assessment of multi-species splicing variant prediction: task-specific deep learning models and genomic foundation models.

Briefings in bioinformatics·2026

Same author

Graph-based RNA structural representation reveals determinants of subcellular localization.

Briefings in bioinformatics·2026

Same author

GatorSC: multi-scale cell and gene graphs with mixture-of-experts fusion for single-cell transcriptomics.

Briefings in bioinformatics·2026

Same author

GatorDuo: Global-Consistency Dual-Graph Refinement With Pseudo-Label Agreement for Spatial Transcriptomics.

bioRxiv : the preprint server for biology·2026

Same author

Modification-aware AI enables terminal chemical modifications for peptide design and discovers potent antimicrobials.

bioRxiv : the preprint server for biology·2026

Same author

Drug screening for α-synuclein aggregation inhibitors via multimodal graph neural network.

Briefings in bioinformatics·2026

Same journal

Deep learning model to predict COPD hospital admissions based on meteorological data: a medical meteorological forecast.

Frontiers in big data·2026

Same journal

Where diverse populations gather: transit accessibility and the spatial structure of social mixing.

Frontiers in big data·2026

Same journal

Inner layer security reinforcement for instant payment systems: a dual layer encryption-steganography evaluation in Brunei's digital payment context.

Frontiers in big data·2026

Same journal

Measuring the impact of virtualization and containerization on the environment when using GPUs for processing the AI models.

Frontiers in big data·2026

Same journal

Using artificial intelligence to improve governance and public services in Africa.

Frontiers in big data·2026

Same journal

Case count metric for comparative analysis of entity resolution results.

Frontiers in big data·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Oct 4, 2025

Collection and Extraction of Saliva DNA for Next Generation Sequencing

Collection and Extraction of Saliva DNA for Next Generation Sequencing

Published on: August 27, 2014

BigFiRSt: A Software Program Using Big Data Technique for Mining Simple Sequence Repeats From Large-Scale Sequencing

Jinxiang Chen¹, Fuyi Li^2,3,4, Miao Wang¹

¹Department of Software Engineering, College of Information Engineering, Northwest A&F University, Yangling, China.

Frontiers in Big Data

|February 4, 2022

Summary

This summary is machine-generated.

BigFiRSt is a new Hadoop-based tool that efficiently merges short DNA sequence reads and identifies Simple Sequence Repeats (SSRs) using parallel processing. This accelerates analysis for non-model species in the era of big biological data.

Keywords:

Hadoop Simple Sequence Repeats (SSR)big data next-generation sequencing read pairs

More Related Videos

DNA Sequence Recognition by DNA Primase Using High-Throughput Primase Profiling

DNA Sequence Recognition by DNA Primase Using High-Throughput Primase Profiling

Published on: October 8, 2019

G2-seq: A High Throughput Sequencing-based Technique for Identifying Late Replicating Regions of the Genome

G2-seq: A High Throughput Sequencing-based Technique for Identifying Late Replicating Regions of the Genome

Published on: March 22, 2018

Related Experiment Videos

Last Updated: Oct 4, 2025

Collection and Extraction of Saliva DNA for Next Generation Sequencing

Collection and Extraction of Saliva DNA for Next Generation Sequencing

Published on: August 27, 2014

DNA Sequence Recognition by DNA Primase Using High-Throughput Primase Profiling

DNA Sequence Recognition by DNA Primase Using High-Throughput Primase Profiling

Published on: October 8, 2019

G2-seq: A High Throughput Sequencing-based Technique for Identifying Late Replicating Regions of the Genome

G2-seq: A High Throughput Sequencing-based Technique for Identifying Late Replicating Regions of the Genome

Published on: March 22, 2018

Area of Science:

Genomics
Bioinformatics
Computational Biology

Background:

Simple Sequence Repeats (SSRs) are crucial genetic markers associated with human diseases.
Identifying SSRs traditionally requires complete genomes, which are often unavailable for non-model species.
Next-generation sequencing (NGS) generates vast amounts of data, posing big data challenges for SSR analysis.

Purpose of the Study:

To develop a novel big data software solution for efficient SSR identification from large-scale NGS data.
To address the limitations of traditional tools in handling massive datasets and merging short DNA read pairs.

Main Methods:

Developed BigFiRSt, a Hadoop-based software program utilizing parallel and distributed computing.
Integrated BigFLASH for merging overlapping short paired-end reads and BigPERF for SSR mining.
Leveraged big data technologies to enhance processing speed and scalability.

Main Results:

BigFiRSt significantly reduces execution times for read merging and SSR mining.
Demonstrated dramatic performance improvements on very large-scale DNA sequence datasets.
The software effectively handles the big data challenges inherent in NGS analysis.

Conclusions:

BigFiRSt leverages Hadoop technology for parallel and distributed processing of NGS data.
The tool is anticipated to be valuable for biological big data analysis, particularly for non-model organisms.
Enables efficient SSR discovery crucial for genetic research and disease association studies.