Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

RNA-seq03:21

RNA-seq

12.4K
RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases. 
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while...
12.4K
Next-generation Sequencing03:00

Next-generation Sequencing

100.7K
The first human genome sequencing project cost $2.7 billion and was declared complete in 2003, after 15 years of international cooperation and collaboration between several research teams and funding agencies. Today, with the advent of next-generation sequencing technologies, the cost and time of sequencing a human genome have dropped over 100 fold.
Next-Generation Sequencing Methods
Although all next-generation methods use different technologies, they all share a set of standard features....
100.7K
Genome Annotation and Assembly03:36

Genome Annotation and Assembly

21.4K
The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.
21.4K
Sanger Sequencing01:57

Sanger Sequencing

777.7K
DNA sequencing is a fundamental technique that is routinely used in the biological sciences. This method can be applied to a range of questions at different scales - from the sequencing of a cloned DNA fragment or the study of a mutation in a gene up to whole-genome sequencing. However, despite the widespread use of sequencing today, it was not until 1977 that Fredrick Sanger and his collaborators developed the chain-termination method to decode DNA sequences. It relies on the separation of a...
777.7K
Maxam-Gilbert Sequencing01:05

Maxam-Gilbert Sequencing

13.6K
In the same year as the discovery of the Sanger sequencing method, another group of scientists, Allan Maxam and Walter Gilbert, demonstrated their chemical-cleavage method for DNA sequencing. The Maxam-Gilbert method relies on using different chemicals that can cleave the DNA sequence at specific sites, the separation of resulting DNA fragments of variable size using electrophoresis, and deciphering the DNA sequence from the resulting gel bands.
Challenges of the Maxam-Gilbert Method
The...
13.6K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

SARS-CoV-2 Spike Protein's Structural Dynamics Affect the Activity of the Bebtelovimab Antibody.

Journal of chemical information and modeling·2026
Same author

Identifying Sex Differences in Adverse Events Reported on Opioid Drugs in the FDA's Adverse Event Reporting System (FAERS).

Pharmaceuticals (Basel, Switzerland)·2026
Same author

Leveraging machine learning for selective cannabinoid ligand discovery: methods, challenges, and opportunities.

Expert opinion on drug discovery·2026
Same author

Using Machine Learning for Green Substitution of Industrial Chemicals: Integrating Functionality, Hazard, and Life Cycle Impact.

Chemical reviews·2026
Same author

Beyond Competitive Binding: New Biochemical Insights Challenge Endocrine Disrupting Chemical Screening Paradigms.

Environmental science & technology·2025
Same author

Integrating Molecular Dynamics, Molecular Docking, and Machine Learning for Predicting SARS-CoV-2 Papain-like Protease Binders.

Molecules (Basel, Switzerland)·2025
Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026
Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026
Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026
Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026
Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026
Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026
See all related articles

Related Experiment Video

Updated: Mar 21, 2026

Collection and Extraction of Saliva DNA for Next Generation Sequencing
06:58

Collection and Extraction of Saliva DNA for Next Generation Sequencing

Published on: August 27, 2014

40.2K

A novel procedure on next generation sequencing data analysis using text mining algorithm.

Weizhong Zhao1,2, James J Chen1, Roger Perkins1

  • 1Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, 3900 NCTR Road, HFT-20, Jefferson, AR, 72079, USA.

BMC Bioinformatics
|May 15, 2016
PubMed
Summary
This summary is machine-generated.

Topic modeling offers a novel approach for analyzing next-generation sequencing (NGS) data, revealing genetic diversity in Salmonella strains. This method effectively classifies serotypes, paving the way for identifying gene-phenotype relationships and biomarkers in big data biology.

Keywords:
BiomarkerData miningGenetic diversityNext-generation sequencing (NGS)Topic modeling

More Related Videos

Targeted DNA Methylation Analysis by Next-generation Sequencing
08:38

Targeted DNA Methylation Analysis by Next-generation Sequencing

Published on: February 24, 2015

38.2K
Author Spotlight: Cost-Effective Transcriptomic Drug Screening - Unlocking New Targets
06:40

Author Spotlight: Cost-Effective Transcriptomic Drug Screening - Unlocking New Targets

Published on: February 23, 2024

1.9K

Related Experiment Videos

Last Updated: Mar 21, 2026

Collection and Extraction of Saliva DNA for Next Generation Sequencing
06:58

Collection and Extraction of Saliva DNA for Next Generation Sequencing

Published on: August 27, 2014

40.2K
Targeted DNA Methylation Analysis by Next-generation Sequencing
08:38

Targeted DNA Methylation Analysis by Next-generation Sequencing

Published on: February 24, 2015

38.2K
Author Spotlight: Cost-Effective Transcriptomic Drug Screening - Unlocking New Targets
06:40

Author Spotlight: Cost-Effective Transcriptomic Drug Screening - Unlocking New Targets

Published on: February 23, 2024

1.9K

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Machine Learning

Background:

  • Next-generation sequencing (NGS) generates vast biological and biomedical data, necessitating efficient data mining strategies for comparative and evolutionary studies.
  • Topic modeling, a machine learning technique, is increasingly used for structuring large text corpora in data mining.

Purpose of the Study:

  • To introduce a novel procedure for analyzing NGS data using topic modeling.
  • To demonstrate the application of this procedure using Salmonella enterica strain data.
  • To optimize the procedure through perplexity and convergence efficiency analysis.

Main Methods:

  • A four-step procedure involving NGS data retrieval, preprocessing, topic modeling, and data mining using Latent Dirichlet Allocation (LDA).
  • Application of LDA to Salmonella enterica NGS data.
  • Evaluation of topic model performance using perplexity and Gibbs sampling convergence.

Main Results:

  • LDA-derived topics accurately characterized the genetic diversity of the fliC gene across Salmonella serotypes.
  • Hierarchical clustering and data matrix analysis of LDA outputs successfully classified Salmonella serotypes.
  • The approach demonstrated potential for elucidating genetic information and identifying gene-phenotype relationships.

Conclusions:

  • Topic modeling provides a novel method for NGS data analysis, enhancing genetic information extraction.
  • This approach facilitates the identification of gene-phenotype relationships and biomarkers in the era of big biological data.