Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Cis-regulatory Sequences02:02

Cis-regulatory Sequences

3.0K
3.0K
Cis-regulatory Sequences02:02

Cis-regulatory Sequences

9.3K
Cis-regulatory sequences are short fragments of non-coding DNA that are present on the same chromosomes as the genes that they regulate. These fragments serve as binding sites for transcriptional regulators, proteins that are responsible for controlling gene transcription and differential gene expression across cell types in eukaryotes. Cis-regulatory sequences can be close to the gene of interest or thousands of bases away in the DNA sequence; however, those sequences that are further away are...
9.3K
Maxam-Gilbert Sequencing01:05

Maxam-Gilbert Sequencing

10.5K
In the same year as the discovery of the Sanger sequencing method, another group of scientists, Allan Maxam and Walter Gilbert, demonstrated their chemical-cleavage method for DNA sequencing. The Maxam-Gilbert method relies on using different chemicals that can cleave the DNA sequence at specific sites, the separation of resulting DNA fragments of variable size using electrophoresis, and deciphering the DNA sequence from the resulting gel bands.
Challenges of the Maxam-Gilbert Method
The...
10.5K
Next-generation Sequencing03:00

Next-generation Sequencing

87.6K
The first human genome sequencing project cost $2.7 billion and was declared complete in 2003, after 15 years of international cooperation and collaboration between several research teams and funding agencies. Today, with the advent of next-generation sequencing technologies, the cost and time of sequencing a human genome have dropped over 100 fold.
Next-Generation Sequencing Methods
Although all next-generation methods use different technologies, they all share a set of standard features....
87.6K
Multi-species Conserved Sequences02:51

Multi-species Conserved Sequences

3.3K
Next-generation sequencing technologies have created large genomic databases of a variety of animals and plants. Ever since the human genome project was completed, scientists studied the genome of primates, mammals, and other phylogenetically distant living beings. Such large-scale  studies have provided new insights into the evolutionary relationship between organisms.
Although the genome of each species varies greatly from each other, a few sequences are highly conserved. Such conserved...
3.3K
RACE - Rapid Amplification of cDNA Ends02:35

RACE - Rapid Amplification of cDNA Ends

5.9K
Rapid Amplification of cDNA Ends, or RACE, is one of the most effective methods to obtain a full-length cDNA from an mRNA sequence between a known internal region to the unknown sequence at the 5’ or 3’ end. The unknown region is cloned in the cDNA by a gene-specific primer that binds the known end, and a hybrid primer that attaches a predefined anchor sequence to the unknown end of the cDNA. The sequence in between is amplified by PCR with an anchor primer and a gene-specific...
5.9K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Self-synergizing mutual prodrug liposomes for targeted cancer therapy <i>via</i> redox-amplified Pin1 inhibition.

Theranostics·2026
Same author

Prognostic impact of multi-divisional trigeminal neuralgia on pain outcomes following microvascular decompression.

Clinical neurology and neurosurgery·2026
Same author

FOXC2 and WT1 regulate transcriptional reprogramming during the podocyte response to injury.

JCI insight·2026
Same author

Non-local modeling of enhancer-promoter interactions, a correspondence on "LOCO-EPI: Leave-one-chromosome-out (LOCO) as a benchmarking paradigm for deep learning based prediction of enhancer-promoter interactions".

Applied intelligence (Dordrecht, Netherlands)·2026
Same author

Steroid hormone antagonism affords vascular protection in a mouse model of vascular Ehlers-Danlos syndrome.

JCI insight·2026
Same author

A Self-Deliverable H<sub>2</sub>O<sub>2</sub>-Responsive Tocopherol Dimer for Enhanced Antioxidant and Liposomal Delivery.

Molecules (Basel, Switzerland)·2026
Same journal

Detection, communication, and individual identification with deep audio embeddings: A case study with North Atlantic right whales.

PLoS computational biology·2026
Same journal

Exploring the structural lexicon of the Proteome via Metric Geometry.

PLoS computational biology·2026
Same journal

Linking retinal sampling in neural encoding models to temporal profiles of visual processing in humans.

PLoS computational biology·2026
Same journal

CAdir: Joint clustering of cells and genes for single-cell transcriptomics with visualization-driven cluster quality assessment.

PLoS computational biology·2026
Same journal

Systematic design of auxotrophic strains and media conditions to probe metabolic functions in E. coli.

PLoS computational biology·2026
Same journal

Neuronal excitability and parameter variability in the Hodgkin-Huxley model.

PLoS computational biology·2026
See all related articles

Related Experiment Video

Updated: Apr 26, 2026

High-throughput Identification of Gene Regulatory Sequences Using Next-generation Sequencing of Circular Chromosome Conformation Capture 4C-seq
09:06

High-throughput Identification of Gene Regulatory Sequences Using Next-generation Sequencing of Circular Chromosome Conformation Capture 4C-seq

Published on: October 5, 2018

10.0K

Enhanced regulatory sequence prediction using gapped k-mer features.

Mahmoud Ghandi1, Dongwon Lee1, Morteza Mohammad-Noori2

  • 1Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America.

Plos Computational Biology
|July 18, 2014
PubMed
Summary
This summary is machine-generated.

This study introduces gapped k-mers (gkm-SVM) to improve DNA and protein sequence analysis. Gkm-SVM accurately predicts genomic regulatory elements, outperforming traditional k-mer methods.

More Related Videos

DNA Sequence Recognition by DNA Primase Using High-Throughput Primase Profiling
08:04

DNA Sequence Recognition by DNA Primase Using High-Throughput Primase Profiling

Published on: October 8, 2019

8.0K
An Integrated Approach for Microprotein Identification and Sequence Analysis
09:37

An Integrated Approach for Microprotein Identification and Sequence Analysis

Published on: July 12, 2022

3.1K

Related Experiment Videos

Last Updated: Apr 26, 2026

High-throughput Identification of Gene Regulatory Sequences Using Next-generation Sequencing of Circular Chromosome Conformation Capture 4C-seq
09:06

High-throughput Identification of Gene Regulatory Sequences Using Next-generation Sequencing of Circular Chromosome Conformation Capture 4C-seq

Published on: October 5, 2018

10.0K
DNA Sequence Recognition by DNA Primase Using High-Throughput Primase Profiling
08:04

DNA Sequence Recognition by DNA Primase Using High-Throughput Primase Profiling

Published on: October 8, 2019

8.0K
An Integrated Approach for Microprotein Identification and Sequence Analysis
09:37

An Integrated Approach for Microprotein Identification and Sequence Analysis

Published on: July 12, 2022

3.1K

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Genomics

Background:

  • K-mers are widely used for DNA and protein sequence analysis.
  • Increasing k-mer length for longer features leads to sparse data and noisy frequencies, limiting statistical learning.
  • Existing methods struggle with large-scale genomic applications due to k-mer limitations.

Purpose of the Study:

  • To develop a robust method for sequence analysis that overcomes k-mer limitations.
  • To introduce gapped k-mers (gkm) and a new classifier, gkm-SVM, for improved feature representation.
  • To enhance the accuracy of predicting functional genomic regulatory elements and enhancers.

Main Methods:

  • Introduced gapped k-mers (gkm) as an alternative feature set.
  • Developed the gkm-SVM classifier for sequence classification.
  • Implemented an efficient tree data structure for kernel matrix computation for large-scale applications.
  • Utilized robust estimation for k-mer frequencies.

Main Results:

  • Gkm-SVM significantly improves the accuracy of predicting functional genomic regulatory elements and tissue-specific enhancers, with precision increased by up to a factor of two.
  • Gkm-SVM consistently outperforms the original k-mer-SVM on human ENCODE ChIP-seq datasets.
  • Demonstrated the general utility of the gkm-SVM method using a Naïve-Bayes classifier.

Conclusions:

  • Gapped k-mers and gkm-SVM offer a more robust and accurate approach to sequence analysis compared to traditional k-mers.
  • The developed methods are efficient and scalable for large-scale genome-wide applications.
  • The gkm-SVM approach is broadly applicable to various sequence classification problems beyond regulatory element analysis.