Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Expected Frequencies in Goodness-of-Fit Tests

Expected Frequencies in Goodness-of-Fit Tests

A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n) to the number of categories (k).

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for k_a Estimation

This lesson introduces two critical methods in pharmacokinetics, the Wagner-Nelson and Loo-Riegelman methods, used for estimating the absorption rate constant (ka) for drugs administered via non-intravenous routes. The Wagner-Nelson method relates ka to the plasma concentration derived from the slope of a semilog percent unabsorbed time plot. However, it is limited to drugs with one-compartment kinetics and can be impacted by factors like gastrointestinal motility or enzymatic degradation.
On...

Determination of Expected Frequency

Determination of Expected Frequency

Suppose one wants to test independence between the two variables of a contingency table. The values in the table constitute the observed frequencies of the dataset. But how does one determine the expected frequency of the dataset? One of the important assumptions is that the two variables are independent, which means the variables do not influence each other. For independent variables, the statistical probability of any event involving both variables is calculated by multiplying the individual...

Linear Approximation in Frequency Domain

Linear Approximation in Frequency Domain

Linear systems are characterized by two main properties: superposition and homogeneity. Superposition allows the response to multiple inputs to be the sum of the responses to each individual input. Homogeneity ensures that scaling an input by a scalar results in the response being scaled by the same scalar.
In contrast, nonlinear systems do not inherently possess these properties. However, for small deviations around an operating point, a nonlinear system can often be approximated as linear.

IR Frequency Region: Fingerprint Region

IR Frequency Region: Fingerprint Region

IR spectra are divided into two main regions: the diagnostic region and the fingerprint region. The diagnostic region of the spectrum lies above 1500 cm−1. The absorptions resulting from single-bond vibrations of the N–H, C–H, and O–H stretch at higher wavenumbers and appear on the left side of the spectrum. The stretching absorptions of the C≡C and C≡N occur between 2100–2300 cm−1. In contrast, those arising from stretching absorptions of the C=O, C=N, and C=C occur between 1600–1850 cm−1.
The...

IR Frequency Region: X–H Stretching

IR Frequency Region: X–H Stretching

In IR spectroscopy, signals produced by the X−H bonds (such as C−H, O−H, or N−H) can be observed in the frequency range of 2700–4000 cm–1. The C−H stretching vibration forms sharp bands in the region 2850–3000 cm–1. The presence of the O−H stretching vibration leads to the forming of an absorption band in the frequency range 3650–3200 cm−1. At the same time, N−H stretching can be confirmed by absorption bands in the 3500–3100 cm−1 range. Even though both O−H and N−H bonds vibrate at a similar...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Non-local modeling of enhancer-promoter interactions, a correspondence on "LOCO-EPI: Leave-one-chromosome-out (LOCO) as a benchmarking paradigm for deep learning based prediction of enhancer-promoter interactions".

Applied intelligence (Dordrecht, Netherlands)·2026

Same author

Steroid hormone antagonism affords vascular protection in a mouse model of vascular Ehlers-Danlos syndrome.

JCI insight·2026

Same author

MicroRNAs provide negative feedback and stability in gene regulatory network models of cell-state transitions.

Frontiers in epigenetics and epigenomics·2026

Same author

Reprogramming of neuronal genome function and phenotype by astrocytes.

bioRxiv : the preprint server for biology·2026

Same author

Corrigendum: Machine learning identifies activation of RUNX/AP-1 as drivers of mesenchymal and fibrotic regulatory programs in gastric cancer.

Genome research·2026

Same author

An expanded registry of candidate cis-regulatory elements.

Nature·2026

Same journal

Phenotypic plasticity trade-offs in an age-structured model of bacterial growth under stress.

Journal of mathematical biology·2026

Same journal

Intraspecific interactions facilitate mutualism across multilayer networks under weak selection.

Journal of mathematical biology·2026

Same journal

A two-species competition model on a compact metric graph for the invasion and competition of Aedes Aegypti and Aedes Albopictus mosquitoes in Florida.

Journal of mathematical biology·2026

Same journal

Superinfection and the hypnozoite reservoir for Plasmodium vivax: a multitype branching process approximation.

Journal of mathematical biology·2026

Same journal

Correction to: Superinfection and the hypnozoite reservoir for Plasmodium vivax: a general framework.

Journal of mathematical biology·2026

Same journal

Stoichiometric balance and sustained rhythms.

Journal of mathematical biology·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 9, 2026

Computer-based Multitaper Spectrogram Program for Electroencephalographic Data

Computer-based Multitaper Spectrogram Program for Electroencephalographic Data

Published on: November 13, 2019

Robust k-mer frequency estimation using gapped k-mers.

Mahmoud Ghandi¹, Morteza Mohammad-Noori, Michael A Beer

¹Department of Biomedical Engineering and McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University, Baltimore, MD, 21205, USA, ghandi@jhmi.edu.

Journal of Mathematical Biology

|July 18, 2013

Summary

This summary is machine-generated.

This study introduces a novel method using gapped k-mer counts to accurately estimate DNA k-mer frequencies. This approach improves statistical learning for analyzing DNA sequences, particularly for identifying functional elements like CTCF binding sites.

More Related Videos

Genomic MRI - a Public Resource for Studying Sequence Patterns within Genomic DNA

Genomic MRI - a Public Resource for Studying Sequence Patterns within Genomic DNA

Published on: May 9, 2011

Related Experiment Videos

Last Updated: May 9, 2026

Computer-based Multitaper Spectrogram Program for Electroencephalographic Data

Computer-based Multitaper Spectrogram Program for Electroencephalographic Data

Published on: November 13, 2019

Genomic MRI - a Public Resource for Studying Sequence Patterns within Genomic DNA

Genomic MRI - a Public Resource for Studying Sequence Patterns within Genomic DNA

Published on: May 9, 2011

Area of Science:

Genomics and Bioinformatics
Computational Biology
Statistical Genetics

Background:

K-mers (oligomers of fixed length k) are fundamental for describing DNA sequence features and constructing complex descriptors.
While k-mers offer a complete and unbiased feature set, increasing k leads to sparse frequency counts, causing noisy estimations in statistical learning.
Molecular DNA interactions have limited spatial extent, making gapped k-mers potentially more informative for capturing biological signals.

Purpose of the Study:

To develop a robust method for estimating ungapped k-mer frequencies using gapped k-mer counts.
To address the challenge of sparse k-mer frequency estimation in statistical learning for DNA sequence analysis.
To improve the accuracy of DNA sequence feature description, especially for large k values.

Main Methods:

Derived a minimum norm estimate equation for k-mer frequencies based on observed gapped k-mer frequencies.
Applied the method to estimate k-mer frequencies in biological sequences.
Utilized a sample of CTCF binding sites in the human genome for validation.

Main Results:

The proposed method provides a more accurate estimation of k-mer frequencies compared to direct counting.
Demonstrated improved accuracy in real biological sequences, specifically within human CTCF binding sites.
Successfully leveraged gapped k-mer information to mitigate the sparsity issue of ungapped k-mers.

Conclusions:

Gapped k-mer counts offer a robust way to improve the estimation of ungapped k-mer frequencies.
This approach enhances the reliability of statistical learning methods applied to DNA sequence analysis.
The findings have implications for accurately identifying DNA sequence features and functional elements.