Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Word organization in coding DNA: a mathematical model.

Indranil Mukhopadhyay1, Anup Som, Satyabrata Sahoo

  • 1Department of Human Genetics, University of Pittsburgh, Pittsburgh, PA-15261, USA. imukhopadhyay@hgen.pitt.edu

Theory in Biosciences = Theorie in Den Biowissenschaften
|October 19, 2006
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Causal Impact of Primary Care and Publicly Funded Health Insurance on Catastrophic Health Spending From Climate-Sensitive Diseases in India.

Applied health economics and health policy·2026
Same author

A Wolbachia lineage likely representing a new supergroup (Y) dominates the microbiome of the quill mite Syringophilus bipectinatus Heller, 1880 (Acariformes: Syringophilidae).

Scientific reports·2026
Same author

Design and Synthesis of Insensitive Fused Triazolo-Pyrimidine-Based Energetic Materials.

Organic letters·2026
Same author

Mutation-driven mechanisms underlying antibiotic resistance of Helicobacter pylori in Asia: A systematic review and meta-analysis (2000-2024).

Microbial pathogenesis·2026
Same author

Incidence of Wheat Curl Mite-Transmitted Viruses in Major Cereal Crops: Potential Roles of Cover and Forage Cereal Crops in the Continuum of Wheat Streak Mosaic Disease Complex.

Plant disease·2026
Same author

Induction of Autoimmune Myocarditis in Diversity Outbred Mice.

Biology·2026
Same journal

From episodes to populations: evolutionary explanation requires a constructive epistemology.

Theory in biosciences = Theorie in den Biowissenschaften·2026
Same journal

Cortical neuron classes and recursive curvature collapse: a neurobiological model of conscious dynamics.

Theory in biosciences = Theorie in den Biowissenschaften·2026
Same journal

On model of weight gain of farm animals.

Theory in biosciences = Theorie in den Biowissenschaften·2026
Same journal

An investigative network analysis mapping global cancer epidemiology.

Theory in biosciences = Theorie in den Biowissenschaften·2026
Same journal

The challenge of distinguishing living from non-living entities.

Theory in biosciences = Theorie in den Biowissenschaften·2026
Same journal

Red fescue (Festuca rubra L.) variety recognition using subset division and neural networks.

Theory in biosciences = Theorie in den Biowissenschaften·2026
See all related articles

Heaps' law does not accurately model vocabulary growth in coding DNA sequences (CDS). A new "equation of word organization" reveals CDSs have unique, structured nucleotide patterns, unlike natural languages or random sequences.

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Genomics

Background:

  • Heaps' law describes vocabulary growth in natural human languages.
  • Coding DNA sequences (CDSs) exhibit unique sequence characteristics.
  • Existing models for natural language may not apply to biological sequences.

Purpose of the Study:

  • To investigate the relationship between vocabulary size and sequence length in CDSs.
  • To develop a novel mathematical model for CDS vocabulary organization.
  • To compare CDS nucleotide organization with natural languages and random sequences.

Main Methods:

  • Defining "words" as non-overlapping nucleotide strings of a fixed length within CDSs.
  • Applying a tangent-hyperbolic function to model vocabulary saturation.

Related Experiment Videos

  • Formulating a new mathematical model termed the "equation of word organization".
  • Main Results:

    • Heaps' law was found to be inadequate for modeling vocabulary in CDSs.
    • The developed "equation of word organization" demonstrates unique nucleotide organization in CDSs.
    • CDSs show distinct patterns compared to both natural languages and random sequences.

    Conclusions:

    • Coding DNA sequences possess a unique and structured nucleotide organization.
    • This organization is distinct from natural human languages and random sequences.
    • The findings suggest a specific biological function underlying CDS nucleotide patterns.