Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Evolutionary Relationships through Genome Comparisons

Evolutionary Relationships through Genome Comparisons

Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...

Comparing Copy Number Variations and SNPs

Comparing Copy Number Variations and SNPs

Sequencing of the human genome has opened up several best-kept secrets of the genome. Scientists have identified thousands of genome variations that exist within a population. These variations can be a single nucleotide or a larger chromosomal variation.
Copy number variations or CNVs are the structural variations that cover more than 1kb of DNA sequence. The single nucleotide polymorphism (SNP), on the other hand, is a single nucleotide change or a point mutation that is found in more than 1%...

Genome-wide Association Studies-GWAS

Genome-wide Association Studies-GWAS

Genome-wide association studies or GWAS are used to identify whether common SNPs are associated with certain diseases. Suppose specific SNPs are more frequently observed in individuals with a particular disease than those without the disease. In that case, those SNPs are said to be associated with the disease. Chi-square analysis is performed to check the probability of the allele likely to be associated with the disease.
GWAS does not require the identification of the target gene involved in...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Serial vaccination expands and refines human CD4<sup>+</sup> T cell memory.

bioRxiv : the preprint server for biology·2026

Same author

Histone deacetylases and cell-cycle regulators orchestrate cell-identity transitions during Arabidopsis root regeneration.

Molecular plant·2026

Same author

Stem cell regulators drive a G1 duration gradient during plant root development.

Nature plants·2025

Same author

HDACs repress runaway stress and cell identity to promote reprogramming in root regeneration.

bioRxiv : the preprint server for biology·2025

Same author

A long term time lapse microscopy technique for Arabidopsis roots.

Frontiers in plant science·2025

Same author

Synthetic deconvolution of an auxin-dependent transcriptional code.

Cell·2025

Same journal

Globins in the marine annelid Platynereis dumerilii shed new light on hemoglobin evolution in bilaterians.

BMC evolutionary biology·2020

Same journal

Is there any intron sliding in mammals?

BMC evolutionary biology·2020

Same journal

The evolution of the huntingtin-associated protein 40 (HAP40) in conjunction with huntingtin.

BMC evolutionary biology·2020

Same journal

You don't have the guts: a diverse set of fungi survive passage through Macrotermes bellicosus termite guts.

BMC evolutionary biology·2020

Same journal

Mitochondrial DNAs provide insight into trypanosome phylogeny and molecular evolution.

BMC evolutionary biology·2020

Same journal

Stress-related changes in leukocyte profiles and telomere shortening in the shortest-lived tetrapod, Furcifer labordi.

BMC evolutionary biology·2020

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 6, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Predicting genome-wide redundancy using machine learning.

Huang-Wen Chen¹, Sunayan Bandyopadhyay, Dennis E Shasha

¹Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY 10003, USA.

BMC Evolutionary Biology

|November 20, 2010

Summary

This summary is machine-generated.

Machine learning effectively identifies gene redundancy, predicting most Arabidopsis genes have functional paralogs. This approach enhances genetic analysis and offers insights into gene duplication evolution.

More Related Videos

Generating the Transcriptional Regulation View of Transcriptomic Features for Prediction Task and Dark Biomarker Detection on Small Datasets

Generating the Transcriptional Regulation View of Transcriptomic Features for Prediction Task and Dark Biomarker Detection on Small Datasets

Published on: March 1, 2024

Related Experiment Videos

Last Updated: Jun 6, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Generating the Transcriptional Regulation View of Transcriptomic Features for Prediction Task and Dark Biomarker Detection on Small Datasets

Generating the Transcriptional Regulation View of Transcriptomic Features for Prediction Task and Dark Biomarker Detection on Small Datasets

Published on: March 1, 2024

Area of Science:

Genomics
Evolutionary Biology
Bioinformatics

Background:

Gene duplication can create genetic redundancy, masking gene functions in analyses.
Improved methods for detecting genetic redundancy can boost reverse genetics efficiency.
Understanding gene duplication's evolutionary impact is crucial.

Purpose of the Study:

To apply machine learning for classifying gene family members into redundant and non-redundant pairs.
To enhance sensitivity in identifying genetic redundancy in model organisms like Arabidopsis thaliana.
To provide insights into the evolutionary outcomes of gene duplication.

Main Methods:

Utilized machine learning techniques, including Support Vector Machines, combining multiple attributes.
Compared multi-attribute classifiers against single-trait classifiers like BLAST E-values and expression correlation.
Employed withholding analysis to assess classifier precision in predicting genetic redundancy.

Main Results:

Combined machine learning attributes significantly improved redundancy prediction accuracy over single traits.
Support Vector Machines demonstrated twofold higher precision, correctly labeling the majority of redundant gene pairs.
Machine learning predicts approximately half of Arabidopsis genes exhibit redundancy with 1-3 other family members.
A substantial portion of predicted redundant gene pairs represent ancient duplications (Ks > 1), indicating stable redundancy over evolutionary time.

Conclusions:

Machine learning predicts that most genes possess a functionally redundant paralog but interact redundantly with few family members.
The generated predictions and gene pair attributes for Arabidopsis serve as a valuable resource for genetics and genome evolution research.
The developed machine learning techniques are applicable to other species for studying gene redundancy and evolution.