Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Improving Translational Accuracy

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...

Improving Translational Accuracy

Improving Translational Accuracy

Gene Evolution - Fast or Slow?

Gene Evolution - Fast or Slow?

Gene Evolution - Fast or Slow?

Gene Evolution - Fast or Slow?

The genomes of eukaryotes are punctuated by long stretches of sequence which do not code for proteins or RNAs. Although some of these regions do contain crucial regulatory sequences, the vast majority of this DNA serves no known function. Typically, these regions of the genome are the ones in which the fastest change, in evolutionary terms, is observed, because there is typically little to no selection pressure acting on these regions to preserve their sequences.
In contrast, regions which code...

Evolutionary Relationships through Genome Comparisons

Evolutionary Relationships through Genome Comparisons

Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...

Genome Size and the Evolution of New Genes

Genome Size and the Evolution of New Genes

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

A pleiotropic <i>EPAS1</i> enhancer mediating Tibetan adaptation to hypoxia is active in adipocytes.

bioRxiv : the preprint server for biology·2026

Same author

OmicsPred as a centralised resource for genetic prediction of multi-omic traits.

medRxiv : the preprint server for health sciences·2026

Same author

Network pharmacology-based discovery and experimental validation of novel drug repurposing candidates in Alzheimer's Disease.

bioRxiv : the preprint server for biology·2026

Same author

Multi-ancestry transcriptome prediction with functionally informed variants in TOPMed MESA improves performance of transcriptome-wide association studies.

American journal of human genetics·2026

Same author

Toxic metals impact gut microbiota and metabolic risk in five African-origin populations.

Gut microbes reports·2026

Same author

Multi-trait polygenic scores for COPD and COPD exacerbations implicate druggable proteins.

JCI insight·2026

Same journal

Interplay between genomic architecture alterations and GDF6 regulation: a candidate mechanism in Nablus mask-like facial syndrome.

HGG advances·2026

Same journal

Pediatric High-Grade Gliomas and Cancer Predisposition Syndromes: A Retrospective Study.

HGG advances·2026

Same journal

Multi-ancestry genome-wide association meta-analysis of hepatocellular carcinoma identifies eight risk loci including MAP3K9, DHRS1, MTTP, and 8q24.21.

HGG advances·2026

Same journal

Expanding the ABCA2-associated neurodevelopmental phenotype.

HGG advances·2026

Same journal

A Pseudotime-Dependent TWAS Framework Identifies Disease Genes along Cell Developmental Paths.

HGG advances·2026

Same journal

A lethal form of ASCC3 disease: severe global developmental delay, axial hypotonia, hypoplasia of corpus callosum, hypothyroidism and micropenis.

HGG advances·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Nov 7, 2025

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

Transcriptome prediction performance across machine learning models and diverse ancestries.

Paul C Okoro¹, Ryan Schubert², Xiuqing Guo³

¹Program in Bioinformatics, Loyola University Chicago, Chicago, IL, USA.

|May 3, 2021

Summary

This summary is machine-generated.

This study explored machine learning (ML) for transcriptome prediction across diverse ancestries. Non-linear models like random forest (RF) showed promise for imputation, potentially improving complex trait mapping in global populations.

More Related Videos

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Published on: July 22, 2025

mirMachine: A One-Stop Shop for Plant miRNA Annotation

mirMachine: A One-Stop Shop for Plant miRNA Annotation

Published on: May 1, 2021

Related Experiment Videos

Last Updated: Nov 7, 2025

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Published on: July 22, 2025

mirMachine: A One-Stop Shop for Plant miRNA Annotation

mirMachine: A One-Stop Shop for Plant miRNA Annotation

Published on: May 1, 2021

Area of Science:

Genetics and Genomics
Computational Biology
Biostatistics

Background:

Transcriptome prediction methods like PrediXcan and FUSION are vital for complex trait mapping.
Current models predominantly use linear assumptions (e.g., elastic net) trained on European populations.
This limits imputation performance across diverse global ancestries.

Purpose of the Study:

To optimize transcriptome imputation performance across global populations.
To evaluate non-linear machine learning (ML) algorithms against traditional linear models.
To assess the impact of ancestry matching on prediction accuracy.

Main Methods:

Trained transcriptome prediction models using genotype and transcriptome data from the Multi-Ethnic Study of Atherosclerosis (MESA) across African, Hispanic, and European ancestries.
Employed linear (elastic net - EN) and non-linear ML algorithms (random forest - RF, support vector regression - SVR, K nearest neighbor - KNN).
Tested model performance using data from the Modeling the Epidemiology Transition Study (METS) in African ancestries and applied to a high-density lipoprotein (HDL) phenotype.

Main Results:

Prediction performance was highest when training and testing populations shared similar ancestries.
While EN generally outperformed other ML models, RF showed superior performance for specific genes, especially between disparate ancestries.
RF imputation demonstrated potential robustness and reduced variability across global populations.
Integrating RF models into PrediXcan identified potential gene associations for HDL phenotypes missed by EN models.

Conclusions:

Non-linear ML models, particularly RF, offer complementary imputation strategies for transcriptome prediction.
Diversifying training populations and incorporating various ML models can enhance the discovery of genes associated with complex traits.
Improved imputation across diverse ancestries is crucial for advancing complex trait mapping in global health research.