Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Multi-input and Multi-variable systems01:22

Multi-input and Multi-variable systems

192
Cruise control systems in cars are designed as multi-input systems to maintain a driver's desired speed while compensating for external disturbances such as changes in terrain. The block diagram for a cruise control system typically includes two main inputs: the desired speed set by the driver and any external disturbances, such as the incline of the road. By adjusting the engine throttle, the system maintains the vehicle's speed as close to the desired value as possible.
In the absence...
192
Strategies for Assessing and Addressing Confounding01:25

Strategies for Assessing and Addressing Confounding

172
Confounding is a critical issue in epidemiological studies, often leading to misleading conclusions about associations between exposures and outcomes. It occurs when the relationship between the exposure and the outcome is mixed with the effects of other factors that influence the outcome. Given that, addressing confounding is of high importance for drawing accurate inferences in research.
Confounding can be addressed at both the design phase of a study and through analytical methods after data...
172
Improving Translational Accuracy02:07

Improving Translational Accuracy

12.0K
Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...
12.0K
Confounding in Epidemiological Studies01:27

Confounding in Epidemiological Studies

295
Confounding in statistical epidemiology represents a pivotal challenge, referring to the distortion in the perceived relationship between an exposure and an outcome due to the presence of a third variable, known as a confounder. This variable is associated with both the exposure and the outcome but is not a direct link in their causal chain. Its presence can lead to erroneous interpretations of the exposure's effect, either exaggerating or underestimating the true association. This...
295
Multiple Regression01:25

Multiple Regression

3.3K
Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...
3.3K
Multicompartment Models: Overview01:14

Multicompartment Models: Overview

280
Multicompartment models are mathematical constructs that depict how drugs are distributed and eliminated within the body. They segment the body into several compartments, symbolizing various physiological or anatomical areas connected through drug transfer processes such as absorption, metabolism, distribution, and elimination.
These models offer a more comprehensive representation of drug behavior in the body than one-compartment models. They accommodate the complexity of drug distribution,...
280

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

HarveST uses a heterogeneous graph learning framework to reveal spatial transcriptomics patterns.

Communications biology·2026
Same author

FAST: Scalable Factor Analysis for Spatial Dimension Reduction of Multi-section Spatial Transcriptomics.

Genomics, proteomics & bioinformatics·2026
Same author

Nonlinear embedding and integration of omics data: a fast and tuning-free approach.

Briefings in bioinformatics·2025
Same author

An integrated deep learning framework for the interpretation of untargeted metabolomics data.

Briefings in bioinformatics·2023
Same author

Feature selection and classification over the network with missing node observations.

Statistics in medicine·2021
Same author

DNLC: differential network local consistency analysis.

BMC bioinformatics·2019
Same journal

Systematic design of auxotrophic strains and media conditions to probe metabolic functions in E. coli.

PLoS computational biology·2026
Same journal

Neuronal excitability and parameter variability in the Hodgkin-Huxley model.

PLoS computational biology·2026
Same journal

Delayed reward information is underweighted in reinforcement learning with dispersed feedback.

PLoS computational biology·2026
Same journal

GHF-ACL: A novel contrastive learning framework with multi-order graph structures for herb-disease association prediction.

PLoS computational biology·2026
Same journal

GATE: Adaptive learning with working memory by information gating in multi-lamellar hippocampal formation.

PLoS computational biology·2026
Same journal

Evaluating vectors for the design of a spillover-disrupting Lassa virus transmissible vaccine.

PLoS computational biology·2026
See all related articles

Related Experiment Video

Updated: Oct 5, 2025

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches
09:47

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches

Published on: December 15, 2023

1.3K

AIME: Autoencoder-based integrative multi-omics data embedding that allows for confounder adjustments.

Tianwei Yu1,2,3

  • 1School of Data Science, The Chinese University of Hong Kong-Shenzhen, Shenzhen, Guangdong, China.

Plos Computational Biology
|January 26, 2022
PubMed
Summary
This summary is machine-generated.

Researchers developed a new deep learning tool called AIME to combine different types of biological data. This method identifies hidden patterns while removing unwanted background noise from clinical variables. It helps scientists discover meaningful connections between distinct molecular datasets more effectively than older linear techniques.

Keywords:
deep learninggenomicsdimensionality reductionbioinformatics software

Frequently Asked Questions

More Related Videos

Author Spotlight: Integrated Multi-Omics Analysis for Unveiling Multicellular Immune Signatures in Clinical Heart Attack Cohorts
08:51

Author Spotlight: Integrated Multi-Omics Analysis for Unveiling Multicellular Immune Signatures in Clinical Heart Attack Cohorts

Published on: September 20, 2024

1.5K
Multiplexed Analysis of Retinal Gene Expression and Chromatin Accessibility Using scRNA-Seq and scATAC-Seq
06:24

Multiplexed Analysis of Retinal Gene Expression and Chromatin Accessibility Using scRNA-Seq and scATAC-Seq

Published on: March 12, 2021

3.8K

Related Experiment Videos

Last Updated: Oct 5, 2025

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches
09:47

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches

Published on: December 15, 2023

1.3K
Author Spotlight: Integrated Multi-Omics Analysis for Unveiling Multicellular Immune Signatures in Clinical Heart Attack Cohorts
08:51

Author Spotlight: Integrated Multi-Omics Analysis for Unveiling Multicellular Immune Signatures in Clinical Heart Attack Cohorts

Published on: September 20, 2024

1.5K
Multiplexed Analysis of Retinal Gene Expression and Chromatin Accessibility Using scRNA-Seq and scATAC-Seq
06:24

Multiplexed Analysis of Retinal Gene Expression and Chromatin Accessibility Using scRNA-Seq and scATAC-Seq

Published on: March 12, 2021

3.8K

Area of Science:

  • Computational biology and Autoencoder-based integrative multi-omics data embedding research
  • Bioinformatics and statistical genomics

Background:

Biological research often struggles to identify complex, nonlinear relationships between diverse molecular datasets. Traditional linear approaches frequently fail to capture the intricate dependencies inherent in large-scale genomic information. This gap motivated the development of more sophisticated computational frameworks. Prior research has shown that deep learning architectures excel at uncovering hidden patterns within high-dimensional data. However, existing models often lack the capacity to account for external clinical variables that might bias results. That uncertainty drove the need for a system capable of simultaneous integration and adjustment. No prior work had resolved how to effectively embed multi-omics information while controlling for confounding factors. This study addresses these limitations by introducing a novel deep learning setup designed for robust data synthesis.

Purpose Of The Study:

The primary aim of this study is to introduce a deep learning framework for integrating diverse omics datasets. Researchers seek to extract data representations that accurately reflect the complex relationships between different molecular types. Traditional linear methods often fail to capture the nonlinear dependencies present in modern biological measurements. This gap motivated the creation of a system that can handle intricate, non-linear data structures. The authors also intend to provide a mechanism for adjusting results based on clinical confounding factors. That uncertainty drove the need for a robust model that separates biological signals from unwanted background noise. No prior work had fully integrated these capabilities into a single, accessible deep learning architecture. This project establishes a new standard for multi-omics analysis by combining feature ranking with effective confounder control.

Main Methods:

The research team designed a deep learning architecture to facilitate the integration of disparate molecular datasets. Their approach utilizes an autoencoder framework to compress high-dimensional information into a lower-dimensional latent space. This design allows the model to capture nonlinear dependencies that linear techniques often overlook during analysis. The investigators incorporated a mechanism to include clinical variables directly into the training process for confounder adjustment. They utilized the Keras library with a TensorFlow backend to construct and execute the neural network models. The review approach involved testing the method on both simulated data and real-world microRNA-gene expression datasets. These simulations allowed the team to evaluate the accuracy of feature extraction under controlled conditions. Finally, the authors provided an open-source software package to ensure the reproducibility of their computational workflow.

Main Results:

The study reports that the proposed deep learning method effectively extracts major contributing features between disparate data types. In simulation tests, the model demonstrated high efficacy in identifying significant associations compared to baseline approaches. The researchers show that the system successfully excludes the influence of clinical confounders in real-world applications. By applying the tool to microRNA and gene expression data, they uncovered novel information that appears biologically plausible. The model provides a systematic way to rank features based on their relative contributions to the integrative embedding. It also identifies specific pairs of related features across the two distinct molecular datasets. These results indicate that the architecture handles complex, nonlinear data structures more effectively than traditional linear methods. The findings confirm that the framework maintains performance stability even when external clinical variables are present in the input.

Conclusions:

The authors demonstrate that their deep learning framework successfully extracts meaningful representations from complex multi-omics datasets. This approach effectively removes unwanted clinical influences, ensuring that identified patterns reflect true biological signals. The researchers propose that their system outperforms traditional linear methods when dealing with nonlinear data dependencies. By ranking feature contributions, the model provides clear insights into the drivers of observed molecular relationships. The study confirms that the tool identifies biologically plausible information in real-world microRNA and gene expression datasets. The authors suggest that their method offers a flexible alternative for integrative analyses across various biological contexts. Their findings indicate that the software package provides a reliable resource for researchers managing confounded genomic data. The team concludes that their architecture represents a significant advancement in the computational analysis of multifaceted molecular information.

The researchers propose that AIME utilizes a deep learning architecture to map distinct data types into a shared, low-dimensional space. This mechanism allows the model to isolate nonlinear relationships while simultaneously filtering out the effects of specified clinical confounding variables.

The tool relies on a Keras and TensorFlow backend to implement its autoencoder structure. This specific software configuration enables the model to perform complex matrix operations required for nonlinear dimensionality reduction and feature contribution ranking.

The authors state that adjusting for confounders is necessary to prevent bias from external clinical factors. Without this correction, the model might incorrectly attribute variations in the data to biological interactions rather than to technical or environmental influences.

The model uses these datasets to demonstrate its ability to extract biologically relevant information. While the first dataset contains clinical confounders, the second serves as a control to validate that the method functions accurately in both noisy and clean environments.

The researchers measure the effectiveness of their approach by its capacity to rank features based on their contributions. They compare this performance against traditional linear methods, noting that their deep learning setup excels at capturing complex, nonlinear associations between the two data types.

The authors propose that their software package provides a robust solution for integrative analyses. They claim that this tool allows scientists to uncover novel, biologically plausible insights that might remain hidden when using standard linear correlation techniques.