Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Genome-wide Association Studies-GWAS01:11

Genome-wide Association Studies-GWAS

15.0K
Genome-wide association studies or GWAS are used to identify whether common SNPs are associated with certain diseases. Suppose specific SNPs are more frequently observed in individuals with a particular disease than those without the disease. In that case, those SNPs are said to be associated with the disease. Chi-square analysis is performed to check the probability of the allele likely to be associated with the disease.
GWAS does not require the identification of the target gene involved in...
15.0K
DNA Microarrays02:34

DNA Microarrays

20.0K
Microarrays are high-throughput and relatively inexpensive assays that can be automated to analyze large quantities of data at a time. They are used in genome-wide studies to compare gene or protein expression under two varied conditions, such as healthy and diseased states. Microarrays consist of glass or silica slides on which probe molecules are covalently attached through surface functionalization. Most commonly, the slides are prepared through the chemisorption of silanes to silica...
20.0K
Genome Size and the Evolution of New Genes03:21

Genome Size and the Evolution of New Genes

3.1K
3.1K
Genome Size and the Evolution of New Genes03:21

Genome Size and the Evolution of New Genes

8.8K
While every living organism has a genome of some kind (be it RNA, or DNA), there is considerable variation in the sizes of these blueprints. One major factor that impacts genome size is whether the organism is prokaryotic or eukaryotic. In prokaryotes, the genome contains little to no non-coding sequence, such that genes are tightly clustered in groups or operons sequentially along the chromosome. Conversely, the genes in eukaryotes are punctuated by long stretches of non-coding sequence.
8.8K
Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

6.7K
Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
6.7K
Genomics02:02

Genomics

39.2K
Genomics is the science of genomes: it is the study of all the genetic material of an organism. In humans, the genome consists of information carried in 23 pairs of chromosomes in the nucleus, as well as mitochondrial DNA. In genomics, both coding and non-coding DNA is sequenced and analyzed. Genomics allows a better understanding of all living things, their evolution, and their diversity. It has a myriad of uses: for example, to build phylogenetic trees, to improve productivity and...
39.2K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

TikTok is a valuable data source for tracking the opioid crisis.

NPJ digital medicine·2026
Same author

Drug-Target Interaction Prediction with PIGLET.

bioRxiv : the preprint server for biology·2026
Same author

GATSBI: Improving context-aware protein embeddings through biologically motivated data splits.

bioRxiv : the preprint server for biology·2026
Same author

Biological data governance in an age of AI.

Science (New York, N.Y.)·2026
Same author

The Human Omnibus of Targetable Pockets.

Journal of cheminformatics·2025
Same author

Publisher Correction: CRISPR-GPT for agentic automation of gene-editing experiments.

Nature biomedical engineering·2025
Same journal

Algorithm-hardware co-design of neuromorphic networks with dual memory pathways.

Nature machine intelligence·2026
Same journal

Plagiarism in the Age of Generative Artificial Intelligence: The advent of generative artificial intelligence (GenAI) tools is challenging the scientific community's understanding of the meaning and significance of plagiarism. A new definition of research misconduct is needed that specifically addresses the use of GenAI writing tools.

Nature machine intelligence·2026
Same journal

Platonic representation of foundation machine learning interatomic potentials.

Nature machine intelligence·2026
Same journal

Immunotherapy drug target identification using machine learning and patient-derived tumour explant validation.

Nature machine intelligence·2026
Same journal

A generative artificial intelligence approach for peptide antibiotic optimization.

Nature machine intelligence·2026
Same journal

LLMs displaying less cognitive bias are not necessarily better decision makers.

Nature machine intelligence·2026
See all related articles

Related Experiment Video

Updated: Dec 8, 2025

A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research
09:35

A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research

Published on: August 16, 2017

18.2K

Gaussian Embedding for Large-scale Gene Set Analysis.

Sheng Wang1, Emily R Flynn2, Russ B Altman1,2,3

  • 1Department of Bioengineering, Stanford University, Stanford, CA 94305, USA.

Nature Machine Intelligence
|September 24, 2020
PubMed
Summary
This summary is machine-generated.

Set2Gaussian embeds gene sets using protein-protein interaction networks, improving biological discovery and identifying novel cancer subnetworks. This computational method enhances machine learning compatibility for gene set analysis.

More Related Videos

Mapping Bacterial Functional Networks and Pathways in Escherichia Coli using Synthetic Genetic Arrays
14:06

Mapping Bacterial Functional Networks and Pathways in Escherichia Coli using Synthetic Genetic Arrays

Published on: November 12, 2012

46.8K
Large-Scale Multi-Omics Genome-Wide Association Studies Mo-GWAS: Guidelines for Sample Preparation and Normalization
08:27

Large-Scale Multi-Omics Genome-Wide Association Studies Mo-GWAS: Guidelines for Sample Preparation and Normalization

Published on: July 27, 2021

4.6K

Related Experiment Videos

Last Updated: Dec 8, 2025

A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research
09:35

A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research

Published on: August 16, 2017

18.2K
Mapping Bacterial Functional Networks and Pathways in Escherichia Coli using Synthetic Genetic Arrays
14:06

Mapping Bacterial Functional Networks and Pathways in Escherichia Coli using Synthetic Genetic Arrays

Published on: November 12, 2012

46.8K
Large-Scale Multi-Omics Genome-Wide Association Studies Mo-GWAS: Guidelines for Sample Preparation and Normalization
08:27

Large-Scale Multi-Omics Genome-Wide Association Studies Mo-GWAS: Guidelines for Sample Preparation and Normalization

Published on: July 27, 2021

4.6K

Area of Science:

  • Computational biology
  • Bioinformatics
  • Machine learning in biology

Background:

  • High-throughput biological data has led to a proliferation of gene sets (e.g., protein complexes, signaling pathways).
  • Extracting biological insights from gene sets requires computational methods for machine learning compatibility.
  • Existing methods often represent gene sets as single points, potentially losing network information.

Purpose of the Study:

  • To develop a novel network-based gene set embedding approach compatible with machine learning models.
  • To represent gene sets as multivariate Gaussian distributions reflecting gene proximity in protein-protein interaction networks.
  • To demonstrate the utility of this approach for biological discovery and clinical applications.

Main Methods:

  • Introduced Set2Gaussian, a network-based embedding method.
  • Represented gene sets as multivariate Gaussian distributions based on gene proximity in protein-protein interaction networks.
  • Evaluated Set2Gaussian on gene set member identification, tumor stratification, and gene set enrichment analysis.

Main Results:

  • Set2Gaussian improved gene set member identification.
  • The method accurately stratified tumors.
  • It identified concise gene sets for enrichment analysis.
  • Set2Gaussian uncovered a novel NEFM-centric clinical prognostic and predictive subnetwork in sarcoma.
  • This subnetwork was validated in independent cohorts.

Conclusions:

  • Set2Gaussian provides a powerful new method for embedding gene sets into machine learning frameworks.
  • The approach enhances biological discovery by leveraging network information.
  • Set2Gaussian has potential clinical applications, as demonstrated by the identification and validation of a novel sarcoma subnetwork.