Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Pregnancy Stress Exposures and Postpartum Serum Metabolomic Profiles in Mothers.

Metabolites·2026
Same author

Impact of a prenatal exposure mixture of metals on DNA methylation-derived cell-type composition in cord blood.

Environmental epigenetics·2026
Same author

Stress testing reveals selective vulnerabilities in protein homeostasis.

Cell reports·2026
Same author

Women's bone health trajectories from pregnancy to postpartum: Associations with bone-seeking metal exposure mixtures during pregnancy.

Ecotoxicology and environmental safety·2025
Same author

Stress testing reveals selective vulnerabilities in protein homeostasis.

bioRxiv : the preprint server for biology·2025
Same author

The impact of CMV reactivation on mortality after chimeric antigen receptor T-cell therapy.

Blood advances·2025
Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026
Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026
Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026
Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026
Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026
Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026
See all related articles

Related Experiment Video

Updated: Mar 25, 2026

Development of Compendium for Esophageal Squamous Cell Carcinoma
03:36

Development of Compendium for Esophageal Squamous Cell Carcinoma

Published on: April 12, 2024

910

GEMINI: a computationally-efficient search engine for large gene expression datasets.

Timothy DeFreitas1,2, Hachem Saddiki3, Patrick Flaherty4,5,6

  • 1Computer Science Department, Worcester Polytechnic Institute, 100 Institute Rd, Worcester, 01609, USA. tmdefreitas@wpi.edu.

BMC Bioinformatics
|February 26, 2016
PubMed
Summary
This summary is machine-generated.

GEMINI is a novel search engine that enables users to query massive genomic databases using a genomic profile. This tool significantly speeds up the identification of similar genomic profiles, improving data accessibility.

More Related Videos

Performing Data Mining And Integrative Analysis Of Biomarker in Breast Cancer Using Multiple Publicly Accessible Databases
07:41

Performing Data Mining And Integrative Analysis Of Biomarker in Breast Cancer Using Multiple Publicly Accessible Databases

Published on: May 17, 2019

9.7K
Pattern-based Search of Epigenomic Data Using GeNemo
06:38

Pattern-based Search of Epigenomic Data Using GeNemo

Published on: October 8, 2017

5.4K

Related Experiment Videos

Last Updated: Mar 25, 2026

Development of Compendium for Esophageal Squamous Cell Carcinoma
03:36

Development of Compendium for Esophageal Squamous Cell Carcinoma

Published on: April 12, 2024

910
Performing Data Mining And Integrative Analysis Of Biomarker in Breast Cancer Using Multiple Publicly Accessible Databases
07:41

Performing Data Mining And Integrative Analysis Of Biomarker in Breast Cancer Using Multiple Publicly Accessible Databases

Published on: May 17, 2019

9.7K
Pattern-based Search of Epigenomic Data Using GeNemo
06:38

Pattern-based Search of Epigenomic Data Using GeNemo

Published on: October 8, 2017

5.4K

Area of Science:

  • Genomics
  • Bioinformatics
  • Computational Biology

Background:

  • Advancements in low-cost DNA sequencing generate vast genomic datasets.
  • Current genomic data retrieval relies on text-based queries, mismatched with genomic profile data.
  • Existing search methods limit efficient exploration of large-scale genomic resources.

Purpose of the Study:

  • To develop a fast and efficient search engine for large genomic databases.
  • To enable similarity searches using genomic profiles as queries.
  • To overcome the limitations of text-based metadata searches in genomics.

Main Methods:

  • Developed GEMINI, a search engine utilizing a genomic profile as a query.
  • Implemented a nearest-neighbor search algorithm with a vantage-point tree.
  • Tested GEMINI on gene expression data from The Cancer Genome Atlas (TCGA).

Main Results:

  • GEMINI achieves a query time that scales logarithmically with database size.
  • Demonstrated a significant speed improvement: 0.05 seconds for nearest neighbor search in a 10^5 sample database.
  • Outperformed brute-force search (0.6 seconds) in practical genomic data scenarios.

Conclusions:

  • GEMINI provides a fast search capability for massive genomic databases.
  • Enables identification of similar genomic profiles irrespective of sample labels or metadata.
  • Enhances accessibility and utility of large-scale genomic data for research.