Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Genomics02:02

Genomics

41.7K
Genomics is the science of genomes: it is the study of all the genetic material of an organism. In humans, the genome consists of information carried in 23 pairs of chromosomes in the nucleus, as well as mitochondrial DNA. In genomics, both coding and non-coding DNA is sequenced and analyzed. Genomics allows a better understanding of all living things, their evolution, and their diversity. It has a myriad of uses: for example, to build phylogenetic trees, to improve productivity and...
41.7K
Genomic DNA in Eukaryotes00:58

Genomic DNA in Eukaryotes

53.9K
Eukaryotes have large genomes compared to prokaryotes. To fit their genomes into a cell, eukaryotic DNA is packaged extraordinarily tightly inside the nucleus. To achieve this, DNA is tightly wound around proteins called histones, which are packaged into nucleosomes that are joined by linker DNA and coil into chromatin fibers. Additional fibrous proteins further compact the chromatin, which is recognizable as chromosomes during certain phases of cell division.
53.9K
Genome-wide Association Studies-GWAS01:11

Genome-wide Association Studies-GWAS

16.6K
Genome-wide association studies or GWAS are used to identify whether common SNPs are associated with certain diseases. Suppose specific SNPs are more frequently observed in individuals with a particular disease than those without the disease. In that case, those SNPs are said to be associated with the disease. Chi-square analysis is performed to check the probability of the allele likely to be associated with the disease.
GWAS does not require the identification of the target gene involved in...
16.6K
DNA Microarrays02:34

DNA Microarrays

22.8K
Microarrays are high-throughput and relatively inexpensive assays that can be automated to analyze large quantities of data at a time. They are used in genome-wide studies to compare gene or protein expression under two varied conditions, such as healthy and diseased states. Microarrays consist of glass or silica slides on which probe molecules are covalently attached through surface functionalization. Most commonly, the slides are prepared through the chemisorption of silanes to silica...
22.8K
Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

7.2K
Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
7.2K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Dynamic time warping analysis of accelerometry data: a tool for interpreting fine-scale movement patterns during fish angling events.

Conservation physiology·2026
Same author

Identification of food deprivation in salmonids using gill biomarkers.

Conservation physiology·2025
Same author

Addressing issues of experimental design, ecological realism and local adaptation for applications of ectotherm upper thermal limits.

The Journal of experimental biology·2025
Same author

Migration and Spawning Affect the Stable Isotope Values of Multiple Tissues in Pacific Salmon.

Ecological and evolutionary physiology·2025
Same author

Coronary circulation enhances the aerobic performance of wild Pacific salmon.

The Journal of experimental biology·2024
Same author

Physiological condition infers habitat choice in juvenile sockeye salmon.

Conservation physiology·2024
Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026
Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026
Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026
Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026
Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026
Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026
See all related articles

Related Experiment Video

Updated: Mar 26, 2026

Databases to Efficiently Manage Medium Sized, Low Velocity, Multidimensional Data in Tissue Engineering
09:43

Databases to Efficiently Manage Medium Sized, Low Velocity, Multidimensional Data in Tissue Engineering

Published on: November 22, 2019

6.9K

GenAp: a distributed SQL interface for genomic data.

Christos Kozanitis1, David A Patterson2

  • 1Department of Computer Science, University of California Berkeley, Soda Hall, Berkeley, 94720, California, USA. kozanitis@eecs.berkeley.edu.

BMC Bioinformatics
|February 6, 2016
PubMed
Summary
This summary is machine-generated.

Researchers have developed a modified Spark SQL to efficiently query large genomic datasets. This new approach speeds up data retrieval by over 50x, simplifying genomic data analysis for genetic disease research.

More Related Videos

Performing Data Mining And Integrative Analysis Of Biomarker in Breast Cancer Using Multiple Publicly Accessible Databases
07:41

Performing Data Mining And Integrative Analysis Of Biomarker in Breast Cancer Using Multiple Publicly Accessible Databases

Published on: May 17, 2019

9.7K
Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens
09:14

Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens

Published on: June 28, 2018

7.6K

Related Experiment Videos

Last Updated: Mar 26, 2026

Databases to Efficiently Manage Medium Sized, Low Velocity, Multidimensional Data in Tissue Engineering
09:43

Databases to Efficiently Manage Medium Sized, Low Velocity, Multidimensional Data in Tissue Engineering

Published on: November 22, 2019

6.9K
Performing Data Mining And Integrative Analysis Of Biomarker in Breast Cancer Using Multiple Publicly Accessible Databases
07:41

Performing Data Mining And Integrative Analysis Of Biomarker in Breast Cancer Using Multiple Publicly Accessible Databases

Published on: May 17, 2019

9.7K
Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens
09:14

Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens

Published on: June 28, 2018

7.6K

Area of Science:

  • Genomics
  • Bioinformatics
  • Computational Biology

Background:

  • Advancements in genome sequencing technology have led to massive data generation for genetic disease research.
  • Handling and accessing terabytes of genomic data presents a significant challenge for researchers.
  • Efficient data retrieval is crucial for understanding disease mechanisms and developing targeted therapies.

Purpose of the Study:

  • To address the challenge of providing on-demand access to large-scale genomic data.
  • To improve the efficiency of querying genomic intervals within distributed databases.
  • To reduce the complexity and development effort required for genomic data analysis.

Main Methods:

  • Modification of Spark SQL, a distributed SQL execution engine.
  • Implementation of efficient join operations using genomic intervals as keys.
  • Benchmarking performance against existing brute-force and distributed approaches.

Main Results:

  • The modified Spark SQL achieves over 50x speedup for genomic interval joins compared to brute-force methods.
  • The system demonstrates an 8x performance improvement over similar distributed implementations.
  • A significant reduction (by an order of magnitude) in software code is required for data querying.

Conclusions:

  • Modified Spark SQL offers a highly efficient solution for querying large genomic datasets.
  • This advancement can accelerate genetic disease research by simplifying data access and analysis.
  • The approach has the potential to replace current practices for genomic data retrieval and analysis.