Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Multiple Comparison Tests01:13

Multiple Comparison Tests

4.0K
Multiple comparison test, abbreviated as MCT, is a post hoc analysis generally performed after comparing multiple samples with one or more tests. An MCT will help identify a significantly different sample among multiple samples or a factor among multiple factors.
It would be easy to compare two samples using a significance alpha level of 0.05. In other words, there is only one sample pair to be compared. However, it would be difficult to identify a significantly different sample if the number...
4.0K
Introduction to R01:11

Introduction to R

2.7K
R is a powerful software environment for statistical computing and graphics. Originating as an implementation of the S language, developed at Bell Laboratories, R has evolved into a robust, open-source statistical software favored by statisticians and data scientists worldwide. Its comprehensive suite includes data manipulation, calculation, and graphical display capabilities, making it versatile for data analysis and visualization. Its programming language is at the core of R's...
2.7K
Column Efficiency: Rate Theory01:12

Column Efficiency: Rate Theory

596
The rate theory of chromatography provides quantitative insight into the shapes and widths of elution bands. These bands are based on the random-walk mechanism governing molecular migration within a column. The Gaussian profile of chromatographic bands arises from the cumulative effect of random molecular motions as they progress through the column.
During elution, a solute molecule experiences numerous transitions between stationary and mobile phases, exhibiting irregular residence times in...
596
Comparing Copy Number Variations and SNPs02:26

Comparing Copy Number Variations and SNPs

18.1K
Sequencing of the human genome has opened up several best-kept secrets of the genome. Scientists have identified thousands of genome variations that exist within a population. These variations can be a single nucleotide or a larger chromosomal variation.
Copy number variations or CNVs are the structural variations that cover more than 1kb of DNA sequence. The single nucleotide polymorphism (SNP), on the other hand, is a single nucleotide change or a point mutation that is found in more than 1%...
18.1K
Law of Independent Assortment02:03

Law of Independent Assortment

59.0K
While Mendel’s Law of Segregation states that the two alleles for one gene are separated into different gametes, a different question of how different genes are inherited remains. For example, is the gene for tall plants inherited with the gene for green peas? Mendel asked this question by experimenting with a dihybrid cross; a cross in which both parents are homozygous for two distinct traits resulting in an F1 generation that are heterozygous for both traits.
59.0K
Parallel Processing01:20

Parallel Processing

358
The brain processes sensory information rapidly due to parallel processing, which involves sending data across multiple neural pathways at the same time. This method allows the brain to manage various sensory qualities, such as shapes, colors, movements, and locations, all concurrently. For instance, when observing a forest landscape, the brain simultaneously processes the movement of leaves, the shapes of trees, the depth between them, and the various shades of green. This enables a quick and...
358

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Facilitating analysis of open neurophysiology data on the DANDI Archive using large language model tools.

Scientific data·2025
Same author

Facilitating analysis of open neurophysiology data on the DANDI Archive using large language model tools.

bioRxiv : the preprint server for biology·2025
Same author

Distributed Augmentation, Hypersweeps, and Branch Decomposition of Contour Trees for Scientific Exploration.

IEEE transactions on visualization and computer graphics·2024
Same author

Spyglass: a framework for reproducible and shareable neuroscience research.

bioRxiv : the preprint server for biology·2024
Same author

Structured behavioral data format: An NWB extension standard for task-based behavioral neuroscience experiments.

bioRxiv : the preprint server for biology·2024
Same author

FAIR for AI: An interdisciplinary and international community building perspective.

Scientific data·2023
Same journal

Architectural Implications for Spatial Object Association Algorithms.

Proceedings. IPDPS (Conference)·2015
Same journal

Orientation Refinement of Virus Structures with Unknown Symmetry.

Proceedings. IPDPS (Conference)·2015
Same journal

High-throughput Analysis of Large Microscopy Image Datasets on CPU-GPU Cluster Platforms.

Proceedings. IPDPS (Conference)·2014
Same journal

Accelerating Large Scale Image Analyses on Parallel, CPU-GPU Equipped Systems.

Proceedings. IPDPS (Conference)·2014
Same journal

Parallel Mapping Approaches for GNUMAP.

Proceedings. IPDPS (Conference)·2013
Same journal

Translational Research Design Templates, Grid Computing, and HPC.

Proceedings. IPDPS (Conference)·2011
See all related articles

Related Experiment Video

Updated: Oct 17, 2025

A High-throughput Cell Microarray Platform for Correlative Analysis of Cell Differentiation and Traction Forces
12:04

A High-throughput Cell Microarray Platform for Correlative Analysis of Cell Differentiation and Traction Forces

Published on: March 1, 2017

9.8K

Predicting and Comparing the Performance of Array Management Libraries.

Donghe Kang1, Oliver Rübel2, Suren Byna2

  • 1The Ohio State University.

Proceedings. IPDPS (Conference)
|October 11, 2021
PubMed
Summary
This summary is machine-generated.

New models predict application performance for I/O-bound scientific computing, considering array libraries like HDF5 and Zarr. These models accurately capture performance beyond just I/O, improving scalability for complex data.

More Related Videos

Simulating Imaging of Large Scale Radio Arrays on the Lunar Surface
06:14

Simulating Imaging of Large Scale Radio Arrays on the Lunar Surface

Published on: July 30, 2020

5.1K
Competitive Genomic Screens of Barcoded Yeast Libraries
11:59

Competitive Genomic Screens of Barcoded Yeast Libraries

Published on: August 11, 2011

18.5K

Related Experiment Videos

Last Updated: Oct 17, 2025

A High-throughput Cell Microarray Platform for Correlative Analysis of Cell Differentiation and Traction Forces
12:04

A High-throughput Cell Microarray Platform for Correlative Analysis of Cell Differentiation and Traction Forces

Published on: March 1, 2017

9.8K
Simulating Imaging of Large Scale Radio Arrays on the Lunar Surface
06:14

Simulating Imaging of Large Scale Radio Arrays on the Lunar Surface

Published on: July 30, 2020

5.1K
Competitive Genomic Screens of Barcoded Yeast Libraries
11:59

Competitive Genomic Screens of Barcoded Yeast Libraries

Published on: August 11, 2011

18.5K

Area of Science:

  • High-Performance Computing (HPC)
  • Data Storage and Management
  • Scientific Data Analysis

Background:

  • Many scientific applications are I/O-bound, necessitating performance optimization for scalability.
  • Existing I/O performance models are insufficient for applications using array libraries (e.g., HDF5, Zarr) due to complex data access patterns and storage models.
  • I/O optimization is often ad-hoc, performed by domain scientists lacking deep storage hierarchy expertise.

Purpose of the Study:

  • To present an analytical cost model for predicting end-to-end execution time of applications using array management libraries.
  • To evaluate the model's accuracy in capturing performance beyond raw I/O, including data transformation and caching.
  • To compare the performance of different storage libraries, specifically HDF5 and Zarr, using the developed model.

Main Methods:

  • Developed an analytical cost model incorporating I/O time, memory copy costs, and software cache benefits.
  • Focused on HDF5 (single-file storage) and Zarr (multi-file storage) as representative array libraries.
  • Evaluated the model on real-world applications in neuroscience and plasma physics across three HPC clusters.

Main Results:

  • I/O can account for as little as 10% of total execution time, highlighting the inadequacy of I/O-only models.
  • The new model accurately predicts the fastest storage library (HDF5 vs. Zarr) 94% of the time.
  • This significantly outperforms a cutting-edge I/O model, which achieves 70% accuracy.

Conclusions:

  • End-to-end performance modeling, including data layout transformations and caching, is crucial for applications using array libraries.
  • The developed analytical model provides a more accurate prediction of application performance compared to traditional I/O models.
  • This work offers a valuable tool for optimizing data storage and access in scientific computing, improving application scalability and developer efficiency.