Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

RNA-seq03:21

RNA-seq

12.3K
RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases. 
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while...
12.3K
Multi-species Conserved Sequences02:51

Multi-species Conserved Sequences

4.9K
Next-generation sequencing technologies have created large genomic databases of a variety of animals and plants. Ever since the human genome project was completed, scientists studied the genome of primates, mammals, and other phylogenetically distant living beings. Such large-scale  studies have provided new insights into the evolutionary relationship between organisms.
Although the genome of each species varies greatly from each other, a few sequences are highly conserved. Such conserved...
4.9K
Maxam-Gilbert Sequencing01:05

Maxam-Gilbert Sequencing

13.3K
In the same year as the discovery of the Sanger sequencing method, another group of scientists, Allan Maxam and Walter Gilbert, demonstrated their chemical-cleavage method for DNA sequencing. The Maxam-Gilbert method relies on using different chemicals that can cleave the DNA sequence at specific sites, the separation of resulting DNA fragments of variable size using electrophoresis, and deciphering the DNA sequence from the resulting gel bands.
Challenges of the Maxam-Gilbert Method
The...
13.3K
Next-generation Sequencing03:00

Next-generation Sequencing

99.9K
The first human genome sequencing project cost $2.7 billion and was declared complete in 2003, after 15 years of international cooperation and collaboration between several research teams and funding agencies. Today, with the advent of next-generation sequencing technologies, the cost and time of sequencing a human genome have dropped over 100 fold.
Next-Generation Sequencing Methods
Although all next-generation methods use different technologies, they all share a set of standard features....
99.9K
DNA Microarrays02:34

DNA Microarrays

21.6K
Microarrays are high-throughput and relatively inexpensive assays that can be automated to analyze large quantities of data at a time. They are used in genome-wide studies to compare gene or protein expression under two varied conditions, such as healthy and diseased states. Microarrays consist of glass or silica slides on which probe molecules are covalently attached through surface functionalization. Most commonly, the slides are prepared through the chemisorption of silanes to silica...
21.6K
Scalar and Vectors01:22

Scalar and Vectors

2.4K
In mechanics, commonly used terms like force, speed, velocity, and work can be classified as either scalar or vector quantities. A scalar is a physical quantity that can be described by its magnitude alone and does not require any directional components. Examples of scalar quantities are mass, area, and length.
Scalar quantities with the same physical units can be added or subtracted according to the usual algebra rules for numbers. For example, a class ending 10 min earlier than 50 min lasts...
2.4K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Inertial measurement units are comparable to motion capture for measuring intersegmental spinal angular velocities in horses.

American journal of veterinary research·2026
Same author

Association of Fetal Gene Regulatory Gene Deletions With Poor Cognition in Schizophrenia and Community-Based Samples.

The American journal of psychiatry·2026
Same author

Incline and decline treadmill trotting produce electromyographic changes in specific canine shoulder muscle activity: implications for therapeutic exercise.

American journal of veterinary research·2026
Same author

Pain, Sleep Latency, and Mental and Physical Health in Individuals with Self-Reported Hypermobile Ehlers-Danlos Syndrome.

Healthcare (Basel, Switzerland)·2026
Same author

Clinical validation of an HPV whole-genome sequencing assay for MRD detection in patients with HPV+ head and neck cancer treated with surgery.

Science translational medicine·2026
Same author

Building an Interoperable Rare Disease Multi-omic Resource: The GREGoR Data Model and Dataset.

bioRxiv : the preprint server for biology·2026
Same journal

3DICE: Interpretable 3D Cross-Modal Learning for Drug-Target Interaction Prediction and Large-Scale Drug Discovery.

Bioinformatics (Oxford, England)·2026
Same journal

KASSPer: Kinase Active Site Structure Prediction using Protein and Ligand Language Models and Its Application to Virtual Screening.

Bioinformatics (Oxford, England)·2026
Same journal

IDR searcher: a search engine solution for public image resources.

Bioinformatics (Oxford, England)·2026
Same journal

KCFtools: Rapid alignment-free method for introgression screening and GWAS using k-mer profiles.

Bioinformatics (Oxford, England)·2026
Same journal

Meta2DB: Curated shotgun metagenomic feature sets and metadata for health state prediction.

Bioinformatics (Oxford, England)·2026
Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026
See all related articles

Related Experiment Video

Updated: Mar 5, 2026

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER
14:06

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER

Published on: June 23, 2012

15.8K

SeqArray-a storage-efficient high-performance data format for WGS variant calls.

Xiuwen Zheng1, Stephanie M Gogarten1, Michael Lawrence2

  • 1Department of Biostatistics, University of Washington, Seattle, WA, USA.

Bioinformatics (Oxford, England)
|March 24, 2017
PubMed
Summary
This summary is machine-generated.

A new data format, SeqArray, offers efficient storage and rapid analysis of whole-genome sequencing (WGS) variant data. It significantly reduces file sizes and speeds up genotype retrieval and allele frequency calculations compared to VCF and BCF formats.

More Related Videos

Pre-Implantation Genetic Testing for Aneuploidy on a Semiconductor Based Next-Generation Sequencing Platform
09:30

Pre-Implantation Genetic Testing for Aneuploidy on a Semiconductor Based Next-Generation Sequencing Platform

Published on: August 17, 2022

3.6K
3' End Sequencing Library Preparation with A-seq2
12:01

3' End Sequencing Library Preparation with A-seq2

Published on: October 10, 2017

11.1K

Related Experiment Videos

Last Updated: Mar 5, 2026

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER
14:06

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER

Published on: June 23, 2012

15.8K
Pre-Implantation Genetic Testing for Aneuploidy on a Semiconductor Based Next-Generation Sequencing Platform
09:30

Pre-Implantation Genetic Testing for Aneuploidy on a Semiconductor Based Next-Generation Sequencing Platform

Published on: August 17, 2022

3.6K
3' End Sequencing Library Preparation with A-seq2
12:01

3' End Sequencing Library Preparation with A-seq2

Published on: October 10, 2017

11.1K

Area of Science:

  • Genomics
  • Bioinformatics
  • Computational Biology

Background:

  • Whole-genome sequencing (WGS) generates vast amounts of data, necessitating efficient storage and analysis formats.
  • Existing formats like Variant Call Format (VCF) are large and can be slow for data retrieval.
  • There is a need for a flexible, high-performance data format for WGS variant analysis.

Purpose of the Study:

  • To introduce SeqArray, a novel array-oriented data format for WGS variant data.
  • To provide enhanced compression and high-performance data access capabilities.
  • To offer a flexible programming environment for WGS variant analysis within R/Bioconductor.

Main Methods:

  • Implementation of the SeqArray format within the R/Bioconductor package 'SeqArray'.
  • Development of array-oriented storage for variant genotypes and annotations.
  • Integration of high compression options and parallel computing for data access.

Main Results:

  • SeqArray achieves significant file size reduction (2.6 GB) compared to VCF (14.0 GB) and BCF (12.3 GB) for the 1000 Genomes Phase 3 dataset.
  • Genotype reading in SeqArray is 2-3 times faster than htslib with BCF files.
  • Allele frequency calculation in SeqArray is over 5 times faster than PLINK v1.9 and over 16 times faster than vcftools.

Conclusions:

  • SeqArray offers a highly compressed and efficient alternative for storing and analyzing WGS variant data.
  • The package provides substantial performance improvements for genotype retrieval and frequency calculations.
  • SeqArray, integrated with R/Bioconductor, delivers a powerful environment for large-scale genomic data analysis.