Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Genome Annotation and Assembly

Genome Annotation and Assembly

The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.

Comparing Copy Number Variations and SNPs

Comparing Copy Number Variations and SNPs

Sequencing of the human genome has opened up several best-kept secrets of the genome. Scientists have identified thousands of genome variations that exist within a population. These variations can be a single nucleotide or a larger chromosomal variation.
Copy number variations or CNVs are the structural variations that cover more than 1kb of DNA sequence. The single nucleotide polymorphism (SNP), on the other hand, is a single nucleotide change or a point mutation that is found in more than 1%...

RNA-seq

RNA-seq

RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases.
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while...

Next-generation Sequencing

Next-generation Sequencing

The first human genome sequencing project cost $2.7 billion and was declared complete in 2003, after 15 years of international cooperation and collaboration between several research teams and funding agencies. Today, with the advent of next-generation sequencing technologies, the cost and time of sequencing a human genome have dropped over 100 fold.
Next-Generation Sequencing Methods
Although all next-generation methods use different technologies, they all share a set of standard features....

Distributions to Estimate Population Parameter

Distributions to Estimate Population Parameter

The accurate values of population parameters such as population proportion, population mean, and population standard deviation (or variance) are usually unknown. These are fixed values that can only be estimated from the data collected from the samples. The estimates of each of these parameters are sample proportion, the sample mean, and sample standard deviation (or variance). To obtain the values of these sample statistics, data are required that have particular distribution and central...

Estimating Population Standard Deviation

Estimating Population Standard Deviation

When the population standard deviation is unknown and the sample size is large, the sample standard deviation s is commonly used as a point estimate of σ. However, it can sometimes under or overestimate the population standard deviation. To overcome this drawback, confidence intervals are determined to estimate population parameters and eliminate any calculation bias accurately. However, this only applies to random samples from normally distributed populations. Knowing the sample mean and...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Determinants of haplotype phasing accuracy in long-read human genome sequencing.

bioRxiv : the preprint server for biology·2026

Same author

RNU4ATAC-opathy: Clinical, molecular and transcriptomic insights from a large cohort.

Genetics in medicine : official journal of the American College of Medical Genetics·2026

Same author

needLR: Long-read structural variant annotation with population-scale frequency estimation.

Bioinformatics (Oxford, England)·2026

Same author

Phenotype-Specific Recalibration of MAVE Data Enables Repurposing of <i>BAP1</i> Functional Assays for Küry-Isidor Syndrome.

medRxiv : the preprint server for health sciences·2026

Same author

Building an Interoperable Rare Disease Multi-omic Resource: The GREGoR Data Model and Dataset.

bioRxiv : the preprint server for biology·2026

Same author

Genome-wide detection and clinical prioritization of tandem repeat outliers using long-read sequencing.

medRxiv : the preprint server for health sciences·2026

Same journal

Optimization in Sparse 2D to Dense 3D Weakly Supervised Learning: Application to Multi-Label Segmentation of Large ex vivo MRI Data.

ArXiv·2026

Same journal

Overview of the MedHopQA track at BioCreative IX: track description, participation and evaluation of systems for multi-hop medical question answering.

ArXiv·2026

Same journal

Characterizing Universal Object Representations Across Vision Models.

ArXiv·2026

Same journal

CXR-LT 2026 Challenge: Multi-Center Long-Tailed and Zero Shot Chest X-ray Classification.

ArXiv·2026

Same journal

What Do Biomedical NER and Entity Linking Benchmarks Measure? A Corpus-Centric Diagnostic Framework.

ArXiv·2026

Same journal

The Origin of Life in the Light of Evolution.

ArXiv·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jan 8, 2026

Following the Dynamics of Structural Variants in Experimentally Evolved Populations

Following the Dynamics of Structural Variants in Experimentally Evolved Populations

Published on: February 3, 2023

needLR: Long-read structural variant annotation with population-scale frequency estimation.

Jonas A Gustafson^1,2, Jiadong Lin³, Evan E Eichler^3,4,5

¹Department of Molecular and Cellular Biology, University of Washington, Seattle, WA 98195, USA.

|December 19, 2025

Summary

This summary is machine-generated.

We developed needLR, a tool to filter and prioritize structural variants (SVs) from long-read sequencing. It effectively uses population data to reduce candidate SVs while retaining pathogenic ones.

More Related Videos

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER

Published on: June 23, 2012

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Published on: December 7, 2021

Related Experiment Videos

Last Updated: Jan 8, 2026

Following the Dynamics of Structural Variants in Experimentally Evolved Populations

Following the Dynamics of Structural Variants in Experimentally Evolved Populations

Published on: February 3, 2023

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER

Published on: June 23, 2012

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Published on: December 7, 2021

Area of Science:

Genomics
Bioinformatics
Computational Biology

Background:

Long-read sequencing technologies enable comprehensive structural variant (SV) detection.
Accurate annotation and prioritization of pathogenic SVs are crucial for diagnosing genetic disorders.
Existing tools often struggle with the scale and complexity of SV data from long reads.

Purpose of the Study:

To introduce needLR, a novel computational tool for annotating and prioritizing structural variants (SVs) identified through long-read sequencing.
To leverage population allele frequencies, genomic context, and gene-phenotype associations for SV filtering.
To enhance the efficiency of identifying candidate pathogenic SVs in clinical and research settings.

Main Methods:

Developed needLR, an SV annotation tool integrating population allele frequencies, genomic annotations, and gene-phenotype data.
Utilized population data from 500 healthy individuals for variant frequency analysis.
Evaluated needLR performance on nine test cases containing known pathogenic SVs.

Main Results:

needLR successfully assigned allele frequencies to over 97.5% of all detected SVs across test cases.
The tool significantly reduced the average number of novel genic SVs to 121 per case.
All known pathogenic SVs were successfully retained within the filtered set, demonstrating high sensitivity.

Conclusions:

needLR provides an effective method for filtering and prioritizing structural variants from long-read sequencing data.
The tool's ability to integrate diverse data sources improves the accuracy and efficiency of identifying disease-causing SVs.
needLR represents a valuable advancement for genomic analysis in both research and clinical diagnostics.