Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Causes of Similarity-Dissimilarity Effect01:26

Causes of Similarity-Dissimilarity Effect

261
The similarity-dissimilarity effect, a fundamental concept in social psychology, explains how interpersonal similarities and differences influence attraction and social interactions. This effect is supported by three key psychological perspectives: balance theory, social comparison theory, and consensual validation.Balance Theory and Cognitive ConsistencyBalance theory, developed by Fritz Heider, posits that individuals seek cognitive consistency in their relationships. When two people share...
261
Factors Influencing Attraction III: Similarity01:23

Factors Influencing Attraction III: Similarity

778
The similarity hypothesis suggests that individuals are more likely to form relationships with others who share similar attitudes, beliefs, values, and interests. This concept has been widely studied in social psychology, demonstrating that perceived similarity fosters interpersonal attraction. In an experiment supporting this hypothesis, participants were presented with fabricated information indicating that strangers held attitudes similar to their own. The results showed that participants...
778
Classification of Titrimetric Analysis Based on Reaction Types01:01

Classification of Titrimetric Analysis Based on Reaction Types

1.8K
Titrimetric analysis in solution chemistry involves measuring the volume of solutions and is often called volumetric analysis. The standard solution of known concentration in the burette is called the titrant, whereas the solution of unknown concentration in the flask is called the analyte, or titrand. Titrimetric analyses can be classified into four types based on the reactions between the titrant and analyte.
Titrations between an acid and a base lead to neutralization reactions that form...
1.8K
Cardiovascular Drugs: Classification based on Therapeutic Indications01:18

Cardiovascular Drugs: Classification based on Therapeutic Indications

4.2K
Cardiovascular diseases, encompassing a range of conditions, can significantly affect the heart's operations and the overall circulatory system. These conditions impair the heart's ability to pump blood, leading to a deficit in oxygen supply to crucial organs. Anomalies in the heart's electrical system, known as arrhythmias, can cause heartbeats to accelerate or slow down. Usually, heart rates increase during physical activity and decrease while resting or sleeping. However,...
4.2K
Trial and Error and Algorithm01:12

Trial and Error and Algorithm

424
A problem-solving strategy is a plan of action used to find a solution. Different strategies have distinct action plans. Trial and error involves trying different solutions until one works. For instance, to fix a broken printer, you might check ink levels, ensure the paper tray isn't jammed, and verify the printer's connection to your laptop. This method can be time-consuming but is commonly used. Thomas Edison, for example, used trial and error to find a suitable filament for the light...
424
Force Classification01:22

Force Classification

2.4K
Forces play a crucial role in the study of physics and engineering. They are essential in describing the motion, behavior, and equilibrium of objects in the physical world. Forces can be classified based on their origin, type, and direction of action.
Contact and non-contact forces are two of the most widely used categories of forces. As the name suggests, contact forces require physical contact between two objects to act upon each other. Examples of contact forces include frictional,...
2.4K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

A skin-conformal rigid-in-soft array-based imaging system.

Nature communications·2026
Same author

Body weight-supported treadmill training enhances motor recovery and ameliorates neuron injury through the MDK/LRP-1 signaling pathway in T10 incomplete spinal cord-injured rats.

Neurochemical research·2026
Same author

[Policy research on agriculture, rural areas, and farmers ("Sannong")based on landscape sustainability science: Scientific foundations and core research directions].

Ying yong sheng tai xue bao = The journal of applied ecology·2026
Same author

NeMO Analytics: a compendium of transcriptomic data for the exploration of neocortical development.

Nature neuroscience·2026
Same author

Midkine Overexpression Promotes Functional Recovery After Spinal Cord Injury by Enhancing Microglial Efferocytosis Via LRP-1.

CNS neuroscience & therapeutics·2026
Same author

Technology Roadmap of Bioinspired Computing Hardware.

ACS nano·2026
Same journal

Comprehensive Analysis of Macrophage Dynamics, CCBE1, and Their Implications in Colorectal Cancer Microenvironment: Insights Into Tumor Progression and Therapeutic Opportunities.

Genetics research·2026
Same journal

Compound Heterozygous ATM Variants Cause Adolescent-Onset Cerebellar and Extrapyramidal Disease Without Telangiectasia in a Consanguineous Pakistani Family.

Genetics research·2026
Same journal

Biological Context-Informed and Population-Stratified Strategies Improve Genetic Diagnosis of CCDC22-Related Disorder.

Genetics research·2026
Same journal

Predicting the Impact of Deleterious Single-Nucleotide Polymorphisms in the p47ING1a Isoform of Human ING1 Gene.

Genetics research·2026
Same journal

Two Novel FBN2 Variants Causing Congenital Contractural Arachnodactyly.

Genetics research·2026
Same journal

Identification of Genetic Diagnostic Markers for Systemic Lupus Erythematosus.

Genetics research·2026
See all related articles

Related Experiment Video

Updated: Feb 5, 2026

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types
12:39

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types

Published on: December 10, 2012

11.7K

An efficient classification algorithm for NGS data based on text similarity.

Xiangyu Liao1, Xingyu Liao2, Wufei Zhu3

  • 1Department of Oncology,The First College of Clinical Medical Science,China Three Gorges University,Yichang Central People's Hospital,Yichang,Hubei 443000,P.R. China.

Genetics Research
|September 18, 2018
PubMed
Summary
This summary is machine-generated.

High-performance short sequence classification (HSC) efficiently clusters next-generation sequencing data. This method significantly reduces memory usage and processing time, making large datasets manageable.

Keywords:
NGS sequences dataclusteringtext similarity

More Related Videos

Pattern-based Search of Epigenomic Data Using GeNemo
06:38

Pattern-based Search of Epigenomic Data Using GeNemo

Published on: October 8, 2017

5.4K
Area-based Image Analysis Algorithm for Quantification of Macrophage-fibroblast Cocultures
07:05

Area-based Image Analysis Algorithm for Quantification of Macrophage-fibroblast Cocultures

Published on: February 15, 2022

3.0K

Related Experiment Videos

Last Updated: Feb 5, 2026

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types
12:39

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types

Published on: December 10, 2012

11.7K
Pattern-based Search of Epigenomic Data Using GeNemo
06:38

Pattern-based Search of Epigenomic Data Using GeNemo

Published on: October 8, 2017

5.4K
Area-based Image Analysis Algorithm for Quantification of Macrophage-fibroblast Cocultures
07:05

Area-based Image Analysis Algorithm for Quantification of Macrophage-fibroblast Cocultures

Published on: February 15, 2022

3.0K

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Genomics

Background:

  • High-throughput sequencing generates massive datasets, challenging current data processing and storage capabilities.
  • Data redundancy in sequencing data increases computational demands and can introduce noise.
  • Efficient data clustering is vital for managing large sequencing datasets and improving downstream analysis.

Purpose of the Study:

  • To introduce a high-performance short sequence classification algorithm (HSC) for next-generation sequencing (NGS) data.
  • To address the computational challenges posed by the increasing volume of sequencing data.
  • To improve the efficiency of data processing and reduce resource consumption in genomic analyses.

Main Methods:

  • HSC algorithm utilizes k-mers and a hash table for efficient data organization.
  • It identifies and merges duplicated and reverse complementary k-mers to create unique sets.
  • Text similarity is employed to iteratively merge read clusters until convergence.
  • The final clusters represent groups of similar short sequences.

Main Results:

  • HSC successfully clustered 100 million short reads in under 2 hours.
  • The algorithm demonstrated significant reductions in memory consumption compared to existing methods.
  • HSC proved to be substantially faster than other tools for handling large-scale sequencing data.
  • When used as a preprocessing step, HSC greatly reduced memory and time for assembly tools.

Conclusions:

  • HSC offers a highly efficient solution for clustering large volumes of NGS data.
  • The algorithm effectively reduces computational resource requirements for sequence data processing.
  • HSC enhances the performance of downstream genomic assembly tasks, leading to improved assembly metrics.