Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Cell Lines01:16

Cell Lines

9.8K
A cell line is a population of cells grown in vitro that can be subcultured over several generations. Normal cells cease to divide after a certain number of cell divisions, a process known as replicative senescence. This number, called the Hayflick limit, was conceptualized by Leonard Hayflick in 1961 when he observed that fetal cells grown in culture could only divide 40-60 times. This limit is due to the shortening of the telomeres during each round of cell division, preventing cell division...
9.8K
Overview Of Cell Separation And Isolation01:20

Overview Of Cell Separation And Isolation

6.9K
Cell separation was first achieved in 1964 by S. H. Seal, who separated large tumor cells from the smaller blood cells using filtration. Two years later, Pohl and Hawk performed experiments on how cells respond differently to a nonuniform electric field based on the cell type. Such observations were the inception of cell separation methods, which allow isolating a single cell type from a heterogeneous sample.
6.9K
Heterochromatin02:38

Heterochromatin

17.4K
The extent of chromatin compaction can be studied by staining chromatin using specific DNA binding dyes. Under the microscope, the dense-compacted regions that take up more dye are called heterochromatin. Heterochromatin is further classified into two forms – constitutive heterochromatin and facultative heterochromatin.
Constitutive heterochromatin: It is a highly compact region of chromatin that is mostly concentrated in the centromere and telomere. Unlike euchromatin, the amino acid at...
17.4K
Cis-regulatory Sequences02:02

Cis-regulatory Sequences

11.4K
Cis-regulatory sequences are short fragments of non-coding DNA that are present on the same chromosomes as the genes that they regulate. These fragments serve as binding sites for transcriptional regulators, proteins that are responsible for controlling gene transcription and differential gene expression across cell types in eukaryotes. Cis-regulatory sequences can be close to the gene of interest or thousands of bases away in the DNA sequence; however, those sequences that are further away are...
11.4K
Classification of Epithelial Tissues: Overview01:22

Classification of Epithelial Tissues: Overview

19.3K
Epithelial tissues are classified according to the shape of the cells and the number of cell layers formed. Cell shapes can be squamous (flattened and thin), cuboidal (square-like, as wide as it is tall), or columnar (rectangular, taller than it is wide). Additionally, the nucleus shape helps identify the type of epithelial cells. Squamous cells have flattened disc-shaped nuclei, cuboidal cells have spherical nuclei, and columnar cells have elongated nuclei.
Based on the number of cell layers,...
19.3K
Classification of Leukocytes01:30

Classification of Leukocytes

4.6K
Leukocytes are classified into two groups based on the presence or absence of cytoplasmic granules. Granular leukocytes, which contain granules, belong to the myeloid lineage and are divided into three subtypes: neutrophils, eosinophils, and basophils. These cells are roughly spherical and characterized by the granules in their cytoplasm.
Neutrophils are the most abundant type of granular leukocytes, comprising 50-70% of all leukocytes. They feature small, evenly distributed granules and a...
4.6K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Dango: Predicting higher-order genetic interactions.

Cell systems·2026
Same author

Atlas of predicted protein complex structures across kingdoms.

Nature communications·2026
Same author

Unified modeling of 3D molecular generation via atomic interactions with PocketXMol.

Cell·2026
Same author

Effect of pH values and addition sequences on the structure and emulsifying properties of soy protein isolate-lecithin-epigallocatechin gallate ternary complexes.

International journal of biological macromolecules·2026
Same author

Learning Protein Structure Representation with Orientation-Aware Networks.

Journal of computational biology : a journal of computational molecular cell biology·2025
Same author

Apt-Nanogel-Kit for Real-Time Quantitative Monitoring of the Released H<sub>2</sub>O<sub>2</sub> from Living Cells and Point-of-Care Application.

Analytical chemistry·2025
Same journal

3DICE: Interpretable 3D Cross-Modal Learning for Drug-Target Interaction Prediction and Large-Scale Drug Discovery.

Bioinformatics (Oxford, England)·2026
Same journal

KASSPer: Kinase Active Site Structure Prediction using Protein and Ligand Language Models and Its Application to Virtual Screening.

Bioinformatics (Oxford, England)·2026
Same journal

IDR searcher: a search engine solution for public image resources.

Bioinformatics (Oxford, England)·2026
Same journal

KCFtools: Rapid alignment-free method for introgression screening and GWAS using k-mer profiles.

Bioinformatics (Oxford, England)·2026
Same journal

Meta2DB: Curated shotgun metagenomic feature sets and metadata for health state prediction.

Bioinformatics (Oxford, England)·2026
Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026
See all related articles

Related Experiment Video

Updated: Dec 15, 2025

Single-cell RNA-Seq of Defined Subsets of Retinal Ganglion Cells
11:26

Single-cell RNA-Seq of Defined Subsets of Retinal Ganglion Cells

Published on: May 22, 2017

14.2K

Artificial-cell-type aware cell-type classification in CITE-seq.

Qiuyu Lian1,2, Hongyi Xin2,3, Jianzhu Ma4

  • 1MOE Key Laboratory of Bioinformatics, BNRIST Bioinformatics Division, Department of Automation, Tsinghua University, Beijing 100084, China.

Bioinformatics (Oxford, England)
|July 14, 2020
PubMed
Summary
This summary is machine-generated.

CITE-seq combines protein and gene expression data to identify cell types, but errors called multiplets often create fake cell types that confuse analysis. Researchers developed a new tool called CITE-sort that recognizes and separates these fake clusters from real biological cells. This method uses a tree-based structure to make the results easier for scientists to interpret and label correctly. Testing shows it outperforms standard clustering techniques in accuracy and reliability.

Keywords:
multiplet detectionsurface marker clusteringbioinformatics toolscellular phenotyping

Frequently Asked Questions

More Related Videos

Isolation and Transcriptome Analysis of Plant Cell Types
08:53

Isolation and Transcriptome Analysis of Plant Cell Types

Published on: April 7, 2023

1.9K
Droplet Barcoding-Based Single Cell Transcriptomics of Adult Mammalian Tissues
10:12

Droplet Barcoding-Based Single Cell Transcriptomics of Adult Mammalian Tissues

Published on: January 10, 2019

18.9K

Related Experiment Videos

Last Updated: Dec 15, 2025

Single-cell RNA-Seq of Defined Subsets of Retinal Ganglion Cells
11:26

Single-cell RNA-Seq of Defined Subsets of Retinal Ganglion Cells

Published on: May 22, 2017

14.2K
Isolation and Transcriptome Analysis of Plant Cell Types
08:53

Isolation and Transcriptome Analysis of Plant Cell Types

Published on: April 7, 2023

1.9K
Droplet Barcoding-Based Single Cell Transcriptomics of Adult Mammalian Tissues
10:12

Droplet Barcoding-Based Single Cell Transcriptomics of Adult Mammalian Tissues

Published on: January 10, 2019

18.9K

Area of Science:

  • Computational biology and CITE-sort methodology within bioinformatics
  • Single-cell genomics and transcriptomics research

Background:

Single-cell sequencing technologies have revolutionized our understanding of cellular heterogeneity across diverse biological systems. However, the integration of surface protein markers with transcriptomic data introduces unique computational challenges for researchers. Multiplets, which occur when multiple cells are encapsulated in a single droplet, frequently distort downstream analysis. These events generate artificial cell types that obscure the identification of genuine biological populations. No prior work had fully resolved the impact of these artifacts on automated phenotyping workflows. Existing clustering algorithms often struggle to distinguish between true biological signals and these technical noise sources. That uncertainty drove the development of specialized approaches to improve data resolution. This paper addresses the persistent difficulty of accurately classifying cells in the presence of such technical interference.

Purpose Of The Study:

The aim of this research is to introduce a new clustering method specifically designed for multi-modal sequencing data. The authors seek to address the persistent problem of artificial cell types caused by multiplet formation. These technical artifacts frequently complicate the automated phenotyping of cell surfaces in large-scale experiments. The researchers intend to create a tool that remains robust despite the presence of these common sequencing errors. They want to improve the accuracy of identifying real biological populations within complex datasets. The study also focuses on enhancing the interpretability of clustering results through a structured, hierarchical approach. By organizing the process into a binary tree, they hope to facilitate easier verification for end users. This work is motivated by the need for more reliable computational pipelines in single-cell genomics.

Main Methods:

The investigators developed a novel clustering framework designed to handle the specific challenges of multi-modal sequencing data. Their approach involves a systematic evaluation using both empirical and synthetic datasets to ensure robustness. They implemented a binary tree structure to organize the hierarchical partitioning of cellular populations. This design choice allows for the clear separation of technical noise from authentic biological signals. The team compared their results against standard clustering techniques to establish performance benchmarks. They utilized surface marker information to guide the partitioning of droplets into distinct groups. The software architecture focuses on identifying and isolating artificial cell types during the initial processing stages. This computational strategy provides a structured pathway for verifying the resulting clusters against known biological markers.

Main Results:

The study demonstrates that the proposed method achieves the highest clustering performance across all tested datasets. It consistently separates multiplet-induced artificial clusters from true biological populations with high reliability. The framework successfully identifies genuine cell types while effectively mitigating the impact of technical artifacts. Quantitative comparisons show that this approach outperforms canonical clustering methods in accuracy and stability. The binary tree organization provides a clear, interpretable representation of the data structure for users. By isolating artificial droplets, the tool prevents the misclassification of cells that often occurs with standard algorithms. The results confirm that the method is robust to the presence of multiplets in complex sequencing samples. These findings highlight the effectiveness of integrating artificial-cell-type awareness into the clustering workflow.

Conclusions:

The authors propose that their method offers superior performance compared to traditional clustering approaches for single-cell datasets. Their analysis demonstrates that the tool reliably separates technical artifacts from genuine biological populations. This synthesis suggests that binary tree organization enhances the interpretability of complex clustering outputs for researchers. The study indicates that the approach simplifies the annotation process by integrating domain knowledge into the workflow. Findings imply that robust handling of multiplets is necessary for accurate surface marker phenotyping. The researchers conclude that their framework consistently identifies biological cell types while minimizing the influence of artificial clusters. This work provides a practical solution for improving the quality of single-cell protein and transcriptomic integration. The evidence supports the utility of this method for refining automated cell classification in high-throughput sequencing experiments.

The researchers propose a binary tree-based clustering approach that explicitly models and separates multiplet-induced artificial clusters from genuine biological populations. This mechanism ensures that technical noise does not contaminate the identification of true cellular phenotypes during the analysis process.

The tool utilizes surface marker protein data alongside mRNA sequencing information to perform its clustering. This dual-modality input allows the algorithm to leverage both protein and transcriptomic signatures for more precise cell identification compared to using gene expression alone.

A binary tree structure is necessary to organize the clustering process, which facilitates the interpretation of results. This hierarchical arrangement allows users to verify cluster assignments and apply domain knowledge more effectively than flat, non-hierarchical clustering techniques.

The algorithm treats multiplet-induced droplets as a distinct category of artificial cell types. By identifying these specific droplet clusters, the software prevents them from being misclassified as real biological cells, thereby increasing the overall accuracy of the final cell-type annotation.

The researchers measured clustering performance by comparing their method against canonical algorithms using both real and simulated datasets. This benchmarking demonstrated that their approach achieved superior results in separating true biological populations from technical artifacts across all tested scenarios.

The authors claim that their method simplifies cell-type annotation by providing a transparent, interpretable output. They suggest that this clarity allows scientists to apply their existing biological expertise to verify and label clusters with greater confidence than with standard black-box clustering tools.