Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

How Data are Classified: Categorical Data01:11

How Data are Classified: Categorical Data

44.2K
A variable, usually notated by capital letters such as X and Y, is a characteristic or measurement that can be determined for each member of a population. Data are the actual values of variables. They may be numbers, or they may be words. Datum is a single value.
Data are classified based on whether they are measurable or not. Categorical data cannot be measured; instead, it can be divided into categories. For example, if Y denotes a person's party affiliation, some examples of Y include...
44.2K
How Data are Classified: Numerical Data00:59

How Data are Classified: Numerical Data

37.7K
Data that are countable or measurable in specific units are called numerical or quantitative data. Quantitative data are always numbers. Quantitative data are the result of counting or measuring the attributes of a population. Amount of money, pulse rate, weight, number of people living in a town, and number of students who opt for statistics are examples of quantitative data.
Quantitative data may be either discrete or continuous. All quantitative data that take on only specific numerical...
37.7K
Data Reporting and Recording01:24

Data Reporting and Recording

5.4K
Reporting and recording are crucial in data documentation. The timely, thorough, and accurate documentation of facts is essential when recording patient data. Failure to record findings during an assessment or interpretation of a problem will result in loss of information and make the patient document unreliable. The reader is left with general impressions if the information is not specific. A recording is documenting data of the individual's health information in a traceable, secure, and...
5.4K
What is Gene Expression?01:42

What is Gene Expression?

196.6K
Overview
Gene expression is the process in which DNA directs the synthesis of functional products, that is, proteins. Cells can regulate gene expression at various stages. It allows organisms to generate different cell types and enables cells to adapt to internal and external factors.
Genetic Information Flows from DNA to RNA to Protein
A gene is a stretch of DNA that serves as the blueprint for functional RNAs and proteins. Since DNA is made up of nucleotides and proteins consist of amino...
196.6K
Model Approaches for Pharmacokinetic Data: Physiological Models01:15

Model Approaches for Pharmacokinetic Data: Physiological Models

274
Physiological models in pharmacokinetics are instrumental in understanding the distribution and elimination of drugs within the body. These models describe the drug concentration within target organs, influenced by factors such as drug uptake, tissue volume, and blood flow. Drug uptake is governed by the partition coefficient, which signifies the drug concentration ratio in tissue to that in the blood. The blood flow rate to a specific tissue is expressed as Qt, and the rate of change in tissue...
274
Data Validation01:15

Data Validation

1.6K
Method validation is a crucial process in analytical chemistry designed to confirm that a given method consistently produces reliable and high-quality results. This process is essential when a method is applied to different sample matrices or when procedural modifications are made, ensuring that the results meet acceptable standards across various applications.
Key parameters for method validation include:
1.6K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Chromap Suite: an open-source single-binary platform for agentic multiomic RNA + ATAC profiling.

bioRxiv : the preprint server for biology·2026
Same author

Comparing variable selection and model averaging methods for logistic regression.

Proceedings of the National Academy of Sciences of the United States of America·2026
Same author

Bringing Age Back In: Accounting for Population Age Distribution in Forecasting Migration.

Demography·2026
Same author

A prospective observational study of lifestyle behaviors and biomarkers to promote cardiometabolic health in healthcare workers during the COVID-19 pandemic.

Frontiers in molecular biosciences·2026
Same author

STAR Suite: Integrating transcriptomics through AI software engineering in the NIH MorPhiC consortium.

bioRxiv : the preprint server for biology·2026
Same author

Singe cell RNA sequencing data processing using cloud-based serverless computing.

bioRxiv : the preprint server for biology·2026
Same journal

Inference on summaries of a model-agnostic longitudinal variable importance trajectory with application to suicide prevention.

The annals of applied statistics·2026
Same journal

A NOVEL BAYESIAN FRAMEWORK UNCOVERING BRAIN CONNECTIVITY-TO-SHAPE RELATIONSHIP IN PRECLINICAL ALZHEIMER'S DISEASE.

The annals of applied statistics·2026
Same journal

EVALUATING MULTIPLEX DIAGNOSTIC TEST USING PARTIALLY ORDERED BAYES CLASSIFIER.

The annals of applied statistics·2026
Same journal

BRIDGING THE GAP: ENHANCING THE GENERALIZABILITY OF EPIGENETIC CLOCKS THROUGH TRANSFER LEARNING.

The annals of applied statistics·2026
Same journal

TREATMENT EFFECT HETEROGENEITY AND IMPORTANCE MEASURES FOR MULTIVARIATE CONTINUOUS TREATMENTS.

The annals of applied statistics·2026
Same journal

FEDERATED LEARNING OF ROBUST INDIVIDUALIZED DECISION RULES WITH APPLICATION TO HETEROGENEOUS MULTIHOSPITAL SEPSIS POPULATION.

The annals of applied statistics·2026
See all related articles

Related Experiment Video

Updated: Jan 29, 2026

Sample Preparation and Analysis of RNASeq-based Gene Expression Data from Zebrafish
11:42

Sample Preparation and Analysis of RNASeq-based Gene Expression Data from Zebrafish

Published on: October 27, 2017

11.5K

Model-Based Clustering With Data Correction For Removing Artifacts In Gene Expression Data.

William Chad Young1, Adrian E Raftery1, Ka Yee Yeung2

  • 1Department of Statistics, University of Washington, Box 354322, Seattle, WA 98195.

The Annals of Applied Statistics
|February 12, 2019
PubMed
Summary
This summary is machine-generated.

A new method, model-based clustering with data correction (MCDC), identifies and corrects artifacts in gene expression data from the NIH Library of Integrated Network-based Cellular Signatures (LINCS) dataset. This improves data accuracy and downstream analysis results.

Keywords:
Gene regulatory networkLINCSMCDCModel-based clustering

More Related Videos

Obtaining High-Quality Transcriptome Data from Cereal Seeds by a Modified Method for Gene Expression Profiling
07:18

Obtaining High-Quality Transcriptome Data from Cereal Seeds by a Modified Method for Gene Expression Profiling

Published on: May 21, 2020

7.9K
Visualization and Quantification of High-Dimensional Cytometry Data using Cytofast and the Upstream Clustering Methods FlowSOM and Cytosplore
06:01

Visualization and Quantification of High-Dimensional Cytometry Data using Cytofast and the Upstream Clustering Methods FlowSOM and Cytosplore

Published on: December 12, 2019

8.9K

Related Experiment Videos

Last Updated: Jan 29, 2026

Sample Preparation and Analysis of RNASeq-based Gene Expression Data from Zebrafish
11:42

Sample Preparation and Analysis of RNASeq-based Gene Expression Data from Zebrafish

Published on: October 27, 2017

11.5K
Obtaining High-Quality Transcriptome Data from Cereal Seeds by a Modified Method for Gene Expression Profiling
07:18

Obtaining High-Quality Transcriptome Data from Cereal Seeds by a Modified Method for Gene Expression Profiling

Published on: May 21, 2020

7.9K
Visualization and Quantification of High-Dimensional Cytometry Data using Cytofast and the Upstream Clustering Methods FlowSOM and Cytosplore
06:01

Visualization and Quantification of High-Dimensional Cytometry Data using Cytofast and the Upstream Clustering Methods FlowSOM and Cytosplore

Published on: December 12, 2019

8.9K

Area of Science:

  • Genomics
  • Bioinformatics
  • Systems Biology

Background:

  • The NIH Library of Integrated Network-based Cellular Signatures (LINCS) dataset contains extensive gene expression data.
  • Luminex Bead technology is used, but limited colors for landmark genes necessitate data deconvolution.
  • Raw data limitations can lead to artifacts like flipped or identical gene expression values and inaccurate clusters.

Purpose of the Study:

  • To introduce a novel method, model-based clustering with data correction (MCDC), for identifying and rectifying gene expression data artifacts.
  • To enhance the reliability and accuracy of the LINCS gene expression dataset.

Main Methods:

  • Development of the model-based clustering with data correction (MCDC) algorithm.
  • Simultaneous identification and correction of three specific data artifacts: flipped expression levels, identical expression values, and erroneous clusters.
  • Validation of MCDC against external benchmarks and subsequent analyses.

Main Results:

  • MCDC effectively identifies and corrects artifacts in gene expression data.
  • The corrected data demonstrates improved agreement with external baseline measurements.
  • Subsequent analyses using MCDC-processed data yield enhanced results.

Conclusions:

  • MCDC offers a robust solution for improving the quality of LINCS gene expression data.
  • The method enhances data reliability, leading to more accurate biological insights.
  • MCDC is a valuable tool for researchers working with complex gene expression datasets.