Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Dehydration Synthesis01:15

Dehydration Synthesis

150.5K
Overview
Dehydration synthesis (also called a condensation reaction) is the chemical process in which two molecules covalently link together to form a new molecule, along with the release of a water molecule. Many physiologically important compounds form by dehydration synthesis reactions, such as complex carbohydrates, proteins, DNA, and RNA.
Synthesis of carbohydrates
Sugar molecules are covalently linked together by dehydration synthesis. During the reaction, the hydroxyl (-OH) group from...
150.5K
Synthesis and Decomposition Reactions02:17

Synthesis and Decomposition Reactions

38.3K
Synthesis and decomposition are two types of redox reactions. Synthesis means to make something, whereas decomposition means to break something. The reactions are accompanied by chemical and energy changes. 
38.3K
Lagging Strand Synthesis01:59

Lagging Strand Synthesis

61.5K
During replication, the complementary strands in double-stranded DNA are synthesized at different rates. Replication first begins on the leading strand. Replication starts later, occurs more slowly, and proceeds discontinuously on the lagging strand.
There are several major differences between synthesis of the leading strand and synthesis of the lagging strand. 1) Leading strand synthesis happens in the direction of replication fork opening, whereas lagging strand synthesis happens in the...
61.5K
How Data are Classified: Categorical Data01:11

How Data are Classified: Categorical Data

45.1K
A variable, usually notated by capital letters such as X and Y, is a characteristic or measurement that can be determined for each member of a population. Data are the actual values of variables. They may be numbers, or they may be words. Datum is a single value.
Data are classified based on whether they are measurable or not. Categorical data cannot be measured; instead, it can be divided into categories. For example, if Y denotes a person's party affiliation, some examples of Y include...
45.1K
Transfer RNA Synthesis02:36

Transfer RNA Synthesis

13.4K
One of the unique features of tRNA is the presence of modified bases. In some tRNAs, modified bases account for nearly 20% of the total bases in the molecule. Altogether, these unusual bases protect the tRNA from enzymatic degradation by RNases.
Each of these chemical modifications is carried by a specific enzyme, post-transcription. All of these enzymes have unique base and site-specificity. Methylation, the most common chemical modification, is carried by at least nine different enzymes, with...
13.4K
How Data are Classified: Numerical Data00:59

How Data are Classified: Numerical Data

38.4K
Data that are countable or measurable in specific units are called numerical or quantitative data. Quantitative data are always numbers. Quantitative data are the result of counting or measuring the attributes of a population. Amount of money, pulse rate, weight, number of people living in a town, and number of students who opt for statistics are examples of quantitative data.
Quantitative data may be either discrete or continuous. All quantitative data that take on only specific numerical...
38.4K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

CTA-guided preoperative perforator selection for knee soft tissue reconstruction.

BMC surgery·2026
Same author

Interface Engineering in MOF-Derived NiOOH/FeOOH Heterostructures: Boosting OER Performance via Lattice Oxygen Redox Regulation and AEM/LOM Synergy.

Small (Weinheim an der Bergstrasse, Germany)·2026
Same author

Neutrophil Extracellular Traps in Pancreatic Ductal Adenocarcinoma: A Vicious Cycle in the Tumor Microenvironment and Targeted Interventions.

International journal of biological sciences·2026
Same author

Oxidized mtDNA Contributes to Pulmonary Inflammation and Fibrosis in Bleomycin-Induced Lung Injury.

MedComm·2026
Same author

Synergistic self-assembly and crosslinking yield a durable, bioactive, and injectable recombinant collagen implant for photoaging therapy.

Regenerative biomaterials·2026
Same author

Organoboron Cycloarenes: Boron-Carbon Annulated Macrocyclic π-Systems.

Angewandte Chemie (International ed. in English)·2026
Same journal

Instrumental Variable Estimation of Marginal Structural Mean Models for Time-Varying Treatment.

Journal of the American Statistical Association·2026
Same journal

Semiparametric Joint Modeling for Survival Analysis with Longitudinal Covariates.

Journal of the American Statistical Association·2026
Same journal

Dimension Reduction for Large-Scale Federated Data: Statistical Rate and Asymptotic Inference.

Journal of the American Statistical Association·2026
Same journal

Facilitating Heterogeneous Effect Estimation via Statistically Efficient Categorical Modifiers.

Journal of the American Statistical Association·2026
Same journal

Nonparametric Density Estimation of a Long-Term Trend from Repeated Semicontinuous Data.

Journal of the American Statistical Association·2026
Same journal

Functional Integrative Bayesian Analysis of High-dimensional Multiplatform Clinicogenomic Data.

Journal of the American Statistical Association·2026
See all related articles

Related Experiment Video

Updated: Feb 12, 2026

Author Spotlight: A Rapid, Microwave-Assisted Hydrothermal Synthesis Of Nickel Hydroxide Nanosheets
07:57

Author Spotlight: A Rapid, Microwave-Assisted Hydrothermal Synthesis Of Nickel Hydroxide Nanosheets

Published on: August 18, 2023

2.6K

Conditional Data Synthesis Augmentation.

Xinyu Tian1, Xiaotong Shen2

  • 1Xinyu Tian is with the School of Statistics, University of Minnesota, MN, 55455 USA.

Journal of the American Statistical Association
|February 11, 2026
PubMed
Summary
This summary is machine-generated.

Conditional Data Synthesis Augmentation (CoDSA) creates realistic synthetic data to address underrepresentation in machine learning datasets. This novel framework improves model performance and generalization across various data types.

Keywords:
Data augmentationGenerative modelsMultimodalityNatural language processingTransfer learningUnstructured data

More Related Videos

ARL Spectral Fitting as an Application to Augment Spectral Data via Franck-Condon Lineshape Analysis and Color Analysis
07:11

ARL Spectral Fitting as an Application to Augment Spectral Data via Franck-Condon Lineshape Analysis and Color Analysis

Published on: August 19, 2021

3.0K
Usability Evaluation of Augmented Reality: A Neuro-Information-Systems Study
05:43

Usability Evaluation of Augmented Reality: A Neuro-Information-Systems Study

Published on: November 30, 2022

3.1K

Related Experiment Videos

Last Updated: Feb 12, 2026

Author Spotlight: A Rapid, Microwave-Assisted Hydrothermal Synthesis Of Nickel Hydroxide Nanosheets
07:57

Author Spotlight: A Rapid, Microwave-Assisted Hydrothermal Synthesis Of Nickel Hydroxide Nanosheets

Published on: August 18, 2023

2.6K
ARL Spectral Fitting as an Application to Augment Spectral Data via Franck-Condon Lineshape Analysis and Color Analysis
07:11

ARL Spectral Fitting as an Application to Augment Spectral Data via Franck-Condon Lineshape Analysis and Color Analysis

Published on: August 19, 2021

3.0K
Usability Evaluation of Augmented Reality: A Neuro-Information-Systems Study
05:43

Usability Evaluation of Augmented Reality: A Neuro-Information-Systems Study

Published on: November 30, 2022

3.1K

Area of Science:

  • Machine Learning
  • Data Science
  • Artificial Intelligence

Background:

  • Reliable machine learning and statistical analysis require diverse, well-distributed training data.
  • Real-world datasets often suffer from limited size and underrepresentation of key subpopulations, leading to biased predictions and reduced model performance, especially in supervised learning tasks like classification.

Purpose of the Study:

  • To introduce Conditional Data Synthesis Augmentation (CoDSA), a novel framework designed to synthesize high-fidelity data for enhancing model performance.
  • To address challenges posed by limited and imbalanced datasets in multimodal domains (tabular, textual, image).

Main Methods:

  • Leveraging generative models, specifically diffusion models, to synthesize high-fidelity data.
  • Fine-tuning pre-trained generative models via transfer learning to enhance synthetic data realism and increase sample density in sparse areas.
  • Developing a theoretical framework to quantify statistical accuracy improvements based on synthetic sample volume and targeted region allocation.

Main Results:

  • CoDSA generates synthetic samples that accurately capture conditional distributions, focusing on under-sampled regions.
  • The framework preserves inter-modal relationships, mitigates data imbalance, improves domain adaptation, and enhances generalization.
  • Extensive experiments show CoDSA consistently outperforms non-adaptive augmentation strategies and state-of-the-art baselines in both supervised and unsupervised settings.

Conclusions:

  • CoDSA offers a robust solution for data augmentation, effectively addressing data limitations and improving machine learning model performance.
  • The proposed framework provides formal guarantees of effectiveness and demonstrates superior performance across diverse data modalities and learning tasks.