Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Synthetic Biology02:55

Synthetic Biology

5.5K
Synthetic biology is an interdisciplinary science that involves using principles from disciplines such as engineering, molecular biology, cell biology, and systems biology. It involves remodeling existing organisms from nature or constructing completely new synthetic organisms for applications such as protein or enzyme production, bioremediation, value-added macromolecule production, and the addition of desirable traits to crops, to name a few.
Golden rice
Golden rice is a genetically modified...
5.5K
Combinatorial Gene Control02:33

Combinatorial Gene Control

9.5K
Combinatorial gene control is the synergistic action of several transcriptional factors to regulate the expression of a single gene. The absence of one or more of these factors may lead to a significant difference in the level of gene expression or repression.
The expression of more than 30,000 genes is controlled by approximately 2000-3000 transcription factors. This is possible because a single transcription factor can recognize more than one regulatory sequence. The specificity in gene...
9.5K
Random Sampling Method01:09

Random Sampling Method

14.1K
Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest. Among the various sampling methods used by...
14.1K
Synthetic Disvision of Polynomials01:28

Synthetic Disvision of Polynomials

139
Synthetic division is an efficient algorithmic approach for dividing a polynomial by a linear binomial of the form x - c, where c is a real number. This method is helpful due to its streamlined process, which avoids the more cumbersome steps involved in the traditional long division of polynomials. It simplifies computation and serves as a practical tool for evaluating polynomials and identifying their factors.To perform synthetic division, one begins by listing the coefficients of the...
139
Mechanistic Models: Compartment Models in Individual and Population Analysis01:23

Mechanistic Models: Compartment Models in Individual and Population Analysis

237
Mechanistic models are utilized in individual analysis using single-source data, but imperfections arise due to data collection errors, preventing perfect prediction of observed data. The mathematical equation involves known values (Xi), observed concentrations (Ci), measurement errors (εi), model parameters (ϕj), and the related function (ƒi) for i number of values. Different least-squares metrics quantify differences between predicted and observed values. The ordinary least...
237
Bootstrapping01:24

Bootstrapping

798
The term "bootstrap" originated in the 19th century as a metaphor for self-improvement or achieving something independently, without external assistance. This concept extends to statistical bootstrapping, a self-contained method for estimating population parameters through resampling, even though it can be computationally intensive. Developed by the American statistician Dr. Bradley Efron in 1979, bootstrapping provides a robust way to perform inference when the original sample size is...
798

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Knowledge, attitude, and practice of the general population towards ear care and hearing health in Karnali Province, Nepal.

Journal of public health research·2025
Same author

Prevalence and causes of hearing impairment: a cross-sectional study in Karnali Province, Nepal - CORRIGENDUM.

The Journal of laryngology and otology·2025
Same author

Input Space Coverage Matters.

Computer·2025
Same author

Prevalence and causes of hearing impairment: a cross-sectional study in Karnali Province, Nepal.

The Journal of laryngology and otology·2025
Same author

Combinatorial Test Generation for Multiple Input Models with Shared Parameters.

IEEE transactions on pattern analysis and machine intelligence·2024
Same author

A Trusted Federated System to Share Granular Data Among Disparate Database Resources.

Computer·2024
Same journal

Toward Cybersecurity Testing and Monitoring of IoT Ecosystems.

SN computer science·2026
Same journal

Voxel-based Deep Regression for Enhanced Body Composition Estimation from 3D Body Scans.

SN computer science·2026
Same journal

Detecting Adverse Drug Events in Social Media: A Brief Literature Review.

SN computer science·2026
Same journal

TRAM: The Telecommunications-Related AcciMap Method.

SN computer science·2026
Same journal

To Signal or Not to Signal? A Non-cooperative Game-Theoretic Approach to Discretionary Communication Between Road Users.

SN computer science·2025
Same journal

Fast and Secure Multiparty Querying over Federated Graph Databases.

SN computer science·2025
See all related articles

Related Experiment Video

Updated: Jan 13, 2026

A Data Integration Workflow to Identify Drug Combinations Targeting Synthetic Lethal Interactions
07:40

A Data Integration Workflow to Identify Drug Combinations Targeting Synthetic Lethal Interactions

Published on: May 27, 2021

4.5K

A Combinatorial Approach to Synthetic Data Generation for Machine Learning.

Krishna Khadka1, Jaganmohan Chandrasekaran2, Yu Lei1

  • 1Department of Computer Science and Engineering, The University of Texas at Arlington, Arlington, TX 76019 USA.

SN Computer Science
|January 12, 2026
PubMed
Summary
This summary is machine-generated.

This study introduces a novel combinatorial sampling method for generating synthetic data, significantly reducing the number of samples needed for comparable machine learning model performance and enhancing privacy protection.

Keywords:
Combinatorial testingDifferential privacySynthetic data generationVariational autoencoder

More Related Videos

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.9K
Constructing and Visualizing Models using Mime-based Machine-learning Framework
06:19

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Published on: July 22, 2025

2.3K

Related Experiment Videos

Last Updated: Jan 13, 2026

A Data Integration Workflow to Identify Drug Combinations Targeting Synthetic Lethal Interactions
07:40

A Data Integration Workflow to Identify Drug Combinations Targeting Synthetic Lethal Interactions

Published on: May 27, 2021

4.5K
Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.9K
Constructing and Visualizing Models using Mime-based Machine-learning Framework
06:19

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Published on: July 22, 2025

2.3K

Area of Science:

  • Machine Learning
  • Data Privacy
  • Synthetic Data Generation

Background:

  • Machine learning datasets frequently contain sensitive personal health and financial information, posing privacy risks.
  • Existing synthetic data generation methods often require numerous samples, impacting downstream task efficiency.
  • Current techniques involve encoding data, random sampling in latent space, and decoding to generate synthetic data.

Purpose of the Study:

  • To develop an efficient synthetic data generation technique that minimizes sample requirements.
  • To enhance the privacy preservation capabilities of synthetic data generation methods.
  • To improve the performance of machine learning models using synthetic data.

Main Methods:

  • A combinatorial approach to sampling the latent space is proposed, focusing on t-way interactions among latent dimensions.
  • This method is motivated by findings that model predictions are often driven by interactions between a limited number of features.
  • The approach generates synthetic data samples by utilizing these identified feature interactions.

Main Results:

  • The combinatorial sampling approach requires fewer synthetic samples compared to traditional random sampling to achieve similar model performance.
  • When combined with differential privacy, this method shows a smaller performance degradation than random sampling.
  • Empirical results demonstrate the effectiveness of leveraging feature interactions for efficient synthetic data generation.

Conclusions:

  • The proposed combinatorial sampling method offers a more efficient alternative for generating high-quality synthetic data.
  • This technique improves the trade-off between data utility and privacy preservation in machine learning.
  • The findings suggest that targeted sampling based on feature interactions can significantly enhance synthetic data generation processes.