Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Introduction to Nonparametric Statistics01:28

Introduction to Nonparametric Statistics

Nonparametric statistics offer a powerful alternative to traditional parametric methods, useful when assumptions about the population distribution cannot be made. Unlike parametric tests, which require data to follow a specific distribution with well-defined parameters (such as the mean and standard deviation), nonparametric tests do not require such constraints. This makes them particularly valuable when dealing with small sample sizes, skewed data, or ordinal and categorical variables.
One of...
Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data01:16

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical inference techniques, paramount in hypothesis testing, differentiate into two broad categories: parametric and nonparametric statistics.
Parametric statistics, as the name suggests, assumes that data follow a specific distribution, often a normal distribution. This assumption enables robust hypothesis testing and estimation. Parametric methods, like the Student's t-test or Goodness-of-fit test, are frequently employed in biostatistics due to their robustness. For instance, comparing...
One-Way ANOVA: Equal Sample Sizes01:15

One-Way ANOVA: Equal Sample Sizes

One-Way ANOVA can be performed on three or more samples with equal or unequal sample sizes. When one-way ANOVA is performed on two datasets with samples of equal sizes, it can be easily observed that the computed F statistic is highly sensitive to the sample mean.
Different sample means can result in different values for the variance estimate: variance between samples. This is because the variance between samples is calculated as the product of the sample size and the variance between the...
One-Way ANOVA: Unequal Sample Sizes01:15

One-Way ANOVA: Unequal Sample Sizes

One-way ANOVA can be performed on three or more samples of unequal sizes. However, calculations get complicated when sample sizes are not always the same. So, while performing ANOVA with unequal samples size, the following equation is used:
Statistical Methods to Analyze Parametric Data: ANOVA01:12

Statistical Methods to Analyze Parametric Data: ANOVA

Analysis of Variance, or ANOVA, is a powerful statistical technique used to analyze parametric data, primarily in research and experimental studies. It's designed to compare the means of two or more groups, assisting researchers in identifying any significant differences between these group means. There are two main types of ANOVA based on the complexity of the analysis: one-way and two-way.
One-way ANOVA is applied when a single independent variable or factor is scrutinized. It compares the...
Causes of Similarity-Dissimilarity Effect01:26

Causes of Similarity-Dissimilarity Effect

The similarity-dissimilarity effect, a fundamental concept in social psychology, explains how interpersonal similarities and differences influence attraction and social interactions. This effect is supported by three key psychological perspectives: balance theory, social comparison theory, and consensual validation.Balance Theory and Cognitive ConsistencyBalance theory, developed by Fritz Heider, posits that individuals seek cognitive consistency in their relationships. When two people share...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Expanded antigen-specific donor regulatory T cells for GVHD prevention.

Blood·2026
Same author

Alcohol consumption and oral human papillomavirus infection among men living with HIV: a cross-sectional study from the ULACNet 201 trial.

Scientific reports·2026
Same author

Pembrolizumab plus high-dose IL-2 in advanced clear cell renal cell carcinoma: six-year survival outcomes and molecular signatures from a phase 2 trial.

Nature communications·2026
Same author

Myosteatosis and Sarcopenic Obesity in Men Receiving Androgen Deprivation Therapy for Prostate Cancer: Rationale for Mechanism-Driven Multimodal Intervention.

Cancers·2026
Same author

Genotype distribution of human papillomavirus (HPV) types in oral gargle specimens among men living with HIV in Mexico, Brazil, and Puerto Rico: A cross-sectional study.

The Journal of infection·2026
Same author

Association of alcohol consumption with oral high-risk human papillomavirus infection: a cross-sectional study within the multinational HPV infection in men (HIM) cohort.

Lancet regional health. Americas·2026
Same journal

Layered social competition coordinates reproductive hierarchy formation in ants.

bioRxiv : the preprint server for biology·2026
Same journal

Combination epigenetic-targeted therapy increases the immunogenicity of poorly immunogenic sarcomas.

bioRxiv : the preprint server for biology·2026
Same journal

Loss of LanC-like proteins delays post-injury regeneration of aging skeletal muscles.

bioRxiv : the preprint server for biology·2026
Same journal

Integrative Transfer Network: Deep Transfer Learning Across Populations and Prediction Targets.

bioRxiv : the preprint server for biology·2026
Same journal

Confidence-supported label-free metabolic imaging with FPhaS phase autofluorescence microscopy.

bioRxiv : the preprint server for biology·2026
Same journal

Sequence-encoded autoinhibition couples mRNA decapping activity to phase separation.

bioRxiv : the preprint server for biology·2026
See all related articles

Related Experiment Video

Updated: May 19, 2026

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments
08:12

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

Synthetic Data Generation and Nonparametric Techniques for Assessing Multivariate Similarity to Address Small-Sample

John Heine1, Erin Fowler1, Steven Eschrich2

  • 1Department of Cancer Epidemiology, H. Lee Moffitt Cancer Center & Research Institute, 12902 Magnolia Drive, Tampa Florida 33612.

Biorxiv : the Preprint Server for Biology
|May 18, 2026
PubMed
Summary
This summary is machine-generated.

This study evaluates a synthetic data generator for biomedical research. High-fidelity synthetic data is produced when bivariate correlation approximates independence, addressing small-sample data challenges.

Keywords:
Bernoulli trialsSynthetics databinomial distributionnonparametric multivariate similaritynormal distributionrandom projection testsmall-sample size

More Related Videos

Basics of Multivariate Analysis in Neuroimaging Data
06:35

Basics of Multivariate Analysis in Neuroimaging Data

Published on: July 24, 2010

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Related Experiment Videos

Last Updated: May 19, 2026

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments
08:12

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

Basics of Multivariate Analysis in Neuroimaging Data
06:35

Basics of Multivariate Analysis in Neuroimaging Data

Published on: July 24, 2010

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Area of Science:

  • Biomedical data science
  • Statistical modeling
  • Computational biology

Background:

  • Biomedical research frequently faces small-sample data limitations, impacting study outcomes.
  • Synthetic data generation offers a potential solution, but its fidelity requires rigorous evaluation.
  • Existing methods struggle to generate adequate synthetic data for high-dimensional, low-sample scenarios.

Purpose of the Study:

  • To evaluate a previously proposed synthetic data generator for biomedical applications.
  • To develop and apply a nonparametric method for assessing multivariate similarity using the Cramér-Wold theorem and random projection testing.
  • To investigate the conditions under which bivariate correlation absence approximates independence in non-normal distributions and evaluate data compression artifacts.

Main Methods:

  • Developed a nonparametric multivariate similarity assessment based on Cramér-Wold theorem and random projection testing.
  • Investigated the approximation of independence by bivariate correlation absence in non-normal settings.
  • Evaluated artifacts introduced by data compression during synthetic data generation.
  • Established a formal testing framework using Bernoulli trials, aggregated outcomes, and a standardized normal test-statistic.

Main Results:

  • The synthetic generator produced high-fidelity multivariate synthetic data when bivariate correlation approximated independence in non-normal settings.
  • In highly compressed data, residual modes were best modeled as normally distributed, irrespective of their intrinsic form.
  • The developed projection framework effectively evaluated the full multivariate covariance structure.

Conclusions:

  • The evaluated synthetic data generator shows promise for generating high-fidelity data in specific non-normal, low-sample regimes.
  • The nonparametric method for assessing multivariate similarity is scalable and adaptable for evaluating synthetic data quality.
  • Further research is needed to apply these methods to higher-dimensional and diverse biomedical datasets.