Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Causes of Similarity-Dissimilarity Effect

Causes of Similarity-Dissimilarity Effect

The similarity-dissimilarity effect, a fundamental concept in social psychology, explains how interpersonal similarities and differences influence attraction and social interactions. This effect is supported by three key psychological perspectives: balance theory, social comparison theory, and consensual validation.Balance Theory and Cognitive ConsistencyBalance theory, developed by Fritz Heider, posits that individuals seek cognitive consistency in their relationships. When two people share...

Bioequivalence Data: Statistical Interpretation

Bioequivalence Data: Statistical Interpretation

The statistical interpretation of bioequivalence data is a significant aspect of pharmaceutical research. Bioequivalence refers to the absence of any significant difference in the rate and extent to which the active ingredient in pharmaceutical products becomes available at the site of drug action when administered at the same molar dose under similar conditions. This helps determine if different drug products have similar absorption rates, ensuring their interchangeability.Statistical...

Correlation of Experimental Data

Correlation of Experimental Data

Dimensional analysis simplifies complex physical problems and guides experimental investigations, but it does not provide complete solutions. It identifies the dimensionless groups that influence a phenomenon, but experimental data is needed to establish the specific relationships and validate theoretical predictions.
For example, a spherical particle moving through a viscous fluid experiences drag. Dimensional analysis shows that the drag force depends on the particle's diameter, velocity, and...

Identifying Statistically Significant Differences: The F-Test

Identifying Statistically Significant Differences: The F-Test

The F-test is used to compare two sample variances to each other or compare the sample variance to the population variance. It is used to decide whether an indeterminate error can explain the difference in their values. The underlying assumptions that allow the use of the F-test include the data set or sets are normally distributed, and the data sets are independent of each other. The test statistic F is calculated by dividing one variance by another. In other words, the square of one standard...

One-Way ANOVA: Equal Sample Sizes

One-Way ANOVA: Equal Sample Sizes

One-Way ANOVA can be performed on three or more samples with equal or unequal sample sizes. When one-way ANOVA is performed on two datasets with samples of equal sizes, it can be easily observed that the computed F statistic is highly sensitive to the sample mean.
Different sample means can result in different values for the variance estimate: variance between samples. This is because the variance between samples is calculated as the product of the sample size and the variance between the...

Test for Homogeneity

Test for Homogeneity

The goodness–of–fit test can be used to decide whether a population fits a given distribution, but it will not suffice to decide whether two populations follow the same unknown distribution. A different test, called the test for homogeneity, can be used to conclude whether two populations have the same distribution. To calculate the test statistic for a test for homogeneity, follow the same procedure as with the test of independence. The hypotheses for the test for homogeneity can be stated as...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

When can Multi-Site Datasets be Pooled for Regression? Hypothesis Tests, <i>ℓ</i> <sub>2</sub>-consistency and Neuroscience Applications.

Proceedings of machine learning research·2019

Same author

Statistical tests and identifiability conditions for pooling and analyzing multisite datasets.

Proceedings of the National Academy of Sciences of the United States of America·2018

Same author

Hypothesis Testing in Unsupervised Domain Adaptation with Applications in Alzheimer's Disease.

Advances in neural information processing systems·2018

Same author

Backward multiple imputation estimation of the conditional lifetime expectancy function with application to censored human longevity data.

Proceedings of the National Academy of Sciences of the United States of America·2015

Same author

Using distance covariance for improved variable selection with application to learning genetic risk models.

Statistics in medicine·2015

Same author

Group variable selection via convex log-exp-sum penalty with application to a breast cancer survivor study.

Biometrics·2014

Same journal

Comparing Adaptive Interventions under a General Sequential Multiple Assignment Randomized Trial Design via Multiple Comparisons with the Best.

Journal of statistical planning and inference·2026

Same journal

Variable Selection in Ultra-high Dimensional Feature Space for the Cox Model with Interval-Censored Data.

Journal of statistical planning and inference·2026

Same journal

On semi-supervised estimation using exponential tilt mixture models.

Journal of statistical planning and inference·2025

Same journal

Regression-Assisted Bayesian Record Linkage for Causal Inference in Observational Studies with Covariates Spread Over Two Files.

Journal of statistical planning and inference·2024

Same journal

Efficient inference of parent-of-origin effect using case-control mother-child genotype data.

Journal of statistical planning and inference·2024

Same journal

Distributed eQTL analysis with auxiliary information.

Journal of statistical planning and inference·2024

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 9, 2026

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

Encoding Dissimilarity Data for Statistical Model Building.

¹Department of Statistics, University of Wisconsin-Madison.

Journal of Statistical Planning and Inference

|September 4, 2010

Summary

This summary is machine-generated.

This study introduces a novel algorithm for embedding discrete, noisy data into Euclidean space using convex cone optimization. This method enhances statistical models like Support Vector Machines for various learning tasks.

Related Experiment Videos

Last Updated: Jun 9, 2026

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

Area of Science:

Statistics
Machine Learning
Computational Biology

Background:

Statistical model building often faces challenges with discrete, noisy, incomplete, and scattered pairwise dissimilarity data.
Existing methods may struggle to effectively incorporate such data into complex models, limiting their applicability.

Purpose of the Study:

To review and comment on three papers addressing the use of challenging dissimilarity data in statistical modeling.
To present a new algorithm for embedding new objects into a pre-defined Euclidean space derived from dissimilarity information.
To demonstrate how this embedding facilitates the integration of dissimilarity data into various machine learning models.

Main Methods:

Utilizing convex cone optimization codes to embed objects into a Euclidean space that respects dissimilarity information while controlling dimensionality.
Developing a "newbie" algorithm for embedding new objects into this established space.
Integrating the dissimilarity information into Smoothing Spline ANOVA penalized likelihood models, Support Vector Machines, and other models admitting Reproducing Kernel Hilbert Space components.

Main Results:

Successfully demonstrated a framework for kernel regularization applicable to problems like protein clustering.
Showcased the utility of flexible risk models in examining covariate influences, including familial and genetic factors.
Developed a robust manifold unfolding technique with kernel regularization for complex data structures.

Conclusions:

The presented methods provide a robust framework for incorporating discrete, noisy, and incomplete dissimilarity data into statistical and machine learning models.
The "newbie" algorithm offers a practical solution for extending existing embeddings with new data points.
Future research directions and open questions in this domain are identified, paving the way for further advancements.