Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Aggregates Classification01:29

Aggregates Classification

1.1K
Aggregate classification is generally based on its size, petrographic characteristics, weight, and source. Size classification ranges from coarse to fine aggregates, defined by the size of the particles. Coarse aggregates are particles that do not pass through ASTM sieve No. 4, and aggregates that pass through the sieve are fine aggregates.
Petrographic classification groups aggregates based on common mineralogical characteristics. Some of the common mineral groups found in aggregates are...
1.1K
Improving Translational Accuracy02:07

Improving Translational Accuracy

15.4K
Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...
15.4K
Improving Translational Accuracy02:07

Improving Translational Accuracy

3.8K
3.8K
Survival Tree01:19

Survival Tree

464
Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
 Building a Survival Tree
Constructing a...
464
Types of Aggregate Grading01:15

Types of Aggregate Grading

1.7K
Aggregate grading is crucial in economically obtaining a concrete mix with adequate strength, reasonable workability, and minimal segregation. There are four types of aggregate gradation: well-graded, uniformly (or one-sized) graded, gap-graded, and open-graded.
Well-graded aggregates include a complete range of necessary size fractions that fit together to create a dense matrix with minimal voids, represented by a smooth, continuous gradation curve. This type of grading ensures good...
1.7K
Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

4.5K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
4.5K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Ultrafast Synthesis of Titanium Suboxide via Magnetic Induction Heating for Enhanced Photodynamic Activity.

Chemistry (Weinheim an der Bergstrasse, Germany)·2026
Same author

A cross-sectional analysis of dermatology resident research productivity in the United States.

JAAD international·2025
Same author

Racial differences in quantitative background parenchymal enhancement on breast magnetic resonance imaging.

Cancer·2025
Same author

Invasive ductal carcinoma of the breast with gallbladder metastasis: a rare case report.

World journal of surgical oncology·2025
Same author

Data coarse graining can improve model performance.

ArXiv·2025
Same author

Optimization and variability can coexist.

ArXiv·2025
Same journal

Erratum: Low-dimensional model for adaptive networks of spiking neurons [Phys. Rev. E 111, 014422 (2025)].

Physical review. E·2026
Same journal

Disentangling the effects of many-body forces on depletion interactions.

Physical review. E·2026
Same journal

Charge transport and mode transition in dual-energy electron beam diodes.

Physical review. E·2026
Same journal

Optimization of multisite reactions in complex compartmentalized media.

Physical review. E·2026
Same journal

Origin of geometric cohesion in nonconvex granular materials: Interplay between interdigitation and rotational constraints enhancing frictional stability.

Physical review. E·2026
Same journal

Interaction of walkers with a standing Faraday wave.

Physical review. E·2026
See all related articles

Related Experiment Video

Updated: Mar 21, 2026

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients
07:34

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Published on: August 22, 2018

8.7K

Data coarse graining can improve model performance.

Alex Nguyen1, David J Schwab2, Vudtiwat Ngampruetikorn3

  • 1Princeton University, Princeton Neuroscience Institute, Princeton, New Jersey 08540, USA.

Physical Review. E
|March 20, 2026
PubMed
Summary
This summary is machine-generated.

Lossy data transformations can surprisingly improve machine learning generalization. A high-pass filtering approach, removing less relevant features, enhances model performance by isolating key signals.

More Related Videos

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

8.2K
A User-friendly and Powerful R Analysis of Large-scale Datasets
10:56

A User-friendly and Powerful R Analysis of Large-scale Datasets

Published on: November 4, 2025

442

Related Experiment Videos

Last Updated: Mar 21, 2026

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients
07:34

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Published on: August 22, 2018

8.7K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

8.2K
A User-friendly and Powerful R Analysis of Large-scale Datasets
10:56

A User-friendly and Powerful R Analysis of Large-scale Datasets

Published on: November 4, 2025

442

Area of Science:

  • Machine Learning
  • Statistical Physics
  • Data Science

Background:

  • Lossy data transformations typically discard information.
  • Techniques like data pruning and lossy data augmentation can enhance machine learning generalization.
  • Understanding the underlying mechanisms of this paradox is crucial for developing effective ML strategies.

Purpose of the Study:

  • To investigate the paradoxical phenomenon of lossy data transformations improving generalization in machine learning.
  • To analyze the impact of data coarse-graining on prediction risk using a solvable model.
  • To provide an analytical explanation for the effectiveness of data augmentation in machine learning.

Main Methods:

  • Utilized a solvable model of high-dimensional, ridge-regularized linear regression.
  • Employed coarse-graining schemes inspired by renormalization group methods in statistical physics.
  • Analyzed feature relevance and systematic discarding of features based on their importance to the learning task.

Main Results:

  • Discovered a nonmonotonic relationship between the degree of data coarse-graining and prediction risk.
  • Demonstrated that a high-pass filtering scheme, removing low-signal features, improves model generalization.
  • Showed that a low-pass scheme, retaining high-signal features, is detrimental to performance.

Conclusions:

  • Careful data augmentation, by removing less relevant information, can isolate predictive signals and improve generalization.
  • The observed nonmonotonicity in prediction risk is a genuine effect of data coarse-graining, not an artifact of other phenomena.
  • Statistical physics principles offer a valuable framework for understanding complex machine learning behaviors.