Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Types of Aggregate Grading01:15

Types of Aggregate Grading

1.4K
Aggregate grading is crucial in economically obtaining a concrete mix with adequate strength, reasonable workability, and minimal segregation. There are four types of aggregate gradation: well-graded, uniformly (or one-sized) graded, gap-graded, and open-graded.
Well-graded aggregates include a complete range of necessary size fractions that fit together to create a dense matrix with minimal voids, represented by a smooth, continuous gradation curve. This type of grading ensures good...
1.4K
Design Example: Aggregate Gradation01:24

Design Example: Aggregate Gradation

311
The right type and quality of aggregates are crucial for concrete as they significantly influence its properties, mix proportions, and cost-effectiveness. If different sources are available for sand, the commonly used fine aggregate in concrete, the selection of sand is primarily based on its gradation.
The grading, or particle-size distribution, of sand is determined using sieve analysis, with standard sizes ranging from 150 μm to 10 mm (ASTM No. 100 sieve to 3⁄8 in. sieve). Sand is...
311
Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

3.5K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
3.5K
Sieve Analysis and Grading Curves01:19

Sieve Analysis and Grading Curves

921
Sieve analysis is a method used to determine the particle size distribution of aggregate materials. This process involves the following steps:
921
Expected Frequencies in Goodness-of-Fit Tests01:19

Expected Frequencies in Goodness-of-Fit Tests

7.2K
A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n)  to the number of categories (k).
7.2K
Improving Translational Accuracy02:07

Improving Translational Accuracy

14.1K
Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...
14.1K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Data coarse graining can improve model performance.

Physical review. E·2026
Same author

Ultrafast Synthesis of Titanium Suboxide via Magnetic Induction Heating for Enhanced Photodynamic Activity.

Chemistry (Weinheim an der Bergstrasse, Germany)·2026
Same author

A cross-sectional analysis of dermatology resident research productivity in the United States.

JAAD international·2025
Same author

Racial differences in quantitative background parenchymal enhancement on breast magnetic resonance imaging.

Cancer·2025
Same author

Invasive ductal carcinoma of the breast with gallbladder metastasis: a rare case report.

World journal of surgical oncology·2025
Same author

Optimization and variability can coexist.

ArXiv·2025
Same journal

Poisoning the Genome: Targeted Backdoor Attacks on DNA Foundation Models.

ArXiv·2026
Same journal

Mechanistic mathematical model of the in vitro infection dynamics of Bunyamwera and Batai viruses including MOI-dependent shortening of the eclipse phase.

ArXiv·2026
Same journal

AI-Driven Lumped-Element Modeling of Human Respiratory System for Studying Voice Mechanics.

ArXiv·2026
Same journal

Beyond Algorithms: Conceptual Innovation in Medical Imaging AI.

ArXiv·2026
Same journal

Feynman Kac Reweighted Schrödinger Bridge Matching for Surface-Based Tau PET Harmonization.

ArXiv·2026
Same journal

Agentic Discovery of Non-Canonical Antimicrobial Peptides with AMPGAN v3.

ArXiv·2026
See all related articles

Related Experiment Video

Updated: Jan 16, 2026

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients
07:34

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Published on: August 22, 2018

8.6K

Data coarse graining can improve model performance.

Alex Nguyen1, David J Schwab2, Vudtiwat Ngampruetikorn3

  • 1Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08540, USA.

Arxiv
|September 26, 2025
PubMed
Summary
This summary is machine-generated.

Lossy data transformations can surprisingly improve machine learning generalization. A high-pass data coarse-graining scheme, removing less relevant features, enhances model performance by isolating predictive signals.

More Related Videos

A User-friendly and Powerful R Analysis of Large-scale Datasets
10:56

A User-friendly and Powerful R Analysis of Large-scale Datasets

Published on: November 4, 2025

339

Related Experiment Videos

Last Updated: Jan 16, 2026

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients
07:34

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Published on: August 22, 2018

8.6K
A User-friendly and Powerful R Analysis of Large-scale Datasets
10:56

A User-friendly and Powerful R Analysis of Large-scale Datasets

Published on: November 4, 2025

339

Area of Science:

  • Machine Learning
  • Statistical Physics
  • Data Science

Background:

  • Lossy data transformations typically discard information.
  • Techniques like data pruning and lossy data augmentation can paradoxically improve machine learning generalization.
  • Understanding the mechanisms behind this phenomenon is crucial for developing more effective ML models.

Purpose of the Study:

  • To investigate the paradox of information loss improving generalization in machine learning.
  • To analyze the impact of data coarse-graining on prediction risk using a solvable model.
  • To provide an analytical explanation for the benefits of certain data augmentation strategies.

Main Methods:

  • Studied high-dimensional, ridge-regularized linear regression under data coarse-graining.
  • Employed schemes inspired by the renormalization group to systematically discard features based on relevance.
  • Analyzed the dependence of prediction risk on the degree of coarse-graining.

Main Results:

  • Discovered a nonmonotonic relationship between the degree of data coarse-graining and prediction risk.
  • A high-pass coarse-graining scheme, filtering out low-signal features, improved generalization.
  • A low-pass scheme, integrating out high-signal features, proved detrimental.
  • Demonstrated that this nonmonotonicity is a distinct effect of data coarse-graining, not an artifact of double descent.

Conclusions:

  • Careful data augmentation, by stripping irrelevant degrees of freedom, can isolate more predictive signals and enhance model generalization.
  • The study highlights a complex, nonmonotonic risk landscape influenced by data structure.
  • Statistical physics principles offer a valuable framework for understanding modern machine learning phenomena.