Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Survival Tree01:19

Survival Tree

Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
 Building a Survival Tree
Constructing a survival tree begins...
Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This number is...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

COMBINER: Composed Image Retrieval Guided by Attribute-Based Neighbor Relations.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same author

UniEmo: Unifying Emotional Understanding and Generation With Learnable Expert Queries.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same author

SpaceEra++: A Unified Framework Towards 3D Spatial Reasoning in Video.

IEEE transactions on pattern analysis and machine intelligence·2026
Same author

A Natural Language Guided Approach for Blind Face Restoration: Methodology and Dataset.

IEEE transactions on pattern analysis and machine intelligence·2026
Same author

FRM-PTQ: Feature relationship matching enhanced low-bit post-training quantization for large language models.

Neural networks : the official journal of the International Neural Network Society·2026
Same author

Noisy Correspondence Rectification in Multimodal Clustering Space for Cross-Modal Matching.

IEEE transactions on pattern analysis and machine intelligence·2025
Same journal

Relation DETR+: Exploring Explicit Position Relation Prior for Dense Prediction.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

RBF++: Quantifying and Optimizing Reasoning Boundaries across Measurable and Unmeasurable Capabilities for Chain-of-Thought Reasoning.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

CAFE: Cross-View Adaptive Fusion and Cluster Center Enhancement for Robust Multi-View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Ethics-Aware Safe Reinforcement Learning for Rare-Event Risk Control in Interactive Urban Driving.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Learning Shape Anchors for Holistic Indoor Scene Understanding.

IEEE transactions on pattern analysis and machine intelligence·2026
See all related articles

Related Experiment Video

Updated: Jun 17, 2026

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Dataset Pruning: Reducing Training Data by Examining SGD-Influence.

Shuo Yang, Yucheng Huang, Zeke Xie

    IEEE Transactions on Pattern Analysis and Machine Intelligence
    |June 15, 2026
    PubMed
    Summary
    This summary is machine-generated.

    Deep learning requires vast data, incurring high costs. This study introduces dataset pruning to identify essential training data, creating smaller, efficient datasets without performance loss.

    Related Experiment Videos

    Last Updated: Jun 17, 2026

    A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
    12:18

    A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

    Published on: January 11, 2020

    Area of Science:

    • Machine Learning
    • Artificial Intelligence
    • Data Science

    Background:

    • Deep learning's success depends on large datasets, leading to significant computational and infrastructure costs.
    • Key questions arise regarding data contribution to model performance and generalization.
    • The need for efficient methods to create smaller, representative training sets is critical.

    Purpose of the Study:

    • To develop an optimization-based method for dataset pruning, selecting the most influential training samples.
    • To construct the smallest possible training data subset while maintaining a controlled generalization gap.
    • To efficiently estimate the influence of individual training samples on model generalization.

    Main Methods:

    • Dataset pruning: An optimization-based sample selection technique.
    • SGD-Influence method: Tracks parameter changes during stochastic gradient descent to estimate sample influence, bypassing traditional limitations.
    • Distributed discrete optimization: Partitions datasets into manageable segments for efficient processing.

    Main Results:

    • Dataset pruning effectively identifies influential training samples and constructs minimal proxy datasets.
    • Empirical generalization gaps align with theoretical predictions.
    • The proposed method outperforms state-of-the-art approaches in efficiency and accuracy.
    • Achieved a 61.26% reduction in computational cost compared to previous work with improved accuracy.

    Conclusions:

    • Dataset pruning offers a computationally efficient and theoretically grounded approach to data selection in deep learning.
    • The SGD-Influence and distributed optimization methods enable effective sample influence estimation and efficient subset construction.
    • This work provides a pathway to significantly reduce the costs associated with deep learning training while enhancing model performance.