Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Integrating Affordances and Attention Models for Short-Term Object Interaction Anticipation.

IEEE transactions on pattern analysis and machine intelligence·2026

Same author

Ego4D: Around the World in 3,600 Hours of Egocentric Video.

IEEE transactions on pattern analysis and machine intelligence·2024

Same author

Visual Object Tracking in First Person Vision.

International journal of computer vision·2023

Same author

Editorial: Active Vision and Perception in Human-Robot Collaboration.

Frontiers in neurorobotics·2022

Same author

Rolling-Unrolling LSTMs for Action Anticipation from First-Person Video.

IEEE transactions on pattern analysis and machine intelligence·2020

Same author

The EPIC-KITCHENS Dataset: Collection, Challenges and Baselines.

IEEE transactions on pattern analysis and machine intelligence·2020

Same journal

Benchmarking the Robustness of Autonomous Driving to Environmental Illusions: A Lane Perception Perspective.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Learning Topology-Aware Representations via Test-Time Adaptation for Anomaly Segmentation.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

TraGraph-GS: Trajectory Graph-based Gaussian Splatting for Arbitrary Large-Scale Scene Rendering.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

SWIFT: A Small-World Interaction Framework for Flow-Aware Trajectory Prediction in Autonomous Driving.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

HardFlow: Hard-Constrained Sampling for Flow-Matching Models Via Trajectory Optimization.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Industrial Brain: Self-Evolving Neuro-Symbolic Autonomy with Causal Resilience for Cyber-Physical Systems.

IEEE transactions on pattern analysis and machine intelligence·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 6, 2026

Capturing Representative Hand Use at Home Using Egocentric Video in Individuals with Upper Limb Impairment

Capturing Representative Hand Use at Home Using Egocentric Video in Individuals with Upper Limb Impairment

Published on: December 23, 2020

Task Graph Maximum Likelihood Estimation for Procedural Activity Understanding in Egocentric Videos.

Luigi Seminara, Giovanni Maria Farinella, Antonino Furnari

IEEE Transactions on Pattern Analysis and Machine Intelligence

|May 4, 2026

Summary

This summary is machine-generated.

This study introduces a novel, differentiable method for learning task graphs from procedural activities, significantly improving accuracy in video understanding and mistake detection tasks.

More Related Videos

Automated Visual Cognitive Tasks for Recording Neural Activity Using a Floor Projection Maze

Automated Visual Cognitive Tasks for Recording Neural Activity Using a Floor Projection Maze

Published on: February 20, 2014

Web-based Clinician Guide to Record Compatible Video of Standardized Drinking Task Kinematics for Computer Vision Analysis

Web-based Clinician Guide to Record Compatible Video of Standardized Drinking Task Kinematics for Computer Vision Analysis

Published on: November 28, 2025

Related Experiment Videos

Last Updated: May 6, 2026

Capturing Representative Hand Use at Home Using Egocentric Video in Individuals with Upper Limb Impairment

Capturing Representative Hand Use at Home Using Egocentric Video in Individuals with Upper Limb Impairment

Published on: December 23, 2020

Automated Visual Cognitive Tasks for Recording Neural Activity Using a Floor Projection Maze

Automated Visual Cognitive Tasks for Recording Neural Activity Using a Floor Projection Maze

Published on: February 20, 2014

Web-based Clinician Guide to Record Compatible Video of Standardized Drinking Task Kinematics for Computer Vision Analysis

Web-based Clinician Guide to Record Compatible Video of Standardized Drinking Task Kinematics for Computer Vision Analysis

Published on: November 28, 2025

Area of Science:

Computer Vision
Machine Learning
Robotics

Background:

Procedural activities are goal-oriented sequences with ordering constraints.
Task graphs represent these activities holistically but often rely on hand-crafted procedures.
Existing methods for task graph extraction from videos are limited.

Purpose of the Study:

To develop a gradient-based approach for directly estimating task graphs from data.
To improve the accuracy and applicability of task graph learning in video understanding.
To enhance procedural understanding and online mistake detection in egocentric videos.

Main Methods:

Employed gradient-based maximum likelihood optimization to learn edge weights for task graphs.
Developed a differentiable framework enabling feature-based prediction from embeddings.
Validated the approach on CaptainCook4D, EgoPER, and EgoProceL datasets, and the Ego-Exo4D benchmark.

Main Results:

Achieved significant improvements in F1 score (+14.5%, +10.2%, +13.6%) on three datasets.
Demonstrated state-of-the-art performance on the Ego-Exo4D benchmark across five downstream tasks.
Showcased substantial gains in online mistake detection (+19.8%, +6.4%) on challenging datasets.

Conclusions:

The proposed differentiable task graph learning approach offers superior performance over prior methods.
This framework advances video understanding and procedural analysis, particularly in egocentric settings.
The method facilitates more robust and accurate identification of procedural steps and errors.