Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

State Space Representation01:27

State Space Representation

492
The frequency-domain technique, commonly used in analyzing and designing feedback control systems, is effective for linear, time-invariant systems. However, it falls short when dealing with nonlinear, time-varying, and multiple-input multiple-output systems. The time-domain or state-space approach addresses these limitations by utilizing state variables to construct simultaneous, first-order differential equations, known as state equations, for an nth-order system.
Consider an RLC circuit, a...
492
Classifying Matter by State02:49

Classifying Matter by State

101.3K
Chemistry is the study of matter and the changes it undergoes. Matter is anything that has mass and occupies space. Matter is all around us; the air, water, soil, mountains, even our bodies are all examples of matter. Matter is divided into three states — solid, liquid, and gas — that are commonly found on earth. The fourth state of matter, plasma, occurs naturally in the interiors of stars. 
101.3K
The Two-State Receptor Model01:29

The Two-State Receptor Model

3.0K
The two-state receptor model explains a drug's interaction with receptors, such as G protein-coupled receptors and ligand-gated ion channels, to induce or inhibit a biological response. When no natural ligands are present, a receptor exists in an equilibrium of inactive (Ri) and active (Ra) conformations. The inactive form does not produce a response, while the active form generates a basal effect known as constitutive activity.
The binding affinity of a drug determines its interaction with...
3.0K
State Space to Transfer Function01:21

State Space to Transfer Function

530
The conversion of state-space representation to a transfer function is a fundamental process in system analysis. It provides a method for transitioning from a time-domain description to a frequency-domain representation, which is crucial for simplifying the analysis and design of control systems.
The transformation process begins with the state-space representation, characterized by the state equation and the output equation. These equations are typically represented as:
530
Encoding01:19

Encoding

712
Information enters the brain through encoding, which is the input of information into the memory system. Once sensory information is received from the environment, the brain labels or codes it. The information is then organized with similar information and connected to existing concepts. Encoding occurs through automatic processing and effortful processing.
Automatic processing involves the encoding of details like time, space, frequency, and the meaning of words, usually done without conscious...
712
States of Matter01:20

States of Matter

2.5K
Solids, liquids, and gases are the three states of matter commonly found on Earth. A solid is rigid and possesses a definite shape. A liquid flows and takes the shape of its container, except it forms a flat or slightly curved upper surface when acted upon by gravity. Both liquid and solid samples have volumes nearly independent of pressure. A gas takes both the shape and volume of its container.
Scientists have discovered a fourth state of matter, plasma, that occurs naturally in the interiors...
2.5K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Characterization of Nasopharyngeal Microbiota Dysbiosis in Children with <i>Mycoplasma pneumoniae</i> Pneumonia.

Microorganisms·2026
Same author

Electronic Delocalization-Confinement Coupling in Edge-Coordinated CQDs@MXene Enables Hydrogen-Bond Modulation for Ultrafast Proton Transport.

Advanced materials (Deerfield Beach, Fla.)·2026
Same author

Clinical features and efficacy analysis of idiopathic sudden sensorineural hearing loss in children: a single-center, retrospective study (2015-2025).

European archives of oto-rhino-laryngology : official journal of the European Federation of Oto-Rhino-Laryngological Societies (EUFOS) : affiliated with the German Society for Oto-Rhino-Laryngology - Head and Neck Surgery·2026
Same author

Hierarchical organization of tau topography across the Alzheimer's disease continuum integrates amyloid, connectome, and intrinsic molecular vulnerability.

Communications biology·2026
Same author

VideoPASTA: 7K Preference Pairs That Matter for Video-LLM Alignment.

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing·2026
Same author

Histopathological differences between antrochoanal polyps and chronic rhinosinusitis with nasal polyps in pediatric patients.

World journal of pediatric surgery·2026
Same journal

Tooling or Not Tooling? The Impact of Tools on Language Agents for Chemistry Problem Solving.

Findings of ACL. NAACL·2026
Same journal

LMOD: A Large Multimodal Ophthalmology Dataset and Benchmark for Large Vision-Language Models.

Findings of ACL. NAACL·2026
Same journal

Semantic Consistency-Based Uncertainty Quantification for Factuality in Radiology Report Generation.

Findings of ACL. NAACL·2026
Same journal

Identifying Self-Disclosures of Use, Misuse and Addiction in Community-based Social Media Posts.

Findings of ACL. NAACL·2025
Same journal

Uncertainty Quantification for Clinical Outcome Predictions with (Large) Language Models.

Findings of ACL. NAACL·2025
See all related articles

Related Experiment Video

Updated: Jan 6, 2026

State-Dependency Effects on TMS: A Look at Motive Phosphene Behavior
12:38

State-Dependency Effects on TMS: A Look at Motive Phosphene Behavior

Published on: December 28, 2010

10.9K

OSCaR: Object State Captioning and State Change Representation.

Nguyen Nguyen1, Jing Bi1, Ali Vosoughi1

  • 1University of Rochester.

Findings of ACL. NAACL
|December 3, 2025
PubMed
Summary
This summary is machine-generated.

New AI research introduces the Object State Captioning and State Change Representation (OSCaR) dataset to evaluate how well multimodal large language models (MLLMs) understand object state changes in videos, finding current models need improvement.

More Related Videos

Using the Visual World Paradigm to Study Sentence Comprehension in Mandarin-Speaking Children with Autism
06:15

Using the Visual World Paradigm to Study Sentence Comprehension in Mandarin-Speaking Children with Autism

Published on: October 3, 2018

8.1K
Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception
05:48

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

2.0K

Related Experiment Videos

Last Updated: Jan 6, 2026

State-Dependency Effects on TMS: A Look at Motive Phosphene Behavior
12:38

State-Dependency Effects on TMS: A Look at Motive Phosphene Behavior

Published on: December 28, 2010

10.9K
Using the Visual World Paradigm to Study Sentence Comprehension in Mandarin-Speaking Children with Autism
06:15

Using the Visual World Paradigm to Study Sentence Comprehension in Mandarin-Speaking Children with Autism

Published on: October 3, 2018

8.1K
Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception
05:48

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

2.0K

Area of Science:

  • Artificial Intelligence
  • Computer Vision
  • Natural Language Processing

Background:

  • Understanding object state changes in dynamic visual environments is key for AI, especially for human-AI interaction.
  • Traditional methods for object captioning and state change detection are limited in scope and expressiveness.
  • Existing language representations for object changes are often restricted to a small set of symbolic words.

Purpose of the Study:

  • To introduce a new dataset and benchmark, Object State Captioning and State Change Representation (OSCaR), for evaluating AI models.
  • To assess the capabilities of Multimodal Large Language Models (MLLMs) in comprehending object state changes in video.
  • To provide a resource for advancing research in multimodal understanding of dynamic environments.

Main Methods:

  • Developed the OSCaR dataset with 14,084 annotated video segments featuring nearly 1,000 unique objects from egocentric videos.
  • Established a benchmark for evaluating MLLMs on object state captioning and state change representation.
  • Conducted experiments using a fine-tuned model to assess current MLLM performance on the OSCaR benchmark.

Main Results:

  • Experiments revealed that current MLLMs demonstrate some ability but lack a comprehensive understanding of object state changes.
  • The fine-tuned model showed initial capabilities but requires significant enhancements in accuracy and generalization.
  • The OSCaR benchmark highlights the need for more robust AI models for real-world dynamic scene understanding.

Conclusions:

  • The OSCaR dataset and benchmark provide a critical resource for advancing AI's ability to interpret dynamic visual information.
  • Significant improvements are needed in MLLMs to accurately and reliably understand object state transitions.
  • Future research should focus on enhancing model accuracy and generalization for complex real-world scenarios.