Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Three-Dimensional Force System:Problem Solving

Three-Dimensional Force System:Problem Solving

A three-dimensional force system refers to a scenario in which three forces act simultaneously in three different directions. This type of problem is commonly encountered in physics and engineering, where it is necessary to calculate the resultant force on the system, which can then be used to predict or analyze the behavior of the object or structure under consideration.
To solve a three-dimensional force system, first resolve each force into its respective scalar components. Do this using...

Collisions in Multiple Dimensions: Problem Solving

Collisions in Multiple Dimensions: Problem Solving

In multiple dimensions, the conservation of momentum applies in each direction independently. Hence, to solve collisions in multiple dimensions, we should write down the momentum conservation in each direction separately. To help understand collisions in multiple dimensions, consider an example.
A small car of mass 1,200 kg traveling east at 60 km/h collides at an intersection with a truck of mass 3,000 kg traveling due north at 40 km/h. The two vehicles are locked together. What is the...

Depth Perception and Spatial Vision

Depth Perception and Spatial Vision

Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.

Modeling and Similitude

Modeling and Similitude

Scaled modeling is a fundamental technique in engineering, enabling the study of large and complex systems by creating smaller, manageable replicas that recreate critical characteristics of the original. In hydrology and civil infrastructure, for example, scaled models of dams help analyze water flow, turbulence, and pressure. This method allows for accurate predictions of real-world behavior within a controlled environment, significantly reducing the cost and time involved in full-scale...

Machines: Problem Solving II

Machines: Problem Solving II

Machines are complex structures consisting of movable, pin-connected multi-force members that work together to transmit forces. Consider a lifting tong carrying a 100 kg load. It comprises movable sections DAF and CBG linked together with member AB.

Natural and Artificial Concepts

Natural and Artificial Concepts

In psychology, concepts can be divided into two categories: natural and artificial. Natural concepts are formed through direct or indirect experiences. For example, consider the concept of snow. If you live in a place with regular snowfall, such as Essex Junction, Vermont, you know snow through direct experiences. You’ve seen it fall, touched it, shoveled it, and played in it. You recognize its texture, appearance, and even its smell. In contrast, if you live on an island like Saint...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Association between oxidative balance score, genetic susceptibility and nephrolithiasis: a cohort study based on the UK Biobank.

European journal of nutrition·2026

Same author

The combined toxic effects of long-term exposure to environmentally relevant concentrations of imidacloprid and chromium on Xenopus laevis tadpoles: Growth, oxidative stress, and molecular mechanisms.

Ecotoxicology and environmental safety·2026

Same author

Design, synthesis, antibacterial activity evaluation, and mechanism of action study of novel pyrrolidine derivatives containing sulfonamide structures.

Molecular diversity·2026

Same author

Design, synthesis, antibacterial activity, and mechanism study of phosphate-containing vanillin sulfonylhydrazide derivatives.

Pest management science·2026

Same author

Machine Learning-Based Accurate Full-Sib Family Assignment in Sturgeon Using Whole-Genome Sequencing Data.

International journal of molecular sciences·2026

Same author

Investigation of Antibacterial Activity and Mechanism of Action: Design and Synthesis of Phosphonate Derivatives Containing Sulfonate Ester Groups.

Journal of agricultural and food chemistry·2026

Same journal

Benchmarking the Robustness of Autonomous Driving to Environmental Illusions: A Lane Perception Perspective.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Learning Topology-Aware Representations via Test-Time Adaptation for Anomaly Segmentation.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

TraGraph-GS: Trajectory Graph-based Gaussian Splatting for Arbitrary Large-Scale Scene Rendering.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

SWIFT: A Small-World Interaction Framework for Flow-Aware Trajectory Prediction in Autonomous Driving.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

HardFlow: Hard-Constrained Sampling for Flow-Matching Models Via Trajectory Optimization.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Industrial Brain: Self-Evolving Neuro-Symbolic Autonomy with Causal Resilience for Cyber-Physical Systems.

IEEE transactions on pattern analysis and machine intelligence·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 24, 2025

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Lowis3D: Language-Driven Open-World Instance-Level 3D Scene Understanding.

Runyu Ding, Jihan Yang, Chuhui Xue

IEEE Transactions on Pattern Analysis and Machine Intelligence

|June 6, 2024

Summary

This summary is machine-generated.

This study introduces a novel method for open-world 3D scene understanding, enabling models to recognize and locate previously unseen objects. The approach leverages vision-language models to generate scene captions, significantly improving 3D object recognition and localization accuracy.

More Related Videos

Author Spotlight: Insights into the Analysis of Human Interaction with 3D Virtual Objects

Author Spotlight: Insights into the Analysis of Human Interaction with 3D Virtual Objects

Published on: October 18, 2024

Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects

Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects

Published on: November 30, 2018

Related Experiment Videos

Last Updated: Jun 24, 2025

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Author Spotlight: Insights into the Analysis of Human Interaction with 3D Virtual Objects

Author Spotlight: Insights into the Analysis of Human Interaction with 3D Virtual Objects

Published on: October 18, 2024

Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects

Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects

Published on: November 30, 2018

Area of Science:

Computer Vision
Artificial Intelligence
3D Scene Understanding

Background:

Open-world instance-level scene understanding requires localizing and recognizing objects from categories not present in training data.
Existing 2D methods benefit from large image-text datasets, but 3D scenarios lack sufficient 3D-text pairs, hindering progress.
The scarcity of 3D-text data presents a significant challenge for training models capable of understanding novel 3D objects.

Purpose of the Study:

To develop a method for open-world instance-level 3D scene understanding that addresses the lack of 3D-text data.
To enable models to localize and semantically categorize novel 3D objects effectively.
To improve the generalization capabilities of 3D instance grouping for accurate novel object localization.

Main Methods:

Harnessing pre-trained vision-language (VL) foundation models to generate captions for multi-view 3D scene images.
Implementing hierarchical point-caption association to learn semantic-aware embeddings from 3D geometry and multi-view images.
Developing debiased instance localization using instance-level pseudo-supervision on unlabeled data for improved object grouping.

Main Results:

Established explicit associations between 3D shapes and semantic-rich captions by generating captions for 3D scenes.
Significantly enhanced fine-grained visual-semantic representation learning for object-level categorization.
Achieved substantial performance improvements across 3D semantic, instance, and panoptic segmentation tasks on multiple datasets, outperforming baseline methods.

Conclusions:

The proposed method effectively bridges the gap in 3D-text data scarcity by leveraging VL models for scene captioning.
The hierarchical association and debiased localization techniques significantly boost the performance of open-world 3D scene understanding.
The approach demonstrates strong generalization capabilities, paving the way for more comprehensive 3D perception systems.