Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Absolute Motion Analysis- General Plane Motion01:24

Absolute Motion Analysis- General Plane Motion

271
Visualize a drone, with its propellers spinning rapidly, hovering mid-air. The fascinating movements and operations of this drone can be comprehended by applying the principle of general plane motion.
As the drone's propellers rotate, an upward force is generated that counteracts the force of gravity, enabling the drone to lift off from the ground. This initial movement of the drone is along a straight path, representing a form of translational motion. In this phase, every point on the...
271
Fixed Action Patterns01:06

Fixed Action Patterns

16.5K
A fixed action pattern (FAP) is a specific, hard-wired sequence of behaviors that occurs in response to an external stimulus, called a sign stimulus. The behavior is “fixed” because it is essentially unchangeable—proceeding similarly across individuals of a species every time it occurs.
16.5K
Muscle Coordination and Action01:24

Muscle Coordination and Action

2.0K
Muscle coordination is a complex and finely tuned process essential for smooth and purposeful movements like flexion, extension, adduction, abduction, and rotation. The human body orchestrates the actions of various muscles working in concert, each with a specific role. Four functional types describe how muscles work together: agonist, antagonist, synergist, and fixator.
Agonists
Agonist muscles, often called prime movers, are the primary muscles responsible for producing a specific movement....
2.0K
Relative Motion Analysis using Rotating Axes-Problem Solving01:29

Relative Motion Analysis using Rotating Axes-Problem Solving

448
Consider a crane whose telescopic boom rotates with an angular velocity of 0.04 rad/s and angular acceleration of 0.02 rad/s2. Along with the rotation, the boom also extends linearly with a uniform speed of 5 m/s. The extension of the boom is measured at point D, which is measured with respect to the fixed point C on the other end of the boom. For the given instant, the distance between points C and D is 60 meters.
Here, in order to determine the magnitude of velocity and acceleration for point...
448
Planar Rigid-Body Motion01:22

Planar Rigid-Body Motion

545
Understanding the movement of a rigid body in planar motion involves recognizing that every particle within this body is traversing a path that maintains a consistent distance from a specific plane. This concept is fundamental in the study of physics and mechanical engineering, and it allows us to comprehend better how objects move in space.
Planar motion is typically divided into three distinct categories. The first is rectilinear translation, demonstrated by a subway train that moves along...
545
Kinematic Equations: Problem Solving01:15

Kinematic Equations: Problem Solving

13.6K
When analyzing one-dimensional motion with constant acceleration, the problem-solving strategy involves identifying the known quantities and choosing the appropriate kinematic equations to solve for the unknowns. Either one or two kinematic equations are needed to solve for the unknowns, depending on the known and unknown quantities. Generally, the number of equations required is the same as the number of unknown quantities in the given example. Two-body pursuit problems always require two...
13.6K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Development of ultra-high efficiency soft x-ray angle-resolved photoemission spectroscopy equipped with deep prior-based denoising method.

The Review of scientific instruments·2026
Same author

Editorial for "A Lightweight Convolutional Neural Network Based on Dynamic Level-Set Loss Function for Spine MR Image Segmentation".

Journal of magnetic resonance imaging : JMRI·2023
Same author

Development of spectral decomposition based on Bayesian information criterion with estimation of confidence interval.

Science and technology of advanced materials·2020
Same journal

RETRACTED: Zhang et al. A Novel Framework for Reconstruction and Imaging of Target Scattering Centers via Wide-Angle Incidence in Radar Networks. <i>Sensors</i> 2025, <i>25</i>, 6802.

Sensors (Basel, Switzerland)·2026
Same journal

Enhancing Unsupervised Multi-Source Domain Adaptation for Person Re-Identification via Mixture of Experts and Graph-Based Relation.

Sensors (Basel, Switzerland)·2026
Same journal

Development of an Instrumented Glove for Palmar Pressure Assessment in Kayakers.

Sensors (Basel, Switzerland)·2026
Same journal

Development and Experimental Validation of an Autonomous IoT-Based Monitoring System for Real-Time Water Quality Assessment in the Amazon River.

Sensors (Basel, Switzerland)·2026
Same journal

Semi-Supervised Adversarial Learning Framework for Controller Area Network Bus Intrusion Detection.

Sensors (Basel, Switzerland)·2026
Same journal

Smart Optimization Method for Safety Signs in Innovative Manufacturing Environments Integrating Industrial Field IoT Sensors and Knowledge Graphs.

Sensors (Basel, Switzerland)·2026
See all related articles

Related Experiment Video

Updated: Sep 9, 2025

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention
06:37

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention

Published on: December 15, 2023

4.1K

LLaVA-Pose: Keypoint-Integrated Instruction Tuning for Human Pose and Action Understanding.

Dewen Zhang1, Tahir Hussain1, Wangpeng An2

  • 1Department of Informatics, Graduate School of Informatics and Engineering, The University of Electro-Communications, Tokyo 182-8585, Japan.

Sensors (Basel, Switzerland)
|August 28, 2025
PubMed
Summary
This summary is machine-generated.

This study introduces keypoint-integrated data to improve vision-language models (VLMs) for understanding human poses and actions. Fine-tuning with this specialized dataset significantly enhances VLM performance on human-centric tasks.

Keywords:
human pose and action understandinginstruction-following datakeypoint-integrated data generationmultimodal instruction tuningvision–language models

More Related Videos

Estimation of Contact Regions Between Hands and Objects During Human Multi-Digit Grasping
09:41

Estimation of Contact Regions Between Hands and Objects During Human Multi-Digit Grasping

Published on: April 21, 2023

1.7K
Quantifying Learning in Young Infants: Tracking Leg Actions During a Discovery-learning Task
11:18

Quantifying Learning in Young Infants: Tracking Leg Actions During a Discovery-learning Task

Published on: June 1, 2015

10.8K

Related Experiment Videos

Last Updated: Sep 9, 2025

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention
06:37

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention

Published on: December 15, 2023

4.1K
Estimation of Contact Regions Between Hands and Objects During Human Multi-Digit Grasping
09:41

Estimation of Contact Regions Between Hands and Objects During Human Multi-Digit Grasping

Published on: April 21, 2023

1.7K
Quantifying Learning in Young Infants: Tracking Leg Actions During a Discovery-learning Task
11:18

Quantifying Learning in Young Infants: Tracking Leg Actions During a Discovery-learning Task

Published on: June 1, 2015

10.8K

Area of Science:

  • Computer Vision
  • Artificial Intelligence
  • Multimodal Learning

Background:

  • Current vision-language models (VLMs) excel at general visual tasks but struggle with complex human pose and action recognition.
  • This limitation stems from a lack of specialized instruction-following data for human-centric visual understanding.

Purpose of the Study:

  • To develop a method for generating specialized vision-language data integrating human keypoints with traditional visual features.
  • To create a comprehensive dataset for fine-tuning VLMs on human-centric tasks, including conversation, detailed description, and complex reasoning.
  • To establish a benchmark for evaluating model performance in human pose and action understanding.

Main Methods:

  • Integrated human keypoint data with existing visual features like captions and bounding boxes.
  • Constructed a dataset of 200,328 samples focused on human-centric tasks.
  • Established the Extended Human Pose and Action Understanding Benchmark (E-HPAUB).
  • Fine-tuned the LLaVA-1.5-7B model using the generated dataset to create the LLaVA-Pose model.

Main Results:

  • The LLaVA-Pose model demonstrated significant improvements on the E-HPAUB benchmark.
  • Achieved an overall performance increase of 33.2% compared to the baseline LLaVA-1.5-7B model.
  • Validated the effectiveness of keypoint-integrated data for enhancing human-centric visual understanding.

Conclusions:

  • Keypoint-integrated data is crucial for advancing VLMs in understanding complex human poses and actions.
  • The proposed method and dataset effectively improve multimodal model capabilities for human-centric visual tasks.