Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

UHPose-VAD: Unsupervised Video Anomaly Detection via Pose-Graph Learning and Normalizing Flow.

Journal of imaging·2026
Same author

Optical Flow-Aware-Based Multi-Modal Fusion Network for Violence Detection.

Entropy (Basel, Switzerland)·2022
Same author

Image Reconstruction Based on Progressive Multistage Distillation Convolution Neural Network.

Computational intelligence and neuroscience·2022
Same author

Bevacizumab biosimilar LY01008 compared with bevacizumab (Avastin) as first-line treatment for Chinese patients with unresectable, metastatic, or recurrent non-squamous non-small-cell lung cancer: A multicenter, randomized, double-blinded, phase III trial.

Cancer communications (London, England)·2021
Same author

Anesthetic management of precise radiotherapy under apnea-like condition.

The Journal of international medical research·2021
Same author

Author Correction: PBX3 is targeted by multiple miRNAs and is essential for liver tumour-initiating cells.

Nature communications·2019
Same journal

Research on a Regional Availability Evaluation Model for Road-Area High-Entropy Energy Based on Synergy Factors.

Entropy (Basel, Switzerland)·2026
Same journal

Atmospheric Turbulence Channel Modeling and Performance Analysis of a CO-ZP-OFDM Coherent Optical Communication System for UAV Air-to-Ground Scenarios.

Entropy (Basel, Switzerland)·2026
Same journal

Information Geometry and Asymptotic Theory for SMML Estimators.

Entropy (Basel, Switzerland)·2026
Same journal

Correlation Entropy and Power-Law Kinetics.

Entropy (Basel, Switzerland)·2026
Same journal

Research on the Contagion of Systemic Financial Risk Under the Impact of Climate Risks-From the Perspective of Complex Networks and Machine Learning.

Entropy (Basel, Switzerland)·2026
Same journal

The Statistical-Mechanical Meaning of the Wave Function of Quantum Mechanics.

Entropy (Basel, Switzerland)·2026
See all related articles

Related Experiment Video

Updated: Jun 29, 2025

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images
04:23

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

Published on: April 21, 2023

1.8K

Pairwise CNN-Transformer Features for Human-Object Interaction Detection.

Hutuo Quan1,2, Huicheng Lai1,2, Guxue Gao1,2

  • 1College of Computer Science and Technology, Xinjiang University, Urumqi 830017, China.

Entropy (Basel, Switzerland)
|March 28, 2024
PubMed
Summary
This summary is machine-generated.

This study introduces the Pairwise Convolutional neural network (CNN)-Transformer (PCT) for human-object interaction (HOI) detection. The PCT model effectively combines CNN and Transformer features, achieving competitive results on benchmark datasets.

Keywords:
computer visionconvolutional neural networkfeature fusionhuman–object interaction detectiontransformer

More Related Videos

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis
05:41

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

9.4K
Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications
03:31

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

530

Related Experiment Videos

Last Updated: Jun 29, 2025

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images
04:23

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

Published on: April 21, 2023

1.8K
A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis
05:41

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

9.4K
Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications
03:31

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

530

Area of Science:

  • Computer Vision
  • Artificial Intelligence
  • Machine Learning

Background:

  • Human-object interaction (HOI) detection is crucial for computers to understand scene semantics.
  • Existing two-stage HOI methods excel at feature quality but lack context.
  • One-stage transformer-based methods capture global context but miss object detection benefits.

Purpose of the Study:

  • To propose a novel two-stage method that integrates the strengths of both CNN and Transformer approaches for HOI detection.
  • To enhance human-object pair representations by fusing CNN and Transformer features.
  • To improve the contextual understanding in HOI detection models.

Main Methods:

  • The Pairwise Convolutional neural network (CNN)-Transformer (PCT) model is proposed as a two-stage approach.
  • Pairwise CNN features are extracted from a CNN backbone and fused with pairwise Transformer features.
  • Global features from the Transformer are utilized to provide contextual cues.

Main Results:

  • The fusion of CNN and Transformer features results in enhanced pairwise representations superior to individual feature types.
  • Experimental comparisons demonstrate that CNN features retain a significant advantage in HOI detection.
  • The PCT model achieves competitive performance compared to state-of-the-art methods on HICO-DET and V-COCO datasets.

Conclusions:

  • The proposed PCT model effectively combines object detection capabilities with rich contextual information for HOI detection.
  • Integrating CNN and Transformer features offers a synergistic advantage for HOI detection tasks.
  • The study highlights the continued relevance of CNN features in advanced HOI detection architectures.