Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Video

Updated: Jun 27, 2026

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

Published on: April 21, 2023

An efficient hybrid CNN-transformer framework for real-time weapon detection and face recognition.

P Shanthi¹, V Manjula¹

¹School of Computer Science and Engineering, Vellore Institute of Technology, Chennai, Tamil Nadu, India.

Frontiers in Artificial Intelligence

|June 26, 2026

Summary

This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Comment on "Animation-assisted learning enhances caregivers' knowledge of anticipatory guidance for children during a well-child clinical visit: A prospective study".

Journal of the Formosan Medical Association = Taiwan yi zhi·2026

Same author

Multimodal autism detection: Deep hybrid model with improved feature level fusion.

Computer methods and programs in biomedicine·2024

Same author

An Insight into the Present Pandemic Scenario (COVID-19) with Respect to Maxfac Speciality.

Journal of pharmacy & bioallied sciences·2024

Same author

In vivo validation of the functional role of MicroRNA-4638-3p in breast cancer bone metastasis.

Journal of cancer research and clinical oncology·2024

Same author

2,6-Diphenyl-3-(prop-2-en-1-yl)piperidin-4-one.

IUCrData·2022

Same author

Morpho-Physiological Traits and Functional Markers Based Molecular Dissection of Heat-Tolerance in Urdbean.

Frontiers in plant science·2021

Same journal

Performance of large language models as an information resource on functional hypothalamic amenorrhea for patients and healthcare professionals.

Frontiers in artificial intelligence·2026

Same journal

<i>S</i> <sup>3</sup>Net: a Synthesis-Segmentation-Spiking Network for Alzheimer's disease detection and segmentation.

Frontiers in artificial intelligence·2026

Same journal

Machine learning-based insurance risk assessment pipeline for natural disaster prediction and claims estimation.

Frontiers in artificial intelligence·2026

Same journal

Early Alzheimer's risk detection via diffusion tensor imaging using a few-shot multichannel attention residual learning network.

Frontiers in artificial intelligence·2026

Same journal

An interpretable machine learning framework for classifying human and machine translations across genres.

Frontiers in artificial intelligence·2026

Same journal

AI-driven financial risk management in complex mobile economies: a contextual-technology fit and security reassurance model.

Frontiers in artificial intelligence·2026

See all related articles

This study introduces ConViDeTR, a hybrid deep learning framework for smart surveillance. It achieves high accuracy in real-time weapon detection and face recognition, outperforming existing methods.

Area of Science:

Computer Vision
Artificial Intelligence
Deep Learning

Background:

Smart surveillance demands accurate, real-time weapon detection and face recognition.
Existing Convolutional Neural Network (CNN) or Vision Transformer (ViT) methods struggle with feature extraction and complex conditions.
Robustness against occlusion, illumination changes, and complex backgrounds is crucial.

Purpose of the Study:

To present ConViDeTR, a novel hybrid deep learning framework.
To enable synchronous weapon detection and face recognition within a unified system.
To enhance accuracy and robustness in intelligent surveillance.

Main Methods:

Integration of CNN, Vision Transformer (ViT), and Detection Transformer (DETR) architectures.

Keywords:

convolutional neural network detection transformer face recognition vision transformer weapon detection

Related Experiment Videos

Last Updated: Jun 27, 2026

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

Published on: April 21, 2023

Introduction of a deep feature fusion layer for integrating diverse feature types.

Development of a shared feature space for synchronous task execution.

Main Results:

Achieved 98.9% accuracy in weapon detection and 97.34% in face recognition.
Demonstrated superior performance compared to existing techniques on benchmark datasets.
Real-time processing capability with 25-30 FPS and low latency.

Conclusions:

ConViDeTR offers an effective, robust, and scalable solution for intelligent surveillance.
The hybrid approach overcomes limitations of standalone CNN and ViT models.
The framework supports next-generation smart surveillance systems.