Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Video

Updated: Sep 18, 2025

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

Evaluating Features and Variations in Deepfake Videos Using the CoAtNet Model.

Eman Alattas^1,2, John Clark², Arwa Al-Aama³

¹Computer Science Department, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia.

Journal of Imaging

|June 25, 2025

Summary

This summary is machine-generated.

Related Concept Videos

Modeling and Similitude

Modeling and Similitude

Scaled modeling is a fundamental technique in engineering, enabling the study of large and complex systems by creating smaller, manageable replicas that recreate critical characteristics of the original. In hydrology and civil infrastructure, for example, scaled models of dams help analyze water flow, turbulence, and pressure. This method allows for accurate predictions of real-world behavior within a controlled environment, significantly reducing the cost and time involved in full-scale...

Force Classification

Force Classification

Forces play a crucial role in the study of physics and engineering. They are essential in describing the motion, behavior, and equilibrium of objects in the physical world. Forces can be classified based on their origin, type, and direction of action.
Contact and non-contact forces are two of the most widely used categories of forces. As the name suggests, contact forces require physical contact between two objects to act upon each other. Examples of contact forces include frictional,...

Masking and Demasking Agents

Masking and Demasking Agents

EDTA titrations may necessitate masking and demasking agents to temporarily protect a particular metal ion in a mixture from the EDTA reaction. These agents facilitate the sequential analysis of the metal ions by forming stable complexes with some—but not all—metal ions during certain steps.
There are many masking agents, such as cyanide, fluoride, triethanolamine, thiourea, and 2,3-bis(sulfanyl)propan-1-ol (formerly 2,3-dimercapto-1-propanol), with the masking agent chosen based on...

Extraction: Advanced Methods

Extraction: Advanced Methods

Metal ions can be separated from one another by complexation with organic ligands–the chelating agent– to form uncharged chelates. Here, the chelating agent must contain hydrophobic groups and behave as a weak acid, losing a proton to bind with the metal. Since most organic ligands used in this process are insoluble or undergo oxidation in the aqueous phase, the chelating agent is initially added to the organic phase and extracted into the aqueous phase. The metal-ligand complex is...

Stereotype Content Model

Stereotype Content Model

The Stereotype Content Model (SCM) was first proposed by Susan Fiske and her colleagues (Fiske, Cuddy, Glick & Xu, 2002; see also Fiske, 2012 and Fiske, 2017). The SCM specifies that when someone encounters a new group, they will stereotype them based on two metrics: warmth—or that group’s perceived intent, and how likely they are to provide help or inflict harm—and competence—or their ability to carry out that objective. Depending on the warmth-competence...

Deconvolution

Deconvolution

Deconvolution, also known as inverse filtering, is the process of extracting the impulse response from known input and output signals. This technique is vital in scenarios where the system's characteristics are unknown, and they must be inferred from the observable signals.
Deconvolution involves several mathematical techniques to derive the impulse response. One common approach is polynomial division. In this method, the input and output sequences are treated as coefficients of...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

SAM for Road Object Segmentation: Promising but Challenging.

Journal of imaging·2025

Same author

Local Binary Pattern-Cycle Generative Adversarial Network Transfer: Transforming Image Style from Day to Night.

Journal of imaging·2025

Same author

Arabic Captioning for Images of Clothing Using Deep Learning.

Sensors (Basel, Switzerland)·2023

Same author

Pure Ductal Carcinoma in Situ in The Male Breast: A Rare Entity.

European journal of breast health·2020

Same journal

Human-AI Interaction in Interventional Radiology: A Narrative Review of Current Applications, Challenges, and Future Directions.

Journal of imaging·2026

Same journal

Coronary Artery Anomalies and Anatomical Variants: Cross-Sectional Diagnostic Imaging and Clinical Background.

Journal of imaging·2026

Same journal

YoLeTooth: A Unified Framework for Joint Tooth Segmentation and Periapical Lesion Detection in Panoramic Radiographs.

Journal of imaging·2026

Same journal

Radiomics-Guided Multi-Sequence Learning for Pathological Complete Response Prediction from Breast MRI with Missing Auxiliary Sequences.

Journal of imaging·2026

Same journal

Cutaneous Thermography in Arthropathies: Quantitative Imaging, Machine Learning, and Clinical Translation.

Journal of imaging·2026

Same journal

Two-Stage Dynamic Synergistic Segmentation Method for Myocardial Pathology.

Journal of imaging·2026

See all related articles

The CoAtNet model shows strong deepfake video detection capabilities, excelling in both intra-dataset and cross-dataset evaluations. This hybrid convolution-transformer architecture demonstrates superior generalization for identifying manipulated videos.

Area of Science:

Artificial Intelligence
Computer Vision
Digital Security

Background:

Deepfake video detection is crucial for combating misinformation and enhancing digital security.
The generalization ability of advanced AI models across diverse datasets is not fully understood.
CoAtNet, a hybrid convolution-transformer architecture, has shown promise in computer vision tasks.

Purpose of the Study:

To evaluate the generalization capabilities of the CoAtNet model for deepfake video detection across various datasets.
To explore CoAtNet's performance in cross-dataset scenarios, identifying key features and variations in deepfake videos.
To benchmark CoAtNet against state-of-the-art models in both intra-dataset and cross-dataset deepfake detection.

Main Methods:

Extensive experiments were conducted using the CoAtNet model.

Keywords:

CoAtNet Generative Adversarial Networks (GANs)computer vision (CV)deepfake digital multimedia forensics

More Related Videos

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Related Experiment Videos

Last Updated: Sep 18, 2025

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

The model was trained with diverse input and processing configurations.

Performance was evaluated on recognized public deepfake datasets, including Celeb-DF and DFDC.

Main Results:

CoAtNet achieved superior intra-dataset performance with an Area Under the Curve (AUC) ranging from 81.4% to 99.9%.
The model demonstrated strong cross-dataset generalization, achieving an AUC of 78%.
CoAtNet exhibited the best AUC for both intra-dataset and cross-dataset deepfake detection, particularly on Celeb-DF.

Conclusions:

CoAtNet exhibits excellent generalization capabilities for deepfake video detection.
The model's hybrid architecture effectively identifies deepfakes across different datasets.
CoAtNet represents a significant advancement in robust deepfake detection technology.