Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Video

Updated: Jan 9, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Lightweight malicious URL detection using deep learning and large language models.

Hareem Kibriya¹, Rashid Amin², Sultan S Alshamrani³

¹Department of Computer Science, Air University, Islamabad, Pakistan.

Scientific Reports

|December 2, 2025

Summary

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Federated ConvNeXt-swin temporal fusion network for malware and botnet detection in IoT systems.

Scientific reports·2026

Same author

Improving IoT security through an explainable hybrid CNN-transformer model and federated learning.

Scientific reports·2026

Same author

FC-FTCP: a lightweight fault-tolerant clustering protocol for secure IoT data transmission.

Scientific reports·2026

Same author

In vitro and in vivo investigations of antibacterial B-TCP/polymer composite enhanced with strontium for bone grafting and photocatalytic applications.

Cytotechnology·2026

Same author

Exploring the lived experience of physical and mental illness using a hermeneutic phenomenological framework.

Journal of health psychology·2026

Same author

Enhancement of cryptography algorithms for security of cloud-based IoT with machine learning models.

Scientific reports·2026

Same journal

Turbulent flow in a vortex separator with a directed pipe inlet.

Scientific reports·2026

Same journal

Systematic characteristic evaluation of clay-based cementitious material derived from calcium carbide residue and waste tile powder.

Scientific reports·2026

Same journal

Retraction Note: Improvement of a rapid diagnostic application of monoclonal antibodies against avian influenza H7 subtype virus using Europium nanoparticles.

Scientific reports·2026

Same journal

Applying large language models to spam detection in the Kazakh low-resource language setting.

Scientific reports·2026

Same journal

An open-source 3D printing system enabling in-situ freeze-thaw processing of hydrogels.

Scientific reports·2026

Same journal

An enhanced EfficientNet framework for automated waste classification using cosine annealing and label smoothing.

Scientific reports·2026

See all related articles

This summary is machine-generated.

This study introduces a deep learning framework using Large Language Models to detect malicious URLs, achieving 97.5% accuracy. The system efficiently classifies threats like phishing and malware with enhanced transparency.

Area of Science:

Cybersecurity
Artificial Intelligence
Machine Learning

Background:

The proliferation of malicious websites poses significant cybersecurity risks, including data compromise and identity theft.
Existing Machine Learning (ML) methods for detecting malicious URLs often rely on manual feature engineering and struggle with evolving threats.
There is a critical need for automated, adaptive solutions to identify and mitigate online threats effectively.

Purpose of the Study:

To develop a fully automated deep learning (DL) framework for detecting malicious Uniform Resource Locators (URLs).
To leverage Large Language Models (LLMs) for generating URL embeddings without manual feature engineering.
To classify URLs into malicious (defacement, malware, phishing) and benign categories with high accuracy and efficiency.

Main Methods:

More Related Videos

Author Spotlight: AI-Driven Trypanosome Species Detection from Microscopic Images

Author Spotlight: AI-Driven Trypanosome Species Detection from Microscopic Images

Published on: October 27, 2023

Related Experiment Videos

Last Updated: Jan 9, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Author Spotlight: AI-Driven Trypanosome Species Detection from Microscopic Images

Author Spotlight: AI-Driven Trypanosome Species Detection from Microscopic Images

Published on: October 27, 2023

Utilized Large Language Models (LLMs) to create high-quality URL embeddings, capturing intricate patterns and token relationships.
Employed a customized deep learning model incorporating Long Short Term Memory (LSTM) and Gated Recurrent Unit (GRU) layers for dependency analysis.
Integrated Bidirectional Encoder Representations from Transformers (BERT) with the DL model and used eXplainable AI (XAI) techniques like Local Interpretable Model-Agnostic Explanations (LIME) for transparency.

Main Results:

Achieved a highest accuracy of 97.5% using the BERT + DL model for malicious URL detection.
The BERT + DL model demonstrates high efficiency, classifying samples in 0.119 ms with only 0.5 million parameters.
Local Interpretable Model-Agnostic Explanations (LIME) provided transparency into the model's decision-making process.

Conclusions:

The proposed deep learning framework effectively detects malicious URLs with state-of-the-art accuracy and efficiency.
The use of LLMs and BERT significantly reduces the need for manual feature engineering, improving adaptability.
The integration of XAI enhances model trustworthiness and reliability for real-time applications in cybersecurity.