Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Associative Learning

Associative Learning

Associative learning is a fundamental concept in behavioral psychology, wherein a connection is established between two stimuli or events, leading to a learned response. This process is critical in understanding how behaviors are acquired and modified. Conditioning, the mechanism through which associations are formed, can be divided into two main types: classical conditioning and operant conditioning, each elucidating different aspects of associative learning.
Classical conditioning, also known...

Improving Translational Accuracy

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...

Chunking and Rehearsal in Sensory Memory

Chunking and Rehearsal in Sensory Memory

Improving short-term memory can be achieved through techniques like chunking and rehearsal. Chunking involves organizing information into larger, more manageable units. This technique is particularly useful for information that exceeds the typical memory span of between five and nine items. For instance, logging into an online account with a password like "ta89vq0179gz" involves grouping letters and numbers into three chunks—ta89, vq01, and 79gz. It makes large amounts of...

Vision

Vision

Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.

Elaborative Rehearsals

Elaborative Rehearsals

Elaborative rehearsal is a crucial cognitive strategy that strengthens information encoding in long-term memory by making meaningful connections between new data and pre-existing knowledge. This approach contrasts with maintenance rehearsal, which involves simple repetition without delving into the significance of the information. While maintenance rehearsal might temporarily keep information active in short-term memory, it is less effective for long-term retention.
The effectiveness of...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

A Dataset with Bilingual TV Commands for Silent Speech Interfaces Using Electroencephalographic Signals.

Scientific data·2026

Same author

Dual Leap Motion Controller 2: A Robust Dataset for Multi-view Hand Pose Recognition.

Scientific data·2024

Same author

A dataset of synthetic art dialogues with ChatGPT.

Scientific data·2024

Same author

Sign Language Dataset for Automatic Motion Generation.

Journal of imaging·2023

Same author

Sign Language Motion Generation from Sign Characteristics.

Sensors (Basel, Switzerland)·2023

Same author

Reducing the Impact of Sensor Orientation Variability in Human Activity Recognition Using a Consistent Reference System.

Sensors (Basel, Switzerland)·2023

Same journal

RETRACTED: Zhang et al. A Novel Framework for Reconstruction and Imaging of Target Scattering Centers via Wide-Angle Incidence in Radar Networks. <i>Sensors</i> 2025, <i>25</i>, 6802.

Sensors (Basel, Switzerland)·2026

Same journal

Enhancing Unsupervised Multi-Source Domain Adaptation for Person Re-Identification via Mixture of Experts and Graph-Based Relation.

Sensors (Basel, Switzerland)·2026

Same journal

Development of an Instrumented Glove for Palmar Pressure Assessment in Kayakers.

Sensors (Basel, Switzerland)·2026

Same journal

Development and Experimental Validation of an Autonomous IoT-Based Monitoring System for Real-Time Water Quality Assessment in the Amazon River.

Sensors (Basel, Switzerland)·2026

Same journal

Semi-Supervised Adversarial Learning Framework for Controller Area Network Bus Intrusion Detection.

Sensors (Basel, Switzerland)·2026

Same journal

Smart Optimization Method for Safety Signs in Innovative Manufacturing Environments Integrating Industrial Field IoT Sensors and Knowledge Graphs.

Sensors (Basel, Switzerland)·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 10, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Parameter-Efficient Adaptation of Large Vision-Language Models for Video Memorability Prediction.

Iván Martín-Fernández¹, Sergio Esteban-Romero¹, Fernando Fernández-Martínez¹

¹Grupo de Tecnología del Habla y Aprendizaje Automático (THAU Group), Information Processing and Telecommunications Center, E.T.S.I. de Telecomunicación, Universidad Politécnica de Madrid (UPM), 28040 Madrid, Spain.

Sensors (Basel, Switzerland)

|April 28, 2025

Summary

This summary is machine-generated.

This study enhances video memorability prediction by adapting Large Vision-Language Models (LVLMs) using Quantized Low-Rank Adaptation (QLoRA). The fine-tuned Qwen-VL model achieved state-of-the-art results, improving media analysis and generation.

Keywords:

efficient adaptation large visual language models multimedia perception video memorability

More Related Videos

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Related Experiment Videos

Last Updated: May 10, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Area of Science:

Artificial Intelligence
Computer Vision
Multimedia Analysis

Background:

Accurate video memorability modeling is crucial for efficient media retrieval, classification, and generation.
Strong correlation exists between video visual semantics and memorability, necessitating advanced visual comprehension.
Large Vision-Language Models (LVLMs) excel at high-level semantic understanding due to extensive multimodal pre-training.

Purpose of the Study:

To leverage LVLMs for video memorability prediction.
To explore efficient adaptation techniques for LVLMs in memorability modeling.
To investigate the impact of LoRA hyperparameters on memorability prediction performance.

Main Methods:

Fine-tuning the Qwen-VL model using the Quantized Low-Rank Adaptation (QLoRA) technique.
Utilizing memorability-related data from the Memento10k dataset for adaptation.
Transforming Qwen-VL into a memorability score regressor.
Optimizing LoRA hyperparameters (rank and alpha) via 5-Fold Cross-Validation.

Main Results:

Achieved a state-of-the-art Spearman Rank Correlation Coefficient (SRCC) of 0.744 on the Memento10k dataset.
Demonstrated the effectiveness of QLoRA for adapting LVLMs to memorability prediction.
Identified optimal LoRA hyperparameters for improved performance.

Conclusions:

This work significantly advances video memorability modeling through LVLMs and efficient adaptation.
The proposed methodology offers a robust approach for predicting video memorability.
High-level semantic understanding is key to accurate video memorability prediction.