Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Associative Learning01:27

Associative Learning

236
Associative learning is a fundamental concept in behavioral psychology, wherein a connection is established between two stimuli or events, leading to a learned response. This process is critical in understanding how behaviors are acquired and modified. Conditioning, the mechanism through which associations are formed, can be divided into two main types: classical conditioning and operant conditioning, each elucidating different aspects of associative learning.
Classical conditioning, also known...
236
Improving Translational Accuracy02:07

Improving Translational Accuracy

8.5K
Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...
8.5K
Chunking and Rehearsal in Sensory Memory01:22

Chunking and Rehearsal in Sensory Memory

107
Improving short-term memory can be achieved through techniques like chunking and rehearsal. Chunking involves organizing information into larger, more manageable units. This technique is particularly useful for information that exceeds the typical memory span of between five and nine items. For instance, logging into an online account with a password like "ta89vq0179gz" involves grouping letters and numbers into three chunks—ta89, vq01, and 79gz. It makes large amounts of...
107
Vision01:24

Vision

52.2K
Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.
52.2K
Elaborative Rehearsals01:07

Elaborative Rehearsals

58
Elaborative rehearsal is a crucial cognitive strategy that strengthens information encoding in long-term memory by making meaningful connections between new data and pre-existing knowledge. This approach contrasts with maintenance rehearsal, which involves simple repetition without delving into the significance of the information. While maintenance rehearsal might temporarily keep information active in short-term memory, it is less effective for long-term retention.
The effectiveness of...
58

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

A Dataset with Bilingual TV Commands for Silent Speech Interfaces Using Electroencephalographic Signals.

Scientific data·2026
Same author

Dual Leap Motion Controller 2: A Robust Dataset for Multi-view Hand Pose Recognition.

Scientific data·2024
Same author

A dataset of synthetic art dialogues with ChatGPT.

Scientific data·2024
Same author

Sign Language Dataset for Automatic Motion Generation.

Journal of imaging·2023
Same author

Sign Language Motion Generation from Sign Characteristics.

Sensors (Basel, Switzerland)·2023
Same author

Reducing the Impact of Sensor Orientation Variability in Human Activity Recognition Using a Consistent Reference System.

Sensors (Basel, Switzerland)·2023
Same journal

RETRACTED: Zhang et al. A Novel Framework for Reconstruction and Imaging of Target Scattering Centers via Wide-Angle Incidence in Radar Networks. <i>Sensors</i> 2025, <i>25</i>, 6802.

Sensors (Basel, Switzerland)·2026
Same journal

Enhancing Unsupervised Multi-Source Domain Adaptation for Person Re-Identification via Mixture of Experts and Graph-Based Relation.

Sensors (Basel, Switzerland)·2026
Same journal

Development of an Instrumented Glove for Palmar Pressure Assessment in Kayakers.

Sensors (Basel, Switzerland)·2026
Same journal

Development and Experimental Validation of an Autonomous IoT-Based Monitoring System for Real-Time Water Quality Assessment in the Amazon River.

Sensors (Basel, Switzerland)·2026
Same journal

Semi-Supervised Adversarial Learning Framework for Controller Area Network Bus Intrusion Detection.

Sensors (Basel, Switzerland)·2026
Same journal

Smart Optimization Method for Safety Signs in Innovative Manufacturing Environments Integrating Industrial Field IoT Sensors and Knowledge Graphs.

Sensors (Basel, Switzerland)·2026
See all related articles

Related Experiment Video

Updated: May 10, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

461

Parameter-Efficient Adaptation of Large Vision-Language Models for Video Memorability Prediction.

Iván Martín-Fernández1, Sergio Esteban-Romero1, Fernando Fernández-Martínez1

  • 1Grupo de Tecnología del Habla y Aprendizaje Automático (THAU Group), Information Processing and Telecommunications Center, E.T.S.I. de Telecomunicación, Universidad Politécnica de Madrid (UPM), 28040 Madrid, Spain.

Sensors (Basel, Switzerland)
|April 28, 2025
PubMed
Summary
This summary is machine-generated.

This study enhances video memorability prediction by adapting Large Vision-Language Models (LVLMs) using Quantized Low-Rank Adaptation (QLoRA). The fine-tuned Qwen-VL model achieved state-of-the-art results, improving media analysis and generation.

Keywords:
efficient adaptationlarge visual language modelsmultimedia perceptionvideo memorability

More Related Videos

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception
05:48

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

1.3K
Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment
08:25

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

8.9K

Related Experiment Videos

Last Updated: May 10, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

461
Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception
05:48

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

1.3K
Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment
08:25

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

8.9K

Area of Science:

  • Artificial Intelligence
  • Computer Vision
  • Multimedia Analysis

Background:

  • Accurate video memorability modeling is crucial for efficient media retrieval, classification, and generation.
  • Strong correlation exists between video visual semantics and memorability, necessitating advanced visual comprehension.
  • Large Vision-Language Models (LVLMs) excel at high-level semantic understanding due to extensive multimodal pre-training.

Purpose of the Study:

  • To leverage LVLMs for video memorability prediction.
  • To explore efficient adaptation techniques for LVLMs in memorability modeling.
  • To investigate the impact of LoRA hyperparameters on memorability prediction performance.

Main Methods:

  • Fine-tuning the Qwen-VL model using the Quantized Low-Rank Adaptation (QLoRA) technique.
  • Utilizing memorability-related data from the Memento10k dataset for adaptation.
  • Transforming Qwen-VL into a memorability score regressor.
  • Optimizing LoRA hyperparameters (rank and alpha) via 5-Fold Cross-Validation.

Main Results:

  • Achieved a state-of-the-art Spearman Rank Correlation Coefficient (SRCC) of 0.744 on the Memento10k dataset.
  • Demonstrated the effectiveness of QLoRA for adapting LVLMs to memorability prediction.
  • Identified optimal LoRA hyperparameters for improved performance.

Conclusions:

  • This work significantly advances video memorability modeling through LVLMs and efficient adaptation.
  • The proposed methodology offers a robust approach for predicting video memorability.
  • High-level semantic understanding is key to accurate video memorability prediction.