Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Video

Updated: Mar 15, 2026

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

MS2-CL: Multi-Scale Self-Supervised Learning for Camera to LiDAR Cross-Modal Place Recognition.

Wen Liu¹, Lei Ma¹, Xuanshun Zhuang¹

¹School of Electronic Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China.

Sensors (Basel, Switzerland)

|March 14, 2026

Summary

This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

PriorNav: Prior Knowledge Enhanced Zero-Shot Goal Navigation via Multi-Step Iterative Reasoning.

Sensors (Basel, Switzerland)·2026

Same author

An Indoor UAV Localization Framework with ESKF Tightly-Coupled Fusion and Multi-Epoch UWB Outlier Rejection.

Sensors (Basel, Switzerland)·2025

Same author

Asymmetric Double-Sideband Composite Signal and Dual-Carrier Cooperative Tracking-Based High-Precision Communication-Navigation Convergence Positioning Method.

Sensors (Basel, Switzerland)·2025

Same author

A Frontier Review of Semantic SLAM Technologies Applied to the Open World.

Sensors (Basel, Switzerland)·2025

Same author

SGF-SLAM: Semantic Gaussian Filtering SLAM for Urban Road Environments.

Sensors (Basel, Switzerland)·2025

Same author

A Fault-Tolerant Localization Method for 5G/INS Based on Variational Bayesian Strong Tracking Fusion Filtering with Multilevel Fault Detection.

Sensors (Basel, Switzerland)·2025

Same journal

RETRACTED: Zhang et al. A Novel Framework for Reconstruction and Imaging of Target Scattering Centers via Wide-Angle Incidence in Radar Networks. <i>Sensors</i> 2025, <i>25</i>, 6802.

Sensors (Basel, Switzerland)·2026

Same journal

Enhancing Unsupervised Multi-Source Domain Adaptation for Person Re-Identification via Mixture of Experts and Graph-Based Relation.

Sensors (Basel, Switzerland)·2026

Same journal

Development of an Instrumented Glove for Palmar Pressure Assessment in Kayakers.

Sensors (Basel, Switzerland)·2026

Same journal

Development and Experimental Validation of an Autonomous IoT-Based Monitoring System for Real-Time Water Quality Assessment in the Amazon River.

Sensors (Basel, Switzerland)·2026

Same journal

Semi-Supervised Adversarial Learning Framework for Controller Area Network Bus Intrusion Detection.

Sensors (Basel, Switzerland)·2026

Same journal

Smart Optimization Method for Safety Signs in Innovative Manufacturing Environments Integrating Industrial Field IoT Sensors and Knowledge Graphs.

Sensors (Basel, Switzerland)·2026

See all related articles

This study introduces a novel method for cross-modal place recognition, enabling robots and autonomous vehicles to localize using both visual and 3D point cloud data. The approach achieves state-of-the-art performance by learning a unified embedding space, overcoming domain gaps and improving generalization.

Area of Science:

Robotics and Autonomous Systems
Computer Vision
Machine Learning

Background:

Place recognition is crucial for autonomous navigation, but cross-modal localization (e.g., visual to 3D point clouds) faces significant challenges.
Existing methods struggle with domain gaps, computational costs, and learning viewpoint/scale-invariant features.

Purpose of the Study:

To develop a robust cross-modal place recognition framework that addresses the limitations of current approaches.
To enable accurate visual localization within large-scale 3D point cloud maps.

Main Methods:

Formulated cross-modal recognition as learning a scale-invariant, unified embedding space.
Employed a hierarchical Swin Transformer for multi-scale feature extraction from unified 2D representations.

Keywords:

Swin Transformer autonomous driving cross-modal place recognition self-supervised learning

More Related Videos

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

Related Experiment Videos

Last Updated: Mar 15, 2026

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

Utilized a multi-scale self-distillation paradigm for intra-modal knowledge transfer.

Achieved inter-modal alignment using a global contrastive loss on 'teacher' embeddings.

Main Results:

Achieved state-of-the-art performance on KITTI and KITTI-360 datasets.
Demonstrated high accuracy in visual localization within 3D point cloud maps.
Achieved over 60% Recall@1 on KITTI-360 without fine-tuning, using a KITTI-trained model.

Conclusions:

The proposed method effectively bridges the domain gap between visual and 3D point cloud data for place recognition.
The scale-invariant unified embedding space and self-distillation approach enhance generalization and performance.
The framework offers a promising solution for reliable cross-modal localization in robotics and autonomous vehicles.