LaTP: LiDAR-aided multimodal token pruning for efficient trajectory prediction of autonomous driving
View abstract on PubMed
Summary
This summary is machine-generated.We introduce LiDAR-aided Token Prune (LaTP), a novel method for Large Vision Language Models (LVLMs) in autonomous driving. LaTP enhances trajectory prediction efficiency by intelligently pruning visual tokens using LiDAR data, significantly reducing computational load without sacrificing accuracy.
Area Of Science
- Computer Vision
- Artificial Intelligence
- Robotics
Background
- Large Vision Language Models (LVLMs) are advancing autonomous driving, particularly in trajectory prediction.
- Onboard computational demands of autonomous vehicles challenge LVLM deployment on resource-constrained systems.
- Token pruning offers inference speed gains for LVLMs without retraining but current methods lack specificity for autonomous driving.
Purpose Of The Study
- To address limitations of general token pruning in autonomous driving trajectory prediction.
- To develop a specialized token pruning method that considers content and distance information crucial for driving.
- To improve the efficiency of LVLMs for onboard autonomous driving systems.
Main Methods
- Proposed LiDAR-aided Token Prune (LaTP), a novel method for LVLM-based trajectory prediction.
- Integrated LiDAR point data to provide essential distance information for camera inputs.
- Developed a content- and distance-aware token importance indicator to discard irrelevant visual tokens.
Main Results
- LaTP achieved significant inference speed gains with up to 75% pruning ratio.
- Maintained high prediction accuracy with an Average Displacement Error (ADE) of 2.03 meters.
- Demonstrated a low Collision Rate (col.) of 2.35%, outperforming general token pruning baselines on the nuScenes dataset.
Conclusions
- LaTP effectively reduces computational load for LVLMs in autonomous driving.
- The method successfully integrates LiDAR data to enhance token pruning for trajectory prediction.
- LaTP offers a promising solution for deploying efficient and accurate LVLMs in real-world autonomous driving scenarios.
Related Concept Videos
In a three-phase circuit, line loss is an indicator of energy dissipated as heat due to the resistance of transmission lines. To address this, incorporating transformers into the system—a step-up transformer at the source and a step-down transformer at the load—is a strategic solution. Two three-phase transformers are introduced to improve this.
With a step-up transformer at the source, the voltage is increased, thereby reducing the current in the transmission lines since power loss...
Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...

