Automated multimodel segmentation and tracking for AR-guided open liver surgery using scene-aware self-prompting
View abstract on PubMed
Summary
This summary is machine-generated.This study presents a real-time semantic segmentation and tracking system for augmented reality (AR)-guided liver surgery. The approach enhances surgical accuracy and speed by integrating multiple AI models with a novel scene-aware re-prompting strategy.
Area Of Science
- Medical Imaging
- Computer Vision
- Surgical Technology
Background
- Augmented Reality (AR) integration in surgery requires real-time, accurate visual data processing.
- Open liver surgery presents complex anatomical variations demanding robust segmentation and tracking.
Purpose Of The Study
- To develop a multimodel, real-time semantic segmentation and tracking system for AR-guided open liver surgery.
- To leverage foundation models and scene-aware re-prompting for balancing accuracy and speed in surgical AR applications.
Main Methods
- Integrated ESANet (RGBD model), SAM (segmentation foundation model), and DeAOT (video object segmentation).
- Developed an auto-promptable pipeline with a scene-aware re-prompting algorithm adapting to surgical scene changes.
- Evaluated on intraoperative RGBD videos from 10 open liver surgeries using a head-mounted AR device.
Main Results
- The multimodel approach achieved 71% median IoU at 13.2 FPS without re-prompting.
- Outperformed individual models, offering superior segmentation accuracy over ESANet and better temporal resolution than SAM.
- Scene-aware re-prompting reached 74.7% IoU at 11.5 FPS, matching DeAOT's performance.
Conclusions
- The scene-aware re-prompting strategy effectively balances segmentation accuracy and temporal resolution for real-time AR liver surgery.
- Integrating complementary models ensures robust and accurate segmentation in complex surgical environments.

