MDMini Drone Shop AI
SlotVTG: Pinpointing Drone Video Events with Object-Centric Precision
A new framework, SlotVTG, significantly improves how Multimodal Large Language Models (MLLMs) analyze drone video. It enables them to precisely identify when specific events occur without sacrificing generalization.
Mar 28·1 min read·Technology