Abstract:Underwater litter detection is a crucial technology for maintaining the balance of underwater ecosystems. To address the challenge of significant variations in target scales encountered in underwater litter detection, we propose the YOLO11-MDA based on YOLO11 is proposed.Firstly, a multidomain feature extraction module MFEM is proposed, which is capable of extracting different scales of features from the input feature map by extracting the target features in both spatial and frequency domains, and enhances the ability of expression of the global features and local information. Second, the lightweight dynamic up-sampling DySample module is introduced to integrate contextual information and improve the quality and efficiency of up-sampling. Finally, the adaptive threshold focused classification loss ATFL is introduced to reduce the impact of the uneven distribution of multi-scale samples on the detection results and improve the detection accuracy of multi-scale targets. The experimental results show that compared with the baseline model, the mAP of YOLO11-MDA in TrashCan dataset and Trash_ICRA19 dataset reaches 91.4% and 97% respectively, which is an enhancement of 3.1% and 10.7%, and the FPS reaches the detection speed of 354.3 fps, which fully demonstrates that the overall performance of the improved model outperforms that of other algorithms, and it can provide an effective method for the automated monitoring of underwater environments.