Multi-scale remote sensing retrieval algorithm with global-local feature fusion
DOI:
CSTR:
Author:
Affiliation:

School of Artificial Intelligence and Computer Science, Xi′an University of Science and Technology, Xi′an 710054, China

Clc Number:

TP391.3;TP18;TN919.8

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Aiming at the issues of interference from redundant information in images, insufficient multi-scale information extraction, and low retrieval accuracy caused by the ineffective integration of global and local information in cross-modal retrieval tasks for remote sensing images, this paper proposes a network model of multi-scale cross-modal remote sensing image retrieval (IGMR) suitable for multi-scale tasks. Firstly, a multi-dimensional perception enhanced convolution module (MFE) is designed to extract local information while filtering redundant features. It also integrates a multi-attention module to focus on the high-frequency information of images, thereby enhancing feature expression ability. Secondly, a multi-scale patch attention network (RFPA) is developed to capture contextual information at different scales. Subsequently, an adaptive feature fusion module (AFFM) is constructed to dynamically fuse the extracted global and local features, strengthening attention to high-quality information. Experimental results on the public datasets RSICD and RSITMD demonstrate that the proposed IGMR method increases the average recall rate (mR) by 1.83% and 3.21% respectively in remote sensing cross-modal retrieval tasks, with retrieval accuracies reaching 19.73% and 31.83%. The overall retrieval performance is significantly improved.

    Reference
    Related
    Cited by
Get Citation
Related Videos

Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:
  • Revised:
  • Adopted:
  • Online: May 08,2026
  • Published:
Article QR Code