面向遥感图像的结构化图像描述网络
DOI:
CSTR:
作者:
作者单位:

1.天津城建大学计算机与信息工程学院天津300384;2.天津城建大学地质与测绘学院天津300384

作者简介:

通讯作者:

中图分类号:

TP753

基金项目:

国家自然科学基金(52178295)项目资助


Structured image description network for remote sensing images
Author:
Affiliation:

1.School of Computer and Information Engineering, Tianjin Chengjian University, Tianjin 300384,China; 2.School of Geology and Surveying, Tianjin Chengjian University, Tianjin 300384,China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    为了解决标准注意力方法只能生成粗粒度的注意力区域,既无法获取遥感对象之间的地理关系,也不能充分利用遥感图像语义内容的问题,提出了一种面向遥感图像的结构化图像描述网络(geoobject relational segmentation for remote sensing image captioning,GRSRC)。首先,针对遥感图像特征高度结构化的特点,提出基于结构化遥感图像语义分割的特征提取方法,通过增强编码器特征提取能力实现更准确的表达;同时,引入注意力机制对分割区域进行加权,使模型能够更加关注重要的语义信息;其次,针对遥感图像空间对象位置关系较为明确的特点,在注意力机制中融合地理空间关系,使生成的描述更加准确且具有空间一致性;最后,在RSICD、UCM、Sydney 3个公开的遥感数据集上进行实验评估,在UCM数据集上,BLEU-1达到了84.06、METEOR达到了44.35、ROUGE_L达到了77.01,相较于所对比的经典模型,分别提升了2.32%,1.15%和1.88%。实验结果说明模型能够更充分利用遥感图像语义内容,表明了该方法在遥感图像描述任务中具有较好的性能。

    Abstract:

    To address the limitation of standard attention mechanisms that can only generate coarse-grained attention regions, failing to capture the geographical relationships between remote sensing objects and underutilize the semantic content of remote sensing images, a structured image description network named GRSRC (geo-object relational segmentation for remote sensing image captioning) is proposed. Firstly, considering the highly structured nature of remote sensing image features, a feature extraction method based on structured semantic segmentation of remote sensing images is introduced, enhancing the encoder’s feature extraction capability for more accurate representation. Simultaneously, an attention mechanism is incorporated to weight the segmented regions, enabling the model to focus more on crucial semantic information. Secondly, taking advantage of the well-defined spatial relationships among objects in remote sensing images, geographical spatial relations are integrated into the attention mechanism, ensuring more accurate and spatially consistent descriptions. Finally, experimental evaluations are conducted on three publicly available remote sensing datasets, RSICD, UCM, and Sydney. On the UCM dataset, BLEU-1 achieved 84.06, METEOR reached 44.35, and ROUGE_L attained 77.01, demonstrating improvements of 2.32%, 1.15%, and 1.88%, respectively, compared to classical models. The experimental results indicate that the model can better leverage the semantic content of remote sensing images, demonstrating its superior performance in remote sensing image captioning tasks.

    参考文献
    相似文献
    引证文献
引用本文

李国燕,田明达,董春华,郝志鹏.面向遥感图像的结构化图像描述网络[J].电子测量与仪器学报,2024,38(5):148-157

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2024-08-30
  • 出版日期:
文章二维码