基于Transformer的逐通道点云分析网络
DOI:
CSTR:
作者:
作者单位:

1.辽宁工程技术大学;2.沈阳理工大学

作者简介:

通讯作者:

中图分类号:

基金项目:

辽宁省科技厅应用基础研究项目(2022JH2/101300274);辽宁省高等学校基本科研项目(LJKMZ20220679)


Transformer-based channel-by-channel point cloud analysis network
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    三维点云能够充分描述目标对象的几何信息,在自动驾驶、医学影像和机器人等领域有着广泛的应用前景。然而,现有方法在处理不同通道间的特征时缺乏差异化,同时对低级空间坐标和高级语义特征采用统一的编码策略,进而导致点云特征提取不全面。因此,本文提出了基于Transformer的逐通道点云分析网络。首先,为了克服传统图卷积在混合通道中难以区分有效信息的挑战,设计了一种深度可分离边缘卷积,可以在逐通道特征提取时保留局部几何信息的同时,显著提升通道间的区分能力。其次,针对Transformer在低级空间坐标和高级语义特征中采用统一编码方式,导致信息提取不足的问题,提出了两种特征编码策略:自适应位置编码和空间上下文编码,分别用于探索低级空间中的隐式几何结构和高级空间中的复杂上下文关系。最后,提出了一种有效的融合策略,可以形成更具区分性的特征表示。为了充分证明所提出模型的有效性,在公开数据集ModelNet40和ScanObjectNN上进行点云分类实验,总体分类精度分别达到93.7%和83.2%,在公开数据集ShapeNet Part上,整体部件分割的平均交并比达到86.0%。因而,本文方法在分类和分割任务中均具有先进的性能。

    Abstract:

    3D point clouds can fully describe the geometric information of target objects and have a wide range of applications in fields such as autonomous driving, medical imaging and robotics. However, existing methods lack differentiation when dealing with features between different channels, and at the same time adopt a unified coding strategy for low-level spatial coordinates and high-level semantic features, which in turn leads to incomplete point cloud feature extraction. Therefore, this paper proposes a channel-by-channel point cloud analysis network based on Transformer. First, in order to overcome the challenge of traditional graph convolution that is difficult to distinguish effective information in mixed channels, a depth-separable edge convolution is designed, which can significantly improve the inter-channel differentiation ability while preserving local geometric information during channel-by-channel feature extraction. Secondly, to address the problem that Transformer adopts a uniform coding approach in low-level spatial coordinates and high-level semantic features, which leads to insufficient information extraction, two feature coding strategies are proposed: adaptive positional coding and spatial context coding, which are used to explore implicit geometric structures in low-level space and complex contextual relationships in high-level space, respectively. Finally, an effective fusion strategy is proposed, which can result in a more discriminative feature representation. In order to fully demonstrate the effectiveness of the proposed model, point cloud classification experiments are conducted on the public datasets ModelNet40 and ScanObjectNN, where the overall classification accuracies reach 93.7% and 83.2%, respectively, and the average intersection and merger ratio of overall part segmentation reaches 86.0% on the public dataset ShapeNet Part. Thus, the method in this paper has advanced performance in both classification and segmentation tasks.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-07-17
  • 最后修改日期:2024-12-21
  • 录用日期:2024-12-25
  • 在线发布日期:
  • 出版日期:
文章二维码
×
《电子测量与仪器学报》
财务封账不开票通知