基于Transformer的逐通道点云分析网络

首页 > 过刊浏览>2025年第39卷第2期 >49-59

基于Transformer的逐通道点云分析网络
DOI:
                        
CSTR:
                        
作者:
                        
作者单位:1.辽宁工程技术大学电子与信息工程学院葫芦岛125105;2.沈阳理工大学自动化与电气工程学院沈阳110159
作者简介:
通讯作者:
中图分类号:TP391.4；TN911.7
基金项目:辽宁省科技厅应用基础研究项目（2022JH2/101300274）、辽宁省高等学校基本科研项目（LJKMZ20220679）资助

Transformer-based channel-by-channel point cloud analysis network

Author:

Affiliation:

1.School of Electronic and Information Engineering, Liaoning Technical University, Huludao 125105,China; 2.School of Automation and Electrical Engineering, Shenyang Ligong University, Shenyang 110159，China

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

三维点云能够充分描述目标对象的几何信息,在自动驾驶、医学影像和机器人等领域有着广泛的应用前景。然而,现有方法在处理不同通道间的特征时缺乏差异化,同时对低级空间坐标和高级语义特征采用统一的编码策略,进而导致点云特征提取不全面。因此,提出了基于Transformer的逐通道点云分析网络。首先,为了克服传统图卷积在混合通道中难以区分有效信息的挑战,设计了一种深度可分离边缘卷积,可以在逐通道特征提取时保留局部几何信息的同时,显著提升通道间的区分能力。其次,针对Transformer在低级空间坐标和高级语义特征中采用统一编码方式,导致信息提取不足的问题,提出了两种特征编码策略，自适应位置编码和空间上下文编码,分别用于探索低级空间中的隐式几何结构和高级空间中的复杂上下文关系。最后,提出了一种有效的融合策略,可以形成更具区分性的特征表示。为了充分证明所提出模型的有效性,在公开数据集ModelNet40和ScanObjectNN上进行点云分类实验,总体分类精度分别达到93.7%和83.2%,在公开数据集ShapeNet Part上,整体部件分割的平均交并比达到86.0%。因而,研究方法在分类和分割任务中均具有先进的性能。

Abstract:

3D point clouds can fully describe the geometric information of target objects and have a wide range of applications in fields such as autonomous driving, medical imaging and robotics. However, existing methods lack differentiation when dealing with features between different channels, and at the same time adopt a unified coding strategy for low-level spatial coordinates and high-level semantic features, which in turn leads to incomplete point cloud feature extraction. Therefore, this paper proposes a channel-by-channel point cloud analysis network based on Transformer. First, in order to overcome the challenge of traditional graph convolution that is difficult to distinguish effective information in mixed channels, a depth-separable edge convolution is designed, which can significantly improve the inter-channel differentiation ability while preserving local geometric information during channel-by-channel feature extraction. Secondly, to address the problem that Transformer adopts a uniform coding approach in low-level spatial coordinates and high-level semantic features, which leads to insufficient information extraction, two feature coding strategies are proposed adaptive positional coding and spatial context coding, which are used to explore implicit geometric structures in low-level space and complex contextual relationships in high-level space, respectively. Finally, an effective fusion strategy is proposed, which can result in a more discriminative feature representation. In order to fully demonstrate the effectiveness of the proposed model, point cloud classification experiments are conducted on the public datasets ModelNet40 and ScanObjectNN, where the overall classification accuracies reach 93.7% and 83.2%, respectively, and the average intersection and merger ratio of overall part segmentation reaches 86.0% on the public dataset ShapeNet Part. Thus, the method in this paper has advanced performance in both classification and segmentation tasks.

参考文献

相似文献

引证文献

引用本文

冯凯浩,陶志勇,李衡,李铭朗,林森.基于Transformer的逐通道点云分析网络[J].电子测量与仪器学报,2025,39(2):49-59

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:
最后修改日期:
录用日期:
在线发布日期: 2025-04-23
出版日期:

网站首页

杂志简介

投稿须知

在线阅读

招商合作

联系我们

English

引用本文

分享

相关视频

文章指标

历史

文章二维码