基于跨层融合语义增强特征的废钢图像分类方法
DOI:
CSTR:
作者:
作者单位:

1.南京理工大学自动化学院南京210094;2.上海科技大学信息科学与技术学院上海201210

作者简介:

通讯作者:

中图分类号:

TP399;TN911.73;TP391

基金项目:


Classification method for scrap steel images based on cross-layer fusion semantic enhanced features
Author:
Affiliation:

1.School of Automation, Nanjing University of Science and Technology, Nanjing 210094, China; 2.School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对废钢堆叠严重的问题及对废钢精细化分类的需求,本文提出一种基于跨层融合语义增强特征的废钢图像细粒度分类方法。首先,采用运动检测实现从视频序列中检索出不包含抓斗等运动物体的废钢图像;其次,采用Segment Anything(SAM)视觉大模型对不包含抓斗等运动物体的废钢图像进行语义分割,以分割出废钢图像中的废钢实例;最后,提出了一种基于跨层融合语义增强特征的废钢图像分类模型(efficientnetb5-cross layer fusion semantically enhanced feature, EfficientNetB5-CLFSEF),该模型采用EfficientNetB5模型的特征提取器,并且通过使用跨层融合特征语义增强特征模块(CLFSEF)实现废钢图像分类。CLFSEF模块包括跨层特征融合(cross layer fusion, CLF)部分和语义增强特征(semantically enhanced feature, SEF)部分,CLF通过融合来自特征提取器中不同层的特征,使模型在捕获深层语义信息同时,保留边界等低级语义信息;SEF模块对融合特征按照各通道之间的语义相似性进行分组,并结合知识蒸馏技术和最大熵正则化技术提升模型对输入废钢图像中最具区分性部分的理解。本文在自制数据集上进行实验,实验结果表明,所提出的EfficientNetB5-CFLSEF模型能够对统废、剪料1、剪料2、炉料1、炉料2、钢板料和重废进行准确分类,该模型在测试集上的准确率为90.51%,优于相对比的分类模型。

    Abstract:

    In response to the severe stacking of scrap steel samples and the need for refined classification of scrap steel types, this paper proposes a scrap steel image classification method based on cross-layer fusion of semantic-enhanced features. The proposed method consists of several stages, aiming to optimize the accuracy and efficiency of scrap steel classification. The first stage is motion detection, which is used to extract scrap steel images without moving objects such as grapples from video sequences. This step ensures that the dataset excludes irrelevant objects, providing a more accurate foundation for subsequent analysis. Next, the state-of-the-art visual model “Segment Anything Model (SAM)” is applied to perform semantic segmentation on scrap steel images without moving objects such as grapples, to segment the instances in the scrap steel images. The core contribution of this paper lies in the design of a scrap steel image classification model, EfficientNetB5-CLFSEF, which can effectively handle the subtle differences between scrap steel categories and the significant morphological changes within each category. This model uses EfficientNetB5 as the feature extractor, as it is renowned for its efficiency and high performance in visual recognition tasks. Additionally, the model integrates a novel cross-layer fusion of semantic-enhanced features (CLFSEF) module, which is crucial for improving the classification accuracy of scrap steel images. The CLFSEF module consists of two key components:cross-layer feature fusion (CLF) and semantic-enhanced features (SEF). CLF fuses the features from different layers of the EfficientNetB5 feature extractor, enabling the model to capture deep semantic information and low-level details such as boundaries, which is crucial for distinguishing similar scrap steel categories. On the other hand, the SEF module groups the fused features based on semantic similarity between channels. This grouping process enables the model to focus on the most discriminative features in the image. Moreover, the SEF module also integrates knowledge distillation and maximum entropy regularization techniques to enhance the model’s ability to recognize the most significant parts of the input scrap steel images. To validate the proposed method, experiments were conducted using a specially customized dataset for scrap steel classification. The benchmark EfficientNetB5 achieved an accuracy of 87.98% on the test set. After introducing the CLF module, the accuracy increased to 89.63%. Adding the SEF module resulted in an accuracy of 89.23%, and when the CLF and SEF modules are combined into the complete CLFSEF module, the accuracy increased to 90.51%. Compared to the benchmark classification model, these improvements increased by 1.65%, 1.25%, and 2.53% respectively. Moreover, the proposed model outperforms the comparison classification models.

    参考文献
    相似文献
    引证文献
引用本文

梁凯朔,朱倍孝,赵高鹏,徐皓远.基于跨层融合语义增强特征的废钢图像分类方法[J].电子测量与仪器学报,2025,39(12):279-288

复制
分享
相关视频

文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2026-02-12
  • 出版日期:
文章二维码
×
《电子测量与仪器学报》
关于防范虚假编辑部邮件的郑重公告