基于对话的多模态情感分析技术研究
DOI:
CSTR:
作者:
作者单位:

中北大学计算机科学与技术学院 太原 030051

作者简介:

通讯作者:

中图分类号:

TP391;TN912.34

基金项目:

山西省2024年度研究生教育创新计划项目(2024SZ23)、2024年山西省高等学校教学改革创新项目(J20240839)资助


Research on multimodal sentiment analysis technology based on conversations
Author:
Affiliation:

School of Computer Science and Technology, North University of China,Taiyuan 030051, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对多模态对话情感识别(MERC)中难以有效捕捉对话中跨模态语义关联以及对少数类和语义易混淆类情感的区分能力有限的问题,提出了一种新的多模态情感分析模型(FuseNet)。该模型采用双向注意力对话编码器(BiDRN)以捕捉对话上下文依赖,有效整合来自不同说话人的音频与视觉线索,并通过基于分层门控机制的融合模块实现动态多模态融合,同时引入类感知多模态对比(CAMC)损失以增强类间判别性,提升对少数类以及语义相近情感类别的区分能力。在IEMOCAP和MELD两个基准ERC数据集上的实验结果表明,与当前先进模型CORECT相比,FuseNet的F1分数分别提升了2.91%和2.00%,在多数情感类别的分类性能上均优于现有基线模型,尤其在识别少数类和语义相近类情感上改进显著。

    Abstract:

    Focused on the issue that multimodal emotion recognition in conversation (MERC) is difficult to effectively capture cross-modal semantic associations in conversation rounds and has limited discrimination ability for minority classes and semantically confusing classes of emotions, a new multimodal sentiment analysis model (FuseNet) is proposed. This model adopts the bidirectional attention dialogue encoder (BiDRN) to capture the context dependency of the dialogue, effectively integrates audio and visual cues from different speakers, and realizes dynamic multimodal fusion through the Hi-gated fusion module based on the hierarchical gated mechanism. Meanwhile, class-aware multimodal contrastive (CAMC) loss is introduced to enhance the inter-class discriminability and improve the discrimination ability of minority classes and semantically similar sentiment categories. Experimental results on the two benchmark ERC datasets of IEMOCAP and MELD show that compared with the current advanced model CORECT, the F1 score of the proposed framework has improved by 2.91% and 2.00%, respectively, which are better than the existing baseline model in terms of classification performance in most emotions, especially in identifying a few classes and semantic similar categories of emotions.

    参考文献
    相似文献
    引证文献
引用本文

赵亚芳,梁志剑.基于对话的多模态情感分析技术研究[J].电子测量技术,2026,49(6):20-28

复制
分享
相关视频

文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2026-05-13
  • 出版日期:
文章二维码

重要通知公告

①《电子测量技术》期刊收款账户变更公告