基于FPGA的高效近似乘法器设计
DOI:
CSTR:
作者:
作者单位:

1.大连海洋大学信息工程学院大连116023;2.大连海洋大学机械与动力工程学院大连116023; 3.山东大学软件学院济南250100

作者简介:

通讯作者:

中图分类号:

TN43;TN791

基金项目:

辽宁省应用基础研究计划(2023-179)、辽宁省研究生教育教学改革研究项目(LNYJG2024198)资助


FPGA-based design of high-efficiency approximate multipliers
Author:
Affiliation:

1.School of Information Engineering, Dalian Ocean University, Dalian 116023, China; 2.School of Mechanical and Power Engineering, Dalian Ocean University, Dalian 116023, China; 3.School of Software, Shandong University, Jinan 250100, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对现场可编程门阵列(field-programmable gate array, FPGA)在加速卷积神经网络、图像处理算法等近似计算领域中模型不彻底、片上资源消耗较大、性能受限等问题,提出了5款近似乘法器设计方法。该方法基于查找表(LUT)的8 bit×8 bit无符号无进位链近似乘法器,通过编码LUT的INIT参数值来优化关键路径简化结构并利用压缩递归调用方法、子积重组计算方法,提出了两款适用于不同现实场景的基于LUT的8 bit×8 bit无符号近似乘法器。该方法在精度可接受的范围内与同类型乘法器相比最高可节省60%的面积、约60.76%的功耗、约25.4%的关键路径延迟(critical path delay, CPD)。同时,为了满足更加复杂的场景需要,在上述基础上将乘数位数倍增,提出了两款基于LUT的16 bit×16 bit无符号近似乘法器,与同类型乘法器相比最高可节省约41.2%的面积、约77%的功耗、约35.4%的CPD并能弥补精度下降带来的损失。此外,结合所提出的有符号数计算模块提出了一款基于LUT的16 bit×16 bit有符号近似乘法器来替代Xilinx(现ADM)的Multiplier IP核,部署至以手写数字识别为功能的卷积神经网络卷积层中并选用MNIST数据集中的手写数字图片进行测试,以精度下降3.4%的代价换取节省约32.48%的面积、约41.21%的功耗、约24.28%的CPD。实验结果表明,这些乘法器可以较好的满足FPGA加速卷积神经网络的需求并在精度与资源开销达成最优平衡。

    Abstract:

    Five approximate multiplier design methods are proposed to address the issues of incomplete models, high on-chip resource consumption, and limited performance of Field Programmable Gate Array (FPGA) in accelerating convolutional neural networks, image processing algorithms, and other approximate computing fields. Based on an 8-bit×8-bit unsigned carry chain approximation multiplier, two LUT based 8-bit×8-bit unsigned approximation multipliers are proposed for different real-world scenarios with a lookup table (LUT) to optimizing the critical path simplification structure by compressed recursive invocation methodology and sub-product recombination computation strategy. This method can save up to 60% of area, about 60.76% of power consumption, and about 25.4% of critical path delay (CPD) compared to similar multipliers within an acceptable range of accuracy. At the same time, in order to meet the needs of more complex scenarios, two 16 bit×16 bit unsigned approximate multipliers with LUT are proposed by doubling the number of multiplies digits. Compared with similar multipliers, the method can save up to about 41.2% of the area, about 77% of the power consumption, about 35.4% of the CPD, which can compensate for the loss caused by the decrease in accuracy. In addition, based on the signed number calculation module, proposed a 16 bit × 16 bit signed approximate multiplier with LUT is proposed to replace Xilinx’s (now ADM) Multiplier IP core, which is deployed in the convolutional neural network convolutional layer with handwritten number recognition function and tested using handwritten number images in the MNIST dataset. It saves about 32.48% of area, about 41.21% of power consumption, and about 24.28% of CPD, at the cost of a 3.4% decrease in accuracy. It is shown that these multipliers can effectively meet the requirements of FPGA accelerated convolutional neural networks and achieve the optimal balance between accuracy and resource overhead.

    参考文献
    相似文献
    引证文献
引用本文

杨宏浩,隋欣宇,王一然,刘恩华,潘澜澜,李响.基于FPGA的高效近似乘法器设计[J].电子测量与仪器学报,2025,39(9):224-232

复制
分享
相关视频

文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2025-12-09
  • 出版日期:
文章二维码
×
《电子测量与仪器学报》
关于防范虚假编辑部邮件的郑重公告