陈朋*,何建彬**,陈诺**,俞天纬*,宦若虹*.基于 FPGA 的视频实时目标检测方法研究[J].高技术通讯(中文),2022,32(3):239~247 |
基于 FPGA 的视频实时目标检测方法研究 |
Research on real-time FPGA-based video target detection method |
|
DOI:10.3772/j.issn.1002-0470.2022.03.003 |
中文关键词: SSD网络; 通道注意力机制; 深度可分离卷积; 现场可编程门阵列(FPGA); 定点量化 |
英文关键词: single shot multibox detector ( SSD) network, channel attention module, depthwise separable convolution, field programmable gate array (FPGA), fixed-point quantization |
基金项目: |
作者 | 单位 | 陈朋* | | 何建彬** | | 陈诺** | | 俞天纬* | | 宦若虹* | |
|
摘要点击次数: 1579 |
全文下载次数: 1003 |
中文摘要: |
针对实时目标检测网络在图形处理器(GPU)加速器上实时性低、功耗高和成本高等问题,本文提出了一种结合通道注意力机制与深度可分离卷积的神经网络模型(AtDS SSD),并将该网络在现场可编程门阵列(FPGA)上进行优化与部署。AtDS SSD网络在SSD模型基础上,将VGG16特征提取网络部分替换成以深度可分离卷积为主体的MobileNet网络,并加入通道注意力模块。本文采用8位的定点量化方法,对网络模型参数进行量化。最后,本文将量化后的AtDS SSD网络模型在ZCU102平台上进行部署,并采用PASCAL VOC数据集进行测试。在平均精度均值只损失0.58%的情况下,加速器性能从85fps提升到311.7fps,测试功耗相当于NVIDIARTX2080Ti的11%。实验数据表明,基于FPGA平台结合注意力机制和深度可分离卷积的网络模型,可以提升计算实时性并降低功耗,减少网络复杂度降低导致的精度损失,从而验证了本文方法的有效性。 |
英文摘要: |
In order to solve the problems of low real-time performance, high power consumption and high cost of real-time target detection network on graphics processing unit (GPU) accelerators, a neural network model named attention-based depthwise seperable single shot multibox detector ( AtDS-SSD) that combines channel attention mechanism and depthwise separable convolution is proposed, and the network is optimized and deployed on field programmable gate array (FPGA). Based on the SSD model, the AtDS-SSD network adds an attention module, and replaces the VGG 16 network with the MobileNet network which is mainly composed of depthwise separable convolution. An 8-bit fixed-point quantization method is used to quantify the network model parameters. The quantified AtDS-SSD net-work model is deployed on the ZCU 102 platform and tested by using the PASCAL VOC data set. The accelerator performance has increased from 85 fps to 311. 7 fps, and the power consumption is equivalent to 11% of NVIDIA RTX 2080Ti, with only 0. 58% drop of meanaverage precision. The experimental results show that the FPGA plat-form combined with the attention mechanism and the depthwise separable convolution network model can improve the real-time performance, reduce the power consumption, and reduce the accuracy loss caused by the reduction of network complexity, which verifies the effectiveness of the method proposed in this paper. |
查看全文
查看/发表评论 下载PDF阅读器 |
关闭 |
|
|
|