JIANG Lin(蒋林)**,DUAN Xueyao*,XIE Xiaoyan***.[J].高技术通讯(英文),2022,28(4):392~400 |
|
A simplified hardware-friendly contour prediction algorithm in 3D-HEVC and parallelization design |
|
DOI:10.3772/j.issn.1006-6748.2022.04.007 |
中文关键词: |
英文关键词: depth modeling mode 4 (DMM-4), contour prediction, 3D high efficiency video coding (3D-HEVC), parallelization, reconfigurable array processor |
基金项目: |
Author Name | Affiliation | JIANG Lin(蒋林)** | (*College of Safety Science and Engineering, Xi’an University of Science and Technology, Xi’an 710054, P.R.China)
(**Laboratory of Integrated Circuit Design, Xi’an University of Science and Technology, Xi’an 710054, P.R.China)
(***School of Computer, Xi’an University of Posts and Telecommunications, Xi’an 710121, P.R.China) | DUAN Xueyao* | (*College of Safety Science and Engineering, Xi’an University of Science and Technology, Xi’an 710054, P.R.China)
(**Laboratory of Integrated Circuit Design, Xi’an University of Science and Technology, Xi’an 710054, P.R.China)
(***School of Computer, Xi’an University of Posts and Telecommunications, Xi’an 710121, P.R.China) | XIE Xiaoyan*** | (*College of Safety Science and Engineering, Xi’an University of Science and Technology, Xi’an 710054, P.R.China)
(**Laboratory of Integrated Circuit Design, Xi’an University of Science and Technology, Xi’an 710054, P.R.China)
(***School of Computer, Xi’an University of Posts and Telecommunications, Xi’an 710121, P.R.China) |
|
Hits: 582 |
Download times: 610 |
中文摘要: |
|
英文摘要: |
After the extension of depth modeling mode 4 (DMM-4) in 3D high efficiency video coding (3D-HEVC), the computational complexity increases sharply,which causes the real-time performance of video coding to be impacted. To reduce the computational complexity of DMM-4, a simplified hardware-friendly contour prediction algorithm is proposed in this paper. Based on the similarity between texture and depth map, the proposed algorithm directly codes depth blocks to calculate edge regions to reduce the number of reference blocks. Through the verification of the test sequence on HTM16.1, the proposed algorithm coding time is reduced by 9.42% compared with the original algorithm. To avoid the time consuming of serial coding on HTM, a parallelization design of the proposed algorithm based on reconfigurable array processor (DPR-CODEC) is proposed. The parallelization design reduces the storage access time, configuration time and saves the storage cost. Verified with the Xilinx Virtex 6 FPGA, experimental results show that parallelization design is capable of processing HD 1080p at a speed above 30 frames per second. Compared with the related work, the scheme reduces the LUTs by 42.3%, the REG by 85.5% and the hardware resources by 66.7%. The data loading speedup ratio of parallel scheme can reach 3.4539. On average, the different sized templates serial/parallel speedup ratio of encoding time can reach 2.446. |
View Full Text
View/Add Comment Download reader |
Close |
|
|
|