谭弘泽,王剑.基于动态压缩的高存储效率末级分支目标缓冲[J].高技术通讯(中文),2024,34(7):671~680 |
基于动态压缩的高存储效率末级分支目标缓冲 |
A storage efficient las-level branch target buffer based on dynamic compression |
|
DOI:10. 3772 / j. issn. 1002-0470. 2024. 07. 001 |
中文关键词: 分支预测; 分支目标缓冲(BTB); 压缩; 偏斜相联 |
英文关键词: branch prediction, branch target buffer (BTB), compression, skewed associativity |
基金项目: |
作者 | 单位 | 谭弘泽 | (处理器芯片全国重点实验室(中国科学院计算技术研究所)北京 100190)
(中国科学院大学北京 100049) | 王剑 | |
|
摘要点击次数: 432 |
全文下载次数: 391 |
中文摘要: |
随着软件系统规模及复杂度的增长,数量庞大的指令使指令高速缓存和分支目标缓冲(BTB)频繁地发生缺失,导致中央处理器(CPU)性能下降。现代工业CPU设计在分离式前端中使用充分大的多级BTB以减少缺失导致的性能损失。由于实际芯片的存储资源有限,大容量的末级BTB需要更高的存储效率。然而,现有压缩BTB采用静态分配目标偏移量存储空间的方法,无法按照分支的实际存储需求进行调整,导致其存储效率较低。针对上述问题,提出一种基于动态压缩的BTB——ZBTB。ZBTB通过可变长编码表示目标偏移量,动态分配目标偏移量存储空间,结合无额外存储的最近最少使用(LRU)和偏斜相联等方法缓解冲突,提升了存储效率。基于以第1届指令预取锦标赛(IPC-1)所发布轨迹数据进行的评估,与现有BTB相比,ZBTB在33.5kB容量下可将误预测次数降低66%。 |
英文摘要: |
With the increasing size and complexity of software systems, the massive instructions bring frequent misses to instruction caches and branch target buffers (BTBs) and hurt central processing unit (CPU) performance. Modern industry CPU designs utilize sufficiently large multi-level BTBs in decoupled front end to reduce performance degradation from misses and consequently result in vast BTB storage requirements. However, current compressed BTBs use statical allocation policies that cannot adapt to upcoming branches. To overcome the limitations of current BTBs, this work proposes a dynamically compressed BTB called zipped branch target buffer (ZBTB). ZBTB uses an adaptive allocation policy enabled by the employment of variable length target offset with a storage-free least-recently-used (LRU) replacement and skewed associativity to reduce conflictions. Evaluate ZBTB on traces from the First Instruction Prefetching Championship (IPC-1). Compared with the state-of-the-art storage-efficient BTBs, ZBTB can reduce the misses by over 66% with the 33.5kB storage budget. |
查看全文
查看/发表评论 下载PDF阅读器 |
关闭 |