李策,章隆兵.基于社区结构的图数据预取器设计[J].高技术通讯(中文),2022,32(12):1251~1261 |
基于社区结构的图数据预取器设计 |
Design of graph data prefetcher based on community structure |
|
DOI:10. 3772/ j. issn. 1002-0470. 2022. 12. 005 |
中文关键词: 图计算; 预取器; 社区结构; 存储规律; 及时性 |
英文关键词: graph analytics, prefetcher, community structure, storage regular pattern, timeliness |
基金项目: |
作者 | 单位 | 李策 | (计算机体系结构国家重点实验室(中国科学院计算技术研究所)北京 100190)
(中国科学院大学计算机学院北京 100190) | 章隆兵 | (计算机体系结构国家重点实验室(中国科学院计算技术研究所)北京 100190)
(中国科学院大学计算机学院北京 100190) |
|
摘要点击次数: 1044 |
全文下载次数: 715 |
中文摘要: |
由于图数据规模庞大且结构不规则,图应用运行时会产生大量高延迟内存访问,大幅度降低了通用处理器的运行效率。本文采用软硬件结合的方式设计了图计算专用预取器,利用图数据访存特点以及社区结构的存储规律,通过对图数据进行混合预取,缩短了图计算访存的延迟,在含有较多社区的图数据集上获得了显著的性能收益。在不同图算法与图数据集上的实验表明,该预取器相对于无预取情况、流式预取器及传统图数据预取器,分别实现了65%~176%、 6%~21%和4%~18%的性能提升。 |
英文摘要: |
Due to the large scale and irregular structure of graph data, a large number of high-latency memory accesses are generated when graph applications are running, which greatly reduces the efficiency of general-purpose processors. This paper uses a combination of software and hardware to design a dedicated prefetcher for graph analytics. Using the characteristics of graph data access and the storage law of community structure, and through hybrid prefetching of graph data, the memory access latency of graph analytics are shortened and significant performance gains are obtained on graph datasets containing more communities. Experiments on different graph algorithms and graph datasets show that the prefetcher achieves 65%-176% performance improvement over the no-prefetch baseline, 6%-21% performance improvement over the stream prefetcher, and 4%-18% performance improvement over the traditional graph data prefetcher. |
查看全文
查看/发表评论 下载PDF阅读器 |
关闭 |