2016 年 9 月 Journal on Communications September 2016
2016173-1
第 37 卷第 9 期 通 信 学 报 Vol.37
No.9
基于滑动窗口的多核程序数据竞争硬件检测算法
朱素霞
1,2
,陈德运
1,2
,季振洲
3
,孙广路
1
(1. 哈尔滨理工大学计算机科学与技术学院,黑龙江 哈尔滨 150080;
2. 哈尔滨理工大学计算机科学与技术学院博士后流动站,黑龙江 哈尔滨 150080;
3. 哈尔滨工业大学计算机科学与技术学院,黑龙江 哈尔滨 150001)
摘 要:数据竞争是引起多核程序发生并发错误的主要原因。针对现有基于硬件的 happens-before 数据竞争检测
方法硬件开销大的问题,提出了一种轻量级的内存竞争硬件检测算法,该算法利用滑动窗口技术动态检测程序执
行过程中发生的距离较近、更易引发并发错误的数据竞争。考虑竞争距离的大小,将并发线程片段细分为加锁并
发竞争域和包含线程近期执行序列的未加锁并发竞争域,用一对交替移动的可重写滑动窗口保存未加锁并发竞争
域内的内存操作指令,用一个大小可变的可重写滑动窗口保存加锁并发竞争域内的内存操作指令,当来自远程的
共享访问与窗口内的内存访问发生冲突时,检测到数据竞争。在硬件实现结构中,仅为每个处理器核添加 3 对较
小尺寸的硬件签名寄存器来保存并发竞争域内的数据地址,无需更改原有的 cache 一致性协议,带来的带宽开销
低,能够快速地检测多核程序并发执行过程中发生的动态数据竞争,为多核程序开发和生产运行阶段的并发错误
诊断提供有效的指导信息。
关键词:数据竞争;滑动窗口;硬件签名;并发错误;多核程序
中图分类号:TP303 文献标识码:A
Hardware data race detection algorithm based on sliding windows
ZHU Su-xia
1,2
, CHEN De-yun
1,2
, JI Zhen-zhou
3
, SUN Guang-lu
1
(1. School of Computer Science and Technology, Harbin University of Technology, Harbin 150080, China;
2. Postdoctoral Research Station, School of Computer Science and Technology, Harbin University of Technology, Harbin 150080, China;
3. School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China)
Abstract: Data race is a major factor which causes multi-core programs to produce concurrent bugs. To address the high
hardware cost in happens-before detection proposals, a light-weight hardware data race detection approach based on slid-
ing window technology was proposed. It used sliding windows to save recent memory instructions in thread execution
and dynamically detected data races with small race distance which more easily lead to concurrent bugs. Considering the
race distance, parallel thread segments were subdivided into concurrent race regions with lock and concurrent race re-
gions without lock. A pair of alternate rewritable sliding windows was used to store the memory instructions in concur-
rent race region without lock, and a sliding window with variable size was used to store the memory instructions in con-
current race region with lock. When there was a conflict between a remote sharing access and memory accesses in sliding
windows, a data race was detected. In the hardware implementation, the addresses of the data in sliding windows were
automatically encoded into three hardware signatures with small size. Data races can be detected quickly without modi-
fying the L1 cache and cache coherence protocol messages. This approach supplies efficient guidance to help users to di-
agnose concurrency bugs occurred in the development and production run of multi-core programs, achieving smaller
hardware and bandwidth overhead.
Key words: data race, sliding window, hardware signature, concurrency bug, multi-core program
收稿日期:2016-04-05;修回日期:2016-07-14
基金项目:国家自然科学青年基金资助项目(No.61502123);黑龙江省青年科学基金资助项目(No.QC2015084);中国博士后
科学基金资助项目(No.2015M571429);国家自然科学基金资助项目(No.61472100);国家重点基础研究发展计划(“973”计
划)基金资助项目(No.2011CB302501)
Foundation Items: The National Natural Science Foundation of China for Youths(No.61502123), Heilongjiang Province Science
Foundation for Youths(No.QC2015084), The China Postdoctoral Science Foundation(No.2015M571429), The National Natural Sci-
ence Foundation of China(No.61472100), The National Basic Research Program of China(973 Program)(No.2011CB302501)
doi:10.11959/j.issn.1000-436x.2016173