Intel缓存分配技术实验探索

0 下载量 16 浏览量 更新于2024-07-14 收藏 960KB PDF 举报
"这篇资源是Pawel Szostek在2015年HTCCC会议上关于Intel缓存分配技术的一些实验展示。主要内容涉及现代计算机架构中的缓存机制、Intel的Cache Allocation Technology(CAT)以及Cache Monitoring Technology(CMT)的工作原理和应用。" 在现代计算机科学中,缓存是提升内存访问速度的关键技术。缓存存储数据以备后续使用,只有当数据被重复利用时,缓存才能发挥其优势。对于流式应用,如果数据不被重用,缓存则无法提供性能提升。在现代x86架构中,通常有三级缓存——L1数据缓存(L1D)、L1指令缓存(L1I)、L2缓存和L3缓存(也称为Last Level Cache,LLC)。L3缓存是多核处理器中所有核心共享的,而L1和L2缓存则是包含关系,意味着所有更高级别的缓存都包含低级别缓存的所有数据。 Intel的Cache Allocation Technology(CAT)旨在将L3缓存划分为多个部分,并将这些部分独立开来,允许为每个核心定义不同的分配类别。这样做的好处是可以动态指定哪些缓存区域可以在从主内存引入新缓存行时被替换,同时结合进程固定(pinning)或未来的控制组(cgroups),可以减少缓存污染。这一技术在某些Haswell SKU(如E5-25x8v3)上可用。 与CAT一同提供的还有Cache Monitoring Technology(CMT)。CMT允许对每个核心的L3缓存分配进行监控,这有助于分析和优化系统性能,理解不同核心如何使用L3缓存,以及在运行多线程应用时如何更有效地管理缓存资源。 这些实验和介绍对理解Intel处理器的高级特性,尤其是如何通过调整缓存策略来提升多核系统的性能,具有重要的参考价值。开发者和系统管理员可以通过应用CAT和CMT,更精确地控制和监控缓存行为,从而优化应用程序的运行效率。
2023-02-13 上传

4 Experiments This section examines the effectiveness of the proposed IFCS-MOEA framework. First, Section 4.1 presents the experimental settings. Second, Section 4.2 examines the effect of IFCS on MOEA/D-DE. Then, Section 4.3 compares the performance of IFCS-MOEA/D-DE with five state-of-the-art MOEAs on 19 test problems. Finally, Section 4.4 compares the performance of IFCS-MOEA/D-DE with five state-of-the-art MOEAs on four real-world application problems. 4.1 Experimental Settings MOEA/D-DE [23] is integrated with the proposed framework for experiments, and the resulting algorithm is named IFCS-MOEA/D-DE. Five surrogate-based MOEAs, i.e., FCS-MOEA/D-DE [39], CPS-MOEA [41], CSEA [29], MOEA/DEGO [43] and EDN-ARM-OEA [12] are used for comparison. UF1–10, LZ1–9 test problems [44, 23] with complicated PSs are used for experiments. Among them, UF1–7, LZ1–5, and LZ7–9 have 2 objectives, UF8–10, and LZ6 have 3 objectives. UF1–10, LZ1–5, and LZ9 are with 30 decision variables, and LZ6–8 are with 10 decision variables. The population size N is set to 45 for all compared algorithms. The maximum number of FEs is set as 500 since the problems are viewed as expensive MOPs [39]. For each test problem, each algorithm is executed 21 times independently. For IFCS-MOEA/D-DE, wmax is set to 30 and η is set to 5. For the other algorithms, we use the settings suggested in their papers. The IGD [6] metric is used to evaluate the performance of each algorithm. All algorithms are examined on PlatEMO [34] platform.

2023-05-24 上传