没有合适的资源?快使用搜索试试~ 我知道了~
首页EN_Modern Processor Design Fundamentals of Superscalar Processors.pdf
资源详情
资源评论
资源推荐
MODERN
PROCESSOR
DESIGN
Fundamentals
of Superscalar
Processors
JOHN PAUL SHEN ● MIKKO H.LIPASTI
高等計算機結構
教師:蔡智強-教授 【2010.09】
John Paul Shen
John Paul Shen is the Director of Intel's Microarchitecture
Research Lab (MRL), providing leadership to about two-
dozen highly skilled researchers located in Santa Clara, C A;
Hillsboro, OR; and Austin. TX. MRL is responsible for de-
veloping innovative microarchitecture techniques that can
I potentially be used in future microprocessor products from
Intel. MRL researchers collaborate closely with microarchi-
I tects from product teams in joint advanced-development
I efforts. MRL frequently hosts visiting faculty and Ph.D.
I interns and conducts joint research projects with academic
I research groups.
Prior to joining Intel in 2000, John was a professor in the
electrical and computer engineering department of Carnegie
Mellon University, where he headed up the CMU Microarchitecture Research Team
(CMuART). He has supervised a total of 16 Ph.D. students during his years at CMU.
Seven are currently with Intel, and five have faculty positions in academia. He won
multiple teaching awards at CMU. He was an NSF Presidential Young Investigator.
He is an IEEE Fellow and has served on the program committees of ISCA, MICRO,
HPCA, ASPLOS, PACT, ICCD, ITC, and FTCS.
He has published over 100 research papers in diverse areas, including fault-
tolerant computing, built-in self-test, process defect and fault analysis, concurrent
error detection, application-specific processors, performance evaluation, compila-
tion for instruction-level parallelism, value locality and prediction, analytical mod-
eling of superscalar processors, systematic microarchitecture test generation, per-
formance simulator validation, precomputation-based
prefetching, database workload
analysis, and user-level helper threads.
John received his M.S. and Ph.D. degrees from the University of Southern
California, and his B.S. degree from the University of Michigan, all in electrical
engineering. He attended Kimball High School in Royal Oak, Michigan. He is
happily married and has three daughters. His family enjoys camping, road trips, and
reading The Lord of the Rings.
(continued on back inside cover)
Modern Processor Design
Fundamentals of Superscalar Processors
John Paul Shen
Intel Corporation
Mikko H. Lipasti
University of Wisconsin
Tata McGraw-Hill Publishing Company Limited
NEW DELHI
McGraw-Hill Offices
New Delhi New York St Louis San Francisco Auckland Bogota Caracas
Kuala Lumpur Lisbon London Madrid Mexico City Milan Montreal
San Juan Santiago Singapore Sydney Tokyo Toronto
Our parents:
Paul and Sue Shen
Tarja and Simo Lipasti
Our spouses:
Amy C. Shen
Erica Ann Lipasti
Our children:
Priscilla S. Shen, Rachael S. Shen, and Valentia C. Shen
Emma Kristiina Lipasti and Elias Joel Lipasti
Tata McGraw-Hiil
MODERN PROCESSOR DESIGN: FUNDAMENTALS OF
SUPERSCALAR PROCESSORS
Copyright © 2005 by The McGraw-Hill Companies, Inc.,
All rights reserved. No part of this publication may be reproduced or distributed
in any form or by any means, or stored in a data base or retrieval system, without the
prior written consent of The McGraw-Hill Companies, Inc., including, but not
limited to, in any network or other electronic storage or transmission, or broadcast
for distance learning
Some ancillaries, including electronic and print components, may not be available
to customers outside the United States
Tata McGraw-Hill Edition
RZXQCRBIRQQDD
Reprinted in India by arrangement with The McGraw-Hill Companies, Inc.,
New York
Sales territories: India, Pakistan, Nepal, Bangladesh, Sri Lanka and Bhutan
ISBN 0-07-059033-8
Published by Tata McGraw-Hill Publishing Company Limited,
7 West Patel Nagar, New Delhi 110 008, and printed at
Shivam Printers, Delhi 110 032
The McGraw-Hill Companies
Table of Contents
Table of Contents
Additional Resources
Preface
1 Processor Design
1.1 The Evolution of Microprocessors
1.2 Instruction Set Processor Design
1.2.1 Digital Systems Design
1.2.2 Architecture, Implementation, and
Realization
1.2.3 Instruction Set Architecture
1.2.4 Dynamic-Static Interface
1.3 Principles of Processor Performance
1.3.1 Processor Performance Equation
1.3.2 Processor Performance Optimizations
1.3.3 Performance Evaluation Method
1.4 Instruction-Level Parallel Processing
1.4.1 From Scalar to Superscalar
1.4.2 Limits of Instruction-Level Parallelism
1.4.3 Machines for Instruction-Level Parallelism
1.5 Summary
2 Pipelined Processors
2.1 Pipelining Fundamentals
2.1.1 Pipelined Design
2.1.2 Arithmetic Pipeline Example
2.1.3 Pipelining Idealism
2.1.4 Instruction Pipelining
2.2 Pipelined Processor Design
2.2.1 Balancing Pipeline Stages
2.2.2 Unifying Instruction Types
2.2.3 Minimizing Pipeline Stalls
2.2.4 Commercial Pipelined Processors
2.3 Deeply Pipelined Processors
2.4 Summary
3 Memory and I/O Systems
3.1 Introduction
3.2 Computer System Overview
3.3 Key Concepts: Latency and Bandwidth
MODERN PROCESSOR DESIGN
3.4 Memory Hierarchy 110
3.4.1 Components of a Modem Memory Hierarchy 111
3.4.2 Temporal and Spatial Locality 113
3.4.3 Caching and Cache Memories 115
3.4.4 Main Memory 127
3.5 Virtual Memory Systems 136
3.5.1 Demand Paging 138
3.5.2 Memory Protection 141
3.5.3 Page Table Architectures 142
3.6 Memory Hierarchy Implementation 145
3.7 Input/Output Systems 153
3.7.1 Types of I/O Devices 154
3.7.2 Computer System Busses 161
3.7.3 Communication with I/O Devices 165
3.7.4 Interaction of I/O Devices and Memory Hierarchy 168
3.8 Summary 170
4 Superscalar Organization 177
4.1 Limitations of Scalar Pipelines 178
4.1.1 Upper Bound on Scalar Pipeline Throughput 178
4.1.2 Inefficient Unification into a Single Pipeline 179
4.1.3 Performance Lost Due to a Rigid Pipeline 179
4.2 From Scalar to Superscalar Pipelines 181
4.2.1 Parallel Pipelines 181
4.2.2 Diversified Pipelines 184
4.2.3 Dynamic Pipelines 186
4.3 Superscalar Pipeline Overview 190
4.3.1 Instruction Fetching 191
4.3.2 Instruction Decoding 195
4.3.3 Instruction Dispatching 199
4.3.4 Instruction Execution 203
4.3.5 Instruction Completion and Retiring 206
4.4 Summary 209
5 Superscalar Techniques 217
5.1 Instruction Flow Techniques 218
5.1.1 Program Control Flow and Control Dependences 218
5.1.2 Performance Degradation Due to Branches 219
5.1.3 Branch Prediction Techniques 223
5.1.4 Branch Misprediction Recovery 228
5.1.5 Advanced Branch Prediction Techniques 231
5.1.6 Other Instruction Flow Techniques 236
5.2 Register Data Flow Techniques 237
5.2.1 Register Reuse and False Data Dependences 237
5.2.2 Register Renaming Techniques 239
5.2.3 True Data Dependences and the Data Flow Limit 244
TABLE OF CONTENTS
5.2.4 The Classic Tomasulo Algorithm 246
5.2.5 Dynamic Execution Core 254
5.2.6 Reservation Stations and Reorder Buffer 256
5.2.7 Dynamic Instruction Scheduler 260
5.2.8 Other Register Data Flow Techniques 261
5.3 Memory Data Flow Techniques 262
5.3.1 Memory Accessing Instructions 263
5.3.2 Ordering of Memory Accesses 266
5.3.3 Load Bypassing and Load Forwarding 267
5.3.4 Other Memory Data Flow Techniques 273
5.4 Summary 279
6 The PowerPC 620 301
6.1 Introduction 302
6.2 Experimental Framework 305
6.3 Instruction Fetching 307
6.3.1 Branch Prediction 307
6.3.2 Fetching and Speculation 309
6.4 Instruction Dispatching 311
6.4.1 Instruction Buffer 311
6.4.2 Dispatch Stalls 311
6.4.3 Dispatch Effectiveness 313
6.5 Instruction Execution 316
6.5.1 Issue Stalls 316
6.5.2 Execution Parallelism 317
6.5.3 Execution Latency 317
6.6 Instruction Completion 318
6.6.1 Completion Parallelism 318
6.6.2 Cache Effects 318
6.7 Conclusions and Observations 320
6.8 Bridging to the IBM POWER3 and POWER4 > 322
6.9 Summary 324
7 Intel's P6 Microarchitecture 329
7.1 Introduction 330
7.1.1 Basics of the P6 Microarchitecture 332
7.2 Pipelining 334
7.2.1 In-Order Front-End Pipeline 334
7.2.2 Out-of-Order Core Pipeline 336
7.2.3 Retirement Pipeline 337
7.3 The In-Order Front End 338
7.3.1 Instruction Cache and ITLB 338
7.3.2 Branch Prediction 341
7.3.3 Instruction Decoder 343
7.3.4 Register Alias Table 346
7.3.5 Allocator 353
剩余331页未读,继续阅读
raico1
- 粉丝: 1
- 资源: 70
上传资源 快速赚钱
- 我的内容管理 收起
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
会员权益专享
最新资源
- zigbee-cluster-library-specification
- JSBSim Reference Manual
- c++校园超市商品信息管理系统课程设计说明书(含源代码) (2).pdf
- 建筑供配电系统相关课件.pptx
- 企业管理规章制度及管理模式.doc
- vb打开摄像头.doc
- 云计算-可信计算中认证协议改进方案.pdf
- [详细完整版]单片机编程4.ppt
- c语言常用算法.pdf
- c++经典程序代码大全.pdf
- 单片机数字时钟资料.doc
- 11项目管理前沿1.0.pptx
- 基于ssm的“魅力”繁峙宣传网站的设计与实现论文.doc
- 智慧交通综合解决方案.pptx
- 建筑防潮设计-PowerPointPresentati.pptx
- SPC统计过程控制程序.pptx
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论12