没有合适的资源?快使用搜索试试~ 我知道了~
首页掌握GPGPU编程:游戏与科学的高性能计算实战指南
掌握GPGPU编程:游戏与科学的高性能计算实战指南
3星 · 超过75%的资源 需积分: 10 17 下载量 42 浏览量
更新于2024-07-20
收藏 5.17MB PDF 举报
"GPGPU编程:游戏与科学的实践指南"是一本深度讲解如何在通用图形处理单元(GPGPU)上进行高性能计算的实用手册,特别针对游戏开发人员和计算机专业人士设计。该书由David H. Eberly撰写,结合了原理、实践与软件工程理念,旨在帮助读者理解并掌握GPGPU技术在DirectX 11环境中的应用。 本书不仅理论扎实,还提供了丰富的实例和高质量的源代码,通过算法示例和一个完整的计算与图形引擎,使学习者能够在无需编写大量基础设施代码的情况下,轻松入门和实践shader编程。这使得即使是初学者也能快速构建简单的应用,同时避免了重复造轮子的问题。 GPGPU,即通用图形处理器,是一种可以执行多种任务,包括图形渲染和数值计算的硬件。它利用其并行处理能力来加速那些原本可能在CPU上耗时的任务,特别是在大规模数据处理和科学计算领域,如物理模拟、机器学习和图像处理等。本书将引导读者了解如何有效地将计算密集型工作负载转移到GPU上,提升应用程序的性能。 书中涵盖了关键概念,如CUDA(NVIDIA的并行计算平台)、OpenCL(跨平台的并行计算标准)以及DirectX 11中的Shader编程模型。此外,还会讨论如何利用硬件特性,如纹理内存、线程块和共享内存,以及如何优化数据布局和内存访问以提高性能。 "GPGPU编程:游戏与科学"是一本实用的工具书,无论是对希望在游戏开发中利用GPU加速的开发者,还是寻求在科学计算中利用图形卡性能的研究人员,都具有很高的参考价值。通过学习本书,读者将能够掌握如何在实际项目中充分利用GPGPU的力量,提升项目的整体效能。
资源详情
资源推荐
List of Tables
2.1 The binary encodings for 8-bit floating-point numbers . . . 35
2.2 Quantities of interest for
binary8 ................ 36
2.3 Quantities of interest for
binary16 ............... 38
2.4 Quantities of interest for
binary32 ............... 40
2.5 Quantities of interest for
binary64 ............... 43
3.1 SIMDcomparisonoperators .................. 99
3.2 SIMDarithmeticoperators................... 100
3.3 Inverse square root accuracy and performance . . . . . . . . 111
3.4 Minimax polynomial approximations to
√
1+x ....... 115
3.5 Minimax polynomial approximations to f(x)=1/
√
1+x . . 115
3.6 Minimax polynomial approximations to f(x)=sin(x) . . . . 117
3.7 Minimax polynomial approximations to f(x)=cos(x) . . . 117
3.8 Minimax polynomial approximations to f(x)=tan(x) . . . 118
3.9 Minimax polynomial approximations to f(x) = asin(x) . . . 118
3.10 Minimax polynomial approximations to f(x)=(π/2 −
asin(x))/
√
1 − x ......................... 119
3.11 Minimax polynomial approximations to f (x) = atan(x) . . . 120
3.12 Minimax polynomial approximations to f (x)=2
x
...... 121
3.13 Minimax polynomial approximations to f (x) = log
2
(1 + x) . 121
4.1 Thetransformationpipeline .................. 130
5.1 Vertex and pixel shader performance measurements . . . . . 246
5.2 Compute shader performance measurements . . . . . . . . . 247
5.3 Depth, stencil, and culling state performance measurements 248
6.1 Error balancing for several n in the Remez algorithm . . . . 304
6.2 Rotationconventions...................... 336
7.1 Numerical ill conditioning for least squares . . . . . . . . . . 358
7.2 Performance comparisons for convolution implementations . 382
xv
http://freepdf-books.com
http://freepdf-books.com
Listings
2.1 Inexact representation of floating-point inputs . . . . . . . . . 13
2.2 Simple implementation for computing distance between two
points ............................... 18
2.3 Incorrect distance computation due to input problems . . . . 18
2.4 Conversion of rational numbers to binary scientific numbers . 26
2.5 A union is used to allow accessing a floating-point number or
manipulating its bits via an unsigned integer . . . . . . . . . 32
2.6 Decoding an 8-bit floating-point number . . . . . . . . . . . . 33
2.7 Decoding a 16-bit floating-point number . . . . . . . . . . . . 37
2.8 Decoding a 32-bit floating-point number . . . . . . . . . . . . 39
2.9 Decoding a 64-bit floating-point number . . . . . . . . . . . . 41
2.10 Integer and unsigned integer quantities that are useful for en-
coding and decoding floating-point numbers . . . . . . . . . . 44
2.11 The general decoding of floating-point numbers . . . . . . . . 44
2.12 Convenient wrappers for processing encodings of floating-point
numbers.............................. 45
2.13 Classification of floating-point numbers . . . . . . . . . . . . . 45
2.14 Queries about floating-point numbers . . . . . . . . . . . . . . 46
2.15 An implementation of the nextUp(x)function......... 48
2.16 An implementation of the nextDown(x)function ....... 49
2.17 An implementation of rounding with ties-to-even . . . . . . . 55
2.18 An implementation of rounding with ties-to-away . . . . . . . 56
2.19 An implementation of rounding toward zero . . . . . . . . . . 57
2.20 An implementation of rounding toward positive . . . . . . . . 58
2.21 An implementation of rounding toward negative . . . . . . . 59
2.22 Conversion of a 32-bit signed integer to a 32-bit floating-point
number .............................. 62
2.23 Conversion from a 32-bit floating-point number to a rational
number .............................. 65
2.24 Conversion from a 64-bit floating-point number to a rational
number .............................. 66
2.25 Conversion from a rational number to a 32-bit floating-point
number .............................. 68
2.26 Conversion of an 8-bit floating-point number to a 16-bit
floating-pointnumber ...................... 71
xvii
http://freepdf-books.com
xviii Listings
2.27 Conversion of a narrow floating-point format to a wide floating-
pointformat............................ 72
2.28 Conversion from a wide floating-point format to a narrow
floating-pointformat....................... 76
2.29 The conversion of a wide-format number to a narrower format 79
2.30 Correctly rounded result for square root . . . . . . . . . . . . 82
2.31 The standard mathematics library functions . . . . . . . . . . 83
2.32 Subtractive cancellation in floating-point arithmetic . . . . . 84
2.33 Another example of subtractive cancellation and how bad it
canbe............................... 86
2.34 Numerically incorrect quadratic roots when using the modified
quadraticformula......................... 87
2.35 An example of correct root finding, although at first glance they
lookincorrect........................... 89
2.36 The example of Listing 2.35 but computed using double-
precisionnumbers ........................ 90
3.1 Computing a dot product of 4-tuples using SSE2 . . . . . . . 94
3.2 Computing the matrix-vector product as four row-vector dot
productsinSSE2......................... 101
3.3 Computing the matrix-vector product as a linear combination
ofcolumnsinSSE2........................ 101
3.4 Computing the matrix-vector product as four row-vector dot
productsinSSE4.1........................ 102
3.5 Transpose of a 4 × 4matrixusingshuffling .......... 103
3.6 Normalizing a vector using SSE2 with a break in the pipeline 103
3.7 Normalizing a vector using SSE2 without a break in the pipeline 104
3.8 The definition of the
Select function for flattening branches . 104
3.9 Flatteningasinglebranch.................... 105
3.10 Flattening a two-level branch where the outer-then clause has
anestedbranch.......................... 105
3.11 Flattening a two-level branch where the outer-else clause has a
nestedbranch........................... 105
3.12 Flattening a two-level branch where the outer clauses have
nestedbranches.......................... 106
3.13 A fast approximation to
1/sqrt(x) for 32-bit floating-point . . 110
3.14 A fast approximation to
1/sqrt(x) for 64-bit floating-point . . 111
3.15 One Remez iteration for updating the locations of the local
extrema .............................. 112
4.1 A vertex shader and a pixel shader for simple vertex coloring
ofgeometricprimitives...................... 134
4.2 A vertex shader and a pixel shader for simple texturing of ge-
ometricprimitives ........................ 136
4.3 HLSL code to draw square billboards . . . . . . . . . . . . . . 139
4.4 A compute shader that implements small-scale Gaussian blur-
ring ................................ 141
http://freepdf-books.com
Listings xix
4.5 The output assembly listing for the vertex shader of
VertexColoring.hlsl for row-major matrix storage . . . . . . . . . 147
4.6 The output assembly listing for the matrix-vector product of
the vertex shader of
VertexColoring.hlsl for column-major matrix
storage............................... 150
4.7 The output assembly listing for the pixel shader of
VertexColoring.hlsl
150
4.8 The output assembly listing for the pixel shader of
Texturing.hlsl 151
4.9 The output assembly listing for the vertex shader of
Billboards.hlsl ............................ 152
4.10 The output assembly listing for the geometry shader of
Billboards.hlsl ............................ 153
4.11 The output assembly listing for the pixel shader of
Billboards.hlsl 155
4.12 The output assembly listing for the compute shader of
GaussianBlurring.hlsl ........................ 156
4.13 The output assembly listing for the compute shader of
GaussianBlurring.hlsl with loop unrolling . . . . . . . . . . . . . 158
4.14 The signature for the
D3DCompile function........... 160
4.15 The signature for the
D3DReflect function ........... 160
4.16 Compile an HLSL program at runtime and start the shader
reflectionsystem ......................... 160
4.17 An example of nested structs for which constant buffers have
one member layout but structured buffers have another member
layout ............................... 162
4.18 A modified listing of the
FXC output from the compute shader
ofListing4.17........................... 163
4.19 The non-default-value members of
D3D11 SHADER DESC for the
computeshaderofListing4.17 ................. 165
4.20 Descriptions about the constant buffers in the compute shader
ofListing4.17........................... 165
4.21 Creating a swap chain for displaying graphics data to a window 170
4.22Creatingabackbuffer ...................... 171
4.23 Common code for setting the usage and CPU access for a de-
scriptionstructure ........................ 175
4.24 The description for a shader resource view and the code to
createtheview .......................... 176
4.25 The description for an unordered access view and the code to
createtheview .......................... 176
4.26 The descriptions for render target and depth-stencil views and
thecodetocreatetheviews................... 177
4.27 Common code for creating an
ID3D11Buffer object....... 180
4.28Creatingaconstantbuffer.................... 181
4.29Creatingatexturebuffer..................... 181
4.30Creatingavertexbuffer ..................... 182
www.allitebooks.com
http://freepdf-books.com
剩余463页未读,继续阅读
yuripa1018
- 粉丝: 0
- 资源: 4
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- OptiX传输试题与SDH基础知识
- C++Builder函数详解与应用
- Linux shell (bash) 文件与字符串比较运算符详解
- Adam Gawne-Cain解读英文版WKT格式与常见投影标准
- dos命令详解:基础操作与网络测试必备
- Windows 蓝屏代码解析与处理指南
- PSoC CY8C24533在电动自行车控制器设计中的应用
- PHP整合FCKeditor网页编辑器教程
- Java Swing计算器源码示例:初学者入门教程
- Eclipse平台上的可视化开发:使用VEP与SWT
- 软件工程CASE工具实践指南
- AIX LVM详解:网络存储架构与管理
- 递归算法解析:文件系统、XML与树图
- 使用Struts2与MySQL构建Web登录验证教程
- PHP5 CLI模式:用PHP编写Shell脚本教程
- MyBatis与Spring完美整合:1.0.0-RC3详解
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功