GPU Gems 3：探索现代GPU编程技术

5星 · 超过95%的资源需积分: 10 152 浏览量更新于2024-07-25 收藏 15.05MB PDF 举报

"GPU Gems 3 是一本由Addison-Wesley Professional在2007年8月12日出版的专业书籍，主要关注现代图形处理单元（GPU）编程技术的最新发展。随着GPU的可编程性增强，开发者不仅能够实现个性化设计，还能利用这种强大的计算能力进行非图形应用，如物理模拟、金融分析甚至病毒检测，特别是在CUDA架构的支持下。尽管图形应用仍然是GPU的主要用途，但书中介绍的最新算法能够创造出超逼真的角色、更好的光照效果和后期渲染合成效果。" GPU Gems 3 作为系列的第三部，内容丰富，旨在提供图形渲染和实时计算的精华知识。书中的章节涵盖了多个关键领域，包括几何学、动画和DirectX 10等。以下是部分内容的详细说明： 1. 第一部分：几何学 - 第1章：使用GPU生成复杂的程序化地形这一章深入探讨了如何利用GPU生成精细的地形，包括介绍Marching Cubes算法和密度函数，对地形生成系统的概述，如何在地形块中生成多边形，以及纹理和着色的技术。作者还讨论了实际应用中应考虑的问题，如性能优化。 - 第2章：动画人群渲染动画人群渲染是游戏和电影特效中常见的需求。本章介绍了动机，并简要回顾了实例化技术，详细阐述了实现高效人群渲染的技巧，同时考虑了其他因素，如动态行为和碰撞检测。 2. DirectX 10 第三章涉及DirectX 10，这是微软推出的图形API，支持下一代游戏和应用程序。这一部分可能会涵盖DirectX 10的新特性、硬件加速、Shader Model 4.0的使用，以及如何利用这些工具来提升图形质量和性能。每一章末尾都提供了参考文献，读者可以进一步深入研究相关主题。GPU Gems 3 是一个宝贵的资源，适合于希望提升GPU编程技能，尤其是从事实时渲染和3D图形开发的工程师和技术人员。通过学习书中的案例和实践，读者能够掌握最新的GPU编程技术，将GPU的强大性能应用于各种应用场景。

Section 33.1. Parallel Processing

Section 33.2. The Physics Pipeline

Section 33.3. Determining Contact Points

Section 33.4. Mathematical Optimization

Section 33.5. The Convex Distance Calculation

Section 33.6. The Parallel LCP Solution Using CUDA

Section 33.7. Results

Section 33.8. References

Chapter 34. Signed Distance Fields Using Single-Pass GPU Scan Conversion of Tetrahedra

Section 34.1. Introduction

Section 34.2. Leaking Artifacts in Scan Methods

Section 34.3. Our Tetrahedra GPU Scan Method

Section 34.4. Results

Section 34.5. Conclusion

Section 34.6. Future Work

Section 34.7. Further Reading

Section 34.8. References

Part VI: GPU Computing

Chapter 35. Fast Virus Signature Matching on the GPU

Section 35.1. Introduction

Section 35.2. Pattern Matching

Section 35.3. The GPU Implementation

Section 35.4. Results

Section 35.5. Conclusions and Future Work

Section 35.6. References

Chapter 36. AES Encryption and Decryption on the GPU

Section 36.1. New Functions for Integer Stream Processing

Section 36.2. An Overview of the AES Algorithm

Section 36.3. The AES Implementation on the GPU

Section 36.4. Performance

Section 36.5. Considerations for Parallelism

Section 36.6. Conclusion and Future Work

Section 36.7. References

Chapter 37. Efficient Random Number Generation and Application Using CUDA

Section 37.1. Monte Carlo Simulations

Section 37.2. Random Number Generators

Section 37.3. Example Applications

Section 37.4. Conclusion

Section 37.5. References

Chapter 38. Imaging Earth's Subsurface Using CUDA

Section 38.1. Introduction

Section 38.2. Seismic Data

Section 38.3. Seismic Processing

Section 38.4. The GPU Implementation

Section 38.5. Performance

Section 38.6. Conclusion

Section 38.7. References

Chapter 39. Parallel Prefix Sum (Scan) with CUDA

Section 39.1. Introduction

Section 39.2. Implementation

Section 39.3. Applications of Scan

Section 39.4. Conclusion

Section 39.5. References

Chapter 40. Incremental Computation of the Gaussian

Section 40.1. Introduction and Related Work

Section 40.2. Polynomial Forward Differencing

Section 40.3. The Incremental Gaussian Algorithm

Section 40.4. Error Analysis

Section 40.5. Performance

Section 40.6. Conclusion

Section 40.7. References

Chapter 41. Using the Geometry Shader for Compact and Variable-Length GPU Feedback

Section 41.1. Introduction

Section 41.2. Why Use the Geometry Shader?

Section 41.3. Dynamic Output with the Geometry Shader

Section 41.4. Algorithms and Applications

Section 41.5. Benefits: GPU Locality and SLI

Section 41.6. Performance and Limits

Section 41.7. Conclusion

Section 41.8. References

Addison-Wesley Warranty on the DVD

NVIDIA Statement on the Software

DVD System Requirements

Inside Back Cover

Geometry

Light and Shadows

Rendering

Image Effects

Physics Simulation

GPU Computing

Index

[AMBER DEMO]

About the Cover: The image on the cover has been rendered in real time in the "Human Head" technology demonstration created by the

NVIDIA Demo Team. It illustrates the extreme level of realism achievable with the GeForce 8 Series of GPUs. The demo renders skin by using

a physically based model that was previously used only in high-profile prerendered movie projects. Actor Doug Jones is the model represented

in the demo. He recently starred as the Silver Surfer in

Fantastic Four: Rise of the Silver Surfer.

Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations

appear in this book, and the publisher was aware of a trademark claim, the designations have been printed with initial capital letters or in all

capitals.

GeForceâ„¢, CUDAâ„¢, and NVIDIA Quadro

Â®

are trademarks or registered trademarks of NVIDIA Corporation.

The authors and publisher have taken care in the preparation of this book, but make no expressed or implied warranty of any kind and assume no

responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use

of the information or programs contained herein.

NVIDIA makes no warranty or representation that the techniques described herein are free from any Intellectual Property claims. The reader

assumes all risk of any such claims based on his or her use of these techniques.

The publisher offers excellent discounts on this book when ordered in quantity for bulk purchases or special sales, which may include electronic

versions and/or custom covers and content particular to your business, training goals, marketing focus, and branding interests. For more

information, please contact:

U.S. Corporate and Government Sales

(800) 382-3419

corpsales@pearsontechgroup.com

For sales outside of the United States, please contact:

International Sales

international@pearsoned.com

Visit us on the Web: www.awprofessional.com

Library of Congress Cataloging-in-Publication Data

GPU gems 3 / edited by Hubert Nguyen.

p. cm.

Includes bibliographical references and index.

ISBN-13: 978-0-321-51526-1 (hardback : alk. paper)

ISBN-10: 0-321-51526-9

1. Computer graphics. 2. Real-time programming. I. Nguyen, Hubert.

T385.G6882 2007

006.6'6â€”dc22

2007023985

the publisher prior to any prohibited reproduction, storage in a retrieval system, or transmission in any form or by any means, electronic,

mechanical, photocopying, recording, or likewise. For information regarding permissions, write to:

Pearson Education, Inc.

Rights and Contracts Department

501 Boylston Street, Suite 900

Boston, MA 02116

Fax: (617) 671-3447

ISBN-13: 978-0-321-51526-1

Text printed in the United States on recycled paper at Courier in Kendallville, Indiana.

Second printing, December 2007

[AMBER DEMO]

Foreword

Composition, the organization of elemental operations into a nonobvious whole, is the essence of imperative programming. The instruction set

architecture (ISA) of a microprocessor is a versatile composition interface, which programmers of software renderers have used effectively and

creatively in their quest for image realism. Early graphics hardware increased rendering performance, but often at a high cost in composability,

and thus in programmability and application innovation. Hardware with microprocessor-like programmability did evolve (for example, the

Ikonas Graphics System), but the dominant form of graphics hardware acceleration has been organized around a fixed sequence of rendering

operations, often referred to as the graphics pipeline

. Early interfaces to these systemsâ€”such as CORE and later, PHIGSâ€”allowed programmers to specify rendering results, but they were not

designed for composition.

OpenGL, which I helped to evolve from its Silicon Graphics-defined predecessor IRIS GL in the early 1990s, addressed the need for

composability by specifying an architecture (informally called the OpenGL Machine

) that was accessed through an imperative programmatic interface. Many featuresâ€”for example, tightly specified semantics; table-driven

operations such as stencil and depth-buffer functions; texture mapping exposed as a general 1D, 2D, and 3D lookup function; and required

repeatability propertiesâ€”ensured that programmers could compose OpenGL operations with powerful and reliable results. Some of the useful

techniques that OpenGL enabled include texture-based volume rendering, shadow volumes using stencil buffers, and constructive solid geometry

algorithms such as capping (the computation of surface planes at the intersections of clipping planes and solid objects defined by polygons).

Ultimately, Mark Peercy and the coauthors of the SIGGRAPH 2000 paper "Interactive Multi-Pass Programmable Shading" demonstrated that

arbitrary RenderMan shaders could be accelerated through the composition of OpenGL rendering operations.

During this decade, increases in the raw capability of integrated circuit technology allowed the OpenGL architecture (and later, Direct3D) to be

extended to expose an ISA interface. These extensions appeared as programmable vertex and fragment shaders within the graphics pipeline and

now, with the introduction of CUDA, as a data-parallel ISA in near parity with that of the microprocessor. Although the cycle toward complete

microprocessor-like versatility is not complete, the tremendous power of graphics hardware acceleration is more accessible than ever to

programmers.

And what computational power it is! At this writing, the NVIDIA GeForce 8800 Ultra performs over 400 billion floating-point operations per

secondâ€”more than the most powerful supercomputer available a decade ago, and five times more than today's most powerful microprocessor.

The data-parallel programming model the Ultra supports allows its computational power to be harnessed without concern for the number of

processors employed. This is critical, because while today's Ultra already includes over 100 processors, tomorrow's will include thousands, and

then more. With no end in sight to the annual compounding of integrated circuit density known as Moore's Law, massively parallel systems are

clearly the future of computing, with graphics hardware leading the way.

GPU Gems 3

is a collection of state-of-the-art GPU programming examples. It is about putting data-parallel processing to work. The first four sections focus

on graphics-specific applications of GPUs in the areas of geometry, lighting and shadows, rendering, and image effects. Topics in the fifth and

sixth sections broaden the scope by providing concrete examples of nongraphical applications that can now be addressed with data-parallel GPU

technology. These applications are diverse, ranging from rigid-body simulation to fluid flow simulation, from virus signature matching to

encryption and decryption, and from random number generation to computation of the Gaussian.

Where is this all leading? The cover art reminds us that the mind remains the most capable parallel computing system of all. A long-term goal of

computer science is to achieve and, ultimately, to surpass the capabilities of the human mind. It's exciting to think that the computer graphics

community, as we identify, address, and master the challenges of massively parallel computing, is contributing to the realization of this dream.

Kurt Akeley

Microsoft Research

[AMBER DEMO]

Preface

It has been only three years since the first GPU Gems book was introduced, and some areas of real-time graphics have truly become

ultrarealistic. Chapter 14

, "Advanced Techniques for Realistic Real-Time Skin Rendering," illustrates this evolution beautifully, describing a skin rendering technique

that works so well that the data acquisition and animation will become the most challenging problem in rendering human characters for the next

couple of years.

All this progress has been fueled by a sustained rhythm of GPU innovation. These processing units continue to become faster and more flexible

in their use. Today's GPUs can process enormous amounts of data and are used not only for rendering 3D scenes, but also for processing images

or performing massively parallel computing, such as financial statistics or terrain analysis for finding new oil fields.

Whether they are used for computing or graphics, GPUs need a software interface to drive them, and we are in the midst of an important

transition. The new generation of APIs brings additional orthogonality and exposes new capabilities such as generating geometry

programmatically. On the computing side, the CUDA architecture lets developers use a C-like language to perform computing tasks rather than

forcing the programmer to use the graphics pipeline. This architecture will allow developers without a graphics background to tap into the

immense potential of the GPU.

More than 200 chapters were submitted by the GPU programming community, covering a large spectrum of GPU usage ranging from pure 3D

rendering to nongraphics applications. Each of them went through a rigorous review process conducted both by NVIDIA's engineers and by

external reviewers.

We were able to include 41 chapters, each of which went through another review, during which feedback from the editors and peer reviewers

often significantly improved the content. Unfortunately, we could not include some excellent chapters, simply due to the space restriction of the

book. It was difficult to establish the final table of contents, but we would like to thank everyone who sent a submission.

Intended Audience

For the graphics-related chapters, we expect the reader to be familiar with the fundamentals of computer graphics including graphics APIs such

as DirectX and OpenGL, as well as their associated high-level programming languages, namely HLSL, GLSL, or Cg. Anyone working with

interactive 3D applications will find in this book a wealth of applicable techniques for today's and tomorrow's GPUs.

Readers interested in computing and CUDA will find it best to know parallel computing concepts. C programming knowledge is also expected.

Trying the Code Samples

GPU Gems 3 comes with a disc that includes samples, movies, and other demonstrations of the techniques described in this book. You can also

go to the book's Web page to find the latest updates and supplemental materials: developer.nvidia.com/gpugems3

Acknowledgments

This book represents the dedication of many peopleâ€”especially the numerous authors who submitted their most recent work to the GPU

community by contributing to this book. Without a doubt, these inspirational and powerful chapters will help thousands of developers push the

envelope in their applications.

Our section editorsâ€”Cyril Zeller, Evan Hart, Ignacio Castaño Aguado, Kevin Bjorke, Kevin Myers, and Nolan Goodnightâ€”took on an

invaluable role, providing authors with feedback and guidance to make the chapters as good as they could be. Without their expertise and

contributions above and beyond their usual workload, this book could not have been published.

Ensuring the clarity of GPU Gems 3

required numerous diagrams, illustrations, and screen shots. A lot of diligence went into unifying the graphic style of about 500 figures, and we

thank Michael Fornalski and Jim Reed for their wonderful work on these. We are grateful to Huey Nguyen and his team for their support for

many of our projects. We also thank Rory Loeb for his contribution to the amazing book cover design and many other graphic elements of the

book.

We would also like to thank Catherine Kilkenny and Teresa Saffaie for tremendous help with copyediting as chapters were being worked on.

Randy Fernando, the editor of the previous

GPU Gems books, shared his wealth of experience acquired in producing those volumes.

We are grateful to Kurt Akeley for writing our insightful and forward-looking foreword.

At Addison-Wesley, Peter Gordon, John Fuller, and Kim Boedigheimer managed this project to completion before handing the marketing aspect

to Curt Johnson. Christopher Keane did fantastic work on the copyediting and typesetting.

The support from many executive staff members from NVIDIA was critical to this endeavor: Tony Tamasi and Dan Vivoli continually value the

creation of educational material and provided the resources necessary to accomplish this project.

We are grateful to Jen-Hsun Huang for his continued support of the

GPU Gems series and for creating an environment that encourages innovation and teamwork.

We also thank everyone at NVIDIA for their support and for continually building the technology that changes the way people think about

computing.

Hubert Nguyen

NVIDIA Corporation

[AMBER DEMO]

Contributors

Thomas Alexander, Polytime

Thomas Alexander cofounded Exapath, a startup focused on mapping networking algorithms onto GPGPUs. Previously he was at Juniper

Networks working in the Infrastructure Product Group building core routers. Thomas has a Ph.D. in electrical engineering from Duke University,

where he also worked on a custom-built parallel machine for ray casting.

Kavita Bala, Cornell University

Kavita Bala is an assistant professor in the Computer Science Department and Program of Computer Graphics at Cornell University. Bala

specializes in scalable rendering for high-complexity illumination, interactive global illumination, perceptually based rendering, and

image-based texturing. Bala has published research papers and served on the program committees of several conferences, including SIGGRAPH.

In 2005, Bala cochaired the Eurographics Symposium on Rendering. She has coauthored the graduate-level textbook Advanced Global

Illumination, 2nd ed

. (A K Peters, 2006). Before Cornell, Bala received her S.M. and Ph.D. from the Massachusetts Institute of Technology, and her B.Tech. from

the Indian Institute of Technology Bombay.

Kevin Bjorke, NVIDIA Corporation

Kevin Bjorke is a member of the Technology Evangelism group at NVIDIA, and continues his roles as editor and contributor to the previous

volumes of GPU Gems

. He has a broad background in production of both live-action and animated films, TV, advertising, theme park rides, print, andâ€”of

courseâ€”games. Kevin has been a regular speaker at events such as SIGGRAPH and GDC since the mid-1980s. His current work focuses on

applying NVIDIA's horsepower and expertise to help developers fulfill their individual ambitions.

Jean-Yves Blanc, CGGVeritas

Jean-Yves Blanc received a Ph.D. in applied mathematics in 1991 from the Institut National Polytechnique de Grenoble, France. He joined CGG

in 1992, where he introduced and developed parallel processing for high-performance computing seismic applications. He is now in charge of IT

strategy for the Processing and Reservoir product line.

Jim Blinn, Microsoft Research

Jim Blinn began doing computer graphics in 1968 while an undergraduate at the University of Michigan. In 1974 he became a graduate student

at the University of Utah, where he did research in specular lighting models, bump mapping, and environment/reflection mapping and received a

Ph.D. in 1977. He then went to JPL and produced computer graphics animations for various space missions to Jupiter, Saturn, and Uranus, as

well as for Carl Sagan's PBS series "Cosmos" and for the Annenberg/CPB-funded project "The Mechanical Universe," a 52-part telecourse to

teach college-level physics. During these productions he developed several other techniques, including work in cloud simulation, displacement

mapping, and a modeling scheme variously called blobbies or metaballs. Since 1987 he has written a regular column in the IEEE Computer

Graphics and Applications

journal, where he describes mathematical techniques used in computer graphics. He has just published his third volume of collected articles

from this series. In 1995 he joined Microsoft Research as a Graphics Fellow. He is a MacArthur Fellow, a member of the National Academy of

Engineering, has an honorary Doctor of Fine Arts degree from Otis Parsons School of Design, and has received both the SIGGRAPH Computer

Graphics Achievement Award (1983) and the Steven A. Coons Award (1999).

George Borshukov, Electronic Arts

George Borshukov is a CG supervisor at Electronic Arts. He holds an M.S. from the University of California, Berkeley, where he was one of the

creators of The Campanile Movie and real-time demo (1997). He was technical designer for the "bullet time" sequences in The Matrix (1999)

and received an Academy Scientific and Technical Achievement Award for the image-based rendering technology used in the film. Borshukov

led the development of photoreal digital actors for The Matrix sequels (2003) and received a Visual Effects Society Award for the design and

application of the Universal Capture system in those films. Other film credits include What Dreams May Come (1998), Mission: Impossible 2

(2000), and Michael Jordan to the Max (2000). He is also a co-inventor of the UV pelting approach for parameterization and seamless texturing

of polygonal or subdivision surfaces. He joined Electronic Arts in 2004 to focus on setting a new standard for facial capture, animation, and

rendering in next-generation interactive entertainment. He conceived the Fight Night Round 3

concept and the Tiger Woods tech demos presented at Sony's E3 events in 2005 and 2006.

Tamy Boubekeur, LaBRIâ€“INRIA, University of Bordeaux

Tamy Boubekeur is a third-year Ph.D. student in computer science at INRIA in Bordeaux, France. He received an M.Sc. in computer science

from the University of Bordeaux in 2004. His current research focuses on 3D geometry processing and real-time rendering. He has developed

new algorithms and data structures for the 3D acquisition pipeline, publishing several scientific papers in the fields of efficient processing and

interactive editing of large 3D objects, hierarchical space subdivision structures, point-based graphics, and real-time surface refinement methods.

He also teaches geometric modeling and virtual reality at the University of Bordeaux.

Ralph Brunner, Apple

剩余611页未读，继续阅读

ndsc1008

粉丝: 9

GPU Gems 3：探索现代GPU编程技术

NVIDIA GPU Gems 3：一线研发人员的宝贵指导资源

探索GPU Gems 3：高级图形编程工程源码

GPU Gems 3：图形技术深度探索

Gpu Gems 3

GPU GEMS 3

GPU 精粹 3 GPU Gems 3

GPU gems 3 part3

GPU GEMS 3 part3

GPU Gems3 part3

GPU Gems 3 part3

最新资源