没有合适的资源?快使用搜索试试~ 我知道了~
首页POWERVR SGX OpenGL ES 2.0 Application Development Recommendations
资源详情
资源评论
资源推荐
Imagination Technologies Copyright
POWERVR SGX 1 Revision 1.8f
POWERVR SGX
OpenGL ES 2.0 Application Development
Recommendations
Copyright © 2011, Imagination Technologies Ltd. All Rights Reserved.
This document is confidential. Neither the whole nor any part of the information contained in, nor the
product described in, this document may be adapted or reproduced in any material form except with
the written permission of Imagination Technologies Ltd. This document can only be distributed to
Imagination Technologies employees, and employees of companies that have signed a
Non-Disclosure Agreement with Imagination Technologies Ltd.
Filename : POWERVR SGX.OpenGL ES 2.0 Application Development
Recommendations.1.8f.External.doc
Version : 1.8f External Issue (Package: POWERVR SDK 2.08.28.0634)
Issue Date : 03 Aug 2009
Author : POWERVR
Imagination Technologies Copyright
Revision 1.8f 2 OpenGL ES 2.0 Application Development
Recommendations
Contents
1.
Introduction .................................................................................................................................4
1.1.
What’s new? .......................................................................................................................4
1.1.1.
Vertex and fragment Shaders ..................................................................................4
1.1.2.
Unified Shader Architecture .....................................................................................4
1.1.3.
Multiple Precisions and Scalar Processing ..............................................................4
1.1.4.
New Texture Formats...............................................................................................5
1.1.5.
Stencil Buffer............................................................................................................5
1.1.6.
Improved Anti-Aliasing .............................................................................................5
1.2.
What’s not-so-new (but important)? ...................................................................................5
1.2.1.
Perfect Hidden Surface Removal.............................................................................5
1.2.2.
Designed for Efficiency and Low Power Consumption ............................................5
1.2.3.
Massive Depth/Stencil Fillrate..................................................................................5
2.
Golden Rules...............................................................................................................................6
3.
Optimisation Strategies..............................................................................................................7
3.1.
SGX Hardware Architecture ...............................................................................................7
3.2.
CPU ....................................................................................................................................8
3.3.
Memory Bus........................................................................................................................8
3.4.
Vertex Shader (USSE) .......................................................................................................8
3.4.1.
Vertex Processing FIFO...........................................................................................8
3.5.
Tiling Co-Processor ............................................................................................................8
3.5.1.
Clipping and Culling .................................................................................................8
3.5.2.
Tiling.........................................................................................................................8
3.6.
Image Synthesis Processor (ISP).......................................................................................9
3.6.1.
Setup ........................................................................................................................9
3.6.2.
Z-Load/Store ............................................................................................................9
3.7.
Fragment Shader (USSE) ..................................................................................................9
3.7.1.
Setup ........................................................................................................................9
3.7.2.
Texture Fetches .......................................................................................................9
3.7.3.
Shader Instructions ..................................................................................................9
3.8.
Parameter Buffer ................................................................................................................9
3.9.
Texture Cache ....................................................................................................................9
4.
Render State Management and Batching...............................................................................10
4.1.
Minimize the Number of State Settings and Draw Calls ..................................................10
4.1.1.
Avoid Redundant Render State Settings ...............................................................10
4.1.2.
Texture Atlases ......................................................................................................10
4.1.3.
Mesh Groups and Dynamic Geometry...................................................................10
4.2.
Cull Invisible Objects ........................................................................................................10
4.3.
Rendering Order...............................................................................................................11
4.3.1.
Opaque First, Blend Last........................................................................................11
4.3.2.
Sorting Opaque Objects by Render State..............................................................11
4.3.3.
Sorting Transparent Objects by Depth...................................................................11
5.
Vertex data.................................................................................................................................12
5.1.1.
Primitive Type.........................................................................................................12
5.1.2.
Interleaved Attributes or Separate Arrays?............................................................12
5.1.3.
Client Side Arrays or Vertex Buffer Objects?.........................................................12
5.1.4.
Attribute Data Types...............................................................................................12
6.
Using Textures..........................................................................................................................13
6.1.
Texture Size......................................................................................................................13
6.1.1.
NPOT Textures ......................................................................................................13
6.2.
Texture Formats and Texture Compression.....................................................................13
6.2.1.
Basic OpenGL ES 2.0 Formats..............................................................................14
6.2.2.
Float and Half Float Texture Formats ....................................................................14
6.2.3.
Compressed Texture Formats ...............................................................................15
Imagination Technologies Copyright
POWERVR SGX 3 Revision 1.8f
6.2.4.
Depth Textures.......................................................................................................16
6.2.5.
How Texture Formats Affect Shaders ....................................................................16
Use Mipmaps! ..................................................................................................................................17
6.2.6.
Mipmap Selection Caveats.....................................................................................17
6.3.
Texture Upload .................................................................................................................17
6.4.
Render to Texture.............................................................................................................18
6.5.
Math lookups ....................................................................................................................18
6.6.
Texture Sampling Performance........................................................................................18
6.6.1.
Trilinear Filtering.....................................................................................................18
6.6.2.
Dependent Texture Reads .....................................................................................18
6.6.3.
Wide Floating Point Textures .................................................................................18
7.
Shaders......................................................................................................................................19
7.1.
Algorithms and Shader Length.........................................................................................19
7.2.
Choosing the Right Precision ...........................................................................................19
7.3.
Attributes...........................................................................................................................20
7.4.
Varyings............................................................................................................................20
7.4.1.
Varying Precision ...................................................................................................20
7.5.
Samplers...........................................................................................................................20
7.6.
Uniforms ...........................................................................................................................20
7.6.1.
Uniform Calculations ..............................................................................................20
7.7.
Scalar Operation...............................................................................................................21
7.7.1.
Sparse Matrices .....................................................................................................21
7.8.
Know Your Spaces ...........................................................................................................22
7.9.
Flow Control......................................................................................................................22
7.10.
Discard..............................................................................................................................23
GL State That Affects Shader Execution .........................................................................................24
7.10.1.
Vertex Attribute Type Conversion ..........................................................................24
7.10.2.
Texture Formats.....................................................................................................24
7.10.3.
Framebuffer Blending and Colour Mask ................................................................24
7.11.
Dos and Don’ts .................................................................................................................24
8.
Target a fixed framerate ...........................................................................................................25
List of Figures
Figure 1 SGX HW Architecture ...............................................................................................................7
Figure 2 Comparison of PVRTC and S3TC for a skybox texture .........................................................15
Imagination Technologies Copyright
Revision 1.8f 4 OpenGL ES 2.0 Application Development
Recommendations
1. Introduction
POWERVR SGX is a new family of GPU cores from Imagination Technologies designed specifically
for shader-based APIs like OpenGL ES 2.0. Due to its scalable architecture, the SGX family spans a
huge performance range.
This document contains recommendations and advice for developers who wish to use version 2.0 of
the OpenGL ES API on POWERVR SGX enabled devices. Therefore references to hardware features
and characteristics apply to members of the POWERVR SGX family targeted at the mobile and
embedded markets, programmed using the OpenGL ES APIs.
1.1. What’s new?
1.1.1. Vertex and fragment Shaders
The most significant departure from the previous generation is the introduction of vertex and fragment
shaders, replacing a large part of the fixed-function vertex and fragment pipeline with fully
programmable stages. This gives developers a huge amount of flexibility to create complex and
convincing visual effects, and to offload work to the GPU that previously had to be performed by the
CPU. A large part of this document is devoted to guidelines for shader programming.
Some parts of the pipeline – such as rasterization, visibility testing, and texture filtering – remain fixed-
function to guarantee maximum efficiency.
1.1.2. Unified Shader Architecture
The architecture of the POWERVR SGX family is based around its fully programmable multi-threaded
Universal Scalable Shader Engine (USSE), which performs all the processing for both vertex and
fragment shaders. This unification of vertex and fragment processing has two very important
implications:
• Load Balancing: In an architecture with separate units for vertex and fragment processing
precious cycles are often wasted due to the workloads not exactly matching the hardware
capabilities. Reducing the vertex workload will not improve performance if the fragment
shader is the bottleneck. On POWERVR SGX however, a reduction in vertex processing will
result in more cycles available for fragment shading, and vice versa, thus always resulting in
maximum utilization of the USSE core.
• Mostly Identical Capabilities: Both vertex and fragment shaders support full flow control.
Both can read from textures. The same precisions are available in both shader types. Some
limitations are imposed by the API and by the inherent differences between vertices and
fragments, though.
The USSE pipeline also performs some “fixed-function” tasks such as alpha blending and conversion
of vertex attributes to floating point values. While this implies a performance penalty when using these
features, it increases the overall GPU efficiency because there are no specialized hardware units
which would be idle when those features are not used.
1.1.3. Multiple Precisions and Scalar Processing
To achieve best performance for a variety of tasks, the USSE supports multiple precisions. The GLSL
ES precision modifiers, lowp, mediump, and highp, map to 10-bit fixed point, 16-bit float and 32-bit
float types, respectively. While lowp computations are performed as 3 and 4 component vector
operations, mediump operations use 2 component vectors. When using highp precision, the USSE
operates on scalar values.
This scalar processing has a significant influence on low-level shader optimizations. Scalar code can
be a lot more efficient than vector code since you only need to calculate those vector components that
contribute to the final result. But keep in mind that you don’t get calculations “for free” by squeezing
multiple scalars into a vector.
Imagination Technologies Copyright
POWERVR SGX 5 Revision 1.8f
1.1.4. New Texture Formats
In addition to the texture formats already supported by POWERVR MBX (including common formats
like 16 and 32-bit RGBA and PVRTC compressed textures), members of the POWERVR SGX family
can handle 16 and 32-bit floating point textures as well as depth textures.
POWERVR SGX also supports the ETC compressed texture format which is expected to be
implemented by a wide range of devices, and introduces support for cube map textures and limited
non-power-of-two (NPOT) textures as required by OpenGL ES 2.0.
1.1.5. Stencil Buffer
One of the fixed-function features missing from POWERVR MBX, stencil buffer support has been
added to POWERVR SGX to allow techniques such as stencil shadows or constructive solid geometry
(CSG).
1.1.6. Improved Anti-Aliasing
POWERVR SGX further improves the anti-aliasing performance and quality of the previous
generation MBX family by offering 4-sample sparse grid multisampling anti-aliasing (MSAA) which
offers quality that often comes close to 16-sample ordered grid anti-aliasing.
Anti-aliasing is essential for achieving the best image quality. Since the POWERVR architecture can
keep the multisample buffers entirely on-chip and only writes out the resolved framebuffer to memory,
POWERVR SGX does not suffer from the large framebuffer memory overhead typical immediate
mode renderers (IMR) have to bear when using multisampling.
1.2. What’s not-so-new (but important)?
While POWERVR SGX is in many ways a massive leap forwards from the MBX family, the
fundamentals of the proven concept have been retained.
1.2.1. Perfect Hidden Surface Removal
POWERVR graphics cores employ a method called Tile Based Deferred Rendering (TBDR). All the
geometry data and state required for a frame is captured into a scene buffer in memory, and fragment
processing is deferred until the scene has been completely submitted.
Because the whole scene is known at the time of rendering, the POWERVR architecture can perfectly
compute the visibility of any surface before performing any fragment processing, independent of the
order in which scene elements were submitted. This means applications do not have to resort to
depth sorting a scene to improve rendering performance. With this method, only those parts of
surfaces that are visible in the final image will be computed by the fragment pipeline.
For rendering the viewport is divided into tiles, rectangular regions which are rendered individually
using on-chip colour and depth buffers. The use of on-chip buffers ensures that no external bandwidth
is used for framebuffer blending and visibility testing, significantly lowering overall bandwidth
requirements.
1.2.2. Designed for Efficiency and Low Power Consumption
In mobile devices, battery life is of paramount importance. The POWERVR SGX family therefore
employs sophisticated power saving techniques to keep battery drain at a minimum. While
POWERVR SGX provides a feature set that rivals that of recent desktop PC hardware, the power
consumption of those mobile devices is more than two orders of magnitude below that of high-end PC
graphics cards. Obviously this also has an impact on performance, and you should not expect to run
the latest, hundreds-of-instructions PC game shaders on a mobile device at interactive framerates,
even with a screen at just VGA resolution.
1.2.3. Massive Depth/Stencil Fillrate
Because the visibility determination requires an enormous amount of depth and stencil tests,
POWERVR graphics cores are designed to perform these at a very high rate. This, along with the
bandwidth savings provided by the on-chip depth buffer as well as the addition of stencil functionality
and depth textures, makes POWERVR SGX a perfect match for techniques such as stencil shadows
and depth shadow maps.
剩余24页未读,继续阅读
lexlee
- 粉丝: 2
- 资源: 17
上传资源 快速赚钱
- 我的内容管理 收起
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
会员权益专享
最新资源
- ExcelVBA中的Range和Cells用法说明.pdf
- 基于单片机的电梯控制模型设计.doc
- 主成分分析和因子分析.pptx
- 共享笔记服务系统论文.doc
- 基于数据治理体系的数据中台实践分享.pptx
- 变压器的铭牌和额定值.pptx
- 计算机网络课程设计报告--用winsock设计Ping应用程序.doc
- 高电压技术课件:第03章 液体和固体介质的电气特性.pdf
- Oracle商务智能精华介绍.pptx
- 基于单片机的输液滴速控制系统设计文档.doc
- dw考试题 5套.pdf
- 学生档案管理系统详细设计说明书.doc
- 操作系统PPT课件.pptx
- 智慧路边停车管理系统方案.pptx
- 【企业内控系列】企业内部控制之人力资源管理控制(17页).doc
- 温度传感器分类与特点.pptx
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论2