OpenCL 2.2开发指南：API与命令队列详解

3星 · 超过75%的资源需积分: 10 125 浏览量更新于2024-07-19 1 收藏 1.42MB PDF 举报

身份认证购VIP最低享 7 折!

领优惠券(最高得80元）

OpenCL 2.2 Reference Guide 是一份全面的开发指南，旨在帮助用户有效地利用 OpenCL 进行高效且并行计算。这份文档的核心内容围绕着 OpenCL API（Application Programming Interface），它是 OpenCL 运行时的核心组件，负责管理 OpenCL 对象如命令队列、内存对象、程序对象和内核对象，以及执行与这些对象相关的操作。在 OpenCL API 中，"cl_command_queue" 是一个关键概念，它代表了程序执行命令的线程池，是 OpenCL 程序的基本工作单元。"clCreateCommandQueueWithProperties" 函数是创建命令队列的基石，该函数接受四个参数： 1. `cl_context context`：表示上下文，是程序运行的环境，包含了设备和内存等资源。 2. `cl_device_id device`：指定执行命令的设备，可以是 CPU、GPU 或其他硬件加速器。 3. `const cl_command_queue_properties* properties`：这是一个指向包含队列属性的零终止列表，属性包括但不限于： - `CL_QUEUE_SIZE`：设置命令队列的大小，用于控制并发任务的数量。 - `CL_QUEUE_PROPERTIES`：这是一个位字段，可以设置多个属性，例如支持异步执行模式（OUT_OF_ORDER_EXEC_MODE_ENABLE）、启用性能分析（PROFILING_ENABLE）或在设备上执行默认行为（ON_DEVICE）。 - `CL_QUEUE_THROTTLE_{HIGH,MED,LOW}_KHR`：在 Khronos 钩子扩展下，用于调整设备执行命令的速度，如低优先级（LOW）、中等优先级（MED）或高优先级（HIGH）。 - `CL_QUEUE_PRIORITY_KHR`：优先级标记，允许开发者设置高（CL_QUEUE_PRIORITY_HIGH_KHR）、中（CL_QUEUE_PRIORITY_MED_KHR）或低（CL_QUEUE_PRIORITY_LOW_KHR）的执行顺序，需要 cl_khr_priority_hints 扩展支持。通过这个函数，开发者可以定制命令队列的行为，以便根据应用需求优化性能和资源使用。创建命令队列后，开发者可以使用其执行 kernel（内核）调用、读取或写入内存对象，从而实现并行计算任务。总结来说，OpenCL 2.2 Reference Guide 提供了丰富的开发工具，使得用户能够深入了解如何创建、配置和管理命令队列，进而利用 OpenCL 架构进行高性能计算。对于想要在不同硬件平台上实现并行处理的开发者来说，理解和掌握这部分内容至关重要。

资源详情

资源推荐

OpenCL 2.2 Reference Guide Page 5

OpenCL C++ Language

OpenCL C++ and C++ 14

The OpenCL C++ programming language is based on the

ISO/IEC JTC1 SC22 WG21 N3690 language (a.k.a. C++14)

specicaon with specic restricons and excepons.

Secon numbers denoted here with § refer to the C++ 14

specicaon.

• Implicit conversions for pointer types follow the rules

described in the C++ 14 specicaon.

• Conversions between integer types follow the conversion

rules specied in the C++14 specicaon except for

specic out-of-range behavior and saturated conversions.

• The preprocessing direcves dened by the C++14

specicaon are supported.

• Macro names dened by the C++14 specicaon but not

currently supported by OpenCL are reserved for future

use.

• OpenCL C++ standard library implements modied version

of the C++ 14 numeric limits library.

• OpenCL C++ implements the following parts of the C++ 14

iterator library: Primives, iterator operaons, predened

iterators, and range access.

• The OpenCL C++ kernel language doesn’t support variadic

funcons and variable length arrays.

• OpenCL C++ library implements most of the C++14 tuples

except for allocator related traits (§ 20.4.2.8).

• OpenCL C++ supports type traits dened in the C++ 14

specicaon with addions and changes to the following:



UnaryTypeTraits (§ 3.15.1)



BinaryTypeTraits (§ 3.15.2)



TransformaonTraits (§ 3.15.3)

• OpenCL C++ standard library implements most C++ 14

tuples excluding allocator related traits.

• C++14 features not supported by OpenCL C++:



the dynamic_cast operator (§ 5.2.7)



type idencaon (§ 5.2.8)



recursive funcon calls (§ 5.2.2, item 9) unless they are a

compile-me constant expression



non-placement new and delete operators (§ 5.3.4, 5.3.5)



goto statement (§ 6.6)



virtual funcon qualier (§ 7.1.2)



funcon pointers (§ 8.3.5, 8.5.3) unless they are a

compile-me constant expression



virtual funcons and abstract classes (§ 10.3, 10.4)



excepon handling (§ 15)



the C++ standard library (§ 17 … 30)



asm declaraon (§ 7.4)



no implicit lambda to funcon pointer conversion (§ 5.1.2)

OpenCL C++ Language Reference

Secon and table references are to the OpenCL 2.2 C++ Language specicaon.

Qualiers and Oponal Aributes

Funcon Qualier [2.6.1]

__kernel, kernel

Type and Variable Aributes [2.8]

[[cl::aligned(X)]] [[cl::aligned]]

Species a minimum alignment (in bytes) for variables of the

specied type.

[[cl::packed]]

Species that each member of the structure or union is placed to

minimize the memory required.

Kernel Funcon Aributes [2.8.3]

[[cl::work_group_size_hint(X, Y, Z)]]

A hint to the compiler to specify the value most likely

to be specied by the local_work_size argument to

clEnqueueNDRangeKernel.

[[cl::required_work_group_size(X, Y, Z)]]

The work-group size that must be used as the local_work_size

argument to clEnqueueNDRangeKernel.

[[cl::required_num_sub_groups(X)]]

The number of sub-groups that must be generated by a kernel

launch.

[[cl::vec_type_hint(<type>)]]

A hint to the compiler as a representaon of the computaonal

width of the kernel.

Kernel Parameter Aribute [2.8.4]

[[cl::max_size(n)]]

The value of the aribute species the maximum size in bytes of

the corresponding memory object.

Loop Aributes [2.8.5]

[[cl::unroll_hint(n)]] [[cl::unroll_hint]]

Used to specify that a loop (for, while, and do loops) can be

unrolled.

[[cl::ivdep(len)]] [[cl::ivdep]]

A hint to indicate that the compiler may assume there are

no memory dependencies across loop iteraons in order to

autovectorize consecuve iteraons of loop.

Conversions and Reinterpretaon

Header <opencl_convert>

Conversion types [3.2]

Conversions are available for the scalar types bool, char,

uchar, short, ushort, int, uint, long, ulong, half (if cl_khr_fp16

extension is enabled), oat, double (if cl_khr_fp64 is enabled),

and derived vector types.

template <class T, rounding_mode rmode, class U>

T convert_cast(U const& arg);

template <class T, rounding_mode rmode>

T convert_cast(T const& arg);

// and more...

Rounding modes [3.2.3]

::rte to nearest even ::rtz toward zero

::rtp toward + innity ::rtn toward - innity

Reinterpreng types [3.3]

Header <opencl_reinterpret>

Supported data types except bool and void may be

reinterpreted as another data type of the same size using the

as_type funcon for scalar and vector data types.

template <class T, class U>

T as_type(U const& arg);

Preprocessor Direcves & Macros [2.7]

#pragma OPENCL FP_CONTRACT on-o-switch

on-o-switch: ON, OFF, or DEFAULT

#pragma OPENCL EXTENSION extensionname : behavior

#pragma OPENCL EXTENSION all : behavior

__FILE__ Current source le

__LINE__ Integer line number

__OPENCL_CPP_VERSION__ Integer version number, e.g: 100

__func__ Current funcon name

Supported Data Types [3.1]

Header <opencl_def>

cl_* types have exactly the same size as their host counterparts

dened in <cl_plaorm.h> le. Half types require cl_khr_fp16.

Double types require that cl_khr_fp64 be enabled and that

CL_DEVICE_DOUBLE_FP_CONFIG is not zero.

Built-in scalar data types

OpenCL Type API Type Descripon

bool -- true (1) or false (0)

char cl_char 8-bit signed

unsigned char, uchar cl_uchar 8-bit unsigned

short cl_short 16-bit signed

unsigned short, ushort cl_ushort 16-bit unsigned

int cl_int 32-bit signed

unsigned int, uint cl_uint 32-bit unsigned

long cl_long 64-bit signed

unsigned long, ulong cl_ulong 64-bit unsigned

oat cl_oat 32-bit oat

double cl_double 64-bit IEEE 754

half cl_half 16-bit oat (storage only)

void void empty set of values

Built-in vector data types

n is 2, 3, 4, 8, or 16. The halfn vector data type is required to be

supported as a data storage format.

OpenCL Type API Type Descripon

bool

[u]char

cl_[u]charn 8-bit [un]signed

[u]short

cl_ [u]shortn 16-bit [un]signed

[u]int

cl_ [u]intn 32-bit [un]signed

[u]long

cl_ [u]longn 64-bit [un]signed

oat

cl_oatn 32-bit oat

double

cl_doublen 64-bit oat

half

cl_ halfn 16-bit oat

Other types

[3.7.1, 3.8.1]

Header <opencl_image>

Image and sampler types require CL_DEVICE_IMAGE_SUPPORT

is CL_TRUE. See header <opencl_pipe> for pipe type. See

header <opencl_device_queue> for device_queue type.

Type in OpenCL C++ API type for applicaon

cl::sampler cl_sampler

cl::image[1d, 2d, 3d]

cl::image1d_[buer, array]

cl::image2d_ms

cl::image2d_array[_ms]

cl::image2d_depth[_ms]

cl::image2d_array_depth[_ms]

cl_image

cl::pipe cl_pipe

cl::device_queue cl_queue

half wrapper [3.6.1]

Header <opencl_half> OpenCL C++ implements a wrapper

class for the built-in half data type. The class methods perform

implicit vload_half and vstore_half operaons from Vector Data

Load and Store Funcons secon.

fp16(const half &r) noexcept;

Constructs an object with a

half built-in type.

fp16(const oat &r) noexcept;

Constructs an object with a

oat built-in type.

fp16(const double &r) noexcept;

Constructs an object with a

double built-in type.

ndrange [3.13.6]

Header <opencl_device_queue> The ndrange type is used to

represent the size of the enqueued workload with a dimension

from 1 to 3.

struct ndrange {

explicit ndrange(size_t global_work_size) noexcept;

ndrange(size_t global_work_size, size_t local_work_size

noexcept;

ndrange(size_t global_work_oset, size_t global_work_size,

size_t local_work_size) noexcept;

template <size_t N>

ndrange(const size_t (&global_work_size)[N]) noexcept;

template <size_t N>

ndrange(const size_t (&global_work_size)[N],

const size_t (&global_work_size)[N]) noexcept;

template <size_t N>

ndrange(const size_t (&global_work_oset)[N],

const size_t (&global_work_size)[N],

const size_t (&global_work_size)[N]) noexcept;

};

Example

#include <opencl_device_queue>

#include <opencl_work_item>

using namespace cl;

kernel void foo(device_queue q) {

q.enqueue_kernel(cl::enqueue_policy::no_wait, cl::ndrange( 1 ),

[](){ uint d = get_global_id(0); } );

}

剩余23页未读，继续阅读

ceilingway

粉丝: 0
资源: 3

OpenCL 2.2开发指南：API与命令队列详解

Cesium学习资料

3DTiles-社区标准-中文版.pdf

OpenCL Programming Guide - A. Munshi, et al

debian11 手动安装opencl1.2

怎么下载sudo apt-get install ocl-icd-opencl2-dev

rk3588 debian11编译opencv-4.5.1调用opencl mali，编译出现错误/usr/bin/ld: /usr/lib/gcc/aarch64-linux-gnu/10/../../../aarch64-linux-gnu/libmali.so.1: .dynsym local symbol at index 3 (>= sh_info of 3)

提供opencl的学习资料

rk3588 debian10系统下/usr/lib/aarch64-linux-gnu/没有libopencl.so

linux opencl cts

protoc 生成支持 MVSC动态库 并可以导出proto 头文件的 命令行

Device OpenCL C Version OpenCL C 2.0 v1.g6p0-01eac0.a9a79ec12aab7bde0098d088e36ea555是什么以上

rk3588 debian11系统安装opencl1.2

opencl1.2源码包链接

ubuntu opencl cts

tb-rk3588 clinfo: /lib/aarch64-linux-gnu/libOpenCL.so.1: no version information available (required by clinfo)详细解决方式

rk3588 报错clinfo: /lib/aarch64-linux-gnu/libOpenCL.so.1: no version information available (required by clinfo)

rk3588 debian11系统 安装opencl

sudo apt-get install opencl-1.2-clhpp-headers

openpose opencl

最新资源

protoc 生成支持 MVSC动态库并可以导出proto 头文件的命令行

rk3588 debian11系统安装opencl