CUDA C++ Programming Guide PG-02829-001_v11.1|xvi
I.7.1.Architecture..................................................................................................................... 326
I.7.2.Global Memory.................................................................................................................327
I.7.3.Shared Memory............................................................................................................... 327
AppendixJ.Driver API......................................................................................................328
J.1.Context................................................................................................................................... 330
J.2.Module....................................................................................................................................331
J.3.Kernel Execution................................................................................................................... 332
J.4.Interoperability between Runtime and Driver APIs.............................................................334
AppendixK.CUDA Environment Variables...................................................................... 335
AppendixL.Unified Memory Programming.................................................................... 338
L.1.Unified Memory Introduction................................................................................................338
L.1.1.System Requirements....................................................................................................339
L.1.2.Simplifying GPU Programming..................................................................................... 339
L.1.3.Data Migration and Coherency......................................................................................340
L.1.4.GPU Memory Oversubscription..................................................................................... 341
L.1.5.Multi-GPU....................................................................................................................... 341
L.1.6.System Allocator............................................................................................................ 342
L.1.7.Hardware Coherency......................................................................................................342
L.1.8.Access Counters.............................................................................................................343
L.2.Programming Model.............................................................................................................344
L.2.1.Managed Memory Opt In............................................................................................... 344
L.2.1.1.Explicit Allocation Using cudaMallocManaged().................................................... 344
L.2.1.2.Global-Scope Managed Variables Using __managed__........................................ 345
L.2.2.Coherency and Concurrency......................................................................................... 345
L.2.2.1.GPU Exclusive Access To Managed Memory......................................................... 346
L.2.2.2.Explicit Synchronization and Logical GPU Activity.................................................347
L.2.2.3.Managing Data Visibility and Concurrent CPU + GPU Access with Streams........ 348
L.2.2.4.Stream Association Examples................................................................................ 349
L.2.2.5.Stream Attach With Multithreaded Host Programs...............................................349
L.2.2.6.Advanced Topic: Modular Programs and Data Access Constraints......................350
L.2.2.7.Memcpy()/Memset() Behavior With Managed Memory.......................................... 351
L.2.3.Language Integration..................................................................................................... 352
L.2.3.1.Host Program Errors with __managed__ Variables..............................................352
L.2.4.Querying Unified Memory Support................................................................................353
L.2.4.1.Device Properties.....................................................................................................353
L.2.4.2.Pointer Attributes.................................................................................................... 353
L.2.5.Advanced Topics............................................................................................................. 353
L.2.5.1.Managed Memory with Multi-GPU Programs on pre-6.x Architectures...............353