Fundamentals of Accelerated Computing with OpenACC
Fundamentals of Accelerated Computing with OpenACC Learn the basics of OpenACC, a high-level programming language for programming on GPUs. This course is for anyone with some C/C++ experience who is interested in accelerating the performance of their applications beyond the …
Introduction to Accelerated Computing
Introduction to Accelerated Computing Explore the three techniques for accelerating code on a GPU: Using GPU-accelerated libraries Using compiler directives like OpenACC Writing code directly in CUDA-enabled languages Upon completion, you’ll understand how to demonstrate the potential speed-ups and ease …
GPU Memory Optimizations with CUDA C/C++
GPU Memory Optimizations with CUDA C/C++ Explore memory optimization techniques for programming with CUDA C/C++ on an NVIDIA GPU, and how to use the NVIDIA Visual Profiler (NVVP) to support these optimizations. You’ll learn how to: Implement a naive matrix …
Accelerating Applications with GPU-Accelerated Libraries in Python
Accelerating Applications with GPU-Accelerated Libraries in Python Learn how to use GPU libraries to accelerate Python code on NVIDIA GPUs by: Using the cuRAND library to accelerate a Monte Carlo pricer Optimizing data movement between the CPU and GPU Upon …
Using Thrust to Accelerate C++
Using Thrust to Accelerate C++ Thrust is a parallel algorithms library loosely based on the C++ Standard Template Library. It enables developers to quickly embrace the power of parallel computing and supports multiple system back-ends such as OpenMP and Intel’s …
Profiling and Parallelizing with OpenACC
Profiling and Parallelizing with OpenACC Get started on the first two steps of the OpenACC programming cycle: identifying parallelism and expressing parallelism. You’ll learn how to: Profile a provided C or Fortran application using NVIDIA NVPROF Use the PGI OpenACC …
Expressing Data Movement and Optimizing Loops with OpenACC
Expressing Data Movement and Optimizing Loops with OpenACC Learn intermediate OpenACC programming techniques by: Adding OpenACC data management directives Optimizing applications using the OpenACC loop directive Upon completion, you’ll be able optimize data transfers and fine tune application parallelism with …
Introduction to Multi-GPU Programming with MPI and OpenACC
Introduction to Multi-GPU Programming with MPI and OpenACC Learn how to program multi-GPU systems or GPU clusters using the Message Passing Interface (MPI) and OpenACC. You’ll learn how to: Exchange data between different GPUs using CUDA-aware MPI and OpenACC Handle …
Advanced Multi-GPU Programming with MPI and OpenACC
Advanced Multi-GPU Programming with MPI and OpenACC Learn how to improve a multi-GPU MPI + OpenACC accelerated applications by: Overlapping communication with computation to hide communication times Handling noncontiguous halo updates with a 2D tiled domain decomposition Upon completion, you’ll …