Fundamentals of Accelerated Computing with CUDA Python (FACCP)

Course Overview

This course explores how to use Numba—the just-in-time, type-specializing Python function compiler—to accelerate Python programs to run on massively parallel NVIDIA GPUs. You’ll learn how to: · Use Numba to compile CUDA kernels from NumPy universal functions (ufuncs). Use Numba to create and launch custom CUDA kernels · Apply key GPU memory management techniques Upon completion, you’ll be able to use Numba to compile and launch CUDA kernels to accelerate your Python applications on NVIDIA GPUs.

Prerequisites

Basic Python competency, including familiarity with variable types, loops, conditional statements, functions, and array manipulations
NumPy competency, including the use of ndarrays and ufuncs
No previous knowledge of CUDA programming is required

Course Objectives

At the conclusion of the workshop, you’ll have an understanding of the fundamental tools and techniques for GPU-accelerated Python applications with CUDA and Numba:

GPU-accelerate NumPy ufuncs with a few lines of code.
Configure code parallelization using the CUDA thread hierarchy.
Write custom CUDA device kernels for maximum performance and flexibility.
Use memory coalescing and on-device shared memory to increase CUDA kernel bandwidth.

Prices & Delivery methods

Online Training

Duration
1 day

Price

on request

Enroll now

Request a date

Classroom Training

Duration
1 day

Price

on request

Enroll now

Request a date

Schedule

Currently there are no training dates scheduled for this course.