Fundamentals of Accelerated Computing with CUDA Python (FACCP)

Resumen del Curso

This course explores how to use Numba—the just-in-time, type-specializing Python function compiler—to accelerate Python programs to run on massively parallel NVIDIA GPUs. You’ll learn how to: · Use Numba to compile CUDA kernels from NumPy universal functions (ufuncs). Use Numba to create and launch custom CUDA kernels · Apply key GPU memory management techniques Upon completion, you’ll be able to use Numba to compile and launch CUDA kernels to accelerate your Python applications on NVIDIA GPUs.

Prerrequisitos

Basic Python competency, including familiarity with variable types, loops, conditional statements, functions, and array manipulations
NumPy competency, including the use of ndarrays and ufuncs
No previous knowledge of CUDA programming is required

Objetivos del curso

At the conclusion of the workshop, you’ll have an understanding of the fundamental tools and techniques for GPU-accelerated Python applications with CUDA and Numba:

GPU-accelerate NumPy ufuncs with a few lines of code.
Configure code parallelization using the CUDA thread hierarchy.
Write custom CUDA device kernels for maximum performance and flexibility.
Use memory coalescing and on-device shared memory to increase CUDA kernel bandwidth.

Precios & Delivery methods

Entrenamiento en línea

Duración
1 día

Precio

Consulta precio y disponibilidad

Fechas y Registro

Solicitud de fecha de entrenamiento

Classroom training

Duración
1 día

Precio

Consulta precio y disponibilidad

Fechas y Registro

Solicitud de fecha de entrenamiento

Calendario

Por el momento no hay fechas programadas para este curso