Course Overview
Explore how to employ advanced data engineering tools and techniques with GPUs to significantly improve data engineering pipelines.
Prerequisites
- Intermediate knowledge of Python (list comprehension, objects)
- Familiarity with pandas a plus
- Introductory statistics (mean, median, mode)
Course Objectives
- How data moves within a computer. How to build the right balance between CPU, DRAM, Disk Memory, and GPUs.
- How different file formats can be read and manipulated by hardware.
- How to scale an ETL pipeline with multiple GPUs using NVTabular.
- How to build an interactive Plotly dashboard where users can filter on millions of data points in less than a second.