Enhancing Data Science Outcomes With Efficient Workflow (EDSOEW)

 

Course Overview

Learn how to create an end-to-end, hardware-accelerated machine learning pipeline for large datasets. Throughout the development process, you’ll use diagnostic tools to identify delays and learn to mitigate common pitfalls.

Prerequisites

  • Basic knowledge of a standard data science workflow on tabular data. To gain an adequate understanding, we recommend this article.
  • Knowledge of distributed computing using Dask. To gain an adequate understanding, we recommend the “Get Started” guide from Dask.
  • Completion of the DLI’s Fundamentals of Accelerated Data Science course or an ability to manipulate data using cuDF and some experience building machine learning models using cuML.

Course Objectives

  • Develop and deploy an accelerated end-to-end data processing pipeline for large datasets
  • Scale data science workflows using distributed computing
  • Perform DataFrame transformations that take advantage of hardware acceleration and avoid hidden slowdowns
  • Enhance machine learning solutions through feature engineering and rapid experimentation
  • Improve data processing pipeline performance by optimizing memory management and hardware utilization

Follow On Courses

Prices & Delivery methods

Online Training

Duration
0.5 days

Price
  • on request
Classroom Training

Duration
0.5 days

Price
  • on request

Schedule

Currently there are no training dates scheduled for this course.