DW613G Outline - IBM BigInsights Foundation

IBM BigInsights Foundation (DW613G) – Outline

Detailed Course Outline

(DW6A1)

(DW6B1)

Unit 1: IBM Open Platform with Apache Hadoop
Exercise 1: Exploring the HDFS
Unit 2: Apache Ambari
Exercise 2: Managing Hadoop clusters with Apache Ambari
Unit 3: Hadoop Distributed File System
Exercise 3: File access & basic commands with HDFS
Unit 4: MapReduce and Yarn
Topic 1: Introduction to MapReduce based on MR1
Topic 2: Limitations of MR1
Topic 3: YARN and MR2
Exercise 4: Creating and coding a simple MapReduce job (Possibly a more complex second Exercise)
Unit 5: Apache Spark
Exercise 5: Working with Sparks RDD to a Spark job
Unit 6: Coordination, management, and governance
Exercise 6: Apache ZooKeeper, Apache Slider, Apache Knox
Unit 7: Data Movement
Exercise 7: Moving data into Hadoop with Flume and Sqoop
Unit 8: Storing and Accessing Data
Topic 1: Representing Data: CSV, XML, JSON, and YAML
Topic 2: Open Source Programming Languages: Pig, Hive, and Other [R, Python, etc]
Topic 3: NoSQL Concepts
Topic 4: Accessing Hadoop data using Hive
Exercise 8: Performing CRUD operations using the HBase shell
Topic 5: Querying Hadoop data using Hive
Exercise 9: Using Hive to Access Hadoop / HBase Data
Unit 9: Advanced Topics
Topic 1: Controlling job workflows with Oozie
Topic 2: Search using Apache Solr
No lab exercises