This 4 day training course is designed for developers who need to create applications to analyze Big Data stored in Apache Hadoop using Pig and Hive. Topics include: Hadoop, YARN, HDFS, MapReduce, data ingestion, workflow definition, using Pig and Hive to perform data analytics on Big Data and an introduction to Spark Core and Spark SQL.
PREREQUISITES
Students should be familiar with programming principles and have experience in software development. SQL knowledge is also helpful. No prior Hadoop knowledge is required.
TARGET AUDIENCE
Software developers who need to understand and develop applications for Hadoop.
AGENDA SUMMARY
Day 1 : An Introduction to the Hadoop Distributed File System
Day 2 : An Introduction to Apache Pig
Day 3 : An Introduction to Apache Hive
Day 4 : Working with Spark Core, Spark SQL and Oozie