Hadoop Course Content



Big Data
Hadoop
- What is Big Data? - What is Hadoop
- Dimensions of Big Data - How Hadoop Works
- Big Data in Advertising - HDFS
- Big Data in Banking - Mapreduce
- Big Data in Telecom - How Hadoop has an edge
- Big Data in ecommerce
- Big Data in Healthcare
- Big Data in Defense
- Processing options of Big Data
- Hadoop as an option
Hadoop Ecosystem
Hadoop Hands On
- Sqoop - Setting up Hadoop on a Single node cluster
- Oozie - Running HDFS commands
- Pig - Running your Mapreduce program
- Hive - Running Sqoop Import and Sqoop Export
- Flume - Creating Hive tables directly from Sqoop
• Cross company (will see in Container)
- Creating Hive tables
- Querying Hive tables
- Running an Oozie workflow
- Analyzing twitter data using Flume
Multinode Setup
Cluster Capacity Planning Advanced Mapreduce
- Setting up Multinode setup on Amazon ec2 - Mapreduce Code Walkthrough
- Setting up multimode setup on the classroom machines - ToolRunner
- Setting up Cloudera Manager on the cloud - MR Unit
- Setting up Cloudera Manager on local setup - Distributed Cache
- Combiner
- Partitioner
- Setup and Cleanup methods
- Using Java API to access HDFS
- Map Side joins
- Reduce side joins
- Input Types in Mapreduce
- Output Types in Mapreduce
- Custom Input Data types
- Custom Output Data types
- Multiple reducer MR program
Zero Reducer Mapper
Advanced Mapreduce Hands On
Mapreduce Design Patterns:
- MR Unit hands On - Searching
- Distributed Cache hands On - Sorting
- Partitioner hands On - Filtering
- Combiner hands On - Inverted Index
- Accessing files using HDFS API hands on - F-IDF
- Map Side joins hands on - Word Co-occurrence
- Reduce side joins hands on
Mapreduce Design Patterns Hands On:
Pig
- Searching Hands On - Introduction
- Sorting Hands On - Basic Data Analysis
- Filtering Hands On - Complex Data Analysis
- Inverted Index Hands On - Multi Data Set Analysis
- TF-IDF Hands On - UDFs in Pig
- Word Co-occurrence Hands On - Troubleshooting and Optimizing Pig
- Pig Hands On
Data Analysis Using Pentaho as a ETL tool
Hive
- Introduction - Introduction
- Setting up Pentaho - Basic Data Analysis with Hive
- Loading Data to HDFS - Hive Data Management
- Loading Data to Hive - Text Processing with Hive
- Aggregation through Mapreduce - Transformations in Hive
- Transforming Data with Hive - Optimizing Hive
- Transforming Data with Pig - Hive Hands On
- Loading data from HDFS to RDBMS
- Loading Data from hive to RDBMS
- Reporting on HDFS Data
Scheduling in Hadoop
Cluster Monitoring
-FIFO Scheduling - Basic Monitoring
-Fair Scheduling - Log Management
- Using Ganglia for monitoring
Cluster Maintenance
- Cluster Upgrades
- Failover Mechanism

Enquiry Form                         

CONTACT US

  1. 10521 Shadow Ridge Ln Apt 203,
  2. Louisville KY-40241
  3. Contact No : 606 909 4722
  4. Email : contact@coloursit.com

TESTIMONIALS

  I am Pleased to compliment your organization for your efficiency and professionalism.     
– Kranthi Ch
MD, Siri IT Solutions



  I am Pleased to compliment your organization for your efficiency and professionalism.     
– Sreedhar J
Manager, Siri IT Solutions

Copyright 2014 Colours IT Consulting | All Rights Reserved.