SUBSCRIBE VIA EMAIL TO GET NEW BATCH UPDATES

Enter your email address to get new batch updates:

Delivered by FeedBurner

Thursday, October 16, 2014

Hadoop (Weekend) on 18th Oct (Sunday) 11:00AM at Marthahalli (Bangalore) by Mr. Anil Kumar

Syllabus:

HADOOP DEVOLOPMENT
• Hadoop and HDFS architecture
Hadoop Architecture and Eco System
Understanding of Distribution system & parallel computing.
HDFS daemons : Namenode, Secondary Namenode, and Datanode
MapReduce daemons : JobTracker and TaskTracker
Block Replacement,Data Integrity, Re-balancer
HDFS user/admin commands.
Anatomy of a Hadoop Cluster
• Setting up Hadoop cluster
Install and configure Apache Hadoop
Make a Pseudo distributed Hadoop cluster on a single laptop/desktop
Monitoring the cluster using UI
• MapReduce Programming
MapReduce framework and architecture
Hadoop Data Types
Developing MapReduce Programs in
♠ Local Mode
♠ Pseudo-distributed Mode
♠ Fully distributed mode
Writing MapReduce Programs
Examining MapReduce Programming
♠ ToolRunner
♠ Basic API Concepts (Driver code, Mapper, Reducer)
• Delving Deeper Into the Hadoop API
The configure and close Methods
Input and Output Formatters
♠ Text Format
♠ KeyValue Format
♠ Nline Format
♠ SequenceFile Format
Partitioners
• Tuning for Performance
Reducing network traffic with combiner
Reducing the amount of input data
Running with speculative execution
• Advanced MapReduce Programming
A Recap of the MapReduce Flow
Custom Writables and WritableComparables
Map-Side Joins
Reduce-Side Joins
Using The Distributed Cache
• Monitoring and debugging on a Production Cluster
Counters
Skipping Bad Records
Rerunning failed tasks with Isolation Runner
Schedulers(FIFO, Capacity and Fair)
• YARN Introduction & Architecture
• Pig - ETL
Introduction, Pig Vs Hive,
Pig Vs MapReduce and SQL
Pig's Data Model
Pig Architecture
• Hive – Dataware housing platform
Architecture of Hive
Hive Services, Clients, Meta-store
Hive Data Model and File Formats
Hive Query Language
DDL in Hive
Joins, Unions, Indexing, Views
• Hbase – NOSQL Database
Hbase Overview & Architecture
Hbase Installation
Usage Scenario of Hbase, CRUD
HBase DataModel
♠ Table and Row
♠ Column Family & Column Qualifier
♠ Cell and its Versioning
♠ Regions and Region Server
• SQOOP
Overview on Sqoop import/export
Install and configure Sqoop on cluster
MySQL Installation and connection
Sqoop commands
Various Options to Import Data
♠ Table Imports
♠ Filtering Imports
♠ Hive Imports
• Flume
Introduction and Architecture
Install and configure Flume
Flume Components
Flume Events
Hands-on Exercise
Gathering Twitter data using Flume
Pig Latin, Transformations
Installing and Running Pig in Local & Distributed modes
Advanced Pig concepts, Debugging
Hands-on Exercise
Statistics & Archiving with Hive
Hive Partitions, Buckets
Hive UDF,UDAF,UDTF
Hive SerDe properties
Hive Optimizations and best practices
Hands-on Exercise
Hbase operations (Get/Scan, Put, Delete..)
Hbase Admin - Create database, Develop and
run sample applications
Hbase Clients
♠ Thrift
♠ Java API
♠ REST
MapReduce & Hive Integration with Hbase

No comments:

Post a Comment

Blog Archive