DURGASOFT: Hadoop on 4th Feb (Wednesday) 7:30AM-8:30AM at Madhapur by Mr. Rama Krishna

Thursday, January 29, 2015

Hadoop on 4th Feb (Wednesday) 7:30AM-8:30AM at Madhapur by Mr. Rama Krishna

Syllabus:
• BigData

What is BigData
Characterstics of BigData
Problems with BigData
Handling BigData

• Distributed Systems

Introduction to Distributed Systems
Problems with Existing Distributed Systems to deal BigData
Requirements of NewApprocach
HADOOP history

• HADOOP Core Concepts

HDFS
MapReduce

• HADOOP Cluster

Install Pseudo cluster
Install Multi node cluster
Configuration Introduction to HADOOP Cluster
The Five Deamons working
• NameNode
• JobTracker
• SecondaryNameNode
• TaskTracker
• DataNode

Introduction to HADOOP EcoSystem projects

• Writing MapReduce programs

Understanding HADOOP API
Basic programs of HADOOP MapReduce ApplicationForm
- Driver Code
- Mapper Code
- Reducer Code
Eclipse intigration with HADOOP for Rapid Application Development

• Understanding ToolRunner

More about ToolRunner
Combiner
Reducer
configure and close methods

• Common MapReduce Algorithems

Sorting
Searching

Indexing
TF-IDF

Word_CoOccurance

• HADOOP EcoSystem

Flume
Sqoop
Importing data from RDBMS using sqoop
Hive

Introduction to hive

Creating tables in hive
Running queries
Pig
Introduction to pig

Different modes of pig

when to use hive and when to use pig
HBASE
Basics of HBASE

• Advanced MapReduce Programming

Developing custom Writable
Developing custom WritableComparable
Understanding Input Output formats

• Introduction to Ooziee
• Hands ons Exercise for each concept

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)