Syllabus:
• BigData
• Hands ons Exercise for each concept
• BigData
- What is BigData
- Characterstics
of BigData
- Problems
with BigData
- Handling BigData
- Introduction to Distributed Systems
- Problems
with Existing Distributed Systems to deal BigData
- Requirements of NewApprocach
- HADOOP history
- HDFS
- MapReduce
- Install
Pseudo cluster
- Install Multi node cluster
- Configuration Introduction to HADOOP
Cluster
- The Five Deamons working
• NameNode
• JobTracker
• SecondaryNameNode
• TaskTracker
• DataNode
- Introduction to HADOOP EcoSystem projects
- Understanding HADOOP API
- Basic
programs of HADOOP MapReduce ApplicationForm
- Driver Code
- Mapper Code
- Reducer Code
- Eclipse intigration with HADOOP for Rapid Application Development
- More about ToolRunner
- Combiner
- Reducer
- configure and close methods
- Sorting
- Searching
- Indexing
- TF-IDF
- Word_CoOccurance
- Flume
- Sqoop
- Importing data from RDBMS using sqoop
- Hive
- Introduction to hive
- Creating tables in hive
- Running
queries
- Pig
- Introduction to pig
- Different modes of pig
- when to use hive and when to use pig
- HBASE
- Basics of HBASE
- Developing custom Writable
- Developing
custom WritableComparable
- Understanding Input Output formats
• Hands ons Exercise for each concept