Description Big Data Hadoop Developer Online Classroom Training Location London, England, United Kingdom Event Type Class Industry Big Data Regions Europe, Middle East, and Africa (EMEA) Event Organizers TACT Start Date Aug 28, 2017 End Date Sep 14, 2017 Venue Name Online Classroom Event URL tactlearn.com/online-training-courses/big-data-... Registration URL Click here to register
Why get Big Data Hadoop Developer Certification from TACT?
Big Data Hadoop developer is one of the most in-demand skills set in the world today and the demand for certified big data and Hadoop developer professionals is rising exponentially. According to a recent study, by 2018 there will be around 180,000 vacant roles in U.S. alone. Big Data and
Hadoop market is expected to grow at annual compound growth rate (CAGR) of 58% by 2020 and it is expected to surpass $16 Billion.
Big Data Hadoop developer certification course by TACT is a comprehensive course on proficiently managing Big Data using Apache’s open source platform Hadoop. This course gives you in-depth knowledge on core concepts and a detailed hands-on experience on how to use those concepts in solving wide-ranging real-world use cases. The Big Data Hadoop Developer course equips you to write codes in MapReduce framework. The course also includes advanced modules like, Yarn, Zookeeper, Oozie, Flume and Sqoop.
Big Data Hadoop Developer Course Objective
Enables participants to write complex codes in MapReduce on both MRv1 & MRv2 (Yarn) and thoroughly understand Hadoop architecture
How to perform analytics and learn high-level scripting frameworks Pig and Hive
Get detailed understanding of Hadoop system and its advance elements like Oozie, Flume and apache workflow scheduler
Get familiar with other important concepts like, Hbase, Zookeeper and Sqoop
Get hands-on expertise on various configurations surroundings of Hadoop cluster
Learn about optimization and troubleshooting
Get thorough knowledge of Hadoop architecture by learning about Hadoop Distribution file system (vHDFS one.0 & vHDFS a pair of.0)
Get hands on experience of working on Real Life Project on Industry standards
Pre-Requisites
Any individual who wants to make a career in Big Data Hadoop should have a basic understanding of core Java. However, TACT provides complementary Java (self-paced) tutorials for those who do not know Java. So having knowledge of Java is not mandatory.
Project 1: “Twitter Analysis”
Today, of all the data in the world, around 80% data is unstructured and only 20% is structured. With the help of conventional RDBMS we can store/process only structured data while Hadoop enables us to store and process unstructured data as well.
Twitter is one of the most popular social networking portals and has become a reliable source of data to analyze what customers are thinking about a particular thing (sentiment analysis). During the case study we will extract data from twitter and use that data to do some interesting analysis.
Project 2: “Clickstream Analysis”
E-commerce websites have been observed to impact the economy of their region in a huge way. This trend has been observed globally. Every e-commerce website keeps a record of user-activity and stores it as “clickstream”. This activity is used to analyze the browsing patterns of a particular user thus helping the sites to recommend products, with high accuracy, when the user visits the website the next time. This also helps the e-commerce websites to design personalized promotional emails for its users.
In this case study we will see how we can analyze the clickstream and user-data by using Pig and Hive. We will be gathering the user data with the help of RDBMS and will capture the user-behaviour (clickstream) by using Flume in HDFS. Thereafter, we will analyze this data using Pig and Hive. We will also be automating the Click Stream Analysis by putting workflow engine Oozie, to use.
Agenda:
INTRODUCTION TO LINUX AND BIG DATA VIRTUAL MACHINE ( VM)
UNDERSTANDING BIG DATA
HDFS (THE HADOOP DISTRIBUTED FILE SYSTEM)
HOW HDFS ADDRESSES FAULT TOLERANCE?
HDFS INTERFACES
ADVANCED HDFS FEATURES
MAP REDUCE – 1 (THEORETICAL CONCEPTS)
MAPREDUCE ARCHITECTURE
MR ALGORITHM AND DATA FLOW
ALTERNATIVES TO MR – BSP (BULK SYNCHRONOUS PARALLEL)
MAP REDUCE – 2 (PRACTICE) DEVELOPING, DEBUGGING AND DEPLOYING MR PROGRAMS
WRITABLECOM PARABLES
OPTIMIZATION TECHNIQUES
MR ALGORITHMS (NON- GRAPH)
MR ALGORITHMS (GRAPH)
HIGHER LEVEL ABSTRACTIONS FOR MR (PIG)
HIGHER LEVEL ABSTRACTIONS FOR MR (HIVE)
COMPARISON OF PIG AND HIVE
DIFFERENT TYPES OF NOSQL DATABASES
COLUMNAR DATABASES CONCEPTS NOSQL DATABASES – 2 (PRACTICE)
INTERFACES TO HBASE (FOR DDL AND DML OPERATIONS)
ADVANCE HBASE FEATURES
SPARK
INTRODUCTION TO YARN
INTRODUCTION TO OOZIE
INTRODUCTION TO FLUME
INTRODUCTION TO SQOOP
SETTING UP A HADOOP CLUSTER USING APACHE HADOOP
SSH CONFIGURATION
HADOOP ECOSYSTEM AND USE CASES
PROOF OF CONCEPTS AND USE CASES