fbpx skip to Main Content
09864032319 , 09678176577 datacomes@gmail.com

Course Name:Big Data/Hadoop Development & Administration

Course Duration: 3 months

Course Fees: 30,000

Syllabus:

The Big Data Economy

  • Data! Data! Data!
  • Data Economy
  • Data Analytics
  • Data Science
  • Traditional Data Processing Technologies

Apache Hadoop Architecture and Ecosystem

  • Hadoop Background
  • Hadoop Architecture
  • Hadoop and RDBMS
  • Hadoop Subprojects
  • Hadoop Distributions
  • Hadoop Documentation

Setting up Hadoop

  • Installing Hadoop
  • Configuring Hadoop
  • Starting Hadoop
  • Running Hadoop Clients
  • Browsing Hadoop UI Consoles

HDFS Architecture

  • Hadoop 1.0 HDFS Architecture
  • Hadoop 1.0 HDFS Architectural Capabilities – – Performance, Scalability, Availability, Installability, Comnfigurability, Operability, Usability, Security
  • Hadoop 2.0 HDFS Architecture

HDFS Programming Basics

  • Hadoop Configuration API
  • HDFS API Overview
  • HDFS File CRUD API
  • HDFS Directory CRUD API

HDFS Programming Advanced

  • File Compression Decompression
  • Type Serialization Deserialization
  • Sequence Files

MapReduce Architecture

  • Hadoop 1.0 MapReduce Architecture
  • Hadoop 1.0 MapReduce Architectural Capabilities – Performance, Scalability, Availability, Installability, Comnfigurability, Operability, Usability, Programmability
  • Hadoop 2.0 MapReduce Architecture

MapReduce Programming Basics

  • MapReduce Programming Concepts – Map Phase and Reduce Phase
  • MapReduce API – Key Java Classes and their Hierarchy
  • Steps to Write a MapReduce Program

MapReduce Programming Intermediate

  • Setting Mapper Counts and Reducer Counts
  • MapReduce Configuration
  • Combiners
  • Partitioners
  • Speculative Execution
  • Task JVM Reuse
  • Compression

MapReduce Programming Advanced

  • Output Format
  • Custom data Format
  • Input Format
  • Built in Mappers and Reducers
  • Counters
  • Multithreading
  • Distributed Cache

MapReduce Streaming and Pipes

  • MapReduce using Hadoop Streaming
  • MapReduce using Hadoop Pipes

MapReduce Development Best Practices

  • Logging in Hadooop
  • Exception Handling
  • Running Jobs Locally
  • Unit Testing with MRUnit
  • Top 10 Hadoop Anti-Patterns

Querying Data using Hive

  • Hive Background
  • Hive Architecture
  • Downloading, Installing and Configuring Hive
  • Simple Hive Example
  • Loading Data into Hive
  • Hive Query Statements
  • Hive Schema Violations
  • Using Built-in Hive Functions
  • Partitioning Data using Hive
  • Joining Data

Querying Data using Pig

  • Pig Background
  • Architecture
  • Downloading, Installing and Configuring Pig
  • Running Pig
  • Pig Latin Language Basics
  • Core Relational Operators – DISTINCT, FILTER, SPLIT, ORDER BY, LIMIT, GROUP, FOREACH
  • Built-in Functions
  • Relational Join Operators
  • Debug Operators

Realtime Database using HBase

  • HBase Overview
  • Data Model
  • Architecture
  • Downloading, Installing and Configuring HBase
  • HBase Shell
  • HBase Java API for CRUD Operations
Back To Top