Course Outline
Introduction
- Hadoop history and concepts
- Ecosystem
- Distributions
- High level architecture
- Hadoop myths
- Hadoop challenges (hardware/software)
Planning and installation
- Selecting software and Hadoop distributions
- Sizing the cluster and planning for growth
- Selecting hardware and network
- Rack topology
- Installation
- Multi-tenancy
- Directory structure and logs
- Benchmarking
HDFS operations
- Concepts (horizontal scaling, replication, data locality, rack awareness)
- Nodes and daemons (NameNode, Secondary NameNode, HA Standby NameNode, and DataNode)
- Health monitoring
- Command-line and browser-based administration
- Adding storage and replacing defective drives
MapReduce operations
- Parallel computing before MapReduce: compare HPC versus Hadoop administration
- MapReduce cluster loads
- Nodes and Daemons (JobTracker and TaskTracker)
- MapReduce UI walk through
- MapReduce configuration
- Job config
- Job schedulers
- Administrator view of MapReduce best practices
- Optimizing MapReduce
- Fool proofing MR: what to tell your programmers
- YARN: architecture and use
Advanced topics
- Hardware monitoring
- System software monitoring
- Hadoop cluster monitoring
- Adding and removing servers and upgrading Hadoop
- Backup, recovery, and business continuity planning
- Cluster configuration tweaks
- Hardware maintenance schedule
- Oozie scheduling for administrators
- Securing your cluster with Kerberos
- The future of Hadoop
Target Audience
Experienced System Administrators who are responsible for maintaining a Hadoop cluster and its related components.
What You'll Learn
Join an engaging hands-on learning environment, where you’ll:
- Understand the benefits of distributed computing
- Understand the Hadoop architecture (including HDFS and MapReduce)
- Define administrator participation in Big Data projects
- Plan, implement, and maintain Hadoop clusters
- Deploy and maintain additional Big Data tools (Pig, Hive, Flume, etc.)
- Plan, deploy and maintain HBase on a Hadoop cluster
- Monitor and maintain hundreds of servers
- Pinpoint performance bottlenecks and fix them
Inclusions
With CCS Learning Academy, you’ll receive:
- Instructor-led training
- Training Seminar Student Handbook
- Pre and Post assessments/evaluations
- Collaboration with classmates (not currently available for self-paced course)
- Real-world learning activities and scenarios
- Exam scheduling support*
- Enjoy job placement assistance for the first 12 months after course completion.
- This course is eligible for CCS Learning Academy’s Learn and Earn Program: get a tuition fee refund of up to 50% if you are placed in a job through CCS Global Tech’s Placement Division*
- Government and Private pricing available.*
*For more details call:Â 858-208-4141Â or email:Â training@ccslearningacademy.com; sales@ccslearningacademy.com