Facebook

Introduction to Apache Spark Essentials (TTSK7502)

* Looking for a flexible schedule (after hours or weekends)? Please call 858-208-4141 or email us:  sales@ccslearningacademy.com.

Student financing options are available.

Transitioning military and Veterans, please contact us to sign up for a free consultation on training and hiring options.

Looking for group training? Contact Us

psinghal
Last Update December 12, 2023
0 already enrolled

About This Course

Course Description

Learn the essentials of using Spark for your big data workloads.

Apache Spark is an important component in the Hadoop Ecosystem as a cluster computing engine used for Big Data. Building on top of the Hadoop YARN and HDFS ecosystem, Spark offers faster in-memory processing for computing tasks when compared to Map/Reduce. It can be programmed in Java, Scala, Python, and R along with SQL-based front-ends.
This course introduces Scala, Python, or R developers to the world of Spark programming. It begins with an overview of the ecosystem and hands-on experience with the platform such as working with the Spark Shell, using RDDs, and DataFrames. You’ll later explore a wider-scoped introduction to NoSQL, Spark Streaming, Spark SQL, Spark MLLib, and how the pieces are put together in a larger application.

Learning Objectives

The essentials of Spark architecture and applications
How to execute Spark Programs
How to create and manipulate both RDDs (Resilient Distributed Datasets) and UDFs (Unified Data Frames)
How Spark core components come together for complete applications

Inclusions

  • Instructor-led training
  • Training Seminar Student Handbook
  • Collaboration with classmates (not currently available for self-paced course)
  • Real-world learning activities and scenarios
  • Exam scheduling support*
  • Enjoy job placement assistance for the first 12 months after course completion.
  • This course is eligible for CCS Learning Academy’s Learn and Earn Program: get a tuition fee refund of up to 50% if you are placed in a job through CCS Global Tech’s Placement Division*
  • Government and Private pricing available.*

Pre-requisites

  • Experience programming in either Java, Python, R, or Scala (only one language needed)
  • Basic understanding of SQL

Target Audience

  • Data Scientists, Data Engineers, Software Engineers, Architects, and Developers.

Curriculum

30 Lessons16h

1. Overview of Spark

Hadoop Ecosystem
Hadoop YARN vs. Mesos
Spark vs. Map/Reduce
Spark: Lambda Architecture
Spark in the Enterprise Data Science Architecture

2. Spark Component Overview

3. RDDs: Resilient Distributed Datasets

4. DataFrames

5. Advanced Spark Overview

Your Instructors

psinghal

0/5
471 Courses
0 Reviews
0 Students
See more

Write a review

IMG1696502009.jpg

$795.00

Level
Intermediate
Duration 16 hours
Lectures
30 lectures
Print Friendly, PDF & Email

Inclusions

  • Instructor-led training
  • Training Seminar Student Handbook
  • Collaboration with classmates (not currently available for self-paced course)
  • Real-world learning activities and scenarios
  • Exam scheduling support*
  • Enjoy job placement assistance for the first 12 months after course completion.
  • This course is eligible for CCS Learning Academy’s Learn and Earn Program: get a tuition fee refund of up to 50% if you are placed in a job through CCS Global Tech’s Placement Division*
  • Government and Private pricing available.*
#edumall-wp-widget-courses-1 { display: none; } #single-course-ratings { display: none; } .tutor-single-course-lead-meta { display: none; } .lead-meta-item meta-course-total-enrolled { display: none; }