Facebook

Apache Spark for Data Scientists

* Looking for a flexible schedule (after hours or weekends)? Please call 858-208-4141 or email us:  sales@ccslearningacademy.com.

Student financing options are available.

Transitioning military and Veterans, please contact us to sign up for a free consultation on training and hiring options.

Looking for group training? Contact Us

psinghal
Last Update December 12, 2023
0 already enrolled

About This Course

Course Description

Learn Spark skills from a data science perspective to build unified big data applications combining batch, streaming, and interactive analytics on your data.

Apache Spark is a powerful, open-source processing engine for data in the Hadoop cluster, optimized for speed, ease of use, and sophisticated analytics. The Spark framework supports streaming data processing and complex iterative algorithms, enabling applications to run up to 100x faster than traditional Hadoop MapReduce programs. With Spark, you can write sophisticated applications to execute faster decisions and real-time actions to a wide variety of use cases, architectures, and industries.

This hands-on course explores using Spark for common data related activities from a data science perspective. You will learn to build unified big data applications combining batch, streaming, and interactive analytics on your data.

Learning Objectives

The essentials of Spark architecture and applications
How to execute Spark Programs
How to create and manipulate both RDDs (Resilient Distributed Datasets) and UDFs (Unified Data Frames)
How to integrate machine learning into Spark applications
How to use Spark Streaming

Inclusions

  • Instructor-led training
  • Training Seminar Student Handbook
  • Collaboration with classmates (not currently available for self-paced course)
  • Real-world learning activities and scenarios
  • Exam scheduling support*
  • Enjoy job placement assistance for the first 12 months after course completion.
  • This course is eligible for CCS Learning Academy’s Learn and Earn Program: get a tuition fee refund of up to 50% if you are placed in a job through CCS Global Tech’s Placement Division*
  • Government and Private pricing available.*

Pre-requisites

  • Introduction to Java Programming (at least exposure to basic Java syntax)
  • Introduction to SQL (familiarity wits SQL basics)
  • Basic knowledge of Statistics and Probability
  • Data Science background

Target Audience

  • Data Scientists, System Administrators, Testers, and other technical business professionals who seek to use Spark for data processing and analysis.

Curriculum

40 Lessons24h

1. Spark

Data Science: The State of the Art
Hadoop, Yarn, and Spark
Architectural Overview
Spark and Storm
MLib and Mahout
Distributed vs. Local Run Modes
Hello, Spark

2. Spark Overview

3. DataFrames

4. Spark SQL

5. Spark MLib

6. Spark Streaming

7. Spark GraphX

8. Performance and Tuning

9. Cluster Mode

Your Instructors

psinghal

0/5
472 Courses
0 Reviews
0 Students
See more

Write a review

IMG1696503009.jpg

$1,995.00

Level
Intermediate
Duration 24 hours
Lectures
40 lectures
Print Friendly, PDF & Email

Inclusions

  • Instructor-led training
  • Training Seminar Student Handbook
  • Collaboration with classmates (not currently available for self-paced course)
  • Real-world learning activities and scenarios
  • Exam scheduling support*
  • Enjoy job placement assistance for the first 12 months after course completion.
  • This course is eligible for CCS Learning Academy’s Learn and Earn Program: get a tuition fee refund of up to 50% if you are placed in a job through CCS Global Tech’s Placement Division*
  • Government and Private pricing available.*
#edumall-wp-widget-courses-1 { display: none; } #single-course-ratings { display: none; } .tutor-single-course-lead-meta { display: none; } .lead-meta-item meta-course-total-enrolled { display: none; }