Machine Learning With Apache Spark

Available since October 31, 2019

Course description

Course Agenda:
Applied Data Science and Business Analytics
Machine Learning Algorithms, Techniques and Common Analytical Methods
Apache Spark Introduction
Spark’s MLlib Machine Learning Library

Target audience

Data Scientists, Business Analysts, Software Developers, IT Architects

Course requirements

Participants should have the general knowledge of statistics and programming

Course Plan

	Section 01 Chapter 1 Machine Learning Algorithms Supervised vs Unsupervised Machine Learning Supervised Machine Learning Algorithms Unsupervised Machine Learning Algorithms Choose the Right Algorithm Life-cycles of Machine Learning Development Classifying with k-Nearest Neighbors (SL)k-Nearest Neighbors Algorithmk-Nearest Neighbors Algorithm The Error Rate Decision Trees (SL)Random Forests Unsupervised Learning Type: ClusteringK-Means Clustering (UL)K-Means Clustering in a Nutshell Regression Analysis Logistic Regression Summary
	Section 02 Chapter 2 Introduction to Functional Programming What is Functional Programming (FP)? Terminology: Higher-Order Functions Terminology: Lambda vs Closure A Short List of Languages that Support FPFP with JavaFP With JavaScript Imperative Programming in JavaScript The JavaScript map (FP) Example The JavaScript reduce (FP) Example Using reduce to Flatten an Array of Arrays (FP) Example The JavaScript filter (FP) Example Common High-Order Functions in Python Common High-Order Functions in Scala Elements of FP in R Summary
	Section 03 Chapter 3 Introduction to Apache Spark What is Apache Spark A Short History of Spark Where to Get Spark?The Spark Platform Spark Logo Common Spark Use Cases Languages Supported by Spark Running Spark on a Cluster The Driver Process Spark Applications Spark Shell The spark-submit Tool The spark-submit Tool Configuration The Executor and Worker Processes The Spark Application Architecture Interfaces with Data Storage Systems Limitations of Hadoop's MapReduce Spark vs MapReduce Spark as an Alternative to Apache Tez The Resilient Distributed Dataset (RDD) Spark Streaming (Micro-batching)Spark SQL Example of Spark SQLSpark Machine Learning Library GraphXSpark vs R Summary
	Section 04 Chapter 4 The Spark Shell The Spark Shell UI Spark Shell Options Getting Help The Spark Context (sc) and SQL Context (sqlContext) The Shell Spark Context Loading Files Saving Files Basic Spark ETL Operations Summary
	Section 05 Chapter 5 Spark Machine Learning Library What is MLlib? Supported Languages MLlib Packages Dense and Sparse Vectors Labeled Point Python Example of Using the Labeled Point Class LIBSVM format An Example of a LIBSVM File Loading LIBSVM Files Local Matrices Example of Creating Matrices in MLlib Distributed Matrices Example of Using a Distributed Matrix Classification and Regression Algorithm Clustering Summary
	Section 06 Chapter 6 Text Mining What is Text Mining? The Common Text Mining Tasks What is Natural Language Processing (NLP)? Some of the NLP Use Cases Machine Learning in Text Mining and NLP Machine Learning in NLPTF-IDF The Feature Hashing Trick Stemming Example of Stemming Stop Words Popular Text Mining and NLP Libraries and Packages Summary Lab Exercises Lab 1. Learning the Lab Environment Lab 2. The Spark Shell Lab 3. Using Random Forests for Classification with Spark MLlib Lab 4. Using k-means Algorithm from MLlib Lab 5. Text Classification with Spark ML Pipeline

Reviews

Coming soon.

Total Price: Request Quotation

Skill level: Beginner

Language: English

Certificate: No

Max students: 10

Total Duration: 2 days

Machine Learning With Apache Spark

IT Business Management Training