Skip to main content
HomeSpark

Course

Building Recommendation Engines with PySpark

AdvancedSkill Level
4.8+
224 reviews
Updated 04/2026
Learn tools and techniques to leverage your own big data to facilitate positive experiences for your users.
Start Course for Free
SparkMachine Learning4 hr15 videos56 Exercises4,550 XP14,009Statement of Accomplishment

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Loved by learners at thousands of companies

Group

Training 2 or more people?

Try DataCamp for Business

Course Description

This course will show you how to build recommendation engines using Alternating Least Squares in PySpark. Using the popular MovieLens dataset and the Million Songs dataset, this course will take you step by step through the intuition of the Alternating Least Squares algorithm as well as the code to train, test and implement ALS models on various types of customer data.

Prerequisites

Supervised Learning with scikit-learnIntroduction to PySpark
1

Recommendations Are Everywhere

This chapter will show you how powerful recommendations engines can be, and provide important distinctions between collaborative-filtering engines and content-based engines as well as the different types of implicit and explicit data that recommendation engines can use. You will also learn a very powerful way to uncover hidden features (latent features) that you may not even know exist in customer datasets.
Start Chapter
2

How does ALS work?

3

Recommending Movies

4

What if you don't have customer ratings?

In most real-life situations, you won't not have "perfect" customer data available to build an ALS model. This chapter will teach you how to use your customer behavior data to "infer" customer ratings and use those inferred ratings to build an ALS recommendation engine. Using the Million Songs Dataset as well as another version of the MovieLens dataset, this chapter will show you how to use the data available to you to build a recommendation engine using ALS and evaluate it's performance.
Start Chapter
Building Recommendation Engines with PySpark
Course
Complete

Earn Statement of Accomplishment

Add this credential to your LinkedIn profile, resume, or CV
Share it on social media and in your performance review
Enroll Now

Don’t just take our word for it

*4.8
from 224 reviews
86%
13%
1%
0%
0%
  • Hubert
    4 days ago

  • Stanisław
    5 days ago

  • Alicja
    6 days ago

  • abdelrahman
    6 days ago

  • Shreeya
    last week

  • Ismaail Ali Azhar
    last week

Hubert

Stanisław

Alicja

FAQs

What recommendation algorithm does this PySpark course focus on?

The course focuses on the Alternating Least Squares (ALS) algorithm for collaborative filtering, covering its mathematical foundation, hyperparameters, and implementation in PySpark.

What datasets are used for building recommendation engines?

You will work with the MovieLens dataset to build and evaluate a cross-validated ALS model, and the Million Songs dataset to practice with implicit feedback data.

Does the course cover recommendations when explicit ratings are not available?

Yes. The final chapter teaches you how to infer ratings from customer behavior data and build ALS recommendation engines using implicit feedback.

What PySpark and Python prerequisites should I have?

You need experience with pandas, Intermediate Python, Introduction to PySpark, basic SQL, and supervised learning with scikit-learn. This is an advanced-level course.

What is matrix factorization and why does it matter for recommendations?

Matrix factorization decomposes a large user-item matrix into smaller matrices to uncover latent features. It is the mathematical core of ALS and helps predict missing ratings.

Join over 19 million learners and start Building Recommendation Engines with PySpark today!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Grow your data skills with DataCamp for Mobile

Make progress on the go with our mobile courses and daily 5-minute coding challenges.