Course
Unsupervised Learning in Python
IntermediateSkill Level
Updated 12/2025Start Course for Free
Included withPremium or Teams
PythonMachine Learning4 hr13 videos52 Exercises4,150 XP170K+Statement of Accomplishment
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.Loved by learners at thousands of companies
Training 2 or more people?
Try DataCamp for BusinessCourse Description
Feels like what you want to learn?
Start Course for FreeWhat you'll learn
- Assess intrinsic dimensionality by interpreting PCA explained-variance ratios and selecting optimal n_components for compression
- Distinguish between k-means, agglomerative hierarchical clustering, and t-SNE based on their algorithms, input requirements, and visualization outputs
- Evaluate cluster quality using inertia plots, dendrogram linkage distances, and cross-tabulations against known categories
- Identify appropriate preprocessing, clustering, and dimension-reduction tools in scikit-learn for specific unsupervised learning tasks
- Recognize significant latent features produced by NMF and apply cosine similarity to recommend documents or images with related topics or patterns
Prerequisites
Supervised Learning with scikit-learn1
Clustering for Dataset Exploration
Learn how to discover the underlying groups (or "clusters") in a dataset. By the end of this chapter, you'll be clustering companies using their stock market prices, and distinguishing different species by clustering their measurements.
2
Visualization with Hierarchical Clustering and t-SNE
In this chapter, you'll learn about two unsupervised learning techniques for data visualization, hierarchical clustering and t-SNE. Hierarchical clustering merges the data samples into ever-coarser clusters, yielding a tree visualization of the resulting cluster hierarchy. t-SNE maps the data samples into 2d space so that the proximity of the samples to one another can be visualized.
3
Decorrelating Your Data and Dimension Reduction
Dimension reduction summarizes a dataset using its common occuring patterns. In this chapter, you'll learn about the most fundamental of dimension reduction techniques, "Principal Component Analysis" ("PCA"). PCA is often used before supervised learning to improve model performance and generalization. It can also be useful for unsupervised learning. For example, you'll employ a variant of PCA will allow you to cluster Wikipedia articles by their content!
4
Discovering Interpretable Features
In this chapter, you'll learn about a dimension reduction technique called "Non-negative matrix factorization" ("NMF") that expresses samples as combinations of interpretable parts. For example, it expresses documents as combinations of topics, and images in terms of commonly occurring visual patterns. You'll also learn to use NMF to build recommender systems that can find you similar articles to read, or musical artists that match your listening history!
Unsupervised Learning in Python
Course Complete
Earn Statement of Accomplishment
Add this credential to your LinkedIn profile, resume, or CVShare it on social media and in your performance review
Included withPremium or Teams
Enroll NowJoin over 19 million learners and start Unsupervised Learning in Python today!
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.