After establishing the foundations of recommendation systems, we will now construct our first type of recommender: a content-based filter. The logic of this approach is straightforward. It recommends items based on their intrinsic properties. If a user likes a movie with certain actors and genres, a content-based system suggests other movies with those same attributes. It operates on a simple principle: if you liked that item, you might also like this other item that is similar to it.
This chapter covers the complete workflow for building such a system. We will start by representing items as feature vectors, a process known as creating item profiles. You will learn to process item metadata, including how to convert unstructured text into a numerical format using the Term Frequency-Inverse Document Frequency (TF-IDF) technique.
With our items represented as vectors, the next step is to measure their similarity. We will implement cosine similarity, a common metric for this task, which calculates the cosine of the angle between two vectors and :
Finally, we will combine these components to generate a user profile that summarizes their preferences and produce a ranked list of recommendations. The chapter concludes with a hands-on practical where you will apply these techniques to build a functional movie recommender from scratch.
2.1 The Mechanics of Content-Based Recommenders
2.2 Creating Item Profiles from Metadata
2.3 Vectorizing Text Data with TF-IDF
2.4 Computing Similarity with Cosine Distance
2.5 Generating User Profiles
2.6 Producing Content-Based Recommendations
2.7 Hands-on Practical: Building a Movie Recommender