An Overview of Recommender Systems

Recently, due to the quarantine, my younger sister asked me to recommend a video game for her to play so she would spend less time on social media. So my thought process was that I’ll ask her about what type or genre of game she likes, what type of games she doesn’t like, and then give a recommendation from there. However, when I asked her about her likes and dislikes, she tells me that she has no clue since she hasn’t really played any video games before.

Well, so much for my initial plan. I thought about what else I could do, and where my mind went was to think about what I knew about her preferences in other areas such as the kinds of movies and TV shows she watches. What information could I glean from there and how could I use that to recommend a video game? Maybe she likes shows with a good, heartfelt story so she might also like an RPG with a good story. Maybe she likes Mystery so recommend a mystery/puzzle game.

This event also made me think about recommender systems that we come across everywhere and I wondered how they work and what the process is for giving recommendations. What I found was that there are 3 main categories of recommender systems and they work very similarly to how I was thinking about giving a recommendation to my sister.

Before we get into those, let’s quickly talk about what a recommender system is. Recommender systems are software tools or techniques that attempt to suggest, or recommend, an item to a user that may be of interest to them based on certain criteria. These ‘items’ can be anything from music to books to video games. The criteria and process is what differs between the various types of recommender systems. The three main types are content-based filtering, collaborative filtering, and hybrid systems. Generally speaking, all these methods require some basic user profile to work from, meaning some kind of history of what the user has purchased/viewed/rated in the past, in order to make suggestions.(Ricci)

Content-based Filtering process

The basis of content-based filtering is to recommend items in a particular category based off of similarities to items already in the user profile. An example would be looking for movies that share the genre(s) of movies you have liked or shown interest in before. Obviously, these systems do not just use one category or keyword to make recommendations. Going along with the movie example, it could look at the genres, actors, directors, etc., along with your ratings to cross reference with other movies and come up with recommendations. This approach also works well with text-based items like user reviews and doing a web search as it can match up the keywords you’re looking for. There are weights assigned to each word based on factors such as how common the word is(less weight assigned to more common words like ‘the’, ‘and’, etc.,) and then each review/website is checked to see how many keywords(weighted) it has. (Adomavicius, Tuzhilin)

Let’s talk about the pros of content-based filtering systems. First, it bases its recommendations solely on you. So it takes only your preferences and not anyone else’s and this makes it a very personalized experience. It also allows for you to get recommendations on new products right away because it just matches keywords/features and has no need for the product to get rated before suggesting it.

There are a few issues with content-based filtering systems as well. If you’re a new user, it can’t give you good suggestions, if any at all, because it has little to no user profile to work with. This can somewhat be worked around with some kind of questionnaire or survey to get an idea of your likes/dislikes. Another downside is that it is very specific so it does not easily transfer to any other content. Info about your movie preferences do not make for good suggestions on music.

Collaborative filtering process(Wikipedia)

Next up we have the collaborative filtering system. This system uses info from other users that have similar preferences/ratings to you to give you suggestions. The gif above summarizes/shows it really well, but if you have similar movie and music likes/dislikes, it will look at what else those users have overall liked and suggest it to you. Amazon is a really good example of this because it has such a diversity of products and gives recommendations on what ELSE you should buy if you’re thinking of buying a certain product. It does this by seeing what other users have bought after buying said item and giving you the top items. (Adomavicius, Tuzhilin)

One upside to a collaborative filtering system is that it doesn’t need much info to get started on giving suggestions. As soon as you start viewing items, it can cross reference that with other user data and begin giving recommendations. Another upside is that the recommendations are not limited to a certain category. It can look at the user profiles of people with similar taste in music and movies and recommend a book you might enjoy for example.

One issue with this system is that it cannot give you suggestions on new items quickly because until it has ratings from other users for the item it won’t know if you would be interested or not. Another issue is that as the database grows and more users are added, the computation required to sort through and determine recommendations also increases so a lot of computational power is needed.

Hybrid systems are exactly what they sound like: recommender systems that use or combine multiple recommender processes. As we saw above, both content-based and collaborative filtering systems have issues, and some of those issues can be overcome by each other. There are also other systems aside from the main ones mentioned that can be used in hybrid systems, but the idea is for one to help with or overcome an issue the other has.

Content-based filtering systems have trouble giving suggestions outside their content, but couple that with collaborative filtering methods and you can cross reference data from other contents and user profiles to give good suggestions. You can also get suggestions toward items that may not match the features of previously viewed/purchased items, but other users with similar preferences liked it so it is recommended.

A good example of this type of hybrid system is Netflix. It can give suggestions based on what users with similar profiles have thought of a movie or TV show(collaborative filtering) and also give you a ‘X% match’ based on ratings of what you’ve watched before(content-based filtering). You can read more about Netflix’s system here.

On the other hand, collaborative filtering systems struggle with giving new items as suggestions because of few to no ratings, but using content-based filtering, that new item can be recommended if it shares features that you have liked in the past.

Overall, recommender systems are varied and the three that I looked at above are the most common types. One thing to note is that the type of recommender system you use is not necessarily dependent on the type of item you are looking at. You can be looking at recommending movies and either content-based or collaborative filtering can be used, one is not better than the other. It is dependent on your goals and resources. Companies are also always trying to find ways to improve their recommender systems so as to make them more accurate and efficient.

Another thing is that I only gave a brief overview of recommender systems; there is much more math, machine learning, etc., that goes on throughout the process that I did not go into. I have not yet learned about machine-learning so that may be something to come back to at a later time.

Lastly, these systems can be very helpful, but in some cases they can also be very annoying. Some sites/apps like Youtube and Facebook use the data they collect to show you ads that you may not want to see at the time, and more often than not you don’t have the option to opt out or not see the ads. However, that is a discussion for a different day.

Data Science student at Flatiron School