EDA Analysis on Popular Non-English Movies

Prompt: A company wants to break into the movie market with original content. Come up with recommendations on how to do so and what the best approach would be with research and EDA(Exploratory Data Analysis).

Our Approach:

However, deciding what foreign content should be made might not be as clear. In this project we explore how non-English movies have performed over time and see if we can narrow down the choices by looking at language and genre trends.

Hypothesis:

  • China’s economic boom should be reflected in Chinese movie profits.
  • Growth of the US Spanish-speaking population should be noticeable in Spanish movie popularity.
  • With the world experiencing a tech boom and so much interest and conspiracies around tech and AI, there may be an increased interest in the Sci-Fi Genre.

We’ll see if we’re correct or not later in our EDA

Data Collection:

We were lacking some information that we thought would be helpful and useful in our analysis, namely gross worldwide revenue, so we collected that data from IMDb (Internet Movie Database) through webscraping and merged it into our dataset.

Data Cleaning:

Before we webscraped the data from IMDb, we removed all movies where the original language was English, and we also removed all movies before 1990 so we could focus on the last three decades and see how things change as non-English movies are more widespread and available.

After we cleaned the data and dropped the groups mentioned above we had 1010 movies left in our dataset.

Data visualization and EDA:

Language trends/visualizations

With regards to language, what we found was that Chinese and Japanese movies were top earners over past 10 years. We also found that Chinese movies seem to be on an upward trend with their profits rising each decade.

There were a couple of other interesting finds. There was a spike in the success of Spanish films during the 2000s. Also, it would seem that French movies lost money on average during the 1990s, but this would need to be looked into further to determine the cause and if this is an accurate representation.

Genre trends/visualizations

There were a lot of genres so the ones we’re going to focus on have been circled in either red or green. The one genre circled in red, War movies, have high ratings but also high budgets with low revenue and profits. This would be a genre that would not be recommended to invest in because over the past two decades they cost more to make on average relative to most other genres, but have dropped in revenue and profits between the 2000s and 2010s. In the last decade, they had the lowest average revenue and the second lowest average profits of all genres.

The top two profit earning genres are Sci-Fi and Action movies, and they are also on an upward trend in profit and revenue over the last 30 years. Sci-Fi and Action movies have low ratings relative to other genres, but seeing as there is no significant difference across the average ratings for all genres, it is not as big of a deal.

However, Animated movies have higher than average revenue, ratings and profits, so these movies seem to be doing well across all areas.

Conclusion/Final Recommendations:

So we would recommend investing in or making Chinese or Japanese movies in the Sci-Fi, Action, and Animated genres.

Data Science student at Flatiron School