These are unprecedented times we’re living in. The global pandemic rages on, hundreds of cities are demonstrating against police brutality and racial injustice, and an election is just over a month away. Today I seek to find out how Spotify streaming has changed in this time.
As the pandemic has spread across the globe, it has created uncertainty and fear, caused nationwide lockdowns, disrupted/halted work and commutes, and, of course, at time of writing this, infected 31.7 million and killed 974,000 people. The United States’ death toll has now surpassed 200,000.
To recap, I will be working with the same dataset that I’ve discussed in the previous two posts. For a quick refresher on this dataset:
- Each day, Spotify shares the 200 most-streamed songs of the day globally and in 60 countries.
- These top 200 lists are archived back to 2017. In this writing I have pulled the charts through July 11th, 2020.
- This list provides streaming count numbers for each song that gets 1000 or more streams in a day.
- Only the top 200 most-streamed song’s stream counts are made available. This Spotify Charts data represents just a portion of total Spotify streaming.
What’s normal anyways?
To assess the impact COVID-19 has made on streaming, we must first understand what streaming numbers looked like when we weren’t living in a global pandemic. Basically, in order to determine that something is abnormal, we have to first see what normal looks like.
First, I took the dataset and added up all the streams for each country’s chart on a given day. This gave me the total number of Charts streams per day for the 60 countries in our dataset. Because the global chart consists of the top 200 most-streamed songs worldwide, I split up the individual country and the global charts data to make sure I was not double-counting streams, since each country’s streams are counted in the global chart.
To get a sense of the magnitude of the total number of streams that Spotify Charts makes up, I wanted to see what a typical number of streams per day would be. The below plot shows the average daily number of total streams on the global chart and the cumulative number of streams in all countries for each year in the dataset.
The first thing that we can see in the figure above is that there has been steady and consistent increase in streaming numbers. Average streams have increased year-over-year and so far 2020 numbers are up more than 20% compared to 2017. This seems to indicate that Spotify is growing in popularity. We also can’t help but notice that streaming growth has slowed each year, most notably in 2020. This could be our first sign of the effects of the pandemic on streaming numbers.
Year | Global charts average daily streams (millions) [Year-over-year change] | All countries average daily streams (millions) [Year-over-year change] |
2017 | 206 | 272 |
2018 | 229 [+11%] | 332 [+22%] |
2019 | 244 [+7%] | 395 [+19%] |
2020 | 254 [+4%] | 413 [+5%] |
Before I continue, I want to highlight that each year the cumulative sum of all countries’ charts increased more rapidly than the global charts total. This isn’t surprising, since the all countries data consists of many more unique songs. The unique number of songs on the global chart in this dataset is 5428, while that of the all countries’ charts is 78,321. Since the global chart is capped at the 200 top songs a day, thousands of tracks are not included in this list.
As referenced in my previous post, we can use sklearn to do regressions and generate trendlines. When we sum the total streams in a given month and plot them for each year in our dataset (excluding July since we don’t yet have the full month of data) the growth in streaming can be further illustrated.
Aside from showing the consistent year-over-year rise in streaming numbers, the above figure shows the monthly fluctuations and trends in streaming. Some months are higher than the trendline predicts, others lower. There seems to be a consistent spike in streaming in December; this trend has occurred the last three years. This plot also shows clearly the difference in this year’s streaming numbers, with the downward trend standing in sharp contrast to years past.
When we plot this same data for the entire time range expressed in the dataset we can see the steady rise in streams over time and the fluctuations in streaming each month or week. The total weekly streams shows a higher resolution of these streaming totals, but a few things stand out. First, just as December consistently seems to be the highest streaming month (even when correcting for the steady growth in streams over time), the final week or two of each year seem to be among the highest streaming weeks – this is consistent. Second, we can see a couple of weeks that are noticeably lower than the typical fluctuations that we see on the rest of the chart.
Let’s investigate these two noticeably low outlier weeks:
- The first of these big weekly dips occurs in the 22nd week of 2017. Upon closer inspection, it appears that there simply wasn’t any Spotify Charts data for 5/30/17, 5/31/17, and 6/2/17, all of which belong to Week 22. So, this big drop can be explained by missing data, since 3 of the 7 days in week that is being totaled is not being counted. If we use our hand-dandy trendline, we can see that the predicted number of streams would be about 1,950 million for that particular week. If we factor this number down to account for the missing days, we see 4/7*1,950 million = 1,114 million, which is really close to the actual total for that week.
- The second big dip, occurring in the first week of 2020, is due to a similar circumstance. The total weekly streaming numbers are grouped and then summed by week and year. In the Pandas datetime convention, a week starts on Monday and ends on Sunday. This means that if the first day of the first week in a new year falls in the previous year (i.e. 2019 & 2020), the streams for these days will not be included in the first week’s total. This happens in 2020, whose week 1 total only counts the first 5 days of 2020. If we approximate the predicted streams from the trendline (~2,800 million) and scale it down by a factor of 5/7, we get ~2,000 million streams, which also closely matches that week’s true total.
In a nutshell, the general trend of Spotify charts in normal times (from the start of 2017 to the end of 2020) can be defined by one word: growth. While there is certainly some fluctuation in streaming totals week-to-week, with some weeks higher and some weeks lower, the key takeaway here is that outside of this “noise”, the streaming totals are steadily increasing.
How does 2020 compare?
In order to more clearly see the differences between 2020 and the years prior, we can take two corrective measures on our dataset:
- First, we can perform a “baseline correction” to help flatten out the consistent growth that we’ve already established is occurring over time. This makes it easier to recognize differences from the typical behavior. This is performed by taking the actual total weekly streams value and subtracting out the predicted total streams value (the trendline shown in the previous figures). Now our chart’s y-axis represents the difference between the actual and predicted total streams in a given week. This is also commonly called the “residual.” For example, in the final week of 2019 there was a total of ~3,400 million streams. The trendline predicted ~2,900 million streams for this week. If we subtract these two values, we get a residual of +500 million streams. In other words, the final week of 2019 overperformed the predicted weekly streaming total by half a billion streams.
- Second, we can correct out the outlier weeks discussed in the previous section. In the absence of Spotify providing the missing data or the Gregorian calendar being updated such that every year has 364 days (or some multiple of 7), we’re stuck making an educated guess about what this week would have looked like if it were complete. That is essentially what the trendline is doing for us. Therefore, if we assume that the actual streaming totals for these outlier weeks match the value predicted by the trendline, we are effectively setting the residual for these weeks (22nd of 2017, 1st of 2020) to zero.
So what does this residual plot look like then?
In this corrected plot, the fluctuations week-to-week are more clearly visible. As discussed before, the final week or two of the year are usually the highest streaming weeks of the year. Other weeks seem to surge here and there, but there is once again a sharp drop-off in streams once we get a few months into 2020.
To see just how abnormal a drop this is, the residual for these weeks in 2020 can be compared with that of the previous years. We can “smooth” the data from 2017 to 2019 by averaging the weekly residual each of these three years, effectively creating a historical average behavior that is corrected for the consistent increase in streaming year-over-year; we’re finally comparing apples to apples!
When examining this chart we can see that 2020 was off to a hot start, exceeding the trendline and historical average for the first 11 weeks. A sharp dip in week 12 brought streams back down to predicted levels. Streaming residuals continued to slide in the following weeks before stabilizing at about -300 million. The 19th week saw another sharp drop to more than -600 million, but streams recovered to their previous low at about -300 million in the following weeks.
I think that this suggests that worldwide, Spotify Charts streaming has decreased sharply since the 11th or 12th week of the year (March 9th or 16th). Streaming totals have dipped to levels below what historical trends would predict.
What about for specific countries?
As the news cycle has churned during this pandemic, a recurring topic of reporting has been COVID-19 numbers surging in different places and countries imposing government-mandated lockdowns in response. I wanted to see how these rising caseloads and nationwide lockdowns affected Spotify Charts streaming totals in specific countries.
To see which countries had the most severe dips in streaming totals in consecutive weeks, I computed the percentage change in week-to-week streaming totals for each country and week in our dataset. For example, in the 21st week of 2020, Hong Kong had ~10 million total streams and ~6.4 million in the 22nd week. Thus, the week-to-week decrease can be calculated, (6.4-10)/6.4=-56%. The first thing I noticed when I started looking through this list was that 12 of the top 20 largest decreases occurred in 2020. Additionally, 7 of these 12 took place between the 10th and 12th week of the year. This three-week period spanned from March 2nd to March 23rd, which is notable since it lines up closely with the timeframe that many nationwide lockdowns went into effect and interest in the virus peaked in different countries.
The top 20 most severe dips were all changes of 20% or more. To help illustrate just how severe and abnormal a streaming drop of 20% from week-to-week is, see the below histogram. The vertical axis shows the number of weeks falling in a “bin”. The horizontal axis makes up these bins, each of which account for a week-to-week percentage change in streams of 2.5%. This histogram shows that about 2100 weeks had between 0 and 2.5% increase in streaming numbers. For the entire dataset, the average and median percentage change in streaming numbers is 0.5%. The standard deviation is ~5%. This means that a 20% decrease in streaming totals in a week is about 4 standard deviations away from the mean behavior: in other words – it is extremely rare. To put it even more simply, of the 7517 week-to-week drops recorded in our dataset, only 20 are 20% or greater.
Calculating the largest drops in consecutive weeks was helpful in establishing what constitutes extreme (or outlier) behavior and calling attention to specific timeframes where streaming has plummeted in different countries. However, it fails to tell the full story. With this in mind, I decided to plot a snapshot of the daily streaming totals to get a general sense of the what happened in the weeks leading up to and following the “big drop” week.
In these check charts, I don’t care so much about the actual magnitudes of the streaming totals as the broader behavior of daily streaming numbers and how they fluctuated over the timeframe in question. The blue line represents the daily streaming totals. There can be significant peaks and valleys in streaming tendencies over the course of the week. Streaming tends to be middling on weekdays, spikes on Fridays and Saturdays, and drops sharply on Sundays. To help show a clearer perspective of the general trend without the noise of the daily fluctuations I also plotted a seven-day moving average of the daily streams, shown as the dashed orange line. Finally, the two week period in which the major drop took place was highlighted in red and three weeks before and after this period were shown. To help call attention to specific weeks, all entries in the table occurring in 2020 were italicized and all entries that took place between the 10th and 12th week of 2020 were colored blue.
Week Span | Year | Country | Decrease In Weekly Streams | Daily Streaming Totals (blue), Seven-day moving average (orange) |
21st – 22nd | 2020 | Hong Kong | 56% | |
11th – 12th | 2020 | Peru | 38% | |
11th – 12th | 2020 | Chile | 31% | |
51st – 52nd | 2019 | Slovakia | 31% | |
25th – 26th | 2019 | Greece | 28% | |
45th – 46th | 2019 | Hong Kong | 26% | |
5th – 6th | 2019 | Taiwan | 25% | |
11th – 12th | 2020 | France | 25% | |
21st – 22nd | 2020 | Malaysia | 24% | |
10th – 11th | 2020 | Italy | 24% | |
42nd – 43rd | 2019 | Chile | 24% | |
22nd – 23rd | 2020 | Greece | 23% | |
11th – 12th | 2020 | Dominican Republic | 23% | |
4th – 5th | 2020 | Hong Kong | 22% | |
42nd – 43rd | 2019 | Bolivia | 22% | |
22nd – 23rd | 2020 | Ireland | 21% | |
25th – 26th | 2017 | Malaysia | 21% | |
11th – 12th | 2020 | Ecuador | 20% | |
10th – 11th | 2020 | Panama | 20% | |
17th – 18th | 2018 | Sweden | 20% |
Actually looking at this snapshot of daily streaming tells a different story than just the week-to-week decrease. For example, the 56% drop in Hong Kong doesn’t look like a case of streams dropping due to a major event. Instead, it appears that streaming spiked for some reason in the 21st week of the year and dropped back down to normal levels the following week – we’ll call this the spike-and-settle. This is not a unique phenomena; similar behavior can be seen for Slovakia, Greece (2019 & 2020), Ireland, and Sweden in this table. Another trend that occurs repeatedly is streams peaking, dropping drastically to below-average levels in the following week, and steadily building back to average or slightly-above. This can be seen for Chile (2019) among others. There are also several instances where streaming will drop sharply from the average only to return to normal levels a week or so later (Taiwan). The common thread between these different trends is that they all feature a recovery in streaming numbers to roughly pre-drop levels within a week or two.
However, not all of the snapshots shown in this table show a quick recovery or consistent upward trend post-drop. In fact, for all of the drops that occur between the 10th and 12th week of 2020, streaming stays down and continues to trend down. This is seen in Peru, Chile, France, Italy, the Dominican Republic, Ecuador, and Panama.
Armed with this information, I wanted to see how the other countries in this dataset have fared during this particular time period. Additionally, as the final piece of this study, I wanted to see how national lockdowns lined up with the sharp decline in streaming seen in many countries this year.
How have national lockdowns affected streaming?
As discussed earlier, many countries have seen lower-than-usual streaming totals, with sharp and long-lasting drops, starting in the 10th or 11th week of the year (March 2nd or March 9th). Also mentioned earlier, this timeframe seems to match up roughly the dates that countries imposed national lockdowns or stay-at-home orders.
To see how well the date of the national lockdown and the drop in streaming numbers lines up, I pulled the start date of pandemic lockdowns and plotted them against the total daily streams and the seven-day moving-average. It should be noted that I only plotted national lockdowns on the chart. Countries like the United States and Brazil never instituted nationwide orders, instead relying on states to make these decisions – as a result, these dates are not shown. All plots show the daily streaming numbers for each country from January 20th, 2020 to May 11th, 2020. In order to make sure that I’m not giving a false impression of the magnitude of these drops, I set the y-axis range to consistently be 5% more than the maximum and that same difference less than the minimum. This helps to keep the appearance of the magnitude of the drop consistent between countries with vastly different streaming totals. I’ve also elected to show all of the plots at the end of this post (can blog posts have appendices?), because our dataset consists of a total of 54 countries and they’re a lot to scroll through.
Before I close, I want to call attention to one country in particular. Italy was one of the hardest-hit countries early in the pandemic, with strict lockdowns put in place and the highest daily number of deaths at the time. Of all the different plots shown below, Italy shows one of the clearest correlations between date of lockdown and significant and sustained drop in streaming.
Not all countries in the dataset showed this clear of a relationship. Some countries, like Ireland, Germany, and Belgium, didn’t seem to have lockdowns affect their streaming much at all. Other countries, like Greece, even saw their streaming increase as the pandemic ramped up around the world and their national lockdown took effect. Finally, as you learn Day 1 in Psychology and Statistics 101, correlation does not imply causation! Just because there appears to be a relationship between two events, does not guarantee that they are related (example).
With that said, I counted 27 of the 54 (A.K.A. half!) country’s streaming totals plotted below to show sharp drops occur in the timeframe of the surge of the virus and lockdowns. Furthermore, this dip in streaming didn’t rebound quickly like other significant drops that have occurred in the past, it was consistent and prolonged. If we do allow that there could be some relation between steaming totals and the global pandemic, the next step is to speculate on reasons why. Here are a few of my best guesses:
- Elimination of the commute – with so many people forced into unemployment or working from home, millions of streams are lost each day due to people not listening to music on their way to work.
- Not feeling the pop – as mentioned before, I can only pull the 200 most-streamed songs in a given day for a given country. This means that I can only see a small sample of the total streams in a day. This means that if streaming habits have shifted away from pop music – which, by definition, makes up the top 200 – we could just be showing people shift away from streaming pop music as frequently. The Weeknd’s “brooding nihilism” just isn’t that interesting if you’re worried about paying rent or your loved ones catching the virus.
- Less time for streaming – whether people are working from home or helping their kids with online learning, routines have been disrupted and maybe music streaming hasn’t filled into the cracks of the new normal yet.
Conclusion
To recap:
- From 2017-2019, the total number of streams on Spotify Charts has grown steadily with time. Streaming totals for all countries combined has been growing at a faster rate than the global top 200, showing that Spotify is increasing in popularity in countries that do not dominate global charts.
- Starting in March 2020, total streams have decreased. These totals have been lower than adjusted historical values and underperformed predictions by ~10% on average.
- When looking at streaming totals on a country-by-country basis, some nations have seen sharper drops than others. 7 of the 20 largest week-to-week decreases in this dataset have occurred in March this year. Some fluctuation in streaming is common, but streaming totals in many countries in this timeframe have decreased and remained low for weeks, behavior that was not seen in large drops that have occurred in the past.
- There appears to be some correlation between the date a national lockdown was imposed and a drop in total charts streams in that country. Not all countries showed clear relationships, but half of the dataset exhibited sharp drops coinciding with the surge of the virus and the instituting of lockdowns.