The forthcoming General Data Protection Regulation on May 25 is pushing publishers to take a hard look at just how dependent their outlets have become on cookies third-party trackers they load on their own sites in order to collect data from their visitors.
News sites actually load more third-party content and set more third-party cookies than other top websites, according to a new study of websites across seven European countries from the Reuters Institute.
News sites in those countries averaged 40 different third-party domains per page and 81 third-party cookies per page, compared to an average of 10 and 12, respectively, for the group of top websites in those countries. (Among sites that run some kind of advertising, the study found that news sites on average load four times as many third-party domains compared to other top websites.)
U.K. news sites were, on average, the most bloated of the bunch:
The prevalence of cookies and third-party tracking varies across news sites that rely on different revenue sources, and thus have different incentives and advertising needs.
Public media sites, most of which depend on neither subscription revenue nor advertising, share the least data with third parties. German news sites are generally more restrained than their U.K. counterparts. Compare the news site for the popular daily Bild, for instance, to the Daily Mirror site in the U.K., and BBC News to German public broadcaster ARD/Tagesschau:
Researchers were able to compare all the third-party requests made on a selection of news sites (based on relative reach and prominence in their respective countries), as well as the 500 overall most popular sites in Finland, France, Germany, Italy, Poland, Spain, and the U.K., by using an open-source tool called webXray, which monitors and then records third-party content that loads on a given page in Chrome. webXray can identify about 400 different types of third-party services, 270 of which showed up in the Reuters analysis.
Surprise, surprise: Google services are on most of the pages the researchers analyzed (followed distantly by Facebook):
GDPR takes aim at the collection of identifiable data on internet users that the users have not knowingly consented to, and levies heavy fines for non-compliance, meaning news sites should have due diligence on what’s loading on their pages…like, yesterday. In their study, the Reuters researchers have included handy rundown of types of third-party content that a site might be carrying and the purposes of each, many of which are not inherently problematic. But just to give you a taste of the range: Loading images from hosting services like AWS? Run Google Analytics? Load ads via Google’s DoubleClick network? Have a Facebook “Share” widget? Include Taboola/Outbrain recommendations on your page? That’s all part of this.
So what are news organizations to do about their sites, with GDPR coming into effect in a little over a week? Researchers Timothy Libert and Rasmus Kleis Nielsen offer a helpful matrix for understanding relative privacy risk of each type of content loaded, as it applies to users:
News organizations should be able to make some simple improvements to protecting users’ privacy pretty easily (see especially, the “low risk,” “low effort to replace” items):
Similarly, social media buttons frequently set cookies and may link browsing data directly to users’ profiles, representing a high privacy risk. While social media companies provide code to enable sharing, it is possible to implement widgets on a first-party basis which facilitate social sharing. Even if social media companies would prefer sharing to happen with their widgets, they have no interest in preventing sharing.
The full study is available here.