London - Paris

Websites surreptitiously tracking and reporting on visitors

Websites surreptitiously tracking and reporting on visitors

As a privacy and data protection advocate, we do occasionally monitor major websites compliance with data protection regulations. We have previously wrote about website compliance and cookies requirements of information, transparency and consent request where applicable. The French Data Protection Authority, CNIL, has played a lead role in clarifying the regulation. The expectation would be that French websites would have followed thes guidances, using the two years of preparation and the first year of GDPR application to comply. Sadly, this has not always been the case. Even pointing the issues to some web owners has not had much effect. Compliance with cookie laws should not be that hard : a cookie banner based on an opt in positive action should give visitors a free option to accept tracking. Full transparency has to be provided by a privacy notice according. See ICO guidance ‘What privacy information should we provide?‘ and CNIL ‘Mentions sur votre site internet : les obligations à respecter’

To ensure our foundings on the website were correct, we requested several IT and security experts to double check. One particular recognised expert, Allen Woods(1), kindly accepted to have a look to further analyse one of these websites. The result is eye opening. We knew about website tracking but this website analysis threw the light to a whole different dimension on the depth of tracking, profiling and fingerprinting, data sharing with third parties not only via cookies but Java Script codes used by the web developers. We invite anyone involved with data protection, IT security and cybersecurity to carefully read this document with full attention. Please share your views and observations.

Initially Allen Woods confirmed that ‘the use, by second, third and fourth parties of cookies constitutes an extension of the organisation boundary of site owners tracking the movements of devices and therefore their owners, with a considerable degree of accuracy‘.

Allen Woods then proceed to further scrutiny to add into his report :

with the sample URL taken from an apparently autogenerated module with the file name : “7D1mEqsUJEn.js. Both beacon and file name appear to be automatically generated at run time on a basis similar to GUID identifiers. The direct impact of the module is an embedded call back to the react.js CDN web site (part of the Facebook estate) with the effect of forcing a regular Facebook monitoring presence on any page using or calling the react.js sub components or code functions concerned.’

You wonder how many lawyers, even IT lawyers advising on Data Protection compliance matters, or website owners, understand what this means. Let’s be honest, We are just starting to pierce the mystery of coding thanks to Allen and few other security experts around.

It’s important to note that for Allen Woods the hidden JavaScript codes on the website ‘represents a multiple risk in that source and components can be inserted for tracking or any other purpose malicious or otherwise but without making the client side users aware.’

Allen concludes his report by stating :

Firstly, in the authors opinion, the owners of the site under review have lost control of data flow into and out of the site concerned and should consider a complete rebuild using components that have been vetted in terms of their functionality and capabilities with a view to reversing that loss of control. There is now a legal imperative for doing as recommended in that current regulation places an emphasis on the “controller” (site owner) to ensure the protection of the integrity of personal data in particular.

Secondly, a review of the reasoning behind the need to collect and analyse visitor traffic should be carried out and an alternative to the use of Google Analytics should be seriously considered. In the current legal operating environment, which establishes definitions of controller and processor responsibilities that making use of Google Analytics and passing visitor data to its control to be processed where Google sees fit, when web site server side log files contain much the same data. It may well be that using a desk top log file analyser will give the kind of visitor analysis that the site owner requires.”

This is important as many many websites use Google Analytics to measure their web traffic trough the IP address. IP address, identification number entering the definition of ‘personal data’ as in the Article 4 GDPR.

As a matter of due diligence, when commissioning a site, site owners should take an active role in approving which components are going to be used by whoever it is that is commissioned to build their web site. If the author was asked to approve any of the components provided by major platforms generally, but particularly social media platforms like Twitter, Google and Facebook and external template platforms like WORDPRESS, that those suggesting the use of such components should guarantee site owners as to the reliability and robustness of the code in respect of compliance and furthermore prove that the code concerned had been forensically reviewed to confirm reliability and robust compliance.’

Disclaimer here : this DataRainbow website itself is build on a WordPress template, and has used Google fonts but no analytics. It has become extremely hard to find any web developper that does not use ready made templates.

Knowing how much users’ behaviour is tracked gives a tough burden and liability to website owners as data controller unless there is a certification assuring web developers real competences and not simply a ‘lego block’ assembler expert. Increasingly web developer use templates such as WordPress that comes out of users control tracking behaviour and location. Based on the GDPR, they are two level of liability. One for the data controller, basically whoever decides the means and purposes of the data processing, and the data processor, acting under the guidance and on behalf of the data controller. The European Court of Justice Advocate General opinion in the FashionID case where embedded Facebook ‘like‘ button capable of gathering data such as IP addresses and browser identification where collected, Facebook was considered data controller. Additionally the ECJ held that the hosting websites should be considered as joint data controller sharing proportional liability. The so called ‘Facebook Fanpage’ decision had already created co-controllership between Facebook and the admin of the Facebook page. The website owner which receives anonymous statistics is deemed to be data controller by permitting third party scripts as the data they allow to be collected is not anonymous from the perspective of the analytics providers. A detailed picture of users is build to target advertising to users. Before that, ECJ in the Jehovan todistajat held that in order to have joint control and joint responsibility, it is not required that each of the controllers have access to (all of) the personal data concerned. Many web owners are in total ignorance of such a shared liability.

Facebook is in close competition with Google, Amazon, Microsoft or Twitter. A privacy regulation might find challenging to combat this kind of behaviour as it has also become an anti-trust issue of over grown corporations. GAFAM’s seems to have become too big to any regulator to combat.

Users online tracking is a vast subject. Facebook users cannot avoid location-based ads, investigation finds.There is no combination of settings that users can enable to prevent their location data from being used by advertisers to target them, according to the privacy researcher Aleksandra Korolova. “Taken together,” Korolova says, “Facebook creates an illusion of control rather than giving actual control over location-related ad targeting, which can lead to real harm.”. In reality, user tracking goes above simple users geolocation. Many websites using free Facebook codes librairies track users geolocation and browsing on all kind of websites.

Helen Dixon, The Irish Data Protection Commissionner announced, in a recent interview, potential first fines in the next few weeks : “It would be a most surprising outcome if, of the 20 big tech investigations we have underway, fines were not a feature of the process”. Dixon said in an interview in London, after speaking at Bloomberg’s Sooner Than You Think technology conference. Simply fining GAFAMs wont stop them, there need to be combined by injunction to stop the fingerprinting and delete all data illegitimately collected.

Collecting web visitors data, profiling, and selling targeted advertisement has become a highly lucrative business. Ad Tech GDPR complaint is extended to four more European regulators. Complainants consider Real-Time Bidding a “vast scale personal data leakage by Google and other major companies” in the behavioural advertising industry. ‘GDPR complaints about Real-Time Bidding (RTB) in the online advertising industry were filed today with Data Protection Authorities in Spain, the Netherlands, Belgium, and Luxembourg.  The complaints detail the vast scale of personal data leakage by Google and other major companies in the “Ad Tech” industry. This week marks one year since the introduction of the GDPR.  The new complaints have been filed by Gemma Galdon Clavell (Eticas Foundation) and Diego Fanjul (Finch), David Korteweg (Bits of Freedom), Jef Ausloos (University of Amsterdam), Pierre Dewitte (University of Leuven), and Jose Belo (Exigo Luxembourg). Today’s filings extend the complaints initially filed in Ireland, the UK and Poland, to a total of seven EU countries. This week marks one year since the GDPR.  Read more here and here

Somehow, we can imagine the gathering of information on each individual internet user to profile their personality in order to predict needs and wishes to be sold to Ad bidders. Google’s wish is to predict your needs before you realise. Young people get offered Apple ‘smart‘ watch with their Vitality insurance and health club that monitor their heart beats and exercises. This is all OK until an illness is detected. Women won’t anymore need to knock at the HR door to announce their pregnancy. Probably their employer will be informed before they even notice it. Consequently, new businesses have been created : dog Fitbit walker to simulate exercise to trick the insurance company. Would that be sufficient? Probably not enough to hide actual illness that your insurance company love to be the first to know about.

And if you wondered why privacy matters, think of the past, not that long ago, what the Stasi could have done had they had access to Google or Facebook. Think of totalitarian regimes. Think of the US democracy asking social media passwords from visitors. you think you have nothing to hide, think twice. Think of over-reliance on technology and the margin of error.

In fine, a demonstration of what happened if your shop assistant was an App.

1- Allen Woods now retired. Before retirement he worked primarily for the UK Ministry of defence both in and out of Uniform for nearly 50 years. The first 15 years being on front line operational duties with mainly Infantry Battalions. The remainder of his time being spent work on information management related activities. On leaving the Army in 1995 he was described by his then commanding officer as the “backbone” of his Corps software development effort and with an Exemplary record both in front line service and subsequently for the UK Defence Logistics Support Chain.

Further read : “More than two of out every three dollars spent on digital ads in the US goes to one of the three companies.”

Beacon Marketing Platform inMarket Introduces Predictive Targeting Program

in 2009 Facebook shuts down Beacon

On the expanding notion of data controller