Cookies are a fundamental mechanism used in the advertising industry for ad personalisation and behavioural targeting, re-targeting, frequency capping and much more. Nowadays, a majority of ad technology providers that are disrupting* advertising industry are still relying on cookies to a great extent.
With recent bad publicity around cookies, new European legislation (so called Cookie law) and new default settings in various web browsers (e.g. enabled DNT headers in Internet Explorer, blocking 3rd party cookies in Firefox and Safari), it is becoming critical to start looking for alternatives and implementing them.
(*) In my opinion there isn’t much disruption happening, rather slow progress, but that’s another story about how crippled the advertising industry is.
What are cookies?
Cookies are key-value data stored by the browser on a domain or domain/path level. They are set either by HTTP response that the browser received from the server or via JavaScript code executed on the webpage. Cookies are sent back to the web server in every HTTP request (only cookies that match the domain/path are sent in the request to the particular domain/path).
Tracking methods
In usual tracker/ad server implementation, only a unique identifier of the visitor profile is stored in the cookie, and the rest of the data lies on the server. Most commonly, ad technology relies on 3rd party cookies (that are set in another domain than the website that the user visited). Their big advantage over 1st party cookies is that the ad ta loaded from the same (ad server’s) domain on different sites still has the same cookie passed by the browser – this is an easy way to track the user across websites that he has visited, ads he has seen etc.
The problem that ad technologies face is that cookies from year to year become less persistent (due to blocking/rejection or automatic deletion), and especially 3rd party cookies persistence is declining the fastest.
What are alternatives? Let’s look through the methods available via which we can deterministically get a unique identifier of the visitor in every request to the server:
- 1st party cookies, are cookies stored within the same domain as visited web page, storing/reading via JavaScript or HTTP request/response,
- 3rd party cookies, are cookies stored under other domain than visited web page, storing/reading via JavaScript or HTTP request/response,
- flash cookies (local shared objects, LSOs), are local shared objects stored by Adobe Flash, storing/reading via Adobe Flash object,
- ETags, are part of HTTP protocol – it’s a mechanism for cache validation which may be used for tracking, storing/reading via HTTP request/response,
- HTML5 local storage, is persistent storage at domain scope, storing/reading via JavaScript,
- Fingerprinting, is done via matching IP address along with browser configuration such as user-agent, plugins, screen size etc. Usually done via JavaScript, but can be only server-side (in that case only by IP+User-agent)
There are systems using hybrid approaches which use several of above techniques for tracking. This method is called respawning, which is a way to restore http cookie by reading flash cookie, ETag or HTML5 local storage.
Even using a hybrid approach to manage cookies (which basically means storing visitor’s unique id via multiple methods listed above and trying retrieving it from any available at the time), still do not work in 100% cases.
Additionally, users start using multiple devices to interact with online services and content. Above tracking techniques are not able to tackle the problem of cross-device tracking.
Better tracking approach.
Let’s start with ideal properties of method for tracking visitors.
- provide a way to uniquely identify the user,
- persistent identification, identifier stays the same not only within session, but until user opts out or force to reset it,
- enables cross-device tracking, which means we can identify the same user no matter if he is on his smartphone, tablet or desktop,
- provide a way to opt out from tracking,
- anonymize data tracked, which should not only ensure that user identifier stored by an advertiser cannot link back to personable identifiable information, but also let user to reset it (generate a new user identifier).
What could be potentially satisfy all of these properties? I haven’t seen yet anything that would be at least close to that.
We already have at least several unique identifiers that we use in internet, e.g.:
- email address, that could already be used to target us with ads via Custom Audiences on Facebook,
- social profile ids, LinkedIn, Twitter, Facebook, Spotify & many other popular social networks may use their unique identifier to track us across sites where their widgets are placed.
Although, none of above provide a way to anonymize or reset user identifier, and each could easily be linked to a person which raises a lot of privacy concerns.
Going further.
There is obviously a lot of issues with cookies, but due to their wide adoption, the change will be very hard.
IAB already created a The Future of the Cookie Working Group working on alternatives that would be more reliable, universal and can be easily widely adopted by ad technology suppliers.
The new standard shall on one hand guarantee reasonable privacy for users, and on the other hand provide a reliable tracking mechanisms. I can hardly see win-win situation, but a good consensus between publishers, advertisers and consumers.
Consumers has plenty to say in today’s internet, but I hope that it won’t end up with a ridiculous solution like the one forced by European Cookie Law which littered websites with meaningless notices about usage of cookies.
Further reading
- Presentation: Future of the cookie by IAB http://www.slideshare.net/jordanmitchell/iab-ad-technology-council-future-of-the-cookie
- Article: A Primer on Information Theory and Privacy By Peter Eckersley https://www.eff.org/deeplinks/2010/01/primer-information-theory-and-privacy
- RFC 6265 – HTTP State Management Mechanism http://tools.ietf.org/html/rfc6265
Post illustration credit: gegen-den-strich