By Martin Gallo
Uniquely identifying the user’s device or browser when accessing an online resource can be useful in very different contexts, and the impact can be different according to those contexts. With respect to identity security, the intelligence can significantly contribute to adaptive authentication journeys and the evaluation of risks when allowing access to resources. In this article we introduce the concept of browser fingerprinting and explore some of the challenges the industry is facing when it comes to utilizing this tool in a secure and privacy-preserving fashion.
Browser Fingerprinting in the Identity World
Let’s begin by setting up the context and discuss browser fingerprinting from an identity security perspective.
First, it is important to understand that a large set of attributes and properties of the user, and the request itself, are generally considered when evaluating an access request. Depending on the type of authentication workflows in place, those characteristics can vary, but it’s very common for Access Management solutions to consider attributes such as the date and time of access, user’s group memberships, originating IP addresses, browser type, Operating System and others. Advanced and adaptive solutions make use of additional contextual, behavioral and threat-related attributes as well, building a profile of the user over time ultimately improving user experience and security.
Given this, having the ability to uniquely identify a user’s browser or the device from which the access is requested is of utmost importance. While some of the aforementioned attributes can represent signals of suspicious activity, like the use of an unknown or maliciously tagged IP address or accessing a document at an unusual time of the day, identifying that a given request was originated from a device the user carries every day is a positive sign. While it should not be considered a strong authenticator factor on its own, it is commonly used to step-down and avoid requesting a second or additional factor if no other risk indicators are identified. It can be also included as a positive factor to reduce the use of passwords in password-less workflows and journeys.
However, the Web access model does not quite so easily allow for this type of identification aside from the use of persistent cookies. Considering this, the industry started to explore ways to identify the user’s device, and more particularly its browser. One of those ideas is to produce unique identifiers of the user’s browser itself, by capturing the input of multiple attributes or “semi-identifiers”, such as the configured language, underlying Operative System, screen information, installed fonts and plugins, and more. The combination of these unique pieces of information is pretty close to reassembling a unique user’s digital fingerprint and are the so called “browser fingerprint”. A very extensive list of such type of semi-identifiers has been complied by Mozilla. The Chromium project published a technical analysis of the different client identification mechanisms as well.
SecureAuth’s access management solution utilizes browser fingerprinting as one of the mechanisms to perform device recognition, as part of several innovations and enhancements resulting in patents granted for those innovations. Other vendors followed up the trend and claim to have implemented similar approaches as well, and there’re also some Open Source solutions available for providing browser fingerprinting. The overall idea is that in those workflows a user in possession of valid credentials, will be challenged for an additional authentication factor when trying to access or execute a sensitive action in the target system if the “fingerprint” obtained from its browser doesn’t match the one known for the user. While it’s not perfect, and we’re going to discuss some of the issues associated with this approach in the next sections, it results in smooth and secure login experiences when combined with additional adaptive layers. And it’s also used as a security and anti-fraud method in ecommerce, cryptocurrency and other online platforms.
However, as online threat and privacy models started to change in recent years, several new challenges came into play.
A Black Market for Fingerprints
What happens when certain security measures are put in place? Well, the general rule is that bad actors will try to circumvent them if the economic incentive is worth the effort. Browser fingerprinting is not an exception, as we have started to see in recent years.
The Genesis market is one of the most notable examples, with threat actors operating a pay-per-bot store at early as November 2018. The store, which was extensively exposed by Kaspersky in April 2019 in their “Digital Doppelgangers” report and followed up by several other analysis, takes advantage of hundreds of infections performed using commodity malware, such as AZORult, to not only steal credentials and cookies but also to gain access to the victim’s browser fingerprints. The market customers can then use a custom browser extension in order to impersonate the victim’s browser and apply the stolen digital fingerprints. This results in a higher probability of not triggering anti-fraud, multi-factor authentication prompts or other type of security protections such as the ones mentioned in the previous section.
As a more recent example, IntSights reported in August 2019 about a new store that emerged with a very similar model, denoted “Richlogs”. While it’s not clear what infection method or malware piece is used on the backend, this black market is said to directly compete with Genesis and in the same way offers not only credentials but a wide range of personal information about the victims. Within time they even added phishing and other forms of proxy interception mechanisms.
The advent of these type of browsers and other types of malicious implants plus the monetization of digital fingerprints shows one of the weaknesses of browser fingerprinting as a defensive measure, and one of the reasons why it cannot be trusted as a strong authentication factor.
All in all, despite ups and downs that some particular stores might have over time, we suspect these types of markets will not cease to exist and have more notoriety because the use of fingerprint-based mechanisms continue to increase as part of security or anti-fraud solutions. And it might be that the growing market of compromised credentials will continue to incorporate browsers fingerprints.
Ads, Ads Networks and the Monetization of the Web
On another front and not much time ago, when publishers started to move their business online, they needed to find a way to generate revenue for the content they were creating and publishing. The main answer was to look at the revenue model most publishers were accustomed to from the print-publishing era: advertising. Ads are the most intuitive and easy way for online publishers to monetize their content. Advertising continues to be, nowadays, one of the main online revenue streams and it might be difficult to conceive the Internet without Ads.
One of the great benefits of the Web is the principle of composability, and Ads benefit remarkably from this. The advertising ecosystem continues to grow, by incorporating new roles and players such as the supply-side and demand-side platforms, ad exchanges, ad networks and other standalone players. In this context, it is important to understand that ads are included in a given Web page not necessarily by the sole Web owner but mostly by third parties. Much of the tracking of ad views, clicks, conversions and generation of other metrics largely relies on the use of third-party cookies or other forms of establishing a cross-site identity, as the content is distributed among a wide range of sources. Recent studies show that approximately 90% of the Websites, from lists of high-traffic sites, perform some sort of tracking.
It’s fair to say also that to a considerable degree some advertisers, ad platforms, social media and analytics companies have been using all types of techniques, such as “supercookies” and more recently browser fingerprinting, as a pure tracking mechanism. The reason is to obtain greater success rates when correlating visits of one user with its historical data, and most of the time avoiding or bypassing any type of user consent. Even more, tools such as browser fingerprinting in combination with other indicators can be used to regenerate normal cookies that the user has deleted or help on generating additional linkage elements, turning them into very powerful techniques.
It’s known to that untrusted parties will make their own attempts at using different techniques to manipulate advertising metrics for their own financial benefit, affecting the revenues of publishers and advertisers. Obtaining measurements for those metrics is a continuous challenge and anti-fraud protections are then required. Platforms started to expand on the methods to uniquely identify users and visits. Adding IP address in the mix helps but then the challenge is how to differentiate users behind the same address (for example in NAT’ed or proxied environments) when cookies are not available. As cookie and IP address-based solutions fall short many ad targeting tools currently make use of the same browser fingerprinting techniques to mitigate those situations and try to get more reliable measurements.
So as can be observed, we can currently find browser fingerprinting being used by the advertising industry and its ecosystem both as an anti-fraud technique and as an absolute tracking method. What are the next challenges then?
Privacy Threats and the Browser’s Ongoing War on Fingerprinting
As things developed, the use of tracking technologies advanced over some of the fundamental users’ rights. Most of the time, users are not aware of the amount and reach of third parties involved when visiting a Web site. Less to say as to understand what those third parties are doing with their data. In an ideal world, users, publishers, advertisers nor developers should not habe to be worried about this, and technology should be helping to keep users’ privacy instead.
There is a wide range of reasons why users want to remain anonymous or unidentified when navigating online. Those can range from personal safety due to threats of violence or imprisonment, to concerns about surveillance from nation-state intelligence agencies, to concerns about discrimination based on the navigation history or opinions emitted in a Web site. The not so long history of the Internet has proven that those concerns are not only valid enough but also have real consequences for the involved individuals and organizations.
The use of browser fingerprinting techniques levels up some of these concerns, as with this mechanism the correlation of browsing activity can be performed even against very privacy-conscious users that regularly clear cookies or other types of user agent’s stores. Unlike cookies or other standard browser features, fingerprinting allows for collection and correlation of data without any type of user consent nor indicator of the activity being performed, which is not desired.
Aligned with these concerns, organizations and individuals started to give more visibility to the situation. Efforts such as the General Data Protection Regulation and the ePrivacy Directive in Europe started to regulate the use of third-party cookies for tracking and mandated explicit user consent before relying on them. Electronic Frontier Foundation released a research project called Panopticlick that can be used to understand if the browser you’re using is safe against tracking, with a big focus on surveying fingerprinting techniques. Other projects such as AmIUnique allow users to understand how identifiable your browser is while helping build a corpus of data that can be used to study the different type of fingerprinting implementations.
Either driven by the need to determine the actual tracking efficiency or as to identify defensive measures and mitigations, a lot of material has been researched and written about browser and device fingerprinting. The academic research community has produced a very large amount of papers and research, with findings ranging from the use of smartphone’s sensor calibration as sources of unique identifiers to proving that cross-browser fingerprinting can be performed by using Operative System and hardware level functions.
Different actors in the Web ecosystem have started to propose and execute on different fronts to either prevent or reduce the impact of fingerprinting as a tracking method. Industry and academy organizations as well as individuals started to work on ways to perform user identification in a privacy-preserving fashion. Notable examples of this work is PrivacyPass protocol and browser extension. The World Wide Web Consortium (W3C) recently produced and published notes and guidance for Web specification authors as to reduce the fingerprinting exposure.
Privacy-focused organizations and projects, such as the Tor Project, started to openly work on limiting the effectiveness of fingerprinting and reduce the attack surface when performed on its browser. This was followed by almost all the major browser vendors, which announced and released plans to work on related initiatives. Mozilla published an anti-tracking policy, from which WebKit’s Tracking Prevention Policy is based. Google announced The Privacy Sandbox project for Chromium, the backend code base that powers the Chrome browser among others, in a mission to “create a thriving web ecosystem that is respectful of users and private by default”. Microsoft made its move as well, by leveraging some of those efforts and adding them on top of their tracking prevention mechanisms of the Edge browser.
All in all, these efforts seem to have the same ambition and direction: limiting the unconsented tracking of users. The proposed solutions vary and involve work on different fronts: for example removing third-party cookies, the use of new APIs such as Privacy Budget or Trust Tokens, reducing the fingerprinting surfaces by removing functionality or lowering the entropy of some of those surfaces. Some of these are on-going or already planned efforts, while others are early proposals that might need previous work and consensus before plans can be established.
Whereas the technical means to achieve the overall goal are not the same for each organization and involve different types of technologies yet to be fully defined, it’s clear that browsers have declared war on user tracking, and as a consequence to browser fingerprinting.
So, What is Next Then?
Considering the challenges shared, it looks like the long-term strategy for uniquely identifying user’s devices in the context of access management solutions might not involve fingerprinting browsers. However, different identity security initiatives have varying threat models and a wide range of constraints. Some consumer identity projects will probably continue to rely in the near-term on some sort of fingerprinting capability, given the challenges of unmanaged devices and the need to keep the user’s experience at the lowest friction level possible.
The mid and long-term strategies are being re-defined as the industry is transitioning on this journey, and we understand it will involve a variety of technologies according to the particularities and needs of each access initiative: custom endpoint and user-agent software pieces, device management platforms, security keys and why not digital certificates and different shapes of Public Key Infrastructures that at one point in the past deemed nefarious and deprecated but are now at the core of several novel innovations.
The blend of conflicting use and abuse cases lead to the current situation, where browser fingerprinting is used as both legit and non-legit mechanisms to track users and devices, with a significant impact on the users’ privacy. As browser vendors and standardization bodies work on mitigating and reducing the impacts, it is clear that identity journeys relying on this mechanism to perform device recognition should start analyzing pivots into other tools.