Note: This blog was co-authored by SecureAuth Security Researcher: Leandro Cuozzo.
The term browser fingerprinting refers to the process of assigning an identifier to the browser of each user who visits a website by using several methods and techniques that gather attributes or properties from the browser, the operating system, and the device. This process is applied to correlate future visits from the same user or device with historical data. It is used in adaptive and continuous authentication schemes to strengthen security and provide a navigability with less friction for the user.
SecureAuth’s access management solution utilizes browser fingerprinting as one of the mechanisms to perform device recognition, as part of several innovations and enhancements resulting in patents granted for those innovations. Other vendors followed up the trend and claim to have implemented similar approaches as well, and there are also some Open-Source solutions available for providing browser fingerprinting.
The overall idea is that a user in possession of valid credentials will be challenged for an additional authentication factor if the “fingerprint” obtained from its browser does not match a known one. While it’s not perfect, as stated before, it results in smooth and secure login experiences when combined with additional adaptive layers.
Basic authentication flow with the browser fingerprinting feature
In this article, we will review the current state of the art in terms of browser fingerprinting. We will identify the players involved in the arms race, the main actions from both sides, and what can we do from our position.
The good and the bad in browser fingerprinting the current state of the art
At the same time, as fingerprinting is being abused for tracking and advertisement purposes, browsers are continuously adding measures to avoid or tampering it. Consequently, the efficiency of the fingerprint process and the stability of the fingerprints obtained are impacted. In this context, we are falling behind a browser fingerprinting arms race.
Keeping alive the browser fingerprinting surface
By staying up-to-date with the state of the art in terms of browser fingerprinting we can provide better Identity Access Management solutions that avoid friction with users, and, most important, that are more secure since we can identify users more precisely.
Luckily for us, there are hundreds of fingerprinting methods still available to use. We could divide them in four categories:
- Deterministic device information, which includes the characteristics of the device, the operating system, and the browser. Examples of these are: battery status, Canvas, ClientRects, clock skew and drift, CSS media queries/capabilities, emoji, error messages, feature list detection, fragment shader, header signature, HTTP/2 frames, image encoding and decoding, JAVA system properties, Line-Curve-Antialiasing, MathML, math routines, number of CPU virtual cores, screen resolution and color depth, social media login detection, transparency via Alpha Channel, user agent string (to be deprecated), vertex shader (WebGL component), Web Audio API, WebGL, ClientHints (experimental), (experimental).
- Network configurations, related to the architecture and lower-level configuration of the local network. Examples of these are: external client IP address, geolocation API, TCP/IP and TLS. There are more techniques, but they are not recommended since they clearly cross the line in terms of the privacy of users, like DNS leaks, WebRTC or STAR-ECHO.
- User behavior and preferences, such as online behaviors or local preferences. Examples of these are: keystroke dynamics, accelerometer readings, mouse gesture dynamics.
This is good, but it carries a terrible curse: these methods can be used for both good and bad, as we stated in our previous note, the correlation of browsing activity can be performed even against very privacy-conscious users that regularly clear cookies or other types of user agent’s stores. Unlike cookies or other standard browser features, fingerprinting allows for collection and correlation of data without any type of user consent nor indicator of the activity being performed, which is not desired.
This unleashed one of the first battles, the development of several anti-fingerprinting projects and initiatives to protect the privacy of the users. Let us see in more detail this work.
Identifying the anti-fingerprint actions
Meanwhile, all main players involved in the WWW are carrying out different actions to prevent or mitigate the fingerprinting techniques. From web browser vendors to standard organizations, all of them started to play an important role in this arms race.
Web browser fingerprinting initiatives
Major web browsers take the problem of trackers very much into consideration, emphasizing how they use fingerprint techniques to maximize user identification. They all have ongoing anti-tracking (privacy) programs, some more mature than others. For example, Chrome has the Privacy Sandbox project, Microsoft Edge the Tracking Prevention feature, Mozilla/Firefox has the Enhanced Tracking Protection, Safari has the Intelligent Tracking Prevention, and let us not forget the Tor Project and the Brave browser.
While Mozilla and Edge use the Disconnect open-source list to detect and classify trackers, Safari WebKit implements ITP. Moreover, Mozilla adds other strong anti-fingerprint measures (in addition to blocking third-party trackers) based on the Tor program. The latter is the most mature program since it is one of the first to address privacy as a key concept on the Web and adopts it as a pillar for the design of its browsers. The main countermeasures that both programs are applying correspond to techniques that alter or spoof the fingerprints that want to be obtained by first and third parties. Furthermore, WebKit’s extreme approach related to the fingerprint surface, removes the fingerprint vectors through the disabling of features and even the removal of APIs.
So, what can we expect from browsers? We expect them to continue to stop third-party sites tracking users by using URL blockers and even adding more intelligence to its algorithms. Browsers tend to implement initiatives that involve a balance between privacy, usability and “free Internet”. Specific counter measures (those that do not consider the fingerprint issue as a whole) that involved tampering fingerprints are difficult to implement. These measures usually add more noise that makes easier the user identification and, sometimes, they may cause some sites or content to not load correctly, making the user’s browsing problematic (without forgetting to mention, the problem that can cause to developers by removing or restricting an existing API or feature!). Related to the previous point, programs that include privacy concerns as a design guide will be more able to add effective countermeasures.
Standard organization efforts
Regarding the standard organizations, the World Wide Web Consortium (W3C) and the Internet Engineering Task Force (IETF) have also been viewed privacy and tracking concepts as major concerns to address from the design. The first named organization has been developing some guides that encourage specification authors to consider the impact on privacy and threat modeling at the design stage. To address this, W3C launched a couple of documents of interest. One of them is “Mitigating Browser Fingerprinting in Web Specifications“. This document addresses the best practices and common actions that authors of specifications for Web features can take to mitigate the privacy impacts of browser fingerprinting. For its part, IETF considers the privacy concern from the point of view of the Internet protocols. The document “Privacy Considerations for Internet Protocols” addresses the main topics related to this problematic by offering guidance for developing privacy considerations for inclusion in protocol specifications.
Although these web specs are good, and have greater consideration on privacy topics, we believe that the fingerprint surface will expand as fingerprint vectors are observable after the specifications are implemented by the browsers. HTML APIs keep growing, providing access to an increasing amount of information about the browser and its environment. So potentially many more APIs will leak identifying information. Defining the best trade-off between rich features and privacy is a critical and difficult choice when setting up new APIs. Moreover, the privacy issues with protocols already implemented will continue as it involves an elevated cost to modify them.
Is there something that we can do as users? Broadly speaking, there are two main defenses to mitigate browser fingerprinting. On the one hand, the defense is to increase the diversity of fingerprints so that real ones are hidden in noise (a.k.a. submit random fingerprints). The intuition behind this method is that third parties rely on fingerprint stability to link fingerprints to a single device. By sending randomized or pre-defined values instead of the real ones, the collected fingerprints are so different and unstable that a tracking company is unable to identify devices on the web. On the other hand, the strategy is to block the fingerprint vector, for example, blocking a specific API.
Following these strategies, several add-on developers have launched a series of tools to protect users against fingerprint, some of the most used browser extensions are: AudioContext fingerprint defender, Canvas fingerprint defender, NoScript, random User-Agent, uBlock Origin, WebGL defender.
Contrary to protecting the users, extensions in general add uniqueness. Techniques used by some plugins allow to create unique hashes from users due to the low adoption of the plugins, making the users who use them be classified in a small group and therefore more identifiable. In addition, there are several scripts that can detect the use of anti-fingerprinting protections. This detection also helps to increase the user uniqueness. Another arguable default is that extensions are naturally limited in what they can do due to the browser architecture.
We believe that in the end, privacy-conscious users will opt to use browsers with mature anti-fingerprint programs rather than install specific add-ons.
Hiding in the crowd
So, at this point, you are probably wondering what you can do to avoid browser fingerprinting today. To answer this, let us look at some key concepts.
For a fingerprint to be effective in correctly authenticating users, the collected information must follow two key principles: uniqueness and stability. The first term is related to the means to provide enough ground for identification, the more unique a fingerprint, the more identifiable it is. When do we have unique fingerprints? When the fingerprint has an attribute, whose value is only present once in the whole dataset or, when the combination of all its attributes is unique in the whole dataset. While entropy defines the amount of uniqueness that a specific property exposed by the browser (such as the User-Agent header) introduces into a browser fingerprint, stability links the browser fingerprints that belong to the same device. For stability, the quantity of modified information (each time the user’s fingerprint is obtained) should be as small as possible.
Entropy is usually expressed in bits, the higher the entropy is, the more unique and identifiable a fingerprint will be. Then, we can say that a technique is more or less effective in the terms of its ability to say that a fingerprint is unique. Since browsers are continually changing, the entire fingerprinting domain keeps evolving and its effectiveness can vary over time. In addition, over the course of its lifetime, a device exhibits different fingerprints. This comes from the fact that web technologies are constantly evolving and thus, browser components are continually updated.
So, what can we do to avoid browser fingerprinting? The answer is nothing in particular, just blend in. As in life, if you do not want to be spotted, be common.
Who is going to win the arms race for browser fingerprinting?
Generally speaking, we can see that browser fingerprinting could continue to be a solid technique to identify users on the web with a high degree of effectiveness and a good level of stability.
As the modern web has become richer and more dynamic than ever, it has set the foundations to support an incredible ecosystem of diverse devices. This diversity opens the door for browser fingerprinting techniques to collect a vast list of device features across multiple layers of the system. Clients and servers have been sharing (and they will continue to do so) device-specific information since the beginning to improve user experience. Why? Because sharing information and having a universal way of communication between machines have been the essence of the WWW since its inception, and this causes that browser fingerprinting cannot be addressed with a simple patch or browser extension.
So, what can we see in the near future? The surface of identification will continue to grow despite the efforts of browsers vendors and standard organizations. Each new browser version that adds, modifies, or even removes an API has a direct impact on this surface. Each new draft that is written by the W3C or any other organization introduces new capabilities and, consequently, new fingerprint vectors. However, they are already working in ambitious privacy and anti-tracking programs that may alter the long-term future of fingerprinting. The major browsers take the problem of trackers very much into consideration, emphasizing how they use fingerprint techniques to maximize user identification.
Keep posted folks, in our next post we will talk about how we will approach an efficient and updated browser identification by setting a strategy that involves the uniqueness and stability of the fingerprint attributes.