What is Fingerprinting?
Fingerprinting is a technique that may be used to identify the specific device, web browser, and operating system making a request, regardless of what the client says in its user-agent header. By helping organisations identify and characterise the attributes of a client's connection, fingerprinting can improve network security and help protect against malicious traffic.
Fingerprinting can also refer to techniques for following or uniquely identifying individual users across the web. That is a separate set of techniques and is not discussed in this article.
Transport Layer Security (TLS) Fingerprinting determines the specific characteristics of a client's TLS implementation by examining the initial TLS handshake packet, known as the "Client Hello." This packet contains fields and parameters such as supported cipher suites, extensions, and the client's preferred order of those parameters, which can be used to create a unique "fingerprint" of the client's TLS implementation.
Why is it used?
Fingerprinting has several uses, including bot protection, DDoS protection, and client identification. By identifying and characterising the attributes of a client's connection, fingerprinting can improve network security and help protect against malicious traffic.
How does TLS Fingerprinting work?
TLS Fingerprinting examines the initial TLS handshake packet, known as the "Client Hello". The Client Hello packet is sent by the client during the initial phase of the TLS handshake, which establishes a secure connection between the client and the server. It contains information about the client's preferred encryption methods, extensions, and parameters, including:
- Protocol Version: The version of the TLS protocol desired by the client.
- Random: A 32-byte random value generated by the client, used in key generation and derivation.
- Session ID: An optional session identifier for resuming a previous session.
- Cipher Suites: A list of supported encryption algorithms, ordered by preference.
- Compression Methods: A list of supported compression algorithms, ordered by preference.
- Extensions: Optional extensions that can negotiate additional parameters, such as Server Name Indication (SNI) and Elliptic Curve Supported (ECS).
The Client Hello packet is central to the operation and security of the TLS connection because it provides information the server uses to select encryption algorithms and parameters. The packet also enables the client and server to negotiate an appropriate encryption method for their communication. The Client Hello's variable content, based on the TLS version, library, cipher suites, extensions, and settings supported by the client, makes it a strong candidate for fingerprinting.
Common components used to create a TLS fingerprint include:
- Cipher Suites: The order of cipher suites supported by the client.
- Extensions: Supported extensions included in the Client Hello packet, such as SNI and ECS.
- TLS Point Formats: Encoding of cryptographic parameters in a format that can be transmitted as part of the TLS protocol, used in elliptic curve cryptography (ECC).
- TLS Curves: The specific elliptic curves used in ECC, a type of public-key cryptography used in the TLS protocol.
TLS fingerprinting has been a topic of research for several years, with a number of tools and techniques developed from that work. Notable examples include JA3, developed by John Althouse, Jeff Atkinson, and Josh Atkins of Salesforce, which uses a hash of the client's SSL/TLS parameters as a unique identifier for tracking and analysing SSL/TLS traffic. Another tool, Mercury by David McGrew and Blake Anderson, can be used to fingerprint client connections and identify the device, operating system, and application making the connection.
TLS fingerprinting has a variety of uses, including bot protection, DDoS protection, malware identification and client identification. By enabling organisations to identify and characterise the attributes of a client's TLS implementation, TLS fingerprinting can improve network security and help protect against malicious traffic.
In production, TLS fingerprints are most useful when combined with IP intelligence and residential proxy detection, rather than treated as a standalone verdict.
Representation of a TLS Fingerprint
A TLS fingerprint is commonly represented as a string or hash that summarises the important components of the Client Hello packet. The most common components used to create a TLS fingerprint include the supported cipher suites, extensions, and TLS point formats. The cipher suites are represented as a list of hexadecimal values in the order they are presented by the client, while extensions and point formats are represented as a list of hexadecimal values or a unique identifier.
Raw JA3 signatures are represented by the following fields, which are then hashed with MD5:
SSLVersion, Cipher, SSLExtension, EllipticCurve, EllipticCurvePointFormat
An example raw signature is:
771,4865-4867-4866-49195-49199-52393-52392-49196-49200-49162-49161-49171-49172-156-157-47-53,0-23-65281-10-11-35-16-5-34-51-43-13-45-28-21,29-23-24-25-256-257,0
An MD5 hash is then applied, resulting in the final signature.
579ccef312d18482fc42e2b822ca2430
Mercury signatures are represented by:
"tls/1" (TLS_Version) (TLS_Ciphersuite) [ Extension* ]
An example signature is:
tls/1/
(0303)
(130213031301c02cc030009fcca9cca8ccaac02bc02f009ec024c028006bc023c0270067c00ac0140039c009c0130033009d009c003d003c0035002f00ff)
[
(0000)
(000a000c000a001d0017001e00190018)
(000b000403000102)
(000d0030002e040305030603080708080809080a080b080408050806040105010601030302030301020103020202040205020602)
(0016)
(0017)
(0023)
(002b0009080304030303020301)
(002d00020101)
(0033)
]
Hash Functions for Representing TLS Fingerprints
Hashing algorithms, such as MD5, are commonly used to create a unique representation of a TLS fingerprint. These hash functions take the client's TLS parameters as input and produce a fixed-length output, which serves as a unique identifier for the client. The hash value can be compared against a database of known TLS fingerprints to help determine the identity of the client.
Other techniques for representing TLS fingerprints include base64 encoding of the client's TLS parameters, such as in the Mercury fingerprint.
Challenges with TLS fingerprinting
TLS fingerprinting is not a foolproof method for identifying clients and their attributes. It has several limitations that need to be considered.
- False Positives: TLS fingerprinting relies on the assumption that the client's Client Hello packet uniquely identifies a connecting process by its TLS implementation. However, it is possible for a client to alter the Client Hello packet by customising TLS parameters, which affects the Client Hello packet and can result in a false positive identification. This makes it important to use multiple methods for identifying clients. For example, Mercury takes into account destination ports to add additional context.
- False Negatives: While TLS fingerprinting can identify many different clients and their attributes, it is not capable of identifying all clients. Some clients may have a unique or unusual TLS implementation that cannot be accurately fingerprinted. Additionally, some clients may actively attempt to evade fingerprinting by customising TLS parameters or using tools to anonymise their connections.
- Forging of TLS Fingerprints: It is possible for attackers to deliberately forge or modify the information contained in their Client Hello packet to appear as a different client. This makes it difficult for fingerprinting tools to accurately identify the true identity of a client and can be used for malicious purposes, such as evading security measures or disguising the origin of an attack.
- Incomplete Data: TLS fingerprinting is limited by the information contained in the Client Hello packet, which may not contain all of the necessary data to accurately identify a client. For example, a client may not send a full list of supported cipher suites or extensions, may use a modified version of the TLS protocol that is not recognised by the fingerprinting tool, or the fingerprint may not be present in available databases.
Different fingerprinting implementations can result in different hashes for the same TLS connection, even though the underlying SSL/TLS protocol remains unchanged. This happens due to the various algorithms, parameters, and representations used by different fingerprinting tools.
For instance, implementation differences when generating the TLS fingerprint may cause hashes found in public databases to be inconsistent with a locally generated hash.
Final Thoughts
Be aware of the limitations and differences between fingerprinting implementations, and choose the right tool and representation for your specific use case. Standardising the representation of fingerprints and using common hash algorithms can help avoid confusion and improve interoperability between databases.