Skip to content

A Lua script for HAProxy that generates JA4 TLS client fingerprints

License

Notifications You must be signed in to change notification settings

FriendlyCaptcha/ja4-haproxy

Repository files navigation

JA4 fingerprints for HAProxy

Test Coverage

A Lua script for HAProxy 3.2+ that generates JA4 TLS client fingerprints.

What is a JA4 fingerprint?

A JA4 fingerprint is a 36-char string that identifies a TLS client based on attributes in its Client Hello.

Example fingerprint: t13d1516h2_8daaf6152771_02713d6af862

Part Example Description
JA4_a t13d1516h2 Various fields (see table below)
JA4_b 8daaf6152771 Truncated SHA256 of sorted cipher suites
JA4_c 02713d6af862 Truncated SHA256 of sorted extensions + sig algos

These are the JA4_a fields:

Pos Example Field Values
1 t Protocol t = TCP, q = QUIC, d = DTLS
2-3 13 TLS version 10-13, s2-s3, d1-d3, 00
4 d SNI d = domain, i = IP (no SNI)
5-6 15 Cipher count from 00 to 99 (capped at 99)
7-8 16 Extension count from 00 to 99 (capped at 99)
9-10 h2 ALPN first+last char (eg, h2) or 00

JA4 specification

Notes on the JA4 specification

See the upstream JA4 specification.

TLS version

  1. Identify TLS version:
    • if supported_versions extension exists, use the highest version value after GREASE filtering
    • fallback to legacy_version field (aka Protocol Version in JA4 spec)
  2. Output the version:
    • output 10, 11, 12 or 13 for TLS
    • output s2 or s3 for SSL
    • output d1, d2 or d3 for DTLS
    • if the version isn't recognized, output 00

Ignore legacy_record_version (aka Handshake Version in JA4 spec).

GREASE

GREASE (Generate Random Extensions And Sustain Extensibility) is a mechanism where TLS clients advertise fake cipher suites and extensions to prevent servers from becoming dependent on specific values. These placeholder values follow the pattern 0x?A?A (eg, 0x0A0A, 0x1A1A, ..., 0xFAFA).

GREASE values must be ignored, otherwise the same client would produce multiple different fingerprints. Filter out GREASE values from:

  • version detection
  • cipher counts and hashes
  • extension counts and hashes

Pass 1 to HAProxy ssl_fc_*_bin() functions to enable GREASE filtering.

ALPN

ALPN (Application-Layer Protocol Negotiation) is a TLS extension that allows clients to indicate which application protocols they support (eg, h2, http/1.1). The JA4 fingerprint captures the first and last characters of the first ALPN value advertised by the client.

Output 00 for these cases:

  • no ALPN extension
  • no ALPN values
  • first ALPN value is empty

Otherwise:

  • If both first and last bytes are alphanumeric:
    • use those characters directly (eg, h2h2)
  • If either byte is non-alphanumeric:
    • convert the entire ALPN value to a hex string
    • take the first and last characters of that hex string (eg, 0xAB 0xCD => abcd => ad)

Examples:

ALPN bytes Output Notes
h2 h2 two alphanumeric, use directly
x xx one alphanumeric, use as first and last
http/1.1 h1 first byte h, last byte 1, use directly
0xAB ab single byte is non-alphanumeric, convert to hex
0xAB 0xCD ad first byte is non-alphanumeric, convert to hex
0x30 0xAB 3b last byte is non-alphanumeric, convert to hex
0x30 0x31 0xAB 0xCD 3d last byte is non-alphanumeric, convert to hex
0x30 0xAB 0xCD 0x31 01 first byte 0, last byte 1, use directly

Hash computation

Cipher hash (JA4_b):

  1. If there are no ciphers (after GREASE filtering), return 000000000000.
  2. Format each cipher code as 4-character lowercase hex (eg, 1301, c02b).
  3. Sort the list hexadecimally (eg, 002f,0035,009c,...,c02b,c02f,cca8).
  4. Join with commas (no spaces).
  5. Compute SHA256 hash of the resulting string.
  6. Return the first 12 characters of the hash (lowercase).

Extension hash (JA4_c):

  1. Start with the list of extensions (after GREASE filtering).
    • remove SNI (0000) and ALPN (0010), as already captured in JA4_a
    • if the list is now empty, return 000000000000
    • format each extension code as 4-character lowercase hex
    • sort the list hexadecimally
    • join with commas (no spaces)
  2. If signature algorithms extension is present in ClientHello:
    • append an underscore
    • append unsorted sig algorithms as 4-char lowercase hex, comma-separated
  3. Compute SHA256 hash of the resulting string.
  4. Return the first 12 characters of the hash (lowercase).

Raw outputs

# replace JA4_b and JA4_c with unhashed values
JA4_r  = t13d0406h2_002f,0035,c02f,cca8_0005,000a,4469,ff01_0403,0804,0806,0601

# replace JA4_b and JA4_c with hash of unsorted values including SNI/ALPN
JA4_o  = t13d0406h2_d3d44e45f89a_ce448a0c7281

# replace JA4_b and JA4_c with unhashed, unsorted values including SNI/ALPN
JA4_ro = t13d0406h2_c02f,cca8,002f,0035_0000,ff01,000a,0010,0005,4469_0403,0804,0806,0601

Usage

  1. Install HAProxy 3.2 or later (built with Lua 5.4 or later).
  2. Install ja4.lua to /etc/haproxy/ja4.lua (or somewhere else).
  3. Put something like this in your haproxy.cfg:
global
    # Without this setting, there will be no data to feed into the Lua script.
    # See recommendations further down on how large to set this.
    tune.ssl.capture-buffer-size 336

    # This is optional, but recommended. The default is `pre-3.1-bug`.
    # ja4.lua accommodates for either setting though.
    tune.lua.bool-sample-conversion normal

    # Load the Lua script.
    lua-load-per-thread /etc/haproxy/ja4.lua

frontend foo
    bind *:443 ssl crt /etc/haproxy/certs/foo.pem

    # Compute JA4 fingerprint.
    http-request lua.ja4

It's then up to you how you want to use var(txn.ja4).

To forward the fingerprint as an HTTP header to your backend:

http-request set-header X-TLS-JA4 %[var(txn.ja4)]

To log them:

http-request capture var(txn.ja4) len 36

If you need the raw outputs (see JA4 specification), pass raw as the first argument when loading the script:

# WARNING: This reduces throughput by ~10%.
lua-load-per-thread /etc/haproxy/ja4.lua raw

TCP mode

Using with HAProxy TCP mode
defaults
    mode tcp
    timeout connect 10s
    timeout client  10s
    timeout server  10s

frontend foo
    bind *:443 ssl crt /etc/haproxy/certs/foo.pem

    # Wait for request data before running the Lua script.
    tcp-request inspect-delay 5s

    # Compute JA4 fingerprint in TCP context.
    tcp-request content lua.ja4

    # Switch to HTTP mode if needed.
    tcp-request content switch-mode http if HTTP

HAProxy capture buffer size

tune.ssl.capture-buffer-size sets the max buffer per-connection for capturing client hello data. Set the buffer too low and JA4 can be wrong. For example, if ciphers take up the whole buffer then there's no room left for extensions or sig algorithms! Extension count would be 00 and JA4_c would be 000000000000.

We recommend one of the following buffer sizes:

  • 336 bytes is probably enough for any real-world client.
    • 10M concurrent connections = extra ~3.3GB of RAM
  • 512 bytes is probably enough for almost any misbehaving client.
    • 10M concurrent connections = extra ~5.1GB of RAM

Capture buffer size analysis

Why 336 bytes?

The capture buffer stores these fields in this order:

Field Limit (bytes) Reference
Cipher suites 65,534 RFC 8446 §4.1.2 (cipher_suites<2..2^16-2>)
Extension type IDs ~128 IANA TLS ExtensionType registry (~64 types × 2 bytes)
Supported groups (EC) 65,535 RFC 8446 §4.2.7 (named_group_list<2..2^16-1>)
EC point formats 255 RFC 8422 §5.1.2 (ec_point_format_list<1..2^8-1>)
Supported versions 254 RFC 8446 §4.2.1 (versions<2..254>)
Signature algorithms 65,534 RFC 8446 §4.2.3 (supported_signature_algorithms<2..2^16-2>)
Total 197,240

If we set the buffer to the theoretical RFC limit, that would mean an extra 197KB of memory per connection. This is impractical at scale (eg, 1M concurrent connections means an extra 197 GB of RAM).

The ceiling is a lot lower though, as there's a limited number of actual unique values defined by IANA (eg, unique cipher suites):

Field IANA definitions Size (bytes)
Cipher suites 339 678
Extension type IDs 64 128
Supported groups (EC) 45 90
EC point formats 3 3
Supported versions 8 16
Signature algorithms 49 98
Total 1,013

1KB is more palatable, but actually it'd be weird for a client to list every single cipher suite etc. During testing, I observed much lower values from real-world clients:

Field Chrome (bytes) curl (bytes)
Cipher suites 30 60
Extension type IDs 32 24
Supported groups (EC) 8 16
EC point formats 1 3
Supported versions 6 4
Signature algorithms 16 54
Total 93 161

The upstream JA4 repo has a fingerprint for the SoftEther VPN Client with a cipher count of 88. This would use 88*2=176 bytes for the cipher suites field alone, so might need around ~300 bytes for the whole buffer.

The most cipher suites I could force curl to advertise is 101. So maybe this is a reasonable ceiling for the cipher suites field (101*2=202 bytes). If we take the guesstimated size needed for the SoftEther VPN Client and pretend it listed 101 ciphers, we'd need approx 300+(202-176)=326 bytes. So 336 bytes seems enough for any real-world client.

Limitations

The last 2 characters of JA4_a is the ALPN field. The JA4 specification requires that you use the first ALPN advertised by the client. Unfortunately, when HAProxy terminates SSL (ie, bind ... ssl) then the only information available to Lua scripts is the negotiated ALPN.

This means the ALPN field of the JA4 fingerprint returned by ja4.lua may be wrong. This may not matter for your use-case, but if you're matching with a JA4 fingerprint database then you may get discrepancies.

HAProxy does expose the raw ALPNs advertised by clients when used in TCP mode without SSL termination. However, this is out-of-scope for ja4.lua, as other data required for JA4 fingerprinting (eg, extension list, supported_versions etc) is only available after SSL termination.

Note

ja4.lua works in either mode http or mode tcp, as long as the HAProxy frontend terminates SSL.

Error handling

If a Lua script panics for some reason, HAProxy will just carry on (ie, you shouldn't get a HTTP 50x error).

Regardless, if somehow ja4.lua encounters a problem then it returns a fallback fingerprint (t00i000000_000000000000_000000000000) and logs a warning (ja4.lua: fingerprint failed). This means if you have downstream HAProxy rules that depend on txn.ja4 being set, a script failure shouldn't cause a HTTP 50x error.

If you use ja4.lua in a HAProxy frontend that doesn't terminate SSL (unsupported), this also results in the fallback fingerprint and a log warning.

Development

Testing

Unit tests and integration tests

Unit tests

Requirements:

  • just >= 1.46.0
  • busted >= 2.3.0

Run test:

$ just unit-test

Integration tests

Integration tests run TLS clients (curl, Chromium, uTLS) to connect to haproxy containers.

Requirements:

  • docker

Run test:

$ just integration-test

Benchmarks

Throughput and latency benchmarks

Running the benchmarks

$ just benchmark

See just help for details on benchmarking options (like thread count).

Methodology

We benchmark 4 scenarios under 2 threading modes (8 haproxy configs total):

Scenario Description
Baseline No Lua
Lua (noop) Load a noop Lua script (measures Lua overhead)
JA4 Load our ja4.lua script
JA4 (raw) Load our ja4.lua script with raw argument
Threading Threads Lua directive
Single nbthread 1 lua-load
Multi nbthread 8 lua-load-per-thread

For each of the 6 configs:

  1. Throughput test: send requests as fast as possible (measure reqs/sec).
  2. Latency test: fixed arrival rate to measure percentiles without saturation.

Benchmark rig:

  • Fedora Linux 42
  • AMD Ryzen 9 9900X 12-Core Processor
  • 2 * 32GiB DDR5 6000 MT/s

Results

When single-threaded:

  • ~20% overhead just from loading a noop Lua script
  • ja4.lua has ~40% of baseline throughput

When multi-threaded (8 threads):

  • Lua overhead is much less noticeable
  • ja4.lua has ~80% of baseline throughput
--------------------------------------------------------------------
SINGLE-THREAD BENCHMARK
--------------------------------------------------------------------

Test         reqs/sec    % baseline   avg (ms)   p95 (ms)   p99 (ms)
--------------------------------------------------------------------
Baseline       143450           100       0.06       0.08       0.12
Lua (noop)     112640            78       0.08       0.12       0.17
JA4             55922            38       0.13       0.23       0.32
JA4 (raw)       46160            32       0.15       0.27       0.36

--------------------------------------------------------------------
MULTI-THREAD BENCHMARK (8 threads)
--------------------------------------------------------------------

Test         reqs/sec    % baseline   avg (ms)   p95 (ms)   p99 (ms)
--------------------------------------------------------------------
Baseline       262114           100       0.06       0.08       0.12
Lua (noop)     246956            94       0.06       0.11       0.19
JA4            207187            79       0.10       0.17       0.37
JA4 (raw)      186746            71       0.10       0.19       0.38

About

A Lua script for HAProxy that generates JA4 TLS client fingerprints

Resources

License

Stars

Watchers

Forks

Packages

No packages published