|
1 | 1 | # Table of Contents |
2 | 2 |
|
3 | | - - [API and server behaviours](#api-and-server-behaviours) |
4 | | - - [Self-hosted tools](#self-hosted-tools) |
5 | | - - [Online services](#online-services) |
| 3 | +- [API and server behaviours](#api-and-server-behaviours) |
| 4 | +- [Self-hosted tools](#self-hosted-tools) |
| 5 | +- [Online services](#online-services) |
6 | 6 |
|
7 | 7 | This document describes how to read data from the Certificate Transparency |
8 | 8 | logs. |
9 | 9 |
|
10 | 10 | # API and server behaviours |
| 11 | + |
11 | 12 | Certificate Transparency logs make their entries available through the |
12 | 13 | `get-entries` endpoint, as defined in |
13 | 14 | [RFC6962](https://www.rfc-editor.org/rfc/rfc6962#section-4.6). |
14 | 15 |
|
15 | 16 | When fetching entries from a log, always keep in mind that: |
16 | 17 |
|
17 | | - 1. `Logs MAY restrict the number of entries that can be retrieved per |
18 | | - "get-entries" request`: this means that if you make a |
19 | | - `get-entries?start=10&end=40`, request the server might return a smaller slice, |
20 | | - such as `start=10&end=20`. The number of entries returned might vary between |
21 | | - requests, so always check how many entries each response contains, to adjust |
22 | | - the start index of the next request. In Trillian with a CT personality (which |
23 | | - powers most of the CT logs at the time of writing), there are two main |
24 | | - mechanisms that limits the number of entries in responses: |
25 | | - - Static: a log will never return more than X entries per reponse, where |
26 | | - X is set by the log operator. |
27 | | - - Dynamic: the server might [trucate responses to align the last leaf |
28 | | - index](https://github.com/google/certificate-transparency-go/blob/afde1b22ba618518e928a0379546db969803afb9/trillian/ctfe/handlers.go#L979-L988) |
29 | | - of a reponses on a geometric sequence. This increases the likelihood that |
30 | | - two different requests lead to the same response, hence making caching more |
31 | | - effective. |
| 18 | +1. `Logs MAY restrict the number of entries that can be retrieved per |
| 19 | +"get-entries" request`: this means that if you make a |
| 20 | + `get-entries?start=10&end=40`, request the server might return a smaller slice, |
| 21 | + such as `start=10&end=20`. The number of entries returned might vary between |
| 22 | + requests, so always check how many entries each response contains, to adjust |
| 23 | + the start index of the next request. In Trillian with a CT personality (which |
| 24 | + powers most of the CT logs at the time of writing), there are two main |
| 25 | + mechanisms that limits the number of entries in responses: - Static: a log will never return more than X entries per reponse, where |
| 26 | + X is set by the log operator. - Dynamic: the server might [trucate responses to align the last leaf |
| 27 | + index](https://github.com/google/certificate-transparency-go/blob/afde1b22ba618518e928a0379546db969803afb9/trillian/ctfe/handlers.go#L979-L988) |
| 28 | + of a reponses on a geometric sequence. This increases the likelihood that |
| 29 | + two different requests lead to the same response, hence making caching more |
| 30 | + effective. |
32 | 31 |
|
33 | | - 2. Serving infrastructure might rate limit requests to protect logs, based on |
34 | | - various parameters. Clients should react to rate limits, with exponential |
35 | | - backoff for instance, in order to release pressure on the servers until they |
36 | | - can serve a steady stream of entries. Rate limiting is dynamic and can vary |
37 | | - over time, so always re-evaluate the number of queries your clients sends based |
38 | | - on the most recent server reponses. Rate limits are often due to: |
39 | | - - network DoS protection mechanisms specific to each log operator |
40 | | - - a log server quota system, such as [Trillian's](https://github.com/google/trillian/blob/master/quota/quota.go) |
| 32 | +2. Serving infrastructure might rate limit requests to protect logs, based on |
| 33 | + various parameters. Clients should react to rate limits, with exponential |
| 34 | + backoff for instance, in order to release pressure on the servers until they |
| 35 | + can serve a steady stream of entries. Rate limiting is dynamic and can vary |
| 36 | + over time, so always re-evaluate the number of queries your clients sends based |
| 37 | + on the most recent server reponses. Rate limits are often due to: - network DoS protection mechanisms specific to each log operator - a log server quota system, such as [Trillian's](https://github.com/google/trillian/blob/master/quota/quota.go) |
41 | 38 |
|
42 | 39 | # Self-hosted tools |
| 40 | + |
43 | 41 | Here is a list of currently known tools to fetch entries from logs. This table |
44 | 42 | attempts to provide a high level overview of them, with info on: |
45 | | - - Storage: how the entries are stored |
46 | | - - Parallelism: whether it allows making multiple queries to the same log in parallel |
47 | | - - Dynamic ranges: whether it checks the number of entries in `get-entries` responses and adjusts the start of following |
48 | | - requests |
49 | | - - Backoff: how it reacts to HTTP `429 Too Many Requests` |
50 | 43 |
|
51 | | -|Tool |Storage |Parallelism |Dynamic indexes |Backoff | |
52 | | -|-------------------------------------------------------------------------------------------------------|-----------------------------|-----------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------| |
53 | | -|[ScanLog](https://github.com/google/certificate-transparency-go/blob/master/scanner/scanlog/scanlog.go)|files (der) |[yes](https://github.com/google/certificate-transparency-go/blob/041b29b9b82cf2eb8972c5afef04e692524af8f0/scanner/scanlog/scanlog.go#L58)|[yes](https://github.com/google/certificate-transparency-go/blob/041b29b9b82cf2eb8972c5afef04e692524af8f0/scanner/fetcher.go#L298)|[exponential](https://github.com/google/certificate-transparency-go/blob/master/jsonclient/backoff.go) | |
54 | | -|[CTClone](https://github.com/google/trillian-examples/tree/master/clone) |MySQL |[yes](https://github.com/google/trillian-examples/blob/f2a13ca2666b721d527d61d68f4fe6768b1e5ad1/clone/cmd/ctclone/ctclone.go#L42) |[no](https://github.com/google/trillian-examples/blob/f2a13ca2666b721d527d61d68f4fe6768b1e5ad1/clone/internal/cloner/clone.go#L75)|[exponential](https://github.com/google/certificate-transparency-go/blob/master/jsonclient/backoff.go) | |
55 | | -|[Axeman](https://github.com/CaliDog/Axeman) |files (csv) |[yes](https://github.com/CaliDog/Axeman/blob/e8a195a3e31f10ee6156d564ec541e7dcc356a4c/axeman/core.py#L28) |[no](https://github.com/CaliDog/Axeman/blob/e8a195a3e31f10ee6156d564ec541e7dcc356a4c/axeman/certlib.py#L60) |no | |
56 | | -|[ctmon](https://github.com/sergiogarciadev/ctmon) |PostgreSQL |[yes](https://github.com/search?q=repo%3Asergiogarciadev%2Fctmon+path%3Astate.json+concurrency&type=code) |[yes](https://github.com/sergiogarciadev/ctmon/blob/e4a4f67f4b405821a2ab47ab1878d6ae0eebb72c/logclient/log.go#L73) |[randomized](https://github.com/sergiogarciadev/ctmon/blob/e4a4f67f4b405821a2ab47ab1878d6ae0eebb72c/logclient/log.go#L92) | |
57 | | -|[Scrape CT Log](https://github.com/mpalmer/scrape-ct-log) |files (json, cbor) |[yes](https://github.com/mpalmer/scrape-ct-log/blob/02314930ac59c23f6b0782fe156239aeff86b667/src/runner/mod.rs#L72) |[yes](https://github.com/mpalmer/scrape-ct-log/blob/02314930ac59c23f6b0782fe156239aeff86b667/src/fetcher/mod.rs#L246) |[randomized](https://github.com/mpalmer/scrape-ct-log/blob/02314930ac59c23f6b0782fe156239aeff86b667/src/fetcher/mod.rs#L183)| |
58 | | -|[crt.sh](https://github.com/crtsh) |SQL |[yes](https://github.com/crtsh/ct_monitor/blob/174e0d8d4954dacd80eaf45dedd90061d7e7a6f4/ct/logList.go#L24) |[yes](https://github.com/crtsh/ct_monitor/blob/174e0d8d4954dacd80eaf45dedd90061d7e7a6f4/ct/getEntries.go#L77) |[static](https://github.com/crtsh/ct_monitor/blob/174e0d8d4954dacd80eaf45dedd90061d7e7a6f4/ct/logList.go#L75) | |
59 | | -|[CertStream](https://github.com/CaliDog/certstream-server?tab=readme-ov-file) |files (json), last 25 entries|[yes](https://github.com/CaliDog/certstream-server/blob/41c054704316f9ade21a0cc89db19d51e10469e6/lib/certstream/ct_watcher.ex#L165) |[no](https://github.com/CaliDog/certstream-server-python/blob/790718da384d3710e7842bd32b8367d2e142cc14/certstream/watcher.py#L143)|no |
60 | | -|[certstream server go](https://github.com/d-Rickyy-b/certstream-server-go) | n/a | [yes](https://github.com/d-Rickyy-b/certstream-server-go/blob/22cc89fc7ea2994d4d2717e5dcc5ad17a444fee7/internal/certificatetransparency/ct-watcher.go#L233) | no | [yes](https://github.com/d-Rickyy-b/certstream-server-go/blob/22cc89fc7ea2994d4d2717e5dcc5ad17a444fee7/internal/certificatetransparency/ct-watcher.go#L230)| |
| 44 | +- Storage: how the entries are stored |
| 45 | +- Parallelism: whether it allows making multiple queries to the same log in parallel |
| 46 | +- Dynamic ranges: whether it checks the number of entries in `get-entries` responses and adjusts the start of following |
| 47 | + requests |
| 48 | +- Backoff: how it reacts to HTTP `429 Too Many Requests` |
| 49 | + |
| 50 | +| Tool | Storage | Parallelism | Dynamic indexes | Backoff | |
| 51 | +| ------------------------------------------------------------------------------------------------------- | ----------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------- | |
| 52 | +| [ScanLog](https://github.com/google/certificate-transparency-go/blob/master/scanner/scanlog/scanlog.go) | files (der) | [yes](https://github.com/google/certificate-transparency-go/blob/041b29b9b82cf2eb8972c5afef04e692524af8f0/scanner/scanlog/scanlog.go#L58) | [yes](https://github.com/google/certificate-transparency-go/blob/041b29b9b82cf2eb8972c5afef04e692524af8f0/scanner/fetcher.go#L298) | [exponential](https://github.com/google/certificate-transparency-go/blob/master/jsonclient/backoff.go) | |
| 53 | +| [CTClone](https://github.com/google/trillian-examples/tree/master/clone) | MySQL | [yes](https://github.com/google/trillian-examples/blob/f2a13ca2666b721d527d61d68f4fe6768b1e5ad1/clone/cmd/ctclone/ctclone.go#L42) | [no](https://github.com/google/trillian-examples/blob/f2a13ca2666b721d527d61d68f4fe6768b1e5ad1/clone/internal/cloner/clone.go#L75) | [exponential](https://github.com/google/certificate-transparency-go/blob/master/jsonclient/backoff.go) | |
| 54 | +| [Axeman](https://github.com/CaliDog/Axeman) | files (csv) | [yes](https://github.com/CaliDog/Axeman/blob/e8a195a3e31f10ee6156d564ec541e7dcc356a4c/axeman/core.py#L28) | [no](https://github.com/CaliDog/Axeman/blob/e8a195a3e31f10ee6156d564ec541e7dcc356a4c/axeman/certlib.py#L60) | no | |
| 55 | +| [ctmon](https://github.com/sergiogarciadev/ctmon) | PostgreSQL | [yes](https://github.com/search?q=repo%3Asergiogarciadev%2Fctmon+path%3Astate.json+concurrency&type=code) | [yes](https://github.com/sergiogarciadev/ctmon/blob/e4a4f67f4b405821a2ab47ab1878d6ae0eebb72c/logclient/log.go#L73) | [randomized](https://github.com/sergiogarciadev/ctmon/blob/e4a4f67f4b405821a2ab47ab1878d6ae0eebb72c/logclient/log.go#L92) | |
| 56 | +| [Scrape CT Log](https://github.com/mpalmer/scrape-ct-log) | files (json, cbor) | [yes](https://github.com/mpalmer/scrape-ct-log/blob/02314930ac59c23f6b0782fe156239aeff86b667/src/runner/mod.rs#L72) | [yes](https://github.com/mpalmer/scrape-ct-log/blob/02314930ac59c23f6b0782fe156239aeff86b667/src/fetcher/mod.rs#L246) | [randomized](https://github.com/mpalmer/scrape-ct-log/blob/02314930ac59c23f6b0782fe156239aeff86b667/src/fetcher/mod.rs#L183) | |
| 57 | +| [crt.sh](https://github.com/crtsh) | SQL | [yes](https://github.com/crtsh/ct_monitor/blob/174e0d8d4954dacd80eaf45dedd90061d7e7a6f4/ct/logList.go#L24) | [yes](https://github.com/crtsh/ct_monitor/blob/174e0d8d4954dacd80eaf45dedd90061d7e7a6f4/ct/getEntries.go#L77) | [static](https://github.com/crtsh/ct_monitor/blob/174e0d8d4954dacd80eaf45dedd90061d7e7a6f4/ct/logList.go#L75) | |
| 58 | +| [CertStream](https://github.com/CaliDog/certstream-server?tab=readme-ov-file) | files (json), last 25 entries | [yes](https://github.com/CaliDog/certstream-server/blob/41c054704316f9ade21a0cc89db19d51e10469e6/lib/certstream/ct_watcher.ex#L165) | [no](https://github.com/CaliDog/certstream-server-python/blob/790718da384d3710e7842bd32b8367d2e142cc14/certstream/watcher.py#L143) | no | |
| 59 | +| [certstream server go](https://github.com/d-Rickyy-b/certstream-server-go) | n/a | [yes](https://github.com/d-Rickyy-b/certstream-server-go/blob/22cc89fc7ea2994d4d2717e5dcc5ad17a444fee7/internal/certificatetransparency/ct-watcher.go#L233) | [yes](https://github.com/google/certificate-transparency-go/blob/6227d7a256e6aba604df22e6e7706b3cffd70476/scanner/fetcher.go#L298) | [yes](https://github.com/google/certificate-transparency-go/blob/6227d7a256e6aba604df22e6e7706b3cffd70476/scanner/fetcher.go#L77) | |
61 | 60 |
|
62 | 61 | If you know of any other tool, or spot any error in this table, please send a PR! |
63 | 62 |
|
64 | | -# Online services |
| 63 | +# Online services |
| 64 | + |
65 | 65 | If you don't want to run your own log downloading and indexing system, there |
66 | 66 | exist various online services which do this and offer the results via API or |
67 | | -HTTP UIs, Let's Encrypt has a list [here](https://community.letsencrypt.org/t/certificate-transparency-search-resources/203368) |
| 67 | +HTTP UIs, Let's Encrypt has a list [here](https://community.letsencrypt.org/t/certificate-transparency-search-resources/203368) |
0 commit comments