Skip to content
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion docs/google/fetch-logs.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,8 @@ attempts to provide a high level overview of them, with info on:
|[ctmon](https://github.com/sergiogarciadev/ctmon) |PostgreSQL |[yes](https://github.com/search?q=repo%3Asergiogarciadev%2Fctmon+path%3Astate.json+concurrency&type=code) |[yes](https://github.com/sergiogarciadev/ctmon/blob/e4a4f67f4b405821a2ab47ab1878d6ae0eebb72c/logclient/log.go#L73) |[randomized](https://github.com/sergiogarciadev/ctmon/blob/e4a4f67f4b405821a2ab47ab1878d6ae0eebb72c/logclient/log.go#L92) |
|[Scrape CT Log](https://github.com/mpalmer/scrape-ct-log) |files (json, cbor) |[yes](https://github.com/mpalmer/scrape-ct-log/blob/02314930ac59c23f6b0782fe156239aeff86b667/src/runner/mod.rs#L72) |[yes](https://github.com/mpalmer/scrape-ct-log/blob/02314930ac59c23f6b0782fe156239aeff86b667/src/fetcher/mod.rs#L246) |[randomized](https://github.com/mpalmer/scrape-ct-log/blob/02314930ac59c23f6b0782fe156239aeff86b667/src/fetcher/mod.rs#L183)|
|[crt.sh](https://github.com/crtsh) |SQL |[yes](https://github.com/crtsh/ct_monitor/blob/174e0d8d4954dacd80eaf45dedd90061d7e7a6f4/ct/logList.go#L24) |[yes](https://github.com/crtsh/ct_monitor/blob/174e0d8d4954dacd80eaf45dedd90061d7e7a6f4/ct/getEntries.go#L77) |[static](https://github.com/crtsh/ct_monitor/blob/174e0d8d4954dacd80eaf45dedd90061d7e7a6f4/ct/logList.go#L75) |
|[CertStream](https://github.com/CaliDog/certstream-server?tab=readme-ov-file) |files (json), last 25 entries|[yes](https://github.com/CaliDog/certstream-server/blob/41c054704316f9ade21a0cc89db19d51e10469e6/lib/certstream/ct_watcher.ex#L165) |[no](https://github.com/CaliDog/certstream-server-python/blob/790718da384d3710e7842bd32b8367d2e142cc14/certstream/watcher.py#L143)|no |
|[CertStream](https://github.com/CaliDog/certstream-server?tab=readme-ov-file) |files (json), last 25 entries|[yes](https://github.com/CaliDog/certstream-server/blob/41c054704316f9ade21a0cc89db19d51e10469e6/lib/certstream/ct_watcher.ex#L165) |[no](https://github.com/CaliDog/certstream-server-python/blob/790718da384d3710e7842bd32b8367d2e142cc14/certstream/watcher.py#L143)|no
|[certstream server go](https://github.com/d-Rickyy-b/certstream-server-go) | n/a | [yes](https://github.com/d-Rickyy-b/certstream-server-go/blob/22cc89fc7ea2994d4d2717e5dcc5ad17a444fee7/internal/certificatetransparency/ct-watcher.go#L233) | no | [yes](https://github.com/d-Rickyy-b/certstream-server-go/blob/22cc89fc7ea2994d4d2717e5dcc5ad17a444fee7/internal/certificatetransparency/ct-watcher.go#L230)|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this possible for clients to bump parallelism to a value higher than 1?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it can make multiple queries to the same log in parallel with a value higher than 1 which is the current default. It can be adjusted here.

Underneath it uses this library.

Copy link
Author

@Xaelp Xaelp Feb 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am preparing a setup to monitor all CT Logs and store ALL unique certificates for powering a tool to help fight phishing.

This library, from a comparison between all the alternatives, is the closest to what we need to do this efficiently.

There are some other requirements which will cause some adjustments on it, but it's a great out of the box self-hosting solution to fetch entries from logs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very sorry for the delay, that fell off my radar.

The intent of the parallelism column it to tell folks running this code what features they can benefit from when they use the tool, without modifying the code. Given that changing parallelism requires changing the code, I'd rather set it to do. Are you planing of making it a flag and/or putting it in the config? That would make it easier for clients to control it.

If you use the fetcher library, then I think that we can say it supports dynamic indexes though?

Copy link
Author

@Xaelp Xaelp Mar 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback!

I am not a contributor of the https://github.com/d-Rickyy-b/certstream-server-go project but I will open a PR there as that seems a logical improvement regarding the configurable parallelism.

Regarding the dynamic indexes you're totally correct. It will keep track of the tree size on the responses of each log so that it doesn't miss a certificate.

However, there is something I want to improve there (either by forking it, or through a PR) which is: it should also cache persistently each CT log current tree size so that if it restarts it can continue from where it stopped.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the dynamic index info. See https://github.com/Xaelp/certificate-transparency-community-site/blob/add-self-hosted-tools/docs/google/fetch-logs.md#self-hosted-tools.

Regarding the parallelism shall I change it to

Tool Storage Parallelism Dynamic indexes Backoff
certstream server go n/a yes* (requires changing code) yes yes

or just set it as

Tool Storage Parallelism Dynamic indexes Backoff
certstream server go n/a no yes yes

and later when the configuration is implemented I can update this.


If you know of any other tool, or spot any error in this table, please send a PR!

Expand Down