-
Notifications
You must be signed in to change notification settings - Fork 45
Open
Labels
enhancementNew feature or requestNew feature or request
Description
How do we identify which components are slowing down Zeno?
For example, https://github.com/internetarchive/warcprox has a special endpoint that returns the following JSON (truncated) where we see the queued_urls of each component.
Slower components have the biggest queues.
In Zeno we have the main channels that facilitate communication between components. Could we have a similar live report and display the size of each channel?
{
"role": "warcprox",
"version": "2.9.0",
"host": "xxx",
"threads": 310,
"active_requests": 14,
"unaccepted_requests": 0,
"load": 0.0,
"queued_urls": 0,
"queue_max_size": 500,
"urls_processed": 11821434,
"warc_bytes_written": 171147470571,
"start_time": "2025-09-02T19:38:41.209707+00:00",
"rates_1min": {
"actual_elapsed": 40.26187539100647,
"urls_per_sec": 16.889426868374233,
"warc_bytes_per_sec": 37202.43991253774
},
"earliest_still_active_fetch_start": "2025-09-10T07:01:15.407696+00:00",
"seconds_behind": 4092.204695,
"postfetch_chain": [
{
"processor": "LimitCaptures",
"queued_urls": 0
},
{
"processor": "AdBlocker",
"queued_urls": 0
},
{
"processor": "MimeTypeFilter",
"queued_urls": 0
},
{
"processor": "CdxServerDedupLoader",
"queued_urls": 1
},
{
"processor": "WarcWriterProcessor",
"queued_urls": 53
},
{
"processor": "CdxServerDedup",
"queued_urls": 22
},
{
"processor": "CrawlLogger",
"queued_urls": 0
},
{
"processor": "LiveCDX",
"queued_urls": 0
},
{
"processor": "CertsWriter",
"queued_urls": 0
},
{
"processor": "RunningStats",
"queued_urls": 0
}
]
}
Copilot
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request