-
Notifications
You must be signed in to change notification settings - Fork 495
Description
I have a cluster using 100 indexers, 4 cores 15GB per, with config and build info below. The cluster is reasonably healthy otherwise but there is a single index that is getting shard rate limited every day as morning traffic increases and then continues throughout the day until the evening, then it goes away.
The architecture is Vector http sinks into the ingest V2 API. Looking at Vector logs I can see that all of the rate limiting is happening to the same sink / index pair.
2025-11-11T17:50:14.919422Z WARN sink{component_kind="sink" component_id=shared_log_to_quickwit_foo component_type=http}:request{request_id=1753}: vector::sinks::util::retries: Retrying after response. reason=too many requests
< this log repeatedly happens, no other components reported>
Some metrics when then the 429s are occurring:
Sometimes the control plane decides to add shards and the 429s go away:
My best guess at the moment is that we're just running too many indexers and that index throughput has a high distribution i.e. we have 2 indexes doing ~70% of the throughput, and 23 accounting for the rest. The plan I had was to reduce indexer count to see if average ST/LT ingestion throughput per node would increase enough to fix the issue. Though I wanted to report my current state to see if my understanding was correct and if this issue was worth some attention. Thanks for all the work. Let me know if you need more information.
build info
{
"build": {
"build_date": "2025-05-23T00:29:41Z",
"build_profile": "release",
"build_target": "aarch64-unknown-linux-gnu",
"cargo_pkg_version": "0.8.0",
"commit_date": "unknown",
"commit_hash": "unknown",
"commit_short_hash": "unknown",
"commit_tags": [],
"version": "0.8.0-nightly"
},
"runtime": {
"num_cpus": 4,
"num_threads_blocking": 3,
"num_threads_non_blocking": 1
}
}
ingest and indexer config
{
"ingest_api_config": {
"max_queue_memory_usage": "2.1 GB",
"max_queue_disk_usage": "4.3 GB",
"replication_factor": 1,
"content_length_limit": "10.5 MB",
"shard_throughput_limit": "5.2 MB",
"shard_burst_limit": "52.4 MB",
"shard_scale_up_factor": 1.5
},
"indexer_config": {
"split_store_max_num_bytes": "100.0 GB",
"split_store_max_num_splits": 1000,
"max_concurrent_split_uploads": 12,
"max_merge_write_throughput": null,
"merge_concurrency": 2,
"enable_otlp_endpoint": true,
"enable_cooperative_indexing": false,
"cpu_capacity": "4000m"
}
}
25 indexes, all with the same config
// index config
version: 0.9
index_id: foo
doc_mapping:
mode: dynamic
field_mappings:
- name: timestamp
type: datetime
input_formats:
- rfc3339
- unix_timestamp
- iso8601
fast: true
fast_precision: milliseconds
- name: message
type: json
tokenizer: raw
fast: true
- name: quickwit_message
type: text
tokenizer: default
record: position
fieldnorms: true
fast:
normalizer: lowercase
timestamp_field: timestamp
indexing_settings:
commit_timeout_secs: 30
search_settings:
default_search_fields: [quickwit_message]