ref(cbrs): move EAP allocation policies into the routing strategy layer #7411

xurui-c · 2025-09-17T00:14:40Z

https://linear.app/getsentry/issue/EAP-79/move-eap-allocation-policies-into-the-outcomes-routing-strategy

codecov · 2025-09-19T04:13:05Z

❌ 1 Tests Failed:

Tests completed	Failed	Passed	Skipped
2976	1	2975	12

View the top 1 failed test(s) by shortest run time

tests.web.test_delete_query::test_delete_query_with_rejecting_allocation_policy

Stack Traces | 0.024s run time

Traceback (most recent call last):
  File "/.venv/lib/python3.11........./site-packages/_pytest/runner.py", line 341, in from_call
    result: TResult | None = func()
                             ^^^^^^
  File "/.venv/lib/python3.11........./site-packages/_pytest/runner.py", line 242, in <lambda>
    lambda: runtest_hook(item=item, **kwds), when=when, reraise=reraise
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11....../site-packages/pluggy/_hooks.py", line 513, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11....../site-packages/pluggy/_manager.py", line 120, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11.........................../site-packages/pluggy/_callers.py", line 182, in _multicall
    return outcome.get_result()
           ^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11.../site-packages/pluggy/_result.py", line 100, in get_result
    raise exc.with_traceback(exc.__traceback__)
  File "/.venv/lib/python3.11.........................../site-packages/pluggy/_callers.py", line 167, in _multicall
    teardown.throw(outcome._exception)
  File "/.venv/lib/python3.11....../site-packages/_pytest/threadexception.py", line 92, in pytest_runtest_call
    yield from thread_exception_runtest_hook()
  File "/.venv/lib/python3.11....../site-packages/_pytest/threadexception.py", line 68, in thread_exception_runtest_hook
    yield
  File "/.venv/lib/python3.11.........................../site-packages/pluggy/_callers.py", line 167, in _multicall
    teardown.throw(outcome._exception)
  File "/.venv/lib/python3.11....../site-packages/_pytest/unraisableexception.py", line 95, in pytest_runtest_call
    yield from unraisable_exception_runtest_hook()
  File "/.venv/lib/python3.11....../site-packages/_pytest/unraisableexception.py", line 70, in unraisable_exception_runtest_hook
    yield
  File "/.venv/lib/python3.11.........................../site-packages/pluggy/_callers.py", line 167, in _multicall
    teardown.throw(outcome._exception)
  File "/.venv/lib/python3.11....../site-packages/_pytest/logging.py", line 846, in pytest_runtest_call
    yield from self._runtest_for(item, "call")
  File "/.venv/lib/python3.11....../site-packages/_pytest/logging.py", line 829, in _runtest_for
    yield
  File "/.venv/lib/python3.11.........................../site-packages/pluggy/_callers.py", line 167, in _multicall
    teardown.throw(outcome._exception)
  File "/.venv/lib/python3.11.../site-packages/_pytest/capture.py", line 880, in pytest_runtest_call
    return (yield)
            ^^^^^
  File "/.venv/lib/python3.11.........................../site-packages/pluggy/_callers.py", line 167, in _multicall
    teardown.throw(outcome._exception)
  File "/.venv/lib/python3.11.../site-packages/_pytest/skipping.py", line 257, in pytest_runtest_call
    return (yield)
            ^^^^^
  File "/.venv/lib/python3.11.........................../site-packages/pluggy/_callers.py", line 103, in _multicall
    res = hook_impl.function(*args)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11........./site-packages/_pytest/runner.py", line 174, in pytest_runtest_call
    item.runtest()
  File "/.venv/lib/python3.11....../site-packages/_pytest/python.py", line 1627, in runtest
    self.ihook.pytest_pyfunc_call(pyfuncitem=self)
  File "/.venv/lib/python3.11....../site-packages/pluggy/_hooks.py", line 513, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11....../site-packages/pluggy/_manager.py", line 120, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11.........................../site-packages/pluggy/_callers.py", line 139, in _multicall
    raise exception.with_traceback(exception.__traceback__)
  File "/.venv/lib/python3.11.........................../site-packages/pluggy/_callers.py", line 103, in _multicall
    res = hook_impl.function(*args)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11....../site-packages/_pytest/python.py", line 159, in pytest_pyfunc_call
    result = testfunction(**testargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../tests/web/test_delete_query.py", line 63, in test_delete_query_with_rejecting_allocation_policy
    class RejectPolicy(AllocationPolicy):
  File ".../snuba/utils/registered_class.py", line 93, in __new__
    getattr(res, "_registry").register_class(res)
  File ".../snuba/utils/registered_class.py", line 33, in register_class
    raise RegisteredClassNameCollisionError(
snuba.utils.registered_class.RegisteredClassNameCollisionError: Class with name AllocationPolicy.RejectPolicy already exists in the registry, change the config_key property in the class <class 'test_delete_query.test_delete_query_with_rejecting_allocation_policy.<locals>.RejectPolicy'> or <class 'test_storage_routing.test_routing_strategy_with_rejecting_allocation_policy.<locals>.RejectPolicy'>

To view more test analytics, go to the Test Analytics Dashboard
_{📋 Got 3 mins? Take this short survey to help us improve Test Analytics.}

xurui-c · 2025-09-22T00:27:31Z

snuba/util.py

This is a refactor of https://github.com/getsentry/snuba/blob/master/snuba/web/db_query.py, because it's also used in outcomes based strategy

xurui-c · 2025-09-22T02:35:49Z

snuba/web/db_query.py

+    if attribution_info.app_id.key == "eap":
+        return
+
+    quota_allowances: dict[str, QuotaAllowance] = {}


This is to avoid executing https://github.com/getsentry/snuba/blob/master/snuba/web/db_query.py#L848-L890, which is implemented elsewhere in the PR

And the reason why I decided to use attribution_info.app_id.key instead of team or feature is because they can also be set to ourlogs even for eap queries

if there are no allocation policies on the storage, you won't execute anything though right?

nothing would execute for eap because I added the if statement, but if i didnt do that then https://github.com/getsentry/snuba/blob/master/snuba/web/db_query.py#L848-L890 would've executed

this is a very hacky way to do this. You're depending on some metadata injected way up somewhere in the pipeline to be set to a specific value.

The much more reasonable thing would be to return early if allocation_policies is empty

xurui-c · 2025-09-22T02:39:34Z

snuba/web/db_query.py

+    if attribution_info.app_id.key == "eap":
+        return
+
+    quota_allowances: dict[str, QuotaAllowance] = {}


This is just changing the type from Any to QuotaAllowance because that's more accurate

xurui-c · 2025-09-22T02:46:46Z

snuba/web/rpc/__init__.py

            span.description = self.config_key()
-        self.routing_context = RoutingContext(timer=self._timer, in_msg=in_msg)
+        self.routing_context = RoutingContext(
+            timer=self._timer, in_msg=in_msg, query_id=uuid.uuid4().hex


This query_id is later needed for getting & updating quota allowances, so might as well just set it here. Originally query_id is randomly generated right before applying allocation policies, and it seems like the only thing that matters is that query_id stays consistent for getting & updating quota allowances, because it later gets reset in the query pipeline, and ofc for querylog

xurui-c · 2025-09-22T03:02:17Z

snuba/web/rpc/__init__.py

                    "request_invalid",
                    tags=self._timer.tags,
                )
-            elif isinstance(error, AllocationPolicyViolations):


So actually, this metric was never raised because if therre is a AllocationPolicyViolations, then we raise a QueryException: https://github.com/getsentry/snuba/blob/master/snuba/web/db_query.py#L667-L690

xurui-c · 2025-09-22T03:06:55Z

snuba/web/rpc/__init__.py

                    out = self._execute(in_msg)
            else:
-                raise AllocationPolicyViolations
+                self.metrics.increment(


I'm also open to just not having this metric because we emit more detailed ones in output_metrics

xurui-c · 2025-09-22T03:07:26Z

snuba/web/rpc/__init__.py

+                )
+                # thoughts on making AlocationPolicyViolations a subclass of QueryException? QueryOrError expects QueryException
+                raise QueryException.from_args(
+                    AllocationPolicyViolations.__name__,


this is from https://github.com/getsentry/snuba/blob/master/snuba/web/db_query.py#L682-L689

xurui-c · 2025-09-22T03:18:18Z

snuba/web/rpc/__init__.py

        output_metrics_error = None
        try:
            self.routing_decision.strategy.output_metrics(self.routing_decision)
+            if self.routing_decision.routing_context.query_result is not None or isinstance(


This is what I'm trying to do: https://github.com/getsentry/snuba/blob/master/snuba/web/db_query.py#L708-L720

xurui-c · 2025-09-22T03:21:55Z

snuba/web/rpc/__init__.py

-                    tags=self._timer.tags,
-                )
-            else:
+            elif not isinstance(error, QueryException):


This is why I wanted AllocationPolicyViolations to inherit from QueryException (and then I would raise AllocationPolicyViolations as AllocationPolicyViolations, so then this metric gets emitted for non-allocation policy violation errors, since rate limiting shouldn't count as request_error anyways

xurui-c · 2025-09-22T03:47:25Z

snuba/web/rpc/storage_routing/routing_strategies/storage_routing.py

+            ) as span:
+                request_meta = extract_message_meta(routing_context.in_msg)
+                recommendations[allocation_policy_name] = allocation_policy.get_quota_allowance(
+                    {


I checked that this is what tenant_ids is across all endpoints: https://github.com/getsentry/snuba/blob/master/snuba/web/db_query.py#L825

Maybe we should pull this into RoutingContext, but it's getting a little bloated idk

If it's only one project id we use the project id

xurui-c · 2025-09-22T04:11:58Z

snuba/web/rpc/storage_routing/routing_strategies/storage_routing.py

            )
        record_query(_construct_hacky_querylog_payload(self, routing_decision))

+    @final


copies this part:

https://github.com/getsentry/snuba/blob/master/snuba/web/db_query.py#L715-L720

xurui-c · 2025-09-22T04:15:35Z

tests/web/rpc/v1/test_storage_routing.py

                },
                "time_budget": 8000,
            },
+            "allocation_policies_recommendations": {


This is what the new querylog payload looks like;

{ "sql": "", "sql_anonymized": "", "start_timestamp": 1758491216, "end_timestamp": 1758494816, "stats": { "final": false, "cache_hit": 0, "max_threads": 0, "clickhouse_table": "na", "query_id": "dc409fabfddd452f8c842511a24dcc62", "is_duplicate": 0, "consistent": false, "can_run": true, "is_throttled": false, "strategy": "MetricsStrategy", "source_request_id": "fe40d336-153b-4257-8a3e-b9b5fec9f854", "extra_info": { "sampling_in_storage_estimation_time_overhead": { "type": "timing", "value": 19, "tags": null }, "time_budget": 8000, "sampling_in_storage_routing_success": { "type": "increment", "value": 1, "tags": { "tier": "TIER_8" } }, "sampling_in_storage_my_metric": { "type": "increment", "value": 1, "tags": { "a": "b", "c": "d" } }, "sampling_in_storage_query_timing": { "type": "timing", "value": 1.0, "tags": { "tier": "TIER_8" } } }, "clickhouse_settings": {}, "result_info": { "meta": {}, "profile": { "bytes": 420, "elapsed": 1.0 }, "stats": { "stat": 1 }, "sql": "SELECT * FROM your_mom" }, "routed_tier": "TIER_8", "allocation_policies_recommendations": { "ConcurrentRateLimitAllocationPolicy": { "can_run": true, "max_threads": 10, "explanation": {}, "is_throttled": false, "throttle_threshold": 22, "rejection_threshold": 22, "quota_used": 1, "quota_unit": "concurrent_queries", "suggestion": "no_suggestion", "max_bytes_to_read": 0 }, "ReferrerGuardRailPolicy": { "can_run": true, "max_threads": 10, "explanation": {}, "is_throttled": false, "throttle_threshold": 1000000000000, "rejection_threshold": 1000000000000, "quota_used": 0, "quota_unit": "no_units", "suggestion": "no_suggestion", "max_bytes_to_read": 0 }, "BytesScannedRejectingPolicy": { "can_run": true, "max_threads": 10, "explanation": {}, "is_throttled": false, "throttle_threshold": 1000000000000, "rejection_threshold": 1000000000000, "quota_used": 0, "quota_unit": "no_units", "suggestion": "no_suggestion", "max_bytes_to_read": 0 } } }, "status": "0", "trace_id": "", "profile": { "time_range": null, "table": "eap_items", "all_columns": [], "multi_level_condition": false, "where_profile": { "columns": [], "mapping_cols": [] }, "array_join_cols": [], "groupby_cols": [] }, "result_profile": { "bytes": 420, "progress_bytes": 0, "elapsed": 1.0 }, "request_status": "na", "slo": "na" }

volokluev · 2025-09-22T20:58:03Z

snuba/util.py

+def get_max_bytes_to_read(quota_allowances: dict[str, QuotaAllowance]) -> int:
+    """
+    alternative implementation:
+    non_zero_max_bytes = [qa.max_bytes_to_read for qa in quota_allowances.values() if qa.max_bytes_to_read > 0]
+    return min(non_zero_max_bytes) if non_zero_max_bytes else 0
+    """
+    max_bytes_to_read = min(
+        [qa.max_bytes_to_read for qa in quota_allowances.values()],
+        key=lambda mb: float("inf") if mb == 0 else mb,
+    )
+    if max_bytes_to_read != 0:
+        return max_bytes_to_read
+    return 0
+
+


this is too high up in the folder structure. It's specific to allocation policies

volokluev · 2025-09-22T20:59:08Z

snuba/web/rpc/__init__.py

+                    "request_rate_limited",
+                    tags=self._timer.tags,
+                )
+                # thoughts on making AlocationPolicyViolations a subclass of QueryException? QueryOrError expects QueryException


sure go for it

decided against it because it's suppsoed to be chained together not inheritance relationship: https://github.com/getsentry/snuba/blob/master/snuba/web/__init__.py#L20-L22

volokluev · 2025-09-22T21:19:55Z

snuba/web/rpc/storage_routing/routing_strategies/outcomes_based.py

        )

-    def _get_routing_decision(self, routing_context: RoutingContext) -> RoutingDecision:
+    def _get_combined_allocation_policies_recommendations(


other routing strategies will use this. I would put this into a utility function in the allocation policies. All this is doing is combining the results of many policies which should fall into the responsibility of the allocation policy subsystem

As discussed in person, going to put this in BaseRoutingStrategy and other strats can override it. It's up to the routing strategy to decide how to combine recommendations, whether its voting or weighted voting or all or any or whatever idk

xurui-c · 2025-09-23T22:19:20Z

snuba/web/rpc/__init__.py

+                        "experiments": {},
+                    },
+                )
+                error.__cause__ = AllocationPolicyViolations.from_args(


this is the QueryException "chaining"

seer-by-sentry · 2025-09-23T22:25:45Z

snuba/web/rpc/storage_routing/routing_strategies/storage_routing.py

+                recommendations[allocation_policy_name] = allocation_policy.get_quota_allowance(
+                    {
+                        "organization_id": request_meta.organization_id,
+                        "referrer": request_meta.referrer,
+                    },
+                    routing_context.query_id,


Potential bug: The call to get_quota_allowance is missing the required project_id, causing two critical allocation policies to systematically fail for all EAP queries.

Description: The get_quota_allowance method is called with a dictionary containing only organization_id and referrer. However, two of the three allocation policies, ConcurrentRateLimitAllocationPolicy and BytesScannedRejectingPolicy, require a project_id. This omission causes them to raise an InvalidTenantsForAllocationPolicy exception. While this exception is handled and results in a can_run=False response instead of a crash, it constitutes a significant functional bug. It means that these critical resource management policies are not being correctly applied to EAP queries, leading to all such queries being rejected due to missing tenant information rather than actual resource constraints.

Suggested fix: The project_id needs to be extracted from request_meta.project_ids and included in the dictionary passed to allocation_policy.get_quota_allowance. Since project_ids is a list, logic should be added to select the appropriate project_id from the list, likely the first one if the list is not empty.
_{severity: 0.7, confidence: 0.95}

_{Did we get this right? 👍 / 👎 to inform future reviews.}

volokluev · 2025-09-23T22:33:08Z

snuba/query/allocation_policies/__init__.py

+def get_max_bytes_to_read(quota_allowances: dict[str, QuotaAllowance]) -> int:
+    """
+    alternative implementation:
+    non_zero_max_bytes = [qa.max_bytes_to_read for qa in quota_allowances.values() if qa.max_bytes_to_read > 0]
+    return min(non_zero_max_bytes) if non_zero_max_bytes else 0
+    """
+    max_bytes_to_read = min(
+        [qa.max_bytes_to_read for qa in quota_allowances.values()],
+        key=lambda mb: float("inf") if mb == 0 else mb,
+    )
+    if max_bytes_to_read != 0:
+        return max_bytes_to_read
+    return 0
+
+


nit: this file already has a bunch of stuff in it. I would put this function it its own file

volokluev · 2025-09-23T22:35:55Z

snuba/web/db_query.py

+    if attribution_info.app_id.key == "eap":
+        return
+
+    quota_allowances: dict[str, QuotaAllowance] = {}


this is a very hacky way to do this. You're depending on some metadata injected way up somewhere in the pipeline to be set to a specific value.

The much more reasonable thing would be to return early if allocation_policies is empty

volokluev · 2025-09-23T22:36:37Z

snuba/query/allocation_policies/__init__.py

+    alternative implementation:
+    non_zero_max_bytes = [qa.max_bytes_to_read for qa in quota_allowances.values() if qa.max_bytes_to_read > 0]
+    return min(non_zero_max_bytes) if non_zero_max_bytes else 0


what is the point of this docstring?

volokluev · 2025-09-23T22:40:30Z

snuba/web/rpc/__init__.py

+            if self.routing_decision.routing_context.query_result is not None or isinstance(
+                error, QueryException
+            ):
+                self.routing_decision.strategy.update_allocation_policies_balances(self.routing_decision, error)  # type: ignore


This doesn't quite smell right to me. The routing strategy should contain this functionality within itself. it could be that output_metrics is not a good name for the method being called above but overall, there's something a routing strategy is supposed to do before the call (decide the routing tier/time window/ etc) and after the call (output metrics, update quota balances, etc). This is relying on the caller to know something three layers deep. That is a leaky abstraction

volokluev · 2025-09-23T22:43:17Z

snuba/web/rpc/storage_routing/routing_strategies/storage_routing.py

+            ) as span:
+                request_meta = extract_message_meta(routing_context.in_msg)
+                recommendations[allocation_policy_name] = allocation_policy.get_quota_allowance(
+                    {


If it's only one project id we use the project id

volokluev

good progress, a few usability questions and suggestions

volokluev · 2025-10-02T17:12:12Z

snuba/web/rpc/__init__.py

+            elif not (
+                isinstance(error, QueryException)
+                and error.exception_type == AllocationPolicyViolations.__name__
+            ):


this check is kind of ugly, is there no other way to do it?

volokluev · 2025-10-02T17:13:07Z

snuba/web/rpc/__init__.py

+        if routing_strategy_after_execute_error is not None:
+            self.metrics.increment("after_execute_failure")
            sentry_sdk.capture_message(
-                f"Error in routing strategy output metrics: {output_metrics_error}"
+                f"Error in routing strategy after execute: {routing_strategy_after_execute_error}"
            )
            if settings.RAISE_ON_ROUTING_STRATEGY_FAILURES:
-                raise output_metrics_error
+                raise routing_strategy_after_execute_error


this exception handling should probably be taken care of in the routing strategies

volokluev · 2025-10-02T17:15:04Z

snuba/web/rpc/storage_routing/routing_strategies/outcomes_based.py

+        combined_allocation_policies_recommendations = (
+            self._get_combined_allocation_policies_recommendations(routing_context)
+        )
+
        routing_decision = RoutingDecision(
            routing_context=routing_context,
            strategy=self,
            tier=Tier.TIER_1,
-            clickhouse_settings={},
-            can_run=True,
+            clickhouse_settings=combined_allocation_policies_recommendations["settings"],
+            can_run=combined_allocation_policies_recommendations["can_run"],
+            is_throttled=combined_allocation_policies_recommendations["is_throttled"],
        )
+
+        if not routing_decision.can_run:
+            return routing_decision
+


we have more than one routing strategy in use right now. Can we provide sane default behavior so that this doesn't have to be copy-pasted into every new routing strategy we use?

tests/web/rpc/v1/test_storage_routing.py

volokluev · 2025-10-02T17:18:20Z

tests/web/rpc/v1/test_storage_routing.py

+    with mock.patch.object(
+        BaseRoutingStrategy,
+        "get_allocation_policies",
+        return_value=[
+            ThrottleAllocationPolicy(
+                ResourceIdentifier(StorageKey("doesntmatter")), ["a", "b", "c"], {}
+            ),
+            ThrottleAllocationPolicyDuplicate(
+                ResourceIdentifier(StorageKey("doesntmatter")), ["a", "b", "c"], {}
+            ),
+        ],
+    ):


why mock when you can extend and override?

volokluev · 2025-10-14T21:31:57Z

snuba/web/rpc/__init__.py

+            tenant_ids={
+                "organization_id": request_meta.organization_id,
+                "referrer": request_meta.referrer,
+                **(
+                    {"project_id": request_meta.project_ids[0]}
+                    if hasattr(request_meta, "project_ids") and len(request_meta.project_ids) == 1
+                    else {}
+                ),
+            },


this seems like something that should be done in the routing strategy, not the rpc endpoint

I'd rather not set an empty tenant_ids in rpc endpoint and then populate it in routing strategy

Plus, routing strategy selector should be using the fields set in tenant_ids (ie organization_id & project_id), and routing strategy selector happens before routing strategy.

I don't think you need tenant_ids at all in the routing context. This is information can entirely be calculated from the request_meta object. A cleaner solution would be to define a tenant_ids property:

class RoutingContext: @property def tenant_ids(self): return { "organization_id": request_meta.organization_id, "referrer": request_meta.referrer, **( {"project_id": request_meta.project_ids[0]} if hasattr(request_meta, "project_ids") and len(request_meta.project_ids) == 1 else {} ) }

volokluev · 2025-10-14T23:56:20Z

snuba/web/rpc/__init__.py

+                    "request_rate_limited",
+                    tags=self._timer.tags,
+                )
+                error = QueryException.from_args(


I don't see a reason why you have to wrap it in a QueryException

A combination of:

this is an existing pattern: https://github.com/getsentry/snuba/blob/master/snuba/web/db_query.py#L682-L691.

I want this to be caught by: https://github.com/getsentry/snuba/blob/master/snuba/web/rpc/__init__.py#L424-L425

Making AllocationPolicyViolations inherit from QueryException goes against the initial design: ref(cbrs): move EAP allocation policies into the routing strategy layer #7411 (comment)

My comment is not blocking for approval but here are my thoughts in reply:

I think this pattern has always been fairly dysfunctional and we're just perpetuating it. The only thing that really expects a QueryException is the AllocationPolicy:

https://github.com/getsentry/snuba/blob/master/snuba/query/allocation_policies/__init__.py#L45

but this could easily be changed to allow any kind of errors, especially AllocationPolicyViolation(s)

You could make a RPCAllocationPolicyException(RPCException) class and raise it here and it would be caught

You don't need to inherit it from QueryException it can just be its own thing. QueryException denotes a problem with clickhouse query execution and we're now muddying its meaning to also mean the RPC request failed which it could be two differeint things

If you're not going to make the updates above, at the very least I would ask that this ugly chaining thing be abstracted into its own component rather than taking up a whole bunch of space in the RPC method. This is a core component and its functionality should be easy to understand at a glance

Heres a place that also expects a QueryException: https://github.com/getsentry/snuba/blob/master/snuba/web/views.py#L468-L471

I think for now I'll just abstract it into its own component and make a follow up pr

snuba/web/rpc/storage_routing/routing_strategies/storage_routing.py

volokluev

Approving with recommended changes so you're not blocked while I'm on PTO. Please give consideration to my suggestions though and make sure you have a rollout plan

xurui-c force-pushed the rachel/movepolicies branch 5 times, most recently from 4b54c00 to 325e6eb Compare September 19, 2025 04:05

xurui-c commented Sep 22, 2025

View reviewed changes

volokluev reviewed Sep 22, 2025

View reviewed changes

xurui-c force-pushed the rachel/movepolicies branch 3 times, most recently from 5aaa5c4 to 6731d49 Compare September 23, 2025 21:57

xurui-c changed the title ~~tryna move the allocation policies~~ ref(cbrs): move EAP allocation policies into the routing strategy layer Sep 23, 2025

xurui-c commented Sep 23, 2025

View reviewed changes

xurui-c marked this pull request as ready for review September 23, 2025 22:20

xurui-c requested review from a team as code owners September 23, 2025 22:20

seer-by-sentry bot reviewed Sep 23, 2025

View reviewed changes

xurui-c requested a review from volokluev September 23, 2025 22:26

volokluev reviewed Sep 23, 2025

View reviewed changes

xurui-c force-pushed the rachel/movepolicies branch 2 times, most recently from 48bf012 to dc0294c Compare September 30, 2025 23:36

xurui-c requested a review from volokluev October 2, 2025 17:01

volokluev reviewed Oct 2, 2025

View reviewed changes

Rachel Chen added 4 commits October 13, 2025 14:37

squash

775331e

ugly check

bd4f615

error handling

58f87e9

routing context

c7cfcda

xurui-c force-pushed the rachel/movepolicies branch from 789a200 to c7cfcda Compare October 13, 2025 21:43

Rachel Chen added 3 commits October 14, 2025 09:31

comments

602fb76

canrun

11f305a

test

5aa3a88

xurui-c requested a review from volokluev October 14, 2025 18:19

test

08e009d

volokluev reviewed Oct 15, 2025

View reviewed changes

xurui-c requested a review from volokluev October 15, 2025 18:10

comments

0282a73

xurui-c force-pushed the rachel/movepolicies branch from ad16694 to 0282a73 Compare October 15, 2025 18:14

tenant_ids

e2015f2

xurui-c force-pushed the rachel/movepolicies branch from 20288e8 to e2015f2 Compare October 15, 2025 19:40

volokluev approved these changes Oct 15, 2025

View reviewed changes

abstract

6a715e2

xurui-c merged commit 76d6e0b into master Oct 16, 2025
34 checks passed

xurui-c deleted the rachel/movepolicies branch October 16, 2025 20:29

xurui-c mentioned this pull request Oct 16, 2025

Revert "ref(cbrs): move EAP allocation policies into the routing strategy layer" #7457

Closed

xurui-c mentioned this pull request Nov 3, 2025

improvement(cbrs): replace weird chaining of QueryException with RPCAllocationPolicyException #7498

Merged

Uh oh!

ref(cbrs): move EAP allocation policies into the routing strategy layer #7411

ref(cbrs): move EAP allocation policies into the routing strategy layer #7411

Uh oh!

Conversation

xurui-c commented Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

❌ 1 Tests Failed:

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xurui-c Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xurui-c Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

seer-by-sentry bot Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

volokluev left a comment

Choose a reason for hiding this comment

Uh oh!

xurui-c commented Sep 17, 2025 •

edited

Loading

codecov bot commented Sep 19, 2025 •

edited

Loading

xurui-c Sep 22, 2025 •

edited

Loading

xurui-c Sep 22, 2025 •

edited

Loading