-
-
Notifications
You must be signed in to change notification settings - Fork 62
feat(cbrs): Time Window Routing #7418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
routing_context.extra_info["estimation_sql"] = res.extra.get("sql", "") | ||
return cast(int, res.result.get("data", [{}])[0].get("num_items", 0)) | ||
|
||
def _adjust_time_window(self, routing_context: RoutingContext) -> TimeWindow | None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when does this function return None?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if there is no adjustment to be made
window_length = original_end_ts - original_start_ts | ||
|
||
start_timestamp_proto = TimestampProto( | ||
seconds=original_end_ts - math.floor((window_length / factor)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this so we prioritize more recent data? and the user will paginate forwards?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct
start_timestamp: TimestampProto | ||
end_timestamp: TimestampProto | ||
|
||
def length_hours(self) -> float: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is this used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
debugging :D
ingested_items = self.get_ingested_items_for_timerange( | ||
routing_context, original_time_window | ||
) | ||
factor = ingested_items / max_items |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is how we actually shrink the time window
# TODO import these from sentry-relay | ||
class OutcomeCategory: | ||
SPAN_INDEXED = 16 | ||
LOG_ITEM = 23 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy pasta, remove
_ITEM_TYPE_TO_OUTCOME = { | ||
TraceItemType.TRACE_ITEM_TYPE_SPAN: OutcomeCategory.SPAN_INDEXED, | ||
TraceItemType.TRACE_ITEM_TYPE_LOG: OutcomeCategory.LOG_ITEM, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy pasted, remove this
column("category"), | ||
_ITEM_TYPE_TO_OUTCOME.get( | ||
in_msg_meta.trace_item_type, | ||
OutcomeCategory.SPAN_INDEXED, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is wrong, the default should not be span indexed
✅ All tests passed in 1293.61s |
This PR implements a routing strategy for routing to TIER_1 and instead of downsampling storage tiers, shrinking time windows.
Design decisions
Window Sizing Algorithm
This is the easiest thing to change in this PR (and likely it will change). At the moment, it looks up the amount of outcomes for the requested time range and shrinks the time window down assuming that the distribution of datapoints is uniform across time (which is not true). Most recent datapoints are prioritized first
Pagination
Only the TraceItemTable endpoint makes use of the recommendations of this routing strategy, the endpoint and routing strategy interact across queries. This is to facilitate a simple client side UX where all the client has to do is pass the
page_token
across their requests and not worry about anything else.Here's a diagram explaining the flow:
In order to facilitate pagination, the TraceItemTable endpoint now queries for
limit + 1
rows in order to know if there are more items in this current window or if we can move on to the next oneWhat's missing