-
Notifications
You must be signed in to change notification settings - Fork 3
feat: Traceloop evals #36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| return True | ||
|
|
||
| # PRIORITY 2: Check for explicit LLM span kind (even without messages, for compatibility) | ||
| if span_kind == "llm": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
limit to traceloop span kinds - https://www.traceloop.com/docs/openllmetry/contributing/semantic-conventions#llm-frameworks
|
|
||
| # PRIORITY 3: Detect ReAct agent/task spans by kind | ||
| # These are agent workflows that contain LLM calls | ||
| if span_kind in ["agent", "task", "workflow"]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, evals should be limited to llm spans
| # ------------------------------------------------------------------ | ||
| # Internal helpers | ||
| # ------------------------------------------------------------------ | ||
| def _is_llm_span(self, span: ReadableSpan) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, this method should return True for spans which are LLM calls or chat calls with the model
| # Mapping from original span_id to translated INVOCATION (not span) for parent-child relationship preservation | ||
| self._original_to_translated_invocation: Dict[int, Any] = {} | ||
| # Buffer spans to process them in the correct order (parents before children) | ||
| self._span_buffer: List[ReadableSpan] = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look into the span buffer logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove buffer logic if not required
| # STEP 2: Check if this is an LLM span that needs evaluation | ||
| if self._is_llm_span(span): | ||
| _logger.debug( | ||
| "🔍 TRACELOOP PROCESSOR: LLM span '%s' detected! Processing immediately for evaluations", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove emojis
| "Failed to stop LLM invocation: %s", stop_err | ||
| ) | ||
| else: | ||
| # Non-LLM spans (tasks, workflows, tools) - buffer for optional batch processing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revisit this logic
beb41f3 to
8c60348
Compare
3ca1998 to
fbad29e
Compare
fbad29e to
b192589
Compare
verify commit
b192589 to
2579cbe
Compare
This builds on top of #29