Conversation
- Added a function to load environment variables from a .env file when not in CI, improving local test configuration. - Refactored MLflow experiment management to cache experiments per worker process, ensuring unique experiments for parallel test execution. - Updated integration tests to use a consistent anomaly threshold for deterministic results, enhancing test reliability. These changes streamline the testing process and improve the management of MLflow experiments across different test sessions.
…at the `_dq_info` field is not None, improving the robustness of the test by ensuring valid data is present before proceeding with further assertions.
There was a problem hiding this comment.
Pull request overview
This PR refines anomaly integration testing by (1) adjusting MLflow experiment lifecycle management for parallel pytest execution and (2) making one mixed-check anomaly test less flaky by using a deterministic threshold and relaxing assertions around anomaly hard-fail behavior.
Changes:
- Cache a single MLflow experiment per xdist worker and clean it up at worker session end.
- Make
test_apply_anomaly_check_by_metadata_with_multiple_checksdeterministic by using a fixed threshold and asserting on_dq_inforather than always requiring_errors.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
tests/integration_anomaly/conftest.py |
Reworks MLflow setup/teardown to create/cache one experiment per worker and set experiment env vars per test. |
tests/integration_anomaly/test_anomaly_apply_checks_by_metadata.py |
Lowers threshold in a mixed-check test and shifts expectations to _dq_info to reduce flakiness. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
…ictionary to streamline experiment management. - Updated the `mlflow_worker_experiment` fixture to utilize the new cache structure, improving clarity and reducing global state usage. - Enhanced cleanup logic to ensure proper deletion of cached experiments at the end of worker sessions. These changes improve the organization and efficiency of MLflow experiment handling in the test suite.
|
✅ 603/603 passed, 36 skipped, 5h11m53s total Running from acceptance #4088 |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1062 +/- ##
==========================================
- Coverage 91.81% 91.78% -0.04%
==========================================
Files 98 98
Lines 8945 8945
==========================================
- Hits 8213 8210 -3
- Misses 732 735 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Removed redundant Databricks and MLflow environment variable definitions from the anomaly detection workflow in GitHub Actions. This streamlines the configuration and focuses on essential variables for running integration tests.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 21 out of 21 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
mwojtyczka
left a comment
There was a problem hiding this comment.
Code Review - PR #1062: Anomaly tests, minor improvements
Overview
This PR refactors the anomaly integration test infrastructure:
- Extracts test helpers from
conftest.pyinto dedicated modules (test_helpers.pyandtest_helpers_data.py) - Simplifies MLflow experiment management by caching per worker process via a module-level dict
- Removes redundant env var forwarding in the CI workflow (relying on inherited env instead)
- Introduces
DETERMINISTIC_FLAG_THRESHOLDfor tests that need reliable anomaly flagging - Makes flaky assertions more resilient by checking
_dq_infopresence rather than hard error counts
Overall this is a solid cleanup that improves test organization and reliability. A few items to address below.
Changes
These changes streamline the testing process and improve the management of MLflow experiments across different test sessions.
Linked issues
No linked issues, flaky and long running tests for every PR
Resolves #..
Tests
Locally run