Anomaly tests, minor improvements by vb-dbrks · Pull Request #1062 · databrickslabs/dqx

vb-dbrks · 2026-03-09T22:13:11Z

Changes

Refactored MLflow experiment management to cache experiments per worker process, ensuring unique experiments for parallel test execution.
Updated integration tests to use a consistent anomaly threshold for deterministic results, enhancing test reliability.

These changes streamline the testing process and improve the management of MLflow experiments across different test sessions.

Linked issues

No linked issues, flaky and long running tests for every PR

Resolves #..

Tests

Locally run

- Added a function to load environment variables from a .env file when not in CI, improving local test configuration. - Refactored MLflow experiment management to cache experiments per worker process, ensuring unique experiments for parallel test execution. - Updated integration tests to use a consistent anomaly threshold for deterministic results, enhancing test reliability. These changes streamline the testing process and improve the management of MLflow experiments across different test sessions.

…at the `_dq_info` field is not None, improving the robustness of the test by ensuring valid data is present before proceeding with further assertions.

Copilot

Pull request overview

This PR refines anomaly integration testing by (1) adjusting MLflow experiment lifecycle management for parallel pytest execution and (2) making one mixed-check anomaly test less flaky by using a deterministic threshold and relaxing assertions around anomaly hard-fail behavior.

Changes:

Cache a single MLflow experiment per xdist worker and clean it up at worker session end.
Make test_apply_anomaly_check_by_metadata_with_multiple_checks deterministic by using a fixed threshold and asserting on _dq_info rather than always requiring _errors.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
`tests/integration_anomaly/conftest.py`	Reworks MLflow setup/teardown to create/cache one experiment per worker and set experiment env vars per test.
`tests/integration_anomaly/test_anomaly_apply_checks_by_metadata.py`	Lowers threshold in a mixed-check test and shifts expectations to `_dq_info` to reduce flakiness.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

tests/integration_anomaly/conftest.py

…ictionary to streamline experiment management. - Updated the `mlflow_worker_experiment` fixture to utilize the new cache structure, improving clarity and reducing global state usage. - Enhanced cleanup logic to ensure proper deletion of cached experiments at the end of worker sessions. These changes improve the organization and efficiency of MLflow experiment handling in the test suite.

github-actions · 2026-03-09T22:29:34Z

✅ 603/603 passed, 36 skipped, 5h11m53s total

_{Running from acceptance #4088}

codecov · 2026-03-10T00:45:29Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 91.78%. Comparing base (c8079d5) to head (082625c).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1062      +/-   ##
==========================================
- Coverage   91.81%   91.78%   -0.04%     
==========================================
  Files          98       98              
  Lines        8945     8945              
==========================================
- Hits         8213     8210       -3     
- Misses        732      735       +3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Removed redundant Databricks and MLflow environment variable definitions from the anomaly detection workflow in GitHub Actions. This streamlines the configuration and focuses on essential variables for running integration tests.

Copilot

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

tests/integration_anomaly/conftest.py

tests/integration_anomaly/test_helpers_data.py

.github/workflows/anomaly.yml

mwojtyczka

Code Review - PR #1062: Anomaly tests, minor improvements

Overview

This PR refactors the anomaly integration test infrastructure:

Extracts test helpers from conftest.py into dedicated modules (test_helpers.py and test_helpers_data.py)
Simplifies MLflow experiment management by caching per worker process via a module-level dict
Removes redundant env var forwarding in the CI workflow (relying on inherited env instead)
Introduces DETERMINISTIC_FLAG_THRESHOLD for tests that need reliable anomaly flagging
Makes flaky assertions more resilient by checking _dq_info presence rather than hard error counts

Overall this is a solid cleanup that improves test organization and reliability. A few items to address below.

tests/integration_anomaly/conftest.py

tests/integration_anomaly/test_helpers_data.py

.github/workflows/anomaly.yml

tests/integration_anomaly/test_anomaly_apply_checks_by_metadata.py

tests/integration_anomaly/test_helpers.py

tests/integration_anomaly/conftest.py

mwojtyczka

left comments

mwojtyczka

LGTM

vb-dbrks requested a review from a team as a code owner March 9, 2026 22:13

vb-dbrks requested review from tombonfert and removed request for a team March 9, 2026 22:13

vb-dbrks temporarily deployed to tool March 9, 2026 22:13 — with GitHub Actions Inactive

vb-dbrks had a problem deploying to tool March 9, 2026 22:13 — with GitHub Actions Error

vb-dbrks temporarily deployed to tool March 9, 2026 22:13 — with GitHub Actions Inactive

vb-dbrks requested a review from Copilot March 9, 2026 22:15

Copilot started reviewing on behalf of vb-dbrks March 9, 2026 22:15 View session

vb-dbrks changed the title ~~Enhance test environment setup and MLflow integration~~ Anomaly tests, minor improvements Mar 9, 2026

Updated the test for applying anomaly checks by metadata to assert th…

7af4201

…at the `_dq_info` field is not None, improving the robustness of the test by ensuring valid data is present before proceeding with further assertions.

Copilot AI reviewed Mar 9, 2026

View reviewed changes

tests/integration_anomaly/conftest.py Outdated Show resolved Hide resolved

tests/integration_anomaly/conftest.py Outdated Show resolved Hide resolved

tests/integration_anomaly/conftest.py Outdated Show resolved Hide resolved

vb-dbrks had a problem deploying to tool March 9, 2026 22:20 — with GitHub Actions Error

vb-dbrks temporarily deployed to tool March 9, 2026 22:21 — with GitHub Actions Inactive

vb-dbrks had a problem deploying to tool March 9, 2026 23:18 — with GitHub Actions Failure

vb-dbrks temporarily deployed to tool March 9, 2026 23:18 — with GitHub Actions Inactive

vb-dbrks temporarily deployed to tool March 9, 2026 23:44 — with GitHub Actions Inactive

vb-dbrks temporarily deployed to tool March 10, 2026 10:43 — with GitHub Actions Inactive

Refactor anomaly workflow environment variables

533f809

Removed redundant Databricks and MLflow environment variable definitions from the anomaly detection workflow in GitHub Actions. This streamlines the configuration and focuses on essential variables for running integration tests.

vb-dbrks temporarily deployed to tool March 10, 2026 10:52 — with GitHub Actions Inactive

Copilot started reviewing on behalf of vb-dbrks March 11, 2026 10:18 View session

Copilot AI reviewed Mar 11, 2026

View reviewed changes

tests/integration_anomaly/conftest.py Show resolved Hide resolved

tests/integration_anomaly/test_helpers_data.py Outdated Show resolved Hide resolved

.github/workflows/anomaly.yml Show resolved Hide resolved