Skip to content

feat: add Data Quality Agent configuration and setup#373

Open
RadovanTomik wants to merge 5 commits intosamply:developfrom
RadovanTomik:feat/data_quality_agent
Open

feat: add Data Quality Agent configuration and setup#373
RadovanTomik wants to merge 5 commits intosamply:developfrom
RadovanTomik:feat/data_quality_agent

Conversation

@RadovanTomik
Copy link
Copy Markdown

Adds an opt-in option to install the Data Quality Agent as part of the BH

@RadovanTomik RadovanTomik requested a review from a team as a code owner March 3, 2026 12:37
@RadovanTomik
Copy link
Copy Markdown
Author

still need to figure out the best way to point it to Blaze without user input on start

Comment on lines +8 to +10
# Remove these 2 environment variables if you do not wish to share Data Quality Reports
REPORTING_SERVER_URL: ${DATA_QUALITY_SERVER_URL}
REPORTING_SERVER_NAME: ${DATA_QUALITY_SERVER_NAME}
Copy link
Copy Markdown
Member

@Threated Threated Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment here is rather useless since one would just not set ENABLE_DATA_QUALITY_AGENT if they don't want it (and nobody really reads the compose files).
Would it make sense to default these vars to the values you mentioned in the README or are they defaulted by the component itself?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand this correctly, setting those ENVs activates remote sharing of the quality reports. The application can be used for local reporting nevertheless. Does the application just check if the ENVs are set or does it also check if they are empty? If the latter, you could include the ENVs (and the comment) in the configuration template (bbmri.conf). That way, sites not opting in can leave them empty there.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for the comments, I have modified the documentation and setup so its clear they can be used for local data quality evaluation and also optional sharing of the metrics

README.md Outdated
DATA_QUALITY_SERVER_NAME=Central Data Quality Server of BBMRI
```

Reports are stored under `/var/cache/bridgehead/bbmri/data-quality-agent-reports/` and are accessible at `https://<your-host>/bbmri-data-quality-agent` (requires Bridgehead login).
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently not mounted as a volume in the compose file so it won't be persisted.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed and they can be accessed via the UI.

labels:
- "traefik.enable=true"
- "traefik.http.routers.data_quality_agent_bbmri.rule=PathPrefix(`/bbmri-data-quality-agent`)"
- "traefik.http.services.data_quality_agent_bbmri.loadbalancer.server.port=8080"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can see through docker inspecting the image this service runs on 8082 no?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed. Is the path prefix correct/conventional for the BH?

@Threated
Copy link
Copy Markdown
Member

Threated commented Mar 3, 2026

still need to figure out the best way to point it to Blaze without user input on start

You can just use point it to "http://bridgehead-bbmri-blaze:8080/fhir" (with or without the /fhir whatever you like) its a bbmri module so there should always be a blaze under that url without auth because its doing docker networking and not going via traefik

@RadovanTomik
Copy link
Copy Markdown
Author

still need to figure out the best way to point it to Blaze without user input on start

You can just use point it to "http://bridgehead-bbmri-blaze:8080/fhir" (with or without the /fhir whatever you like) its a bbmri module so there should always be a blaze under that url without auth because its doing docker networking and not going via traefik

added as an ENV

@RadovanTomik
Copy link
Copy Markdown
Author

Can we test the config on BBMRI-ERIC dev BH?

@RadovanTomik
Copy link
Copy Markdown
Author

Hi, @TKussel could you please take a look at the PR?

Copy link
Copy Markdown
Member

@Threated Threated left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. @TKussel do we want to replicate the ghcr image to our registry or just keep the ghcr url?

- /etc/timezone:/etc/timezone:ro
volumes:
agent-data:
driver: local No newline at end of file
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
driver: local

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better yet, remove the volume and mount a folder (see above)

Comment on lines +9 to +10
REPORTING_SERVER_URL: ${DATA_QUALITY_SERVER_URL}
REPORTING_SERVER_NAME: ${DATA_QUALITY_SERVER_NAME}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does your app handle the case where these are set to "" gracefully? Just asking because we had issues with this for our components often times.

depends_on:
- "blaze"
volumes:
- agent-data:/app/data
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't use a docker volume but use a folder mount in /var/cache/bridgehead/

- /etc/timezone:/etc/timezone:ro
volumes:
agent-data:
driver: local No newline at end of file
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better yet, remove the volume and mount a folder (see above)


#### Sharing Data Quality Reports (recommended)

We strongly encourage sharing your data quality reports with the central BBMRI-ERIC quality dashboard. The reports contain only aggregated, non-patient-identifiable statistics and help the network to monitor and improve overall data quality.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
We strongly encourage sharing your data quality reports with the central BBMRI-ERIC quality dashboard. The reports contain only aggregated, non-patient-identifiable statistics and help the network to monitor and improve overall data quality.
We encourage sharing your data quality reports with the central BBMRI-ERIC quality dashboard. The reports contain only aggregated, non-patient-identifiable statistics and help the network to monitor and improve overall data quality. However, quality reporting is completely optional and opt-in.

@TKussel
Copy link
Copy Markdown
Member

TKussel commented Mar 24, 2026

Hi, @TKussel could you please take a look at the PR?

Sorry, took me a while. Thank you for continually improving the PR. I only had some nitpicks regarding the volume remaining :)

@TKussel
Copy link
Copy Markdown
Member

TKussel commented Mar 24, 2026

Looks good to me. @TKussel do we want to replicate the ghcr image to our registry or just keep the ghcr url?

We currently don't have the ghcr in Bridgehead's required endpoint list so, I replicated the latest tag to our registry: https://docker.verbis.dkfz.de/cache/bbmri-cz/data-quality-server:latest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants