Fix Pagination Bug in Github GraphQL Job Collector and add configurable page and batch size #8616
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
pr-type/bug-fix,pr-type/feature-development, etc.Summary
This PR fixes the Pagination bug from #8615 and adds the ability to adjust the collection paramaters of the Github GraphQL Job collector through environment variables.
The bug was caused by a pagination implementation for workflow jobs in
job_collector.gothat didn't play well together with the simultaneous batching of multiple workflow runs. IngetPageInfothe first workflow run that hadHasNextPagetrue returned it'sEndCursorwhich was then used for pagination for all Workflow runs in the batch, but only worked for the one it came from, therefore missing all the pages of other workflow runs in the same batch.Since some people reported having timeout issues with this plugin (@ClaudioMascaro and @robaca) for large repositories with large workflows I didn't want to take away either option. But combining pagination and batching at the same time didn't seem feasible without adding a lot of hard to maintain complexity to it (like keeping track of all produced EndCursors and submitting them to following collection, but only for those workflow runs that had more pages...).
I ended up implementing a configurable mode switch:
I think Rate Limit wise it doesnt make a difference (If I understood that point system correctly), since both ways would in total consume about the same amount of points.
Since it was suggested in #8469 to have the batch size and page size configurable via environment variable, I also added this to give more flexibility to the users in finding the sweet spot for their setup.
Does this close any open issues?
Closes #8615
Screenshots
Include any relevant screenshots here.
Other Information
Maybe it would also make sense to add these configuration options to Scope Config and thus make them configurable on a per scope config basis, but this would also need migration scripts and so on. So maybe an improvement for the future?
I will add tests aswell in the following days, if this approach seems appropriate to you?