Skip to content

Conversation

@sriram-atlan
Copy link

Change description

Run tag propagation in isolation.
Stop this atlas pod running tag task from taking any other requests
introduce another endpoint /ready that will be used in readinessProbe. If tag propagation is ON, than this readiness endpoint will exclude this pod from serving web traffic.

Type of change

  • Bug fix (fixes an issue)
  • New feature (adds functionality)

Related issues

https://atlanhq.atlassian.net/browse/MLH-1239

Fix #1

Helm Config Changes for Running Tests (Staging PR)

Does this PR require Helm config changes for testing?

  • Tests are NOT required for this commit. (You can proceed with the PR.) ✅
  • No, Helm config changes are not needed. (You can proceed with the PR.) ✅
  • Yes, I have already updated the config-values on enpla9up36. (You can proceed with the PR.) ✅
  • Yes, but I have NOT updated the config-values. (Please update them before proceeding; or, tests will run with default values.)⚠️

Checklists

Development

  • Lint rules pass locally
  • Application changes have been tested thoroughly
  • Automated tests covering modified code pass

Security

  • Security impact of change has been considered
  • Code follows company security practices and guidelines

Code review

  • Pull request has a descriptive title and context useful to a reviewer. Screenshots or screencasts are attached as necessary
  • "Ready for review" label attached and reviewers assigned
  • Changes have been reviewed by at least one other contributor
  • Pull request linked to task tracker where applicable

} else {
LOG.info("TasksFetcher: No task to queue");
if (submittedCount > 0) {
LOG.info("TasksFetcher: Submitted {} tasks to the queue", submittedCount);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Task Submission and Latch Decrement Mismatch

The CountDownLatch is initialized with the total task count, but submitAll skips null tasks or exits early without decrementing it. This prevents the latch from reaching zero, causing waitForTasksToComplete to hang. Also, the new task submission retry loop lacks a maximum retry limit, risking an infinite loop if memory stays high.

Fix in Cursor Fix in Web

Response.ResponseBuilder builder;

// Return 503 if processing tasks or high memory
if (isProcessingTasks || memoryUsage > 0.75) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Inconsistent Memory Thresholds Across Components

The new /admin/ready endpoint's memory threshold for readiness is hardcoded to 0.75. This creates an inconsistency with the configurable AtlasConfiguration.TASK_MEMORY_THRESHOLD_PERCENT used by other components like TaskQueueWatcher, potentially causing different behavior between the readiness check and task processing.

Fix in Cursor Fix in Web

submitAll(tasks, latch);
LOG.info("Submitted {} tasks to the queue", tasks.size());
waitForTasksToComplete(latch);
isProcessingTasks.set(true);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic just responsible for submittiing tasks to the queue.
IMO it should never tell whether isProcessingTasks, best please for this flag would be the TaskExecutor.run() method.

Also, generally we do check RequestContext.get().getCurrentTask(); to conclude whether any task is in progress or not, may be same can be used instead the flag but this way also fine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants