Skip to content

Conversation

xGladiate
Copy link

@xGladiate xGladiate commented Oct 12, 2025

Description:

This PR fixes the following issues with the question syncing process:

  • Only leetcode DB sync is completed when it returns
  • The cursor was updated to the nearest 200 questions instead of the last sync point.
  • Failed sync questions are forgotten during syncing. Tracks and remembers skipped questions during syncing.
  • Seed cursor only validates sync to leetcode-service instead of question-service.

Changes Made

  • Health Check for Question Service done before fetching leetcode questions into the leetcode-service.questions DB
  • The next sync now starts from the last successful sync point.
  • Database operations are strictly limited to inserts—no updates if the question already exists in the database. (it was previously not explicitly guarded)
  • Tracks and remembers and attempt to resync skipped questions during syncing.

@Copilot Copilot AI review requested due to automatic review settings October 12, 2025 16:14
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes synchronization issues in the LeetCode question seeding process by improving cursor management and database operations. It removes the "done" state tracking in favor of more robust pagination handling and optimizes the database schema for better performance.

  • Replaces "done" state with total-based cursor management to prevent re-fetching completed data
  • Changes from $set to $setOnInsert to prevent overwriting existing question data
  • Optimizes database indexes by adding unique constraint on titleSlug and improving compound indexes

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
backend-services/leetcode-backend-service/src/leetcode/seedBatch.ts Removes done state tracking, improves cursor pagination logic, and changes upsert strategy to preserve existing data
backend-services/leetcode-backend-service/src/db/model/question.ts Adds unique index on titleSlug and optimizes compound indexes for better query performance

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines 72 to 73
// setOnInsert to avoid overwriting existing entries
$setOnInsert: {
Copy link

Copilot AI Oct 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using $setOnInsert instead of $set means that existing questions will never be updated with new data from LeetCode. This could leave stale data in the database if question details change on LeetCode's side.

Suggested change
// setOnInsert to avoid overwriting existing entries
$setOnInsert: {
// set to update existing entries with latest data
$set: {

Copilot uses AI. Check for mistakes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the intended outcome, not to update the database - to prevent issue where update may overwrite update made manually.

@xGladiate xGladiate changed the title fix: Fix Question Service bugs and edge cases fix: Fix Question Service seed-batch bugs and edge cases Oct 12, 2025
@xGladiate xGladiate self-assigned this Oct 12, 2025
@Copilot Copilot AI review requested due to automatic review settings October 12, 2025 16:40
Copilot

This comment was marked as resolved.

@Copilot Copilot AI review requested due to automatic review settings October 12, 2025 16:53
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines 90 to 104
},
},
Copy link

Copilot AI Oct 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updatedAt field was removed from $setOnInsert but upsert: true still allows updates. This could lead to inconsistent timestamp behavior for existing records. Consider either adding updatedAt back or clarifying the intended behavior for record updates.

Suggested change
},
},
updatedAt: new Date(),
},
$set: {
updatedAt: new Date(),
},

Copilot uses AI. Check for mistakes.

Copy link
Author

@xGladiate xGladiate Oct 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our case, an additional

$set: {
    updatedAt: new Date(),
},

is not required since we only want to insert new records and don't want to update existing ones. Moreover, since the schema already has { timestamps: true }, MongoDB will take care of setting updatedAt when a new document is inserted.

@xGladiate xGladiate force-pushed the fix-question-edge-cases branch from d7a163c to ceaa880 Compare October 14, 2025 15:26
@Copilot Copilot AI review requested due to automatic review settings October 15, 2025 16:04
@xGladiate xGladiate force-pushed the fix-question-edge-cases branch from ceaa880 to c970774 Compare October 15, 2025 16:04
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

const cursor =
(await SeedCursor.findById(id)) ??
new SeedCursor({ _id: id, nextSkip: 0, pageSize: 200, done: false });
new SeedCursor({ _id: id, nextSkip: 0, pageSize: 200 });
Copy link

Copilot AI Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pageSize is hardcoded as a magic number. Consider defining this as a named constant at the module level for better maintainability.

Copilot uses AI. Check for mistakes.

Comment on lines +112 to +117
if (nextSkip + pageSize > total) {
// Prevent future refetching of previously fetched items
cursor.nextSkip = total;
} else {
cursor.nextSkip = nextSkip + pageSize;
}
Copy link

Copilot AI Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cursor advancement logic is duplicated from lines 67. Consider extracting this into a helper function to avoid code duplication.

Copilot uses AI. Check for mistakes.

}

if (attempt < retries) {
await delay(250 * 2 ** attempt);
Copy link

Copilot AI Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The retry delay calculation uses magic numbers. Consider defining base delay (250ms) as a named constant for better maintainability.

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant