Skip to content

Use git commit hashes instead mercurial hashes as canonical.#2658

Open
lutien wants to merge 2 commits intomozilla:masterfrom
lutien:use-git-commit-hashes-as-canonical
Open

Use git commit hashes instead mercurial hashes as canonical.#2658
lutien wants to merge 2 commits intomozilla:masterfrom
lutien:use-git-commit-hashes-as-canonical

Conversation

@lutien
Copy link
Member

@lutien lutien commented Nov 14, 2025

No description provided.

@lutien lutien requested a review from jgraham November 14, 2025 19:14
@lutien lutien force-pushed the use-git-commit-hashes-as-canonical branch from 818398b to 7512167 Compare January 14, 2026 09:25
Copy link
Member

@jgraham jgraham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I think we want to split this work up into multiple phases:

  • First start to store the new git revision for each commit in the metadata
  • Then backfill that into existing commit metadata (e.g. as a one-time job)
  • Then start to actually use the git revision instead of the hg revision.

This is generally rather high regression risk, because we might end up comparing incorrect revisions, and the type system won't help us. We could I suppose build some infrastructure so it can help us (stop passing strings around, but pass around HgSha and GitSha objects that can't be compared). But we at least need to be careful.

Comment on lines +60 to +62
api_url = https://lando.moz.tools/api
hg2git = /hg2git/firefox/
git2hg = /git2hg/firefox/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these need to be configuration? Certainly not the paths.

def hg2git(hg_hash: str) -> str:
session = requests.Session()
try:
response = tc.fetch_json(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any specific reason to use tc.fetch_json rather than urllib directly? Also if we aren't passing in a session that function will create one, so I don't think we need to create one here, except if we really have to do these repeated calls for each revision, in which case I think we probably want to create a single session to reduce the overhead a bit.

But more generally I wonder if there's a better option for converting SHAs back and forth than doing an HTTP call for each SHA we want to convert. It feels slow and fragile. There isn't some easier way to either convert many at once, or download the full table locally is there? We could cache the results, but that might be a bit high overhead…

# If there's already a patch file here then don't try to create a new one
# because we'll presumbaly fail again
raise AbortError("Skipping due to existing patch")
wpt_commit = gecko_commit.move(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Commit.move method itself isn't updated in this patch (unless I'm misreading it), so I think we're now being inconsistent about what the patch file is called.

)

@mut()
def reapply_local_commits(self, gecko_commits_landed: set[str]) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we don't use this method at all at the moment, but I think if we did it would be broken? gecko_commits_landed is built from the gecko-commit metadata which is still using the hg revision, I think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants