Skip to content

Conversation

@dzbarsky
Copy link
Contributor

What type of PR is this?
Feature

What package or component does this PR mostly affect?
go_repository

What does this PR do? Why is it needed?
When GOPROXY is enabled, a module fetch reduces to an HTTP download from the proxy, which Bazel is capable of doing on its own. This lets the rules play more nicely with the downloader config, remote asset API, etc.

Which issues(s) does this PR fix?

Fixes #

Other notes for review

@dzbarsky dzbarsky force-pushed the zbarsky/direct-download branch from 752ad5e to 78f1456 Compare June 20, 2025 16:55
auth = use_netrc(read_user_netrc(ctx), ctx.attr.urls, ctx.attr.auth_patterns),
allow_fail = True,
)
if result.success:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which cases do we expect this to fail? Should we print a warning if it does or fail instead and provide a way to opt into the fallback?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe native go tooling will silently fallback when there are multiple comma-separated options in GOPROXY (as there are by default) so I was attempting to replicate that behavior. So for example GOPROXY=https://nosuchproxy.com,https://proxy.golang.org,direct should continue to work, as should GOPROXY=https://nosuchproxy.com,direct. I figured the easiest way to handle this edge case was to fallback to the full resolution chain via native tooling, even if it retries the first (failed) one.

Probably worth at least a comment, but happy to revise the approach if you have a better idea

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To give a taste of complexity here, it's important to only fall back to the next option if the return code is 404 or 410 specifically. The proxy should be able to deny fallback by returning another error code, like 403. That lets a proxy enforce usage policies without needing to serve all external modules.

The separators are , (above behavior) or | (fallback on any error including unavailability).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, good to know. That reaffirms my belief that if we go down this route, we should not try to replicate the full fallback logic, but defer to go mod download after the first failure, which is what this PR does.

Do you think there are cases where ctx.download_and_extract does NOT fail but we should still do a fallback?

if not pat:
return False
if pat.endswith("/..."):
return module_path.startswith(pat[:-4])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would let foo/bar/... match foo/barbaz.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, good catch. i'll add some tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually i'll remove this for now, given #2124 (comment)

if goproxy == "off":
return None

def matches_pattern(pattern, module_path):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be a top-level function? That does avoid some overhead.

return False
if pat.endswith("/..."):
return module_path.startswith(pat[:-4])
if "*" in pat:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this occur in any position? The logic doesn't look quite right for that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not only is this incorrect, but it's also insufficient. This would need to implement all of https://pkg.go.dev/path#Match to really be correct. I'm tempted to just avoid using this new codepath if you have anything in GOPRIVATE/GONOPROXY, I feel like trying to handle it correctly is really not worth it, at least for the first cut

@dzbarsky dzbarsky force-pushed the zbarsky/direct-download branch from 78f1456 to 4c6288d Compare June 23, 2025 00:58
@fmeum fmeum requested review from jayconrod and linzhp June 23, 2025 20:09
@dzbarsky dzbarsky force-pushed the zbarsky/direct-download branch from 4c6288d to ef58f55 Compare June 23, 2025 20:33
@dzbarsky dzbarsky force-pushed the zbarsky/direct-download branch from ef58f55 to ecd4203 Compare June 23, 2025 20:34
Copy link
Contributor

@jayconrod jayconrod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is safe, and I'd prefer we not go in this direction for a few reasons.

  1. At this point, go_repository should almost always be used in module mode, with version and sum set, not with sha256. I don't think there's any need to support GOPATH mode at this point. sum in a different format than sha256: it's a hash of the uncompressed files and metadata, not the archive itself. Bazel doesn't know how to verify these hashes, so we can't use ctx.download_and_extract.
  2. There is a lot of complex logic in go mod download that's important for security and privacy. I don't think we should bypass that.

)
result = ctx.download_and_extract(
url = url,
sha256 = ctx.attr.sha256,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ctx.attr.sha256 is not set in module mode. can_use_direct_download is only set when ctx.attr.version is set (L231), but we verify sha256 is not set in L233.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah right, we should snip this line, good catch. Note that that sha256 was actually not the part ensuring correctness of download; we invoke the fetch_repo tool in -no-fetch mode so it still computes the dirhash and verifies that with the value from go.sum

@dzbarsky
Copy link
Contributor Author

I don't think this is safe, and I'd prefer we not go in this direction for a few reasons.

1. At this point, `go_repository` should almost always be used in module mode, with `version` and `sum` set, not with `sha256`. I don't think there's any need to support GOPATH mode at this point. `sum` in a different format than `sha256`: it's a hash of the uncompressed files and metadata, not the archive itself. Bazel doesn't know how to verify these hashes, so we can't use `ctx.download_and_extract`.

The sha256 attribute was an oversight, but the actual correctness is verified via the golang.org/x/mod/sumdb/dirhash mechanism and compared to sum. I'm not sure if you missed that in the PR or if I'm misunderstanding what you're saying :)

There's still a question of what happens if the download is corrupted; the fetch_repo invocation will fail because sum mismatches but will the re-run get a fresh attempt or will the corrupted download be cached? I think we need a way to invalidate it in that case?

2. There is a lot of complex logic in `go mod download` that's important for security and privacy. I don't think we should bypass that.

Is this the same concern you mention above about the complex fallback behavior (which I think this PR sidesteps by only peeling off the first attempt into starlark) or are there other behaviors you're worried about?

@jayconrod
Copy link
Contributor

The sha256 attribute was an oversight, but the actual correctness is verified via the golang.org/x/mod/sumdb/dirhash mechanism and compared to sum. I'm not sure if you missed that in the PR or if I'm misunderstanding what you're saying :)

I didn't get that deep into the code: my concerns are more architectural (below).

Is this the same concern you mention above about the complex fallback behavior (which I think this PR sidesteps by only peeling off the first attempt into starlark) or are there other behaviors you're worried about?

Yes, other behaviors, too.

I think go mod download is the safest, correct way to do this. I worked for a few years on the team that implemented, and there's a surprising amount of subtlety. We spent a lot of time on privacy and security safeguards, along with lots of bug fixes and standardization. People were pretty unhappy with us until all that was built, so I want to make sure it stays intact.

Could you say more about the problem being solved? You mentioned Bazel's downloader config, but could you give more detail? I wonder if this is something that could be solved with a customized Go module proxy server. The proxy protocol is meant to be simple to implement: most of the complexity is on the client side (go mod download). It can be done with a static file server or a pull-through cache.

You mentioned remote asset API too, but I don't think that will work as long as go_repository needs to run Gazelle locally to generate BUILD files.

@fmeum
Copy link
Member

fmeum commented Jun 24, 2025

We should also take into account that --repo_contents_cache exists now and, at least in theory, could receive a remote backend implementation.

@dzbarsky
Copy link
Contributor Author

dzbarsky commented Jul 1, 2025

Could you say more about the problem being solved? You mentioned Bazel's downloader config, but could you give more detail? I wonder if this is something that could be solved with a customized Go module proxy server. The proxy protocol is meant to be simple to implement: most of the complexity is on the client side (go mod download). It can be done with a static file server or a pull-through cache.

Taking a step back, the problem we've observed is that there are times when proxy.golang.org appears to be flaky/under high load, which manifests as go mod download experiencing timeouts that fail the repository rule and thus the build. By using bazel's downloader to download these modules, we are able to redirect these downloads at an instance of BB Remote Asset which will pull them from the underlying CAS, removing the need to hit proxy.golang.org and increasing reliability.

Of course the same effect can be achieved by running our own module proxy server, but it's appealing to reuse the same solution that's already powering the download caching for packages for other languages (python, JS, java, etc.)

You mentioned remote asset API too, but I don't think that will work as long as go_repository needs to run Gazelle locally to generate BUILD files.

We haven't seen any more flakes in weeks since running with this patch, so I suspect that even though the entire repository rule execution cannot be cached, the download itself comes from the CAS and then gazelle still runs locally

We should also take into account that --repo_contents_cache exists now and, at least in theory, could receive a remote backend implementation.

Yeah perhaps some of the recent/upcoming changes can provide more holistic caching. I am all for being able to cache the state of the repository post-gazelle-run, as long as it correctly invalidates :)

@jayconrod
Copy link
Contributor

I think what's needed here is a service that acts as a Go module proxy but is actually a transparent pull through cache. It could store files in the CAS using the Asset API. I'm not sure such a thing exists yet, but it would be relatively simple to implement, and architecturally, it seems like the right thing for this situation. The rules shouldn't have to change for this.

The main risk of using the CAS as storage is durability: files can get garbage collected after a while. But proxy.golang.org has some guarantees for durability, at least for open source modules.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants