-
Notifications
You must be signed in to change notification settings - Fork 128
asset API: loosen restrictions on qualifiers #329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
asset API: loosen restrictions on qualifiers #329
Conversation
The spec previously stated that servers must reject Fetch requests if there were unsupported qualifiers specified. This was too strict- consider for example a scenario where the client specifies both SHA256 and SHA512 qualifiers, and the server only supports one of these. Supporting only one of these should be sufficient, and the server should be free to accept such a request.
I agree with a part of your request: the client should be allowed to specify multiple equivalent hashes (using different algorithms), and the server should be allowed to pick any one of those.
I think there are three less invasive changes that would all make your use-case possible:
In addition to any of these, it would be nice if the server could signal which qualifiers have been used (so the client can update its internal knowledge accordingly - only for checksums that were validated). |
There is no central list of valid qualifiers, only some suggested ones. Clients could specify hashes with different qualifiers instead of multiple checksum.sri qualifiers. To generalise my example, if one qualifier is "strong" (like a cryptographic hash) then all other qualifiers may be redundant and safely ignored (assuming the set of qualifiers is not inconsistent). |
I agree. That is my first suggestion (documenting the additional qualifiers in the lexicon is an optional step - you could start using them right away).
That's not what you suggest in the change of the spec. Your change allows the server to ignore any strong qualifiers as it chooses, which leaves clients to guess what validation (if any) has been performed. I want to avoid ambiguity between client and server as much as possible. Another way to achieve this is to add a new field in the request where the client indicates which qualifiers are equivalent to the client (so the server can choose one of them). |
Is this not already the case, but with whitespace separation instead of commas? The SRI spec linked from the qualifier lexicon allows for multiple qualifiers in one entry. If we're going to use SRI, I prefer the idea of sticking with the existing spec as far as possible. However, I worry that it's originally designed as a client-side validation and we're using it here server-side. I think that works sufficiently with the original MUST wording, but if it's a MAY the client doesn't actually know the validation has been performed. To me, the original issue is sufficiently well solved by the client sending multiple specifiers within the one |
Today I learned. In that case, it's probably a good idea to add a short note in the lexicon hinting at the fact that more than one checksum may be sent (separated by whitespace). EDIT: Done in #330 |
See also #301 (comment). Edit: Ah I think @malt3 linked this above. |
Looking at the protocol, we also have the digest_function field in requests, with the following note:
Since the sri only refers to values specified by the client, it sounds like this means that the server must present/compute this digest separately for the purpose of the Response population. Can we specifically associate this language for this field with the response digest field? |
https://www.w3.org/TR/SRI/#agility refers to parsing metadata handling a list of multiple sris via split a string on spaces - why would we go with commas here in disagreement with the standard?
Why would this make them interchangeable? Now you have an ordering/list problem... |
It would be extremely painful if we do that. In Bazel ecosystem, there are cases such as Forcing the sri to be the same as the response digest function would be disruptive for this ecosystem and requires build rules authors and/or users to maintain a manual mapping of integrity themselves.
Reading 3.3.3 -> 3.3.5 carefully, I think you are right.
So it sounds like you can have a payload like this: "qualifiers": [
{
"name": "checksum.sri",
"value": "sha1-abcadf sha256-123dafa sha256-sadfasd sha512-mmdmfdmf sha512-mmmrrrr"
}
] and the server should do the following:
|
This is what the specification says. However, I still think that for a client using the
So if I am building a client that is agnostic of the server (doesn't know about implementation details or configuration), the only guarantee I have is that the server verifies at least one of the provided checksums in |
Yeah, from the Client perspective, you should only pick the The only thing the client should pay attention to is to avoid using sha1 or md5. |
Painful or not, some correspondence with the Digest (and DigestFunction) in the response should be indicated. The notes in the response do allude to this, where it says that:
This implies it's completely up to the server's discretion what they want to do with it, but consider the practical case:
I agree with all of these states/procedures, per the spec. The only effective checksums are those with the strongest hash function, and any of them matching will succeed the download.
All of these things being true, the realization is that the Client in this case is not the user-agent, it is the (spec exemplified) html author adding tags. It is foregoing that user-agent role by requesting via the fetch api. The html author is no more in control of the browser user-agent than the client is here. The server's guidelines should be the spec, and it is outside of the specification to dictate how the server should implement these: there is no agnosticism permitted here, hence the digest flexibility requirement. |
Not only that. Even though the Remote Asset API is currently only used as a way to speed up downloads of remote assets, it could also be used to completely avoid the downloading of assets on the client, having the client place these files in input roots of actions "by reference". Because REv2 doesn't allow input roots containing files with different digest functions, it would be severely restrictive if the Remote Asset API required that the digest function of these requests matches the one in checksum.sri. |
Clarifying here - there's no association between the sris' hash-algos, and the requested digested function. Explicitly, they're being isolated in my request for additional language - they have nothing to do with one another. The requested digest function should only correspond to the digest function (and computed digest) in the response. Since there are other algos in the mix, I suggested that we add language to tie the two together (Request.DigestFunction -> Response.{Digest,DigestFunction}) distinct from any qualifiers, if a server chooses to implement them. |
I'm in favor of adding language for this. It appears to match everyone's expectations. This behavior is also implemented by all servers I could find so far. In addition, it would be nice to have some other field in the response to signal additional validations performed by the server (like a field specifying which part of checksum.sri was verified by the server), but I'm not convinced that this is easy to do, without being too focused on
|
Response already contains a Qualifier field for population with the utilized checksum.sri. |
That's correct, but right now it's hard to reason about it in the case of |
The language is supportive, but not prescriptive, for such filtering. I don't think any interactions with remote are wholly reliable just by [comment] text in this specification, and there's not enough wide coverage to say that a server is or is not compliant. Telling servers explicitly what they must provide hasn't been the theme of expansions here, it tends to be what the clients react with. |
I think it would be permissible by the current spec to introduce a new qualifier that is only used to convey validated checksums ( |
Trying to summarize the discussion here, with a few of my own opinions:
All of the above should be farmed off to separate PRs. Per the original request: should the server be allowed to ignore unsupported Qualifiers? IMO it should NOT. While the cited example of multiple verification methods is reasonable, allowing the server to ignore unsupported verification methods would be specifying which Qualifiers it is OK to ignore (or to ignore if there's another, equivalent, qualifier already present). I prefer a model where the server MUST respect all qualifiers, but the qualifier itself can be defined flexibly (as is the case with checksum.sri). |
The spec previously stated that servers must reject Fetch requests if there were unsupported qualifiers specified. This was too strict- consider for example a scenario where the client specifies both SHA256 and SHA512 qualifiers, and the server only supports one of these. Supporting only one of these should be sufficient, and the server should be free to accept such a request.