fix: Clarify handling of / in the specification
#687
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This change clarifies the meaning of “excessive slashes” (
/) innamespace,name, andsubpathby adding explicit requirements for parsers:Namespace: Parsers must ignore all empty segments. Leading and trailing slashes are just special cases of empty segments (e.g.,
/foo/→["", "foo", ""]).Name: Parsers must ignore trailing slashes. To avoid ambiguity when multiple leading slashes appear (which some parsers might interpret as part of the
nameand others as part of thenamespace), parsers should remove them in either case. For example:pkg:type/namespace//name→ the intended namespace isnamespace/and the name isname, but some parsers might incorrectly treat the name as/name.pkg:type//name→ the intended namespace is empty, and the name isname, but some parsers might incorrectly treat the name as/name.Subpath: Parsers must apply the same rule and ignore empty segments.
These changes aim to resolve ambiguities in the current specification. For example, under the existing wording:
/“should be stripped in the canonical form,” but it is unclear whether this refers to the canonical form of the entire PURL or only the encodednamespace.namespace.The revision also extends the specification by requiring parsers to strip consecutive slashes that appear within the encoded
namespaceandsubpath.This change addresses part of the ambiguities described in #584 by eliminating the category of “invalid but tolerated” PURLs. Instead, it shifts the responsibility to parsers, which must leniently normalize PURLs with excessive slashes into valid ones. Importantly, this normalization can be applied without requiring a full parser.