Skip to content

Conversation

@ppkarwasz
Copy link
Contributor

This change clarifies the meaning of “excessive slashes” (/) in namespace, name, and subpath by adding explicit requirements for parsers:

  • Namespace: Parsers must ignore all empty segments. Leading and trailing slashes are just special cases of empty segments (e.g., /foo/["", "foo", ""]).

  • Name: Parsers must ignore trailing slashes. To avoid ambiguity when multiple leading slashes appear (which some parsers might interpret as part of the name and others as part of the namespace), parsers should remove them in either case. For example:

    • pkg:type/namespace//name → the intended namespace is namespace/ and the name is name, but some parsers might incorrectly treat the name as /name.
    • pkg:type//name → the intended namespace is empty, and the name is name, but some parsers might incorrectly treat the name as /name.
  • Subpath: Parsers must apply the same rule and ignore empty segments.

These changes aim to resolve ambiguities in the current specification. For example, under the existing wording:

  • Leading and trailing slashes / “should be stripped in the canonical form,” but it is unclear whether this refers to the canonical form of the entire PURL or only the encoded namespace.
  • At the same time, those slashes are described as not being part of the namespace.

The revision also extends the specification by requiring parsers to strip consecutive slashes that appear within the encoded namespace and subpath.

This change addresses part of the ambiguities described in #584 by eliminating the category of “invalid but tolerated” PURLs. Instead, it shifts the responsibility to parsers, which must leniently normalize PURLs with excessive slashes into valid ones. Importantly, this normalization can be applied without requiring a full parser.

This change clarifies the meaning of “excessive slashes” (`/`) in `namespace`, `name`, and `subpath` by adding explicit requirements for parsers:

* **Namespace**: Parsers must ignore all empty segments. Leading and trailing slashes are just special cases of empty segments (e.g., `/foo/` → `["", "foo", ""]`).

* **Name**: Parsers must ignore trailing slashes. To avoid ambiguity when multiple leading slashes appear (which some parsers might interpret as part of the `name` and others as part of the `namespace`), parsers should remove them in either case.
  For example:

  * `pkg:type/namespace//name` → the intended namespace is `namespace/` and the name is `name`, but some parsers might incorrectly treat the name as `/name`.
  * `pkg:type//name` → the intended namespace is empty, and the name is `name`, but some parsers might incorrectly treat the name as `/name`.

* **Subpath**: Parsers must apply the same rule and ignore empty segments.

These changes aim to resolve ambiguities in the current specification. For example, under the existing wording:

* Leading and trailing slashes `/` “should be stripped in the canonical form,” but it is unclear whether this refers to the canonical form of the entire PURL or only the encoded `namespace`.
* At the same time, those slashes are described as not being part of the `namespace`.

The revision also extends the specification by requiring parsers to strip consecutive slashes that appear within the encoded `namespace` and `subpath`.

This change addresses part of the ambiguities described in package-url#584 by eliminating the category of “invalid but tolerated” PURLs. Instead, it shifts the responsibility to parsers, which must leniently normalize PURLs with excessive slashes into valid ones. Importantly, this normalization can be applied without requiring a full parser.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants