Skip to content

Query parameter gets overly escaped when saved to the refcache #239

@chalin

Description

@chalin

To Reproduce

You can use the attached minimal htmltest-query-escape-test.zip, which contains only the two following files listed next.

index.html:

<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
  <title>Example Link</title>
</head>
<body>
  <a href="https://grpc.io?param1=a+b&param2=Hello,+World">
    Example.com greeting
  </a>
</body>

.htmltest.yml:

StripQueryString: false

To reproduce, run:

$ htmltest -c .htmltest.yml --log-level 1 ./index.html
htmltest started at 11:53:36 on .
========================================================================
index.html
  hitting --- index.html --> https://grpc.io?param1=a+b&param2=Hello,+World
✔✔✔ passed in 356.54455ms

Expected behaviour

The refcache.json should contain the exact same URL as the HTML file, namely:

{
  "https://grpc.io?param1=a+b&param2=Hello,+World": {
    "StatusCode": 206,
    "LastSeen": "2025-02-06T11:53:37.190045-05:00"
  }
}

Actual behaviour

The recache contains:

{
  "https://grpc.io?param1=a+b\u0026param2=Hello,+World": {
    "StatusCode": 206,
    "LastSeen": "2025-02-06T11:53:37.190045-05:00"
  }
}

Note the & has been escaped and encoded as \u0026. I've also seen + get escaped to &#43;, but I can't reproduce this in my minimal example yet.

Versions

  • OS: macOS 12.x
  • htmltest: htmltest 0.17.0

/cc @svrnm @theletterf @tiffany76

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions