-
-
Notifications
You must be signed in to change notification settings - Fork 51
Description
PS: Solved for me, please check the first comment to this.
🆒 Your use case
Problem Statement
When using the custom endpoints feature to define dynamic URLs, the sitemap module performs automatic URL encoding. This process appears to use a function similar to encodeURI(), which causes problems for certain URLs.
For example, when a dynamic slug contains a reserved character like a dollar sign ($) or a colon (:), the module does not encode it. However, if the slug contains a Unicode character (e.g., an emoji ❓), it is correctly encoded.
This inconsistent encoding violates the XML standard for sitemaps (RFC-3986/3987) for URLs containing unencoded reserved characters, leading to an "Invalid character" error in Search Console.
Use Case
We are generating dynamic URLs from a database where slugs may contain a mix of characters, including reserved characters and Unicode. We are already using encodeURIComponent() to correctly handle all of these characters on the application side. A global option to disable the sitemap module's automatic URL encoding would allow us to pass our correctly encoded URLs directly to the sitemap, resolving this conflict and enabling successful sitemap submission to Search Console.
🆕 The solution you'd like
Feature Request
Please consider adding a global configuration option to disable automatic encoding for all dynamic URLs. This would give developers control over the encoding process and prevent double-encoding issues.
The proposed syntax would be:
export default defineNuxtConfig({
sitemap: {
encodeDynamicUrls: false,
}
})🔍 Alternatives you've considered
No response
ℹ️ Additional info
export default defineSitemapEventHandler(async () => {
const tests = ['$-:)', '😅']
const test = tests.map((slug) => {
return {
loc: slug,
}
})
return [
...test
]
})Above code outputs:
http://localhost:3000/%F0%9F%98%85
http://localhost:3000/$-:)
An invalid sitemap.xml because:
Invalid character: $
All values in a Sitemap must be entity-escaped. Please check to make sure that your URLs follow the RFC-3986 standard for URIs, the RFC-3987 standard for IRIs, and the XML standard.
This issue is currently framed as a feature request, but the inconsistent URL encoding could be considered a bug. I am open to any alternative solutions you might propose to resolve this problem.