feat: automation of new urls added in sitemap using new slugs on wordpress - 🚧#346
feat: automation of new urls added in sitemap using new slugs on wordpress - 🚧#346amaan-bhati wants to merge 2 commits intomainfrom
Conversation
Signed-off-by: amaan-bhati <amaanbhati49@gmail.com>
…press Signed-off-by: amaan-bhati <amaanbhati49@gmail.com>
There was a problem hiding this comment.
Pull request overview
This PR introduces automation to keep public/sitemap.xml updated with new WordPress blog post URLs (technology/community) by adding sync scripts and a scheduled GitHub Actions workflow that opens an automated PR when changes are detected.
Changes:
- Added Node scripts to fetch post slugs from WPGraphQL and append missing
/blog/technology/*and/blog/community/*entries intopublic/sitemap.xml. - Added npm scripts to run the sitemap sync from CI.
- Added a scheduled GitHub Actions workflow to run the sync periodically and create a PR with sitemap updates.
Reviewed changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
scripts/sync-tech-urls-to-sitemap.mjs |
Adds a script to fetch technology posts from WPGraphQL and append missing sitemap <url> entries. |
scripts/sync-blog-urls-to-sitemap.mjs |
Adds a script to fetch all posts, map them to community/technology URLs, and append missing sitemap entries. |
public/sitemap.xml |
Updates the sitemap content with many new/changed entries. |
package.json |
Adds npm scripts to run the sitemap sync scripts. |
.github/workflows/sitemap_sync.yml |
Adds a scheduled workflow to run the sync and open an automated PR. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| This XML file does not appear to have any style information associated with it. The document tree is shown below. | ||
| <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"> |
There was a problem hiding this comment.
public/sitemap.xml is not valid XML: the first line is a plain-text message (likely copied from a browser view) and the XML prolog (<?xml version="1.0" encoding="UTF-8"?>) is missing. This will break sitemap consumers and will also fail the workflow xmllint validation; remove the plain-text line and restore the XML declaration at the top of the file.
| <url> | ||
| <loc>https://keploy.io/blog</loc> | ||
| <lastmod>2026-03-11T21:50:59+00:00</lastmod> | ||
| <priority>1.00</priority> | ||
| </url> | ||
| <url> | ||
| <loc>https://keploy.io/blog/technology</loc> | ||
| <lastmod>2026-03-11T21:50:59+00:00</lastmod> | ||
| <priority>0.80</priority> | ||
| </url> | ||
| <url> | ||
| <loc>https://keploy.io/blog/community</loc> | ||
| <lastmod>2026-03-11T21:50:59+00:00</lastmod> |
There was a problem hiding this comment.
Many <lastmod> values appear to have been rewritten to the same sync timestamp (e.g. the /blog, /blog/technology, /blog/community entries). lastmod is intended to reflect the actual last modification time of each URL; setting it to a bulk-generated timestamp can mislead crawlers and trigger unnecessary recrawls. Prefer keeping existing lastmod values unchanged and only setting lastmod for newly added URLs based on WordPress modified/date.
| const loc = normalizeUrl(`https://keploy.io/blog/technology/${node.slug}`); | ||
| const lastmod = toIso(node.modified) || toIso(node.date) || "2026-03-11T21:50:59+00:00"; | ||
| return { | ||
| loc, | ||
| xml: `<url>\n<loc>${loc}</loc>\n<lastmod>${lastmod}</lastmod>\n<priority>0.51</priority>\n</url>\n`, | ||
| }; |
There was a problem hiding this comment.
lastmod falls back to a hard-coded timestamp. If modified/date is missing or unparsable, this will silently write a stale value and is easy to forget to update later. Prefer using the current time (or omitting <lastmod> when unknown) instead of a fixed constant.
| const url = buildBlogUrl(node); | ||
| if (!url) return null; | ||
| const loc = normalizeUrl(url); | ||
| const lastmod = toIso(node.modified) || toIso(node.date) || "2026-03-11T21:50:59+00:00"; | ||
| return { | ||
| loc, | ||
| lastmod, | ||
| xml: `<url>\n<loc>${loc}</loc>\n<lastmod>${lastmod}</lastmod>\n<priority>0.51</priority>\n</url>\n`, | ||
| }; |
There was a problem hiding this comment.
lastmod falls back to a hard-coded timestamp. If modified/date is missing or unparsable, this will write an arbitrary (and eventually stale) value into the sitemap. Prefer using the current time (or omitting <lastmod> when unknown) rather than a fixed constant.
Related Tickets & Documents
Fixes: #[issue-number]
Description
Changes
Type of Change
Testing
Demo
Environment and Dependencies
Checklist