technical

XML sitemap

MetricSpot tries to fetch /sitemap.xml. The sitemap is how you tell search engines and AI crawlers which URLs on your site exist and matter.

What this check does

Makes a GET request to https://yourdomain.com/sitemap.xml and checks that it returns a valid sitemap or sitemap index. We also look for a Sitemap: line in your robots.txt (separate check).

Why it matters

A sitemap is the explicit signal of which URLs are part of your site. Without one, crawlers rely entirely on internal links to find your pages, and any URL that isn’t linked from your homepage (orphan pages, deeply nested category pages, recently published content) may take weeks to be discovered, or be missed entirely.

For AI crawlers like GPTBot, ClaudeBot, and PerplexityBot, the sitemap is even more important — they crawl narrower than Googlebot and lean on the sitemap as the canonical inventory.

How to fix it

Generate a sitemap at the root of your domain. Most frameworks do this automatically:

Astro: npm install @astrojs/sitemap, then in astro.config.mjs:

import sitemap from '@astrojs/sitemap';
export default defineConfig({
  site: 'https://yourdomain.com',
  integrations: [sitemap()],
});

Next.js (App Router): create app/sitemap.ts returning an array of { url, lastModified, changeFrequency, priority }.

WordPress: Yoast, Rank Math, and SEOPress all auto-publish a sitemap index at /sitemap_index.xml — you may need to add a redirect from /sitemap.xml.

Hand-rolled: a static sitemap.xml works for small sites:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://yourdomain.com/</loc>
    <lastmod>2026-05-01</lastmod>
  </url>
</urlset>

After publishing, submit it once in Google Search Console → Sitemaps. Google will recrawl it automatically going forward.

Frequently asked questions

Does the sitemap need to be at /sitemap.xml exactly?

No — the Sitemap: line in robots.txt is the official discovery mechanism. But /sitemap.xml is the convention; many tools (including MetricSpot) probe there as a fallback.

What about sitemap indexes?

For sites with more than 50,000 URLs or 50 MB compressed, you split into multiple sitemaps and link them from a sitemap index. Google accepts both formats interchangeably.

Should I include every URL?

Only canonical, indexable URLs. Exclude noindex pages, admin pages, search-result pages, and tracking-param variants. The sitemap is a shortlist of what you want indexed, not a dump of every URL that responds with 200.

Sources

Last updated 2026-05-11