ai

Semantic HTML

MetricSpot measures the ratio of meaningful HTML5 elements (article, section, nav, header, footer, main) to generic div/span. Semantic markup is how AI agents and screen readers understand page structure.

What this check does

Scans the rendered DOM and counts semantic HTML5 elements — <article>, <section>, <nav>, <header>, <footer>, <aside>, <main>, <figure>, <figcaption>, <time> — against the total element count. Pages built almost entirely of nested <div> and <span> score low; pages with a clear semantic skeleton score high.

Why it matters

Semantic HTML is the difference between markup a machine can parse and markup a machine can understand. A <div class="header"> is invisible to anything that doesn’t read your CSS classes. A <header> is unambiguous: every browser, screen reader, search crawler, and LLM agent knows what role that block plays on the page.

Three audiences read your HTML:

  • AI agents (ChatGPT, Perplexity, Google AI Overviews) extract and cite content. They reliably pull from <article>, <main>, and <time datetime> tags. They guess inside div soup.
  • Screen readers generate a landmark map of the page so users can skip directly to navigation, main content, or footer. No semantic tags means no landmarks. Pair this check with ARIA landmarks.
  • Search engines weight content inside <article> and <main> higher than content inside generic containers — and reliably ignore boilerplate inside <nav> and <footer>.

The fix is cheap. The cost of not doing it compounds across every page on the site.

How to fix it

Replace generic containers with the element that describes their role.

Page chrome:

<!-- Before -->
<div class="site-header">
  <div class="nav">...</div>
</div>
<div class="main-content">...</div>
<div class="site-footer">...</div>

<!-- After -->
<header>
  <nav>...</nav>
</header>
<main>...</main>
<footer>...</footer>

Blog posts and articles: wrap each post in <article> and mark dates with <time datetime>. The datetime attribute is machine-readable and shows up in AI citations and Schema.org auto-extraction.

<article>
  <header>
    <h1>Post title</h1>
    <time datetime="2026-05-11">May 11, 2026</time>
  </header>
  <p>Post body...</p>
</article>

Images with captions: use <figure> + <figcaption> when the caption is part of the content (not just a hover tooltip).

<figure>
  <img src="chart.png" alt="Revenue growth 2024-2026" />
  <figcaption>Quarterly revenue, Q1 2024 through Q1 2026.</figcaption>
</figure>

When to use <section> vs <div>: use <section> for a named region of the page — one that would have a heading in an outline (Pricing, Features, FAQ). Use <div> for purely presentational grouping (a flex wrapper, a card border). If you can’t write a one-line heading for it, it’s a <div>.

Cross-check the result against ARIA landmarks and descriptive link text — those checks reinforce each other.

Frequently asked questions

Does it matter if I use <div role="main"> instead of <main>?

For accessibility, they’re equivalent — both expose a main landmark. For everyone else (AI agents, search crawlers, code reviewers), <main> is clearer and shorter. Use the native element. ARIA roles exist to patch elements that can’t express their role, not to replace ones that can.

Yes. <header> and <footer> are scoped to their nearest sectioning ancestor (<article>, <section>, <main>, or the document). A blog post can have its own <header> (title + date) inside <article>, plus the site-wide <header> at the top of the page. <main> is the exception — only one per document.

My CMS theme uses divs everywhere. Worth rewriting?

If you control the template, yes — it’s a one-time edit that improves accessibility, SEO, and AI extractability for every page that uses it. If you don’t (a hosted theme on Shopify/Squarespace), focus on the content slot where you have control: wrap posts in <article>, mark dates with <time>, use <figure> for captioned images. The chrome will improve when the theme updates.

Sources

Last updated 2026-05-11