readability

Thin content

Flags pages where the extracted main article runs under ~150 words — what Google's helpful-content system demotes and AI agents skip entirely.

What this check does

Measures the substance of the page, not the gross word count.

MetricSpot extracts the main article using a Readability-style heuristic:

  1. Prefer <main> or <article> when present.
  2. Strip nav, header, footer, aside, cookie banners, and known chrome (skip-link targets, breadcrumbs).
  3. Score remaining blocks by text density (chars per node) and tag weight — Readability’s same approach.
  4. Count words inside the surviving block.

If that count is under ~150 words, the page is flagged as thin. This is intentionally different from page word count, which counts everything visible (boilerplate included). A site can pass page word count with 2,000 words of nav + footer + sidebar and fail this check with 80 words of actual article.

Why it matters

Two systems penalize thin content, hard:

Google’s helpful-content system. Since the 2022 update (and tightened through 2024–2025), pages judged to provide “little value, low added value, or are otherwise not very helpful” are demoted site-wide — not just the offending URL. The classic offenders are auto-generated category indexes, location pages spun from a template, and product pages with no description beyond title + price.

AI answer engines. ChatGPT, Perplexity, Claude, Google AI Overviews, and Gemini all run a substance filter before quoting. A page with 60 words of body copy is parseable but unciteable — it doesn’t contain a defensible answer. Your URL never enters the citation pool.

Thin content is also a conversion problem. Visitors who land on a 70-word page bounce; visitors who land on a 600-word page with depth, FAQs, and a clear next step convert.

How to fix it

You’re not adding filler. You’re adding substance — content a reader would screenshot. Pick the patterns that fit the page type.

Category / index pages

The default thin-content offender. The fix:

  • 2–3 short paragraphs of original commentary above the listing — what the category is, who it’s for, what to look for.
  • A short FAQ block (3 questions) addressing the common pre-purchase questions.
  • Internal links to the 3–5 most-relevant subcategory or pillar pages. See internal linking strategy.

Product / SKU pages

If your description is just specs:

  • Add a “why we carry this” or “best for” paragraph (50–100 words, written by a human).
  • Pull in 3–5 customer reviews with author attribution and rating. Mark them up with Review schema — they double as social proof and as content.
  • Include a “compatible with” or “frequently bought with” block — useful to readers, useful to search engines mapping entity relationships.

Location / service-area pages

The template-spam trap. The fix is harder because it’s labor:

  • Every page needs at least one paragraph of locally-specific content. Not “we serve ” — actual details: a landmark, a local regulation, a service nuance.
  • A local case study or testimonial if you have one.
  • A unique H1 per page (not “Plumber in {City}”).

If you can’t write something unique for a city, that city doesn’t deserve a page. Consolidate into a regional hub.

Blog stubs and landing pages

  • Lead with the answer-first content pattern — the question in the H1, the answer in the next 50 words.
  • Follow with the “why this matters” and “how to do it” sections — the content depth pattern.
  • End with an FAQ block (2–3 questions) that captures long-tail queries.

How MetricSpot extracts the content

If the check fires but you believe the page has substance, the extractor may have lost it. Common causes:

  • Article body inside a <div> with no semantic wrapper, sitting next to a very large sidebar — Readability sometimes picks the sidebar. Wrap the article in <main> or <article>.
  • Content rendered client-side after hydration — MetricSpot reads the server response. Pre-render or SSR the body.
  • Heavy use of <iframe> or <canvas> — those aren’t prose. If the page is a tool, that’s fine; the rule won’t apply to interactive pages once you mark them with role="application" and a brief description.

Pair this rule with page word count and paragraph length — together they tell you whether the page has enough copy, well-structured.

Frequently asked questions

What’s the actual word-count threshold?

Roughly 150 words of extracted main content. The exact number depends on the page type — a product page with rich structured data (price, ratings, availability, schema) gets more leniency than an /about page with no schema and 80 words.

Will adding an FAQ block fix every thin page?

It helps but isn’t a silver bullet. An FAQ adds substance and captures long-tail queries, but if the rest of the page is genuinely empty (a stub category, a placeholder), the FAQ alone won’t carry it. Add an FAQ plus at least one original paragraph above the fold.

My homepage is intentionally short. Should I worry?

Homepages get a partial pass — the check expects them to be navigational, and most extraction heuristics return a low count even for healthy homepages. The thin-content penalty in practice hits interior pages: category indexes, location pages, and template-generated SKUs. If your homepage is the only thin page flagged, you can usually ignore it.

Sources

Last updated 2026-05-11