readability
Thin content
Flags pages where the extracted main article runs under ~150 words — what Google's helpful-content system demotes and AI agents skip entirely.
What this check does
Measures the substance of the page, not the gross word count.
MetricSpot extracts the main article using a Readability-style heuristic:
- Prefer
<main>or<article>when present. - Strip
nav,header,footer,aside, cookie banners, and known chrome (skip-link targets, breadcrumbs). - Score remaining blocks by text density (chars per node) and tag weight — Readability’s same approach.
- Count words inside the surviving block.
If that count is under ~150 words, the page is flagged as thin. This is intentionally different from page word count, which counts everything visible (boilerplate included). A site can pass page word count with 2,000 words of nav + footer + sidebar and fail this check with 80 words of actual article.
Why it matters
Two systems penalize thin content, hard:
Google’s helpful-content system. Since the 2022 update (and tightened through 2024–2025), pages judged to provide “little value, low added value, or are otherwise not very helpful” are demoted site-wide — not just the offending URL. The classic offenders are auto-generated category indexes, location pages spun from a template, and product pages with no description beyond title + price.
AI answer engines. ChatGPT, Perplexity, Claude, Google AI Overviews, and Gemini all run a substance filter before quoting. A page with 60 words of body copy is parseable but unciteable — it doesn’t contain a defensible answer. Your URL never enters the citation pool.
Thin content is also a conversion problem. Visitors who land on a 70-word page bounce; visitors who land on a 600-word page with depth, FAQs, and a clear next step convert.
How to fix it
You’re not adding filler. You’re adding substance — content a reader would screenshot. Pick the patterns that fit the page type.
Category / index pages
The default thin-content offender. The fix:
- 2–3 short paragraphs of original commentary above the listing — what the category is, who it’s for, what to look for.
- A short FAQ block (3 questions) addressing the common pre-purchase questions.
- Internal links to the 3–5 most-relevant subcategory or pillar pages. See internal linking strategy.
Product / SKU pages
If your description is just specs:
- Add a “why we carry this” or “best for” paragraph (50–100 words, written by a human).
- Pull in 3–5 customer reviews with author attribution and rating. Mark them up with
Reviewschema — they double as social proof and as content. - Include a “compatible with” or “frequently bought with” block — useful to readers, useful to search engines mapping entity relationships.
Location / service-area pages
The template-spam trap. The fix is harder because it’s labor:
- Every page needs at least one paragraph of locally-specific content. Not “we serve
” — actual details: a landmark, a local regulation, a service nuance. - A local case study or testimonial if you have one.
- A unique H1 per page (not “Plumber in {City}”).
If you can’t write something unique for a city, that city doesn’t deserve a page. Consolidate into a regional hub.
Blog stubs and landing pages
- Lead with the answer-first content pattern — the question in the H1, the answer in the next 50 words.
- Follow with the “why this matters” and “how to do it” sections — the content depth pattern.
- End with an FAQ block (2–3 questions) that captures long-tail queries.
How MetricSpot extracts the content
If the check fires but you believe the page has substance, the extractor may have lost it. Common causes:
- Article body inside a
<div>with no semantic wrapper, sitting next to a very large sidebar — Readability sometimes picks the sidebar. Wrap the article in<main>or<article>. - Content rendered client-side after hydration — MetricSpot reads the server response. Pre-render or SSR the body.
- Heavy use of
<iframe>or<canvas>— those aren’t prose. If the page is a tool, that’s fine; the rule won’t apply to interactive pages once you mark them withrole="application"and a brief description.
Pair this rule with page word count and paragraph length — together they tell you whether the page has enough copy, well-structured.
Frequently asked questions
What’s the actual word-count threshold?
Roughly 150 words of extracted main content. The exact number depends on the page type — a product page with rich structured data (price, ratings, availability, schema) gets more leniency than an /about page with no schema and 80 words.
Will adding an FAQ block fix every thin page?
It helps but isn’t a silver bullet. An FAQ adds substance and captures long-tail queries, but if the rest of the page is genuinely empty (a stub category, a placeholder), the FAQ alone won’t carry it. Add an FAQ plus at least one original paragraph above the fold.
My homepage is intentionally short. Should I worry?
Homepages get a partial pass — the check expects them to be navigational, and most extraction heuristics return a low count even for healthy homepages. The thin-content penalty in practice hits interior pages: category indexes, location pages, and template-generated SKUs. If your homepage is the only thin page flagged, you can usually ignore it.
Sources
Last updated 2026-05-11