ecommerce7 min readFebruary 4, 2026Hakan

Technical SEO for Ecommerce: A Practical Checklist (and the Traps We Keep Fixing)

Ecommerce sites break technical SEO in unique ways. Here is our practical checklist for crawl budget, faceted navigation, JS rendering, structured data, and the traps that waste revenue.

Ecommerce sites are technically harder to get right for search engines. A 50‑page marketing site has maybe 50 indexable URLs. A mid‑size online store with 2,000 products, 30 categories, 5 filter facets, and 3 sort options can easily generate 500,000+ URL combinations, most of them duplicates or near‑duplicates that waste crawl budget and dilute ranking signals.

We've spent the last 15 years building and fixing commerce platforms. This is the checklist we actually use when we audit ecommerce sites, plus the traps we see teams fall into repeatedly.

Image placeholder: A simple diagram showing how filters/sort explode URLs (Category × Color × Size × Brand × Sort → millions of combinations).

1. Crawl Budget and Indexation

Crawl budget is finite. Google allocates a certain number of URLs it will crawl on your site per day. If most of those crawls hit filtered category URLs nobody searches for, your new product pages can take days or weeks to get discovered and indexed.

Start by answering two questions:

1) Which URLs do you want indexed? 2) Which URL patterns should never be indexed (even if they’re crawlable)?

Practical checks:

- Sitemap hygiene: only include canonical, indexable URLs. Separate sitemaps for categories and products if the site is large. - Robots.txt: block obvious crawl traps (endless parameter spaces) but don’t rely on robots.txt as your only duplicate control (blocked URLs can still be indexed as “URL-only” in some cases). - Noindex rules: use meta robots (noindex,follow) for internal search pages, “sort by” pages, and low‑value filter combinations. - GSC Coverage + Page indexing: look for “Duplicate, Google chose different canonical” and “Crawled – currently not indexed” patterns. They usually map to parameter mess.

Common trap: teams block everything with parameters in robots.txt and think they’re done. Then Google can’t crawl enough to understand the site structure, and you lose discovery for legitimate pages. Prefer a strategy (canonicals + internal linking control), not just a blanket robots rule.

Image placeholder: Screenshot callout of GSC “Page indexing” report with the 3–4 statuses to watch.

2. Faceted Navigation and Filters (Canonicals + Parameters)

Facets are where ecommerce SEO goes to die. Filters are great for users but brutal for crawl/index unless you intentionally decide which combinations are indexable.

A good default: - Index: category pages and a small set of “SEO facets” (e.g. “/shoes/black/” if there is real demand and unique inventory). - Don’t index: arbitrary combinations (color+size+brand+sort), “in stock” toggles, price sliders, etc.

Implementation patterns that work: - Canonical: filtered pages should usually canonical to the base category (unless it’s an SEO facet you intentionally index). - Internal linking: your navigation should not create infinite crawl paths. If a filter combination is not indexable, keep it out of prominent internal link surfaces (or use nofollow selectively). - Parameter normalization: avoid creating multiple URLs for the same state (e.g. `?color=black&size=m` vs `?size=m&color=black`). Choose one canonical ordering.

Common trap: letting the frontend generate filter URLs client-side without a canonical strategy. Google will still discover them through crawl, and you’ll end up with tens of thousands of thin URLs.

Image placeholder: Table showing which facets are “Index / Noindex / Canonical-to-base”.

3. Pagination and Category Pages

Pagination is often implemented in a way that makes page 2+ either invisible to crawlers or indexable but low quality.

Checklist: - Ensure pagination links are real <a href> links (not only JS handlers). - Keep canonical on paginated pages self-referential (page 2 canonicals to page 2) unless you have a deliberate consolidation strategy. - Make page 1 the strongest SEO landing page and ensure products deeper in pagination can still be discovered via internal linking.

Common trap: infinite scroll that never exposes crawlable paginated URLs.

Image placeholder: Diagram of category page: SEO intro + product grid + crawlable pagination.

4. Product Variants (Canonicals + Schema)

Variants create duplicate content by default: the same product with 10 sizes/colors becomes 10 URLs that look almost identical.

Good strategies: - One canonical product URL (the parent) and variants handled via selectable options. - If you must have variant URLs, canonical them to the parent unless variant content is meaningfully different (e.g. different images, different price, different inventory, different title).

Schema matters here: - Use Product + Offer(s) correctly (price, availability). For variants, make sure the schema reflects the selected variant’s offer when possible.

Common trap: variant URLs are indexable and also appear in sitemaps.

Image placeholder: Example of parent vs variant canonical strategy.

5. JavaScript Rendering / Next.js: SSR vs ISR vs CSR

Ecommerce SEO and performance are coupled: slow pages rank worse and convert worse.

Rules of thumb for Next.js: - Category and product pages should be server-rendered (SSR/ISR) so bots and users get meaningful HTML immediately. - Avoid gating critical content behind client-only fetches. - Keep HTML stable to avoid layout shift (CLS). Images and fonts are usual offenders.

Core Web Vitals checklist: - LCP: optimize hero/product image, use correct sizing, preloading for the LCP image. - CLS: reserve space for images, avoid injecting banners above content after load. - INP: reduce heavy JS on category pages (filters, analytics). Defer non-critical scripts.

Common trap: relying on client-side rendering for product grids and assuming Google will “figure it out”. It often does, but it’s slower, less reliable, and can cause partial indexing.

Image placeholder: Lighthouse report highlighting LCP/CLS/INP on category pages.

6. Structured Data (Product / Offer / Review / Breadcrumb)

Structured data won’t fix a broken site, but it improves eligibility for rich results and reduces ambiguity.

Minimum set for ecommerce: - Product + Offer (price, availability, currency) - BreadcrumbList - Organization - AggregateRating / Review (only if you actually show reviews)

Common trap: emitting invalid schema or marking up content that’s not visible on the page.

Image placeholder: JSON-LD snippet callout (Product + Offer + Breadcrumb).

7. Duplicate Content Sources (Search Pages, Sort, UTM, Sessions)

Common duplicate sources: - internal search pages (`/search?q=...`) - sort orders (`?sort=price_asc`) - tracking params (`utm_*`, `gclid`) - pagination combined with filters

Controls: - Noindex internal search and sort pages. - Canonical parameterized URLs to the cleanest version. - Strip tracking params in canonicals. - Ensure the same page isn’t accessible via multiple paths without a clear canonical strategy.

Common trap: the same product is reachable under multiple category paths, each indexable, each competing.

Image placeholder: URL patterns checklist (index/noindex/canonical).

8. Redirect and 404 Hygiene

Large catalogs change constantly. If you don’t manage redirects, you bleed authority.

Checklist: - Keep redirects 301, single-hop. - Avoid redirect chains from legacy slugs. - Handle out-of-stock / discontinued products intentionally.

Common trap: redirecting everything to the homepage.

Image placeholder: Example redirect map decision tree.

9. International SEO (hreflang): If Applicable

If you run multi-country/multi-language: - Use hreflang with correct region/language codes. - Ensure every locale references every other locale (bidirectional). - Don’t mix canonicals across locales.

Common trap: hreflang points to URLs that redirect.

Image placeholder: hreflang matrix example.

10. Measurement: GSC + Logs + Performance

Technical SEO gets easier when you measure the right things.

What we look at: - GSC Page indexing trends: are duplicates going down? - Crawl stats: are crawls focused on the URLs you care about? - Server/CDN logs: what is Googlebot actually fetching? Which parameter patterns dominate? - Performance monitoring: watch CWV for category/product templates, not just the homepage.

Common trap: only looking at rankings, not indexation + crawl patterns.

Image placeholder: A simple SEO monitoring dashboard mockup.

CTA: Want a technical SEO roadmap for your store?

If you're running Shopify, Medusa, or a headless stack (Next.js + Sanity), we can audit crawl/indexation, URL strategy, rendering performance, and structured data, then deliver a prioritized roadmap your team can execute.

If you want, send us: - your domain - a rough product count - whether you use filters/facets heavily

…and we’ll tell you the first 3 technical fixes we’d make.

#technical-seo#ecommerce#nextjs#medusajs#headless-commerce
H

Hakan

Hakan helps ecommerce teams design and build headless commerce stacks with technical SEO, performance, and scalability baked in from day one.

Frequently asked questions

The biggest risk is crawl budget waste from URL explosions: filters, sorts, internal search, and duplicate variants generating hundreds of thousands of low-value URLs. The checklist above focuses on controlling crawl budget with robots.txt, canonicals, sitemaps, and a clear URL strategy so Google spends its time on product and category pages that can actually rank.

Want an audit of your headless ecommerce stack?