Technical SEO

12 min read

Technical SEO Analysis for Ecommerce

Technical SEO problems kill rankings before you optimize anything else. This guide covers the structural issues most ecommerce stores carry — crawl budget waste, faceted navigation, duplicate content from product variants, and Core Web Vitals failures — with specific steps to fix each one.

Crawl Budget: Stop Wasting Google's Attention

Google doesn't crawl every page on your site every day. It allocates a crawl budget, a rough limit on how many pages Googlebot will fetch in a given period. For stores with a few hundred products this rarely matters. For stores with 10,000+ SKUs, it matters a lot.

The problem is that most ecommerce platforms generate huge amounts of low-value URLs: paginated category pages (/shoes?page=47), sort and filter combinations (/shoes?sort=price&color=red&size=10), out-of-stock product pages, and internal search results. Google crawls all of these instead of your real content, so your important category and product pages get crawled less often.

What burns crawl budget on ecommerce sites

Pagination beyond page 2–3, faceted navigation filter URLs, session IDs in URLs, out-of-stock pages with no alternatives, internal search result pages (e.g. /search?q=red+shoes), and redirect chains. Block or noindex these aggressively.

Fix it at the source. In robots.txt, disallow Googlebot from crawling URL patterns that generate junk pages. For filter and sort parameters, either use canonical tags pointing to the base category page or add rel="nofollow" to the pagination and filter links so Googlebot doesn't discover them in the first place. Check your GSC Coverage report regularly. It shows you exactly which URLs Google is wasting time on.

Faceted Navigation: The #1 Technical Problem in Ecommerce

Faceted navigation (the filter sidebars shoppers use to narrow by size, colour, brand, and price) is one of the most useful UX features on any store. It is also, almost universally, an SEO disaster when left unconfigured.

Every filter combination generates a new URL. A category with 200 products, 5 colours, 8 sizes, and 3 price ranges can produce tens of thousands of unique URL combinations. Most contain no unique content. They are all thin duplicates of the base category page. Google indexes them, splits your link equity across them, and your category page ranking drops.

Tip

The right solution depends on what you are filtering. Filters that represent real search demand, like /running-shoes/womens or /sofas/grey, should have their own indexable pages with unique content. Pure UX filters (sort order, page number, in-stock toggle) should be blocked from indexing entirely.

In practice: add a canonical tag on every filter URL pointing back to the base category page. Or, if your platform supports it, use JavaScript-only URL updates so filters change the page state without creating new server-side URLs. Shopify handles this poorly out of the box. The default faceted search app creates fully indexable URLs for every filter. You need to add canonical handling manually or via a theme modification.

Use Screaming Frog to crawl your store and count URLs by template. If your category page has 12 real pages but Screaming Frog finds 4,000 URLs with that category in the path, you have a faceted navigation problem.

Product Variants and Duplicate Content

A product with 5 colour options should have one indexable page, not five. When each variant has its own URL (/blue-trainer, /red-trainer, /green-trainer) and each page has the same title, description, and content with only the colour name swapped, Google sees thin duplicate content across all five. None of them rank well.

The fix is canonical tags. Every variant URL should have a canonical pointing to the main product page. The main product page itself should be self-canonicalising. In Shopify, variant URLs are generated by default (?variant=12345). These are handled by Shopify's built-in canonical logic, but it's worth verifying, especially on themes that have been heavily customised, that the canonical tag on every variant URL actually resolves to the correct product URL.

Identify all product variant URL patterns with Screaming Frog
Confirm each variant URL has a canonical pointing to the main product page
Check that the main product URL has a self-referencing canonical
Verify in Google Search Console that only the main product URL is indexed
Remove or redirect any standalone variant pages that have been indexed without canonicals

Site Speed and Core Web Vitals for Ecommerce

Google uses Core Web Vitals as a ranking signal. The thresholds are: LCP (Largest Contentful Paint) under 2.5 seconds, CLS (Cumulative Layout Shift) under 0.1, and INP (Interaction to Next Paint) under 200ms. Most ecommerce sites fail at least one of these on mobile, and mobile is where Google measures them.

The biggest LCP culprits on ecommerce sites are unoptimised product images. A 3MB JPEG hero image on a product page will kill your LCP score. Serve WebP or AVIF, set explicit width and height attributes, use next/image or an equivalent image optimisation layer, and ensure the hero product image has fetchpriority="high" so the browser loads it first. Ahrefs' Site Audit and PageSpeed Insights both flag LCP element issues clearly.

Tip

CLS on ecommerce is usually caused by late-loading elements: a cookie banner that shifts content, a reviews widget that loads after the page paints, or dynamically injected promotional banners. Reserve space for these elements with min-height, or load them outside the main content flow.

INP is the newest metric and the hardest to fix. It measures how long the page takes to respond to user interactions. On ecommerce sites, heavy JavaScript is usually the cause, particularly third-party scripts for chat widgets, retargeting pixels, and recommendation engines. Defer all non-critical third-party scripts. Use Chrome DevTools' Performance panel to identify which scripts are blocking the main thread after page load.

Where to start with Core Web Vitals

Open PageSpeed Insights on your homepage, your most important category page, and one product page. Check the field data (real user data) rather than just the lab score. That is what Google actually uses. Then use Screaming Frog's integration with PageSpeed Insights to surface CWV issues across your whole site in one crawl.

Diagnosing Crawl Errors in Google Search Console

The Coverage report in Google Search Console is where crawl problems surface. It categorises URLs into: Indexed, Not Indexed (with reasons), Error, and Valid with Warnings. Each category tells you something different.

  • Soft 404s: pages that return a 200 HTTP status but show a 'no results' or 'product unavailable' message. Google sees these as valid pages. They get indexed, they pass no value, and they can quietly hurt your domain quality signals.
  • Redirect chains: URL A redirects to URL B which redirects to URL C. Each hop dilutes PageRank and slows crawling. Fix: update all internal links to point directly to the final destination URL.
  • Crawled but not indexed: Google visited the page but chose not to index it. Usually a quality signal issue: thin content, duplicate, or near-duplicate pages.
  • Discovered but not crawled: Google found the URL (via sitemap or internal link) but hasn't fetched it yet. On large sites this indicates crawl budget exhaustion. Google is queuing URLs it never gets to.

The most productive workflow: export your GSC Coverage data, run a full crawl with Screaming Frog, and cross-reference the two. GSC tells you what Google sees; Screaming Frog tells you what is actually on the site. Pages that GSC shows as soft 404s but Screaming Frog finds with real content usually have a content rendering problem, typically JavaScript that isn't loading for Googlebot.

Structured Data for Ecommerce

Structured data doesn't directly improve rankings, but it does influence how your pages appear in search results. For ecommerce, this matters more than in most niches because Product schema unlocks rich results with price, availability, and star ratings directly in the SERP. These boost click-through rates, particularly in competitive shopping queries where organic results sit below a wall of Shopping ads.

Every product page needs Product schema with at minimum: name, image, description, sku, brand, offers (with price, priceCurrency, availability, and url). If you have reviews on the page, add AggregateRating. Without it, your review stars won't appear in search results even if the reviews are on the page.

BreadcrumbList schema is underrated

Adding BreadcrumbList schema to every page tells Google your site structure and often causes breadcrumb paths to appear in search results instead of bare URLs. This improves CTR and helps Google understand the hierarchy between your category and product pages. It takes 30 minutes to implement site-wide and keeps paying off for years.

Product schema on every product page (name, image, price, availability, sku, brand)
AggregateRating within Product schema on any page with reviews
BreadcrumbList on all category and product pages
Validate all structured data with Google's Rich Results Test
Monitor rich result performance in GSC under Enhancements
Check for structured data errors in GSC. Missing required fields cause rich results to drop silently

Work Together With SEO Experts who understand ecommerce

World’s first Ecom-founded SEO agency

Technical SEO for Ecommerce | EcomSEO Guides