Skip to content
§
§ · journal

Effective SEO strategies for large ecommerce sites.

A scale-tier decision matrix for catalog SEO at $5M, $20M, and $50M-plus revenue - the foundations to fix at each tier, the metrics that move, and the 90-day program for an existing $20M brand auditing where to start.

Three tiers. Three programs. One 90-day audit.

SEO at scale is not the same job at $5M and $50M revenue. At $5M ARR with 1,000 SKUs the work is foundational technical SEO - canonical discipline, sitemap-index structure, schema parity, mobile Core Web Vitals - and the spend is $3,000 to $8,000 per month or 30 to 60 hours of in-house time. At $20M with 5,000 to 10,000 SKUs the work shifts to category-page SEO programs treated as topical hubs, plus a content engine for upper-funnel queries, at $8,000 to $20,000 per month with a multi-discipline retainer. At $50M-plus with 25,000 to 100,000 SKUs the work is faceted-navigation governance, dynamic landing pages, crawl-budget controls, and a senior in-house SEO lead at $20,000 to $60,000 per month all-in. The five metrics that actually move at scale are organic revenue per visit by URL category, indexed-page health (ratio of indexed URLs earning impressions), crawl-waste percentage from log-file analysis, search-visibility share against named competitors, and AI Overviews citation rate. The 90-day audit for an existing $20M brand splits into days 1-30 diagnostic, days 31-60 quick-wins, and days 61-90 program design - by day 90 the brand has a documented multi-quarter roadmap, not just a backlog.

A 200-SKU store and a 50,000-SKU catalog are not the same problem.

The popular ecommerce SEO playbook on the internet was written for the 200-SKU store. Optimize your title tags, write better product descriptions, build some backlinks, ship a blog. That advice is fine at the 200-SKU tier - the catalog is small enough that Google can crawl and index every page without the merchant intervening, and the failure mode is mostly thin content rather than scale-induced waste.

At 5,000 SKUs the failure mode changes. The platform default - especially on Shopify, Magento Open Source, BigCommerce, and Salesforce Commerce Cloud - generates a long tail of low-value URL variants from faceted navigation, sort-order parameters, pagination, and session-based URLs that dilute crawl budget and produce hundreds of duplicate-content warnings in Search Console. The same store at 200 SKUs would never trip these issues; the same store at 50,000 SKUs is unworkable without governance.

At 50,000 SKUs the work becomes architectural. You cannot hand-write title tags page by page; you need patterns, automation, and review gates. You cannot rely on the homepage to internally link to every revenue page; you need a hub-and-spoke architecture with category pages as the load-bearing nodes. You cannot trust the platform default to handle filter URLs correctly; you need a documented faceted-navigation policy with a quarterly audit. The work shifts from optimization to governance.

Three structural differences separate large-ecommerce SEO from the work most retainer agencies sell.

  1. Scale changes the failure mode from thin content. scale changes the failure mode from thin content to crawl waste and indexation pollution.
  2. Category. page SEO becomes the dominant traffic channel above 5,000 SKUs - product pages individually capture long-tail brand-plus-SKU queries, but category pages capture the broader commercial-intent queries that actually drive volume.
  3. Governance becomes a discipline. The strategy at scale is less about specific tactics and more about the framework that prevents the long tail of catalog-quality decay.

This piece is the framework. The companion piece - optimizing product pages for better SEO - drills into the PDP layer specifically. This one zooms out to the whole-catalog strategy at the three revenue tiers where the SEO program looks structurally different. Read both, and the cross-link logic gives you the depth-plus-breadth picture.

Three revenue tiers. Three programs. Different jobs.

Most ecommerce-SEO retainer pitches sell the same generic checklist regardless of the brand's revenue tier. The matrix below cuts the work into three honest programs - foundational tech SEO at $5M, category and content programs at $20M, governance and dynamic-landing-page work at $50M-plus.

Revenue tierCatalog scalePrimary programTypical spend / month
Tier 1 · $5M ARR500 - 2,000 SKUsFoundational technical SEO$3K - $8K (or 30-60 in-house hrs)
Tier 2 · $20M ARR5,000 - 10,000 SKUsCategory-page programs + content engine$8K - $20K
Tier 3 · $50M+ ARR25,000 - 100,000 SKUsFaceted-nav controls + dynamic landing pages + governance$20K - $60K all-in

Sections 04, 05, and 06 below break each tier down by named work, named tools, and the typical mistake operators make at that tier.

Foundations first. Tactics later.

At $5M ARR with 1,000 SKUs the brand has enough traffic to be worth optimizing, enough catalog to need governance starting points, and enough cash flow to fund either a freelancer retainer or 30 to 60 hours of in-house implementation time per month. The work at this tier is foundational technical SEO - the scaffolding that has to be in place before category-page programs and content engines can compound on top of it.

Canonical discipline. Every product page has a single canonical URL pointing to itself. Every category page has a single canonical pointing to the base category, not to a filter or sort variant. Every paginated category page has canonical-to-page-1 logic with noindex on pages 2-plus. Every filter URL canonicals back to the base category. The default platform behavior on Shopify and Magento usually handles PDP and base category canonicals correctly; the filter-URL canonical logic almost always needs theme-level intervention. Google's canonical-URL documentation is the canonical reference.

Sitemap-index structure. A single 50,000-URL sitemap.xml is the wrong shape at this tier. The right shape is a sitemap index calling out separate sitemaps for products, collections, blog posts, and pages, each capped at 50,000 URLs and 50MB. Google's crawler reads the index, prioritizes the sitemaps that have updated most recently, and re-crawls the URLs inside them on a faster cadence. This is not a 2026 trick - Google's sitemap documentation has covered the pattern for over a decade - but most $5M brands are still running a single auto-generated sitemap from the platform default and missing the indexation lift.

Schema parity on PDPs and category pages. Product schema with Offer, AggregateOffer, Brand, and Review properties on every product page (validated against the Rich Results Test). BreadcrumbList schema on every page. Organization schema on the homepage and About page. ItemList schema on category pages where appropriate. Schema.org is the canonical reference; Google's product structured-data guidance covers the supported fields and rich-result eligibility.

Mobile Core Web Vitals at the 75th percentile. LCP under 2.5 seconds, INP under 200 milliseconds, CLS under 0.1, measured at the 75th percentile from the CrUX dataset (the public Chrome User Experience Report). The metrics are documented at web.dev/articles/vitals. The single biggest LCP lever at this tier is image optimization - images at the platform default are usually too heavy and push LCP above the 2.5-second threshold on mobile. Section 9 below covers the INP discipline specifically.

Basic redirect map. Every URL change creates a 301 redirect. Every product rename, every collection URL change, every legacy URL from a prior platform - all redirected to the closest semantic match on the current site. Audit quarterly with Screaming Frog for redirect chains and 404s. The maintenance burden at this tier is manageable; the burden grows non-linearly at Tier 2 and Tier 3, which is why getting the discipline right at Tier 1 matters.

The Tier 1 anti-pattern. Spending money on content production or link building before the foundations are stable. We have audited brands at $5M ARR paying $8,000 per month for blog content and link outreach while their PDPs ship without Product schema and their filter URLs are all indexed with self-canonicals. The content and links produce some traffic; the foundational gaps quietly cap the program at 20 to 40 percent of what it would do if the foundations were sound. Fix foundations first.

Where Emani fits. The Emani case study is a clean Tier 1 example. The brand came to us at the $0 to $2M ARR pre-tier-1 stage with foundational SEO debt - default theme schema, filter URLs all indexed, no canonical discipline, no redirect map. We shipped the foundational fixes inside a Shopify Plus build, and the SEO compounded as the brand scaled to $2M MRR over the following 18 months. The work at this tier is not glamorous; it is the scaffolding that everything else compounds on.

Category pages as topical hubs. Content engine on top.

At $20M ARR with 5,000 to 10,000 SKUs the foundations should already be stable - if they are not, push the Tier 2 work back and finish Tier 1 first. The work at Tier 2 is two compounding programs running in parallel: the category-page program (PLPs treated as topical hubs that compete on broad commercial-intent queries) and the content engine (blog and supporting content that captures upper-funnel queries and feeds the conversion funnel).

Category pages as topical hubs. The default category page on most ecommerce platforms is a product grid with no descriptive copy above the grid, a generic title tag like "Women's Shoes - Brand Name," and zero topical-cluster internal linking. That page does not compete for the broader commercial-intent query "women's running shoes" against publishers, marketplaces, and review sites that have rich content above their product listings. The Tier 2 fix is to treat each top-30 category page as a topical hub: 400 to 800 words of original above-grid content covering the buying-decision factors for that category, a curated grid of the highest-revenue products, internal links to subcategory pages and to relevant blog content, and FAQ blocks that target the related question queries. The result is a category page that ranks for the head term, the question variants, and the modifier combinations that buyers actually search.

The content engine. Two to four long-form blog posts per month targeting upper-funnel commercial-intent queries identified through Search Console exports and competitor-gap analysis. The right question to answer at this tier is not "what should I write about" - it is "which of the 200 commercial-intent queries my competitors rank for that I do not yet rank for would generate the most organic revenue if I captured them." Pull the data from Ahrefs or Semrush, prioritize by the intersection of search volume, commercial intent, and competitive difficulty, and ship the content cadence against that prioritized list. Generic blog content with no commercial-intent prioritization burns content budget without producing organic-revenue uplift.

Internal linking from blog content to category pages. The compounding step that separates strong content engines from weak ones. Every blog post links to two or three relevant category pages with descriptive anchor text. Every category page links to the top-three relevant blog posts. The result is topical-cluster authority - Google reads the bidirectional linking as a signal that the brand has depth on the topic, and the category page benefits from the link equity flowing in from the blog content. Skipping this step is the most common Tier 2 mistake we see; brands ship 30 blog posts per month that never link to revenue pages and the blog ranks for informational queries while the category pages stagnate.

Schema extensions. At Tier 2 the schema work expands beyond the foundational PDP and BreadcrumbList layer. Article schema on every blog post with named author, date published, and date modified. ItemList schema on every category page rendering the top products. Site name annotations for cleaner SERP rendering. Skip FAQPage and HowTo schema - rich-result eligibility for both was restricted in 2023 to a narrow set of authoritative sites, and shipping the schema on a commercial site does not generate rich results in 2026.

Multi-region considerations. If the brand sells in multiple countries, hreflang tags become non-optional at this tier. Use ISO 639-1 language codes plus ISO 3166-1 Alpha-2 region codes (en-US, en-GB, en-IN; never UK, always GB). The hreflang tags ship in the head of every page, not just the homepage. Google's international SEO documentation covers the implementation. Brands that ship hreflang on the homepage only and miss the per-page implementation see Google index the wrong region's URL for the wrong country's queries.

Where Big Game Sports fits. The Big Game Sports case study is a Tier 2 example. The brand operates at the $10M to $25M ARR range with a 6,000-SKU catalog of athletic merchandise. The work we shipped covered the full Tier 2 scope - category-page topical-hub conversions on the top 40 PLPs, content engine cadence at three long-form posts per month targeting commercial-intent queries, internal-linking discipline from blog to category, and the schema extensions. The 12-month organic-revenue lift came largely from the category-page program, which is the pattern at this tier - blog content captures upper-funnel demand and feeds the funnel, but the category pages capture the commercial-intent volume that converts.

Where Noble Paris fits. The Noble Paris case study is a multi-region Tier 2 example. The brand sells across the US, France, and the UK with a 4,500-SKU catalog. The Tier 2 work included the category-page program in three languages, hreflang implementation across every page, and a content engine cadence in English with selective French and UK content for region-specific commercial-intent queries. The case study illustrates the multi-region complexity that compounds at Tier 2 - hreflang errors and duplicate-content issues that did not exist at Tier 1 become program-defining at Tier 2.

Faceted-nav governance. Dynamic landing pages. Crawl-budget controls.

At $50M-plus ARR with 25,000 to 100,000 SKUs the work shifts again. Foundations are presumed in place. Category-page programs and content engines are running. The bottleneck on organic growth is now the long tail of catalog-scale problems that smaller stores never have to solve - faceted-navigation indexation pollution, dynamic landing-page generation for the long tail of demand, crawl-budget governance, and a level of operational dashboard discipline that catches issues in week one rather than month three.

Faceted-navigation governance. The combinatorial explosion problem. A category with 6 filters of 5 values each generates 7,776 possible filter URL combinations, and the platform default on Shopify, Magento Open Source, BigCommerce, and Salesforce Commerce Cloud is to render every combination at a unique URL with a unique title tag - which produces an indexation-pollution disaster at scale. The governance pattern is selective indexation: index only the filter combinations that map to commercial-intent queries with measurable search volume, canonical the rest back to the base category page, and apply noindex to all multi-filter combinations and all sort-order parameters. The selective-indexation policy is documented in a faceted-nav governance sheet that the SEO lead reviews quarterly against the Search Console Performance report - which filter URLs are earning impressions, which are not, which should be promoted to indexation, which should be demoted.

Dynamic landing pages for the long tail. Above 25,000 SKUs the long tail of search demand no longer maps cleanly to existing category pages. Buyers search for specific filter combinations ("blue running shoes size 10 women's wide") that have measurable search volume but no dedicated landing page on the site. The Tier 3 pattern is to generate dynamic landing pages programmatically for the high-search-volume filter combinations - the same canonical category-grid template but with a fixed filter applied, an above-grid copy block templated from the filter values, and the URL parameter pattern indexable. The dynamic pages capture the long-tail commercial intent that the base category cannot, without diluting the base category page itself. Done badly this is doorway-page abuse under Google's spam policies; done well, with each page providing genuine product variation and meaningful commercial intent, it is one of the highest-leverage Tier 3 moves.

Crawl-budget governance. Above 50,000 indexed pages, Googlebot's crawl behavior matters. The metric is crawl-waste percentage from log-file analysis - what percentage of Googlebot hits land on revenue pages versus filter URLs, sort parameters, 404s, redirected URLs, and orphaned legacy URLs. The right tool for the analysis is a server-log sample (one to two weeks of access logs filtered to Googlebot user-agents) processed through Screaming Frog Log File Analyzer, JetOctopus, or a custom log-analysis pipeline. The pattern is to identify the 20 to 40 percent of Googlebot's crawl budget that is wasted on non-revenue URLs and to govern those URLs aggressively - robots.txt disallow on session-based parameters, noindex on filter combinations not in the indexation policy, redirect-chain cleanup, and orphaned-URL pruning.

Operational dashboard discipline. The five Tier 3 metrics tracked on a daily-refresh dashboard: organic revenue per visit by URL category, indexed-page health (ratio of indexed URLs earning at least one impression in the last 90 days), crawl-waste percentage from log-file analysis, search-visibility share against three named competitors, and AI Overviews citation rate. Section 7 below covers each metric in detail. The dashboard is the early-warning system at this tier; without it, the brand discovers issues from monthly reports two to three months after they should have been caught.

In-house ownership. At Tier 3 the optimal team structure is a senior in-house SEO lead with 2 to 4 SEOs reporting in (technical, content, program management) plus an agency or consultant on retainer for specific deep work. The agency or consultant role narrows from execution-led to strategic-input, which is the same transition every other marketing function makes between $10M and $50M revenue. The mistake operators make is staying on a generalist agency too long after the catalog has outgrown what a multi-client retainer can attend to. A Tier 3 catalog at 50,000-plus SKUs has more SEO surface area than most agencies' top three retainer clients combined; one shared account manager cannot cover the work.

The Tier 3 anti-pattern. Treating Tier 3 like a bigger version of Tier 2 - more category pages, more content cadence, more link building - and ignoring the governance and dynamic-landing-page work. The category pages and content keep producing, but the long tail of crawl waste eats the gains, and the indexation pollution from filter URLs ranks for queries that should belong to the base category. We have audited brands at $50M-plus ARR running $30,000-per-month content retainers while their faceted-navigation pollution is costing them 30 to 50 percent of what their organic program could be doing. Fix governance first.

Five metrics. Daily-refresh dashboard. Junior SEO will not look at them.

The default ecommerce-SEO dashboard tracks organic sessions and conversion rate. Useful, but lagging - by the time a session decline shows up, the cause is two to four weeks back in the data. The five metrics below are the leading indicators that move at scale and surface issues in week one.

metric 01

Organic revenue per visit by URL category

Revenue divided by organic sessions, sliced by URL pattern - product page, category page, blog post, brand page, search-results page. Surfaces which page types are pulling weight and which are dead pixels. A common Tier 2 finding: the blog drives 40 percent of organic sessions but 5 percent of organic revenue, while category pages drive 30 percent of sessions and 70 percent of revenue. The work redistribution flows from there - more investment in category pages, less in undifferentiated blog content.

metric 02

Indexed-page health

The ratio of indexed URLs that have earned at least one impression in Search Console in the last 90 days versus the ratio that have not. The latter group is crawl waste - pages indexed by Google but not earning attention. Healthy at Tier 1: 70-percent-plus of indexed URLs earning impressions. Healthy at Tier 3 with dynamic landing pages: 50-percent-plus. Below 30 percent is a governance failure.

metric 03

Crawl-waste percentage

From server log-file analysis - what percentage of Googlebot hits land on revenue pages versus filter URLs, sort parameters, 404s, redirected URLs, and orphaned legacy URLs. The metric requires log-file access (most managed-platform brands need to coordinate with hosting); the analysis runs through Screaming Frog Log File Analyzer or a custom pipeline. Healthy at Tier 3: under 30-percent crawl waste. Above 50 percent is a governance crisis.

metric 04

Search-visibility share by category

The brand's share of voice on the top 200 commercial-intent queries in the catalog versus three named competitors, tracked monthly. Pull the data from Ahrefs, Semrush, or a similar visibility tracker. The metric measures category-level positioning - is the brand gaining or losing ground against the same competitors month over month. Useful for diagnosing whether stagnant organic traffic is a brand-level issue or a category-level competitor moving on the same head terms.

metric 05

AI Overviews citation rate

How often the brand's pages get cited in Google AI Overviews answers for branded queries and unbranded category queries. Per Google's AI Features documentation, AI Overviews use the regular Search index - no special markup, no AI.txt, no new files. The citation rate moves with the same signals that move organic rankings - content quality, authority, structured data, and topical depth. Tracking the metric monthly surfaces which page types and which content patterns are AI-Overviews-eligible versus which are not, which informs the content engine's prioritization. AIO traffic pools into the Performance report's Web search type; there is no dedicated filter, so the analysis is comparison-based - branded queries with citation versus branded queries without, by URL pattern.

Building all five metrics requires Search Console plus analytics plus a visibility tool plus log-file access plus a custom dashboard layer. The setup is two to four weeks; the ongoing maintenance is 4 to 8 hours per month for a senior SEO. The dashboard pays for itself the first time it catches an indexation-pollution event in week one rather than month three.

Shallow over deep. Hub-and-spoke for category pages.

Site architecture is the URL structure plus internal linking pattern that determines how link equity flows from the homepage out to revenue pages. At small scale this barely matters - 100 SKUs reachable from the homepage in two clicks does the work. At 50,000 SKUs the architecture is the difference between a catalog that ranks and a catalog that buries its best pages four levels deep where Googlebot rarely revisits them.

Shallow over deep. The right pattern is a flat hierarchy - homepage to category to subcategory to product, three or at most four clicks from the homepage to the deepest revenue page. The wrong pattern is a deep hierarchy with five or six levels of nested categories where the leaf-level products are seven clicks from the homepage. Google's crawler revisits shallow pages more frequently and ranks them higher on commercial-intent queries; deep pages get crawled less and rank less.

Hub-and-spoke for category pages. Category pages as topical hubs (covered in Tier 2 above) are the spokes; the homepage and top-level navigation are the hub. The hub links to every top-30 category page; each category page links to its subcategory pages and to the most-relevant blog content; subcategory pages link to product pages; product pages link back to category pages and to related products. The pattern produces tight topical clustering that Google rewards and that buyers find easy to navigate.

Internal linking from authority pages to revenue pages. The homepage is the highest-authority page on most ecommerce sites; the top 5-10 category pages are usually the next tier; everything else is downstream. The internal-linking discipline is to use the authority pages to lift the revenue pages that need ranking help - new collection launches, seasonal product pushes, pages that have ranked stagnantly for 6-plus months despite the underlying product being commercially relevant. A simple internal-linking audit pulled from Screaming Frog ranks every URL by inbound internal links; the bottom 20 percent of revenue pages are usually the highest-leverage internal-linking opportunities.

URL structure within platform constraints. Shopify URLs are constrained to /products/, /collections/, /pages/, /blogs/ - the patterns cannot be changed. Magento URLs are configurable but most stores ship with the platform default category-and-product structure. BigCommerce and Salesforce Commerce Cloud have their own constraints. The discipline is to work within the platform's constraints and optimize what is configurable - URL slugs (descriptive, hyphenated, no stop words), collection paths (one or two levels deep, not five), and the canonical pattern across filter variants.

Breadcrumbs. Every page has a breadcrumb trail rendering the path from homepage to the current page, with BreadcrumbList schema marked up at schema.org/BreadcrumbList. Google renders breadcrumbs in the SERP for most ecommerce queries; the rendering is a small visibility lift on every result. The schema is one of the few rich-result types still actively rewarded for ecommerce sites in 2026.

Mobile-first index. INP under 200ms. Tiebreaker reality.

Google's index has been mobile-first since 2020 - the mobile version of every page is the version Google uses for indexing and ranking. Google's mobile-first indexing documentation covers the practical implications. For ecommerce sites the implication is that the mobile rendering of every PDP, every category page, every blog post, and every search-results page is what gets evaluated for ranking - and the mobile rendering on most ecommerce platforms is heavier and slower than the desktop rendering.

Core Web Vitals at the 75th percentile. LCP under 2.5 seconds, INP under 200 milliseconds, CLS under 0.1. The metrics are documented at web.dev/articles/vitals. The 75th-percentile measurement matters - the median rarely catches the tail behavior, and Google's evaluation uses the CrUX dataset's 75th percentile across all visitors. Measure with PageSpeed Insights for synthetic data and CrUX for field data; both are free.

INP replaced FID in March 2024. Interaction to Next Paint - the time from a user's interaction (tap, click, key press) to the next visible paint after the browser processes that interaction. INP under 200ms is the threshold; above 500ms is poor. The metric replaced First Input Delay because FID only measured the first interaction; INP measures every interaction across the page-life. For ecommerce sites the INP killer is usually third-party scripts - chat widgets, analytics tags, A/B testing tools, personalization engines - that block the main thread on user interactions. The fix is third-party-script governance: defer non-critical scripts, lazy-load below-fold widgets, and audit the script payload quarterly.

Mobile-specific patterns. Image weight on mobile is usually 60 to 80 percent of total page weight; the optimization lever is responsive images with the srcset pattern, WebP or AVIF format, and lazy-loading below-fold imagery. Font loading on mobile blocks text rendering more than on desktop; the optimization is font-display: swap and self-hosted variable fonts rather than third-party CDN-loaded font stacks. Render-blocking JavaScript on mobile is the biggest LCP killer; the optimization is critical-CSS inlining, async/defer on non-critical scripts, and code-splitting for first-paint payload reduction.

Tiebreaker reality. Core Web Vitals are not the dominant ranking factor on commercial-intent queries - content relevance, topical authority, and link signals all weigh more heavily. But on competitive queries where two brands have roughly equal content and authority, Core Web Vitals are the tiebreaker. The brand at LCP 1.8s and INP 140ms ranks above the brand at LCP 3.2s and INP 320ms, all else being equal. At Tier 1 the metric matters as a foundational hygiene gate; at Tier 2 and Tier 3 it matters as a competitive separator on the head terms.

Total Blocking Time (TBT) under 200ms in lab is a useful proxy for INP, not a replacement. Time to First Byte (TTFB) is a diagnostic metric, not a Core Web Vital - useful for debugging server-side performance issues but not part of Google's page-experience signal. First Input Delay (FID) was deprecated in March 2024 and removed from CrUX in September 2024; references in older SEO content to FID are stale.

Multi-region brands. Only when relevant.

Local SEO is not a default add-on for every large ecommerce brand. It belongs in the program when the brand has physical retail locations, regional fulfillment centers, or B2B sales territories where buyers convert differently by region. Skip the section if the brand is online-only with one fulfillment hub.

Google Business Profile per physical location. Every retail store and every staffed regional fulfillment center gets a verified Google Business Profile with accurate address, hours, phone, and category. Reviews flow into the profile and influence the local-pack rankings on near-me queries. Important constraint: GBP is physical-presence-only. Brands with virtual offices or remote-only regional sales presence are not eligible for GBP at those addresses; fabricating a presence to claim GBP risks account suspension.

LocalBusiness schema only on dedicated location pages. Each physical location gets a /locations/{slug}/ page with hours, address, photos, parking guidance, and the LocalBusiness schema marked up at the page level. The schema does not belong on the homepage or general site pages - that is misuse. Schema.org's LocalBusiness type covers the supported properties.

Regional fulfillment messaging. Brands with regional fulfillment centers can win on shipping-time queries - "next-day delivery laptops Boston" - by surfacing the regional-fulfillment messaging on category pages and search-results pages, with hreflang or geographic-targeting controls if the brand serves different SKUs by region. The messaging is a conversion lift more than a ranking lift, but the SERP click-through-rate uplift on regionally-relevant queries is meaningful.

Multi-region for ecommerce-only brands. When the brand sells in multiple countries with different currencies, taxes, or product mixes - hreflang and country-specific URL paths (or country-specific TLDs, or country-specific subdomains) become non-optional. The pattern is documented at the multi-region considerations callout in Tier 2 above. The add-on at Tier 3 is country-specific structured data - currency-specific Offer schema in PriceSpecification, country-specific shipping availability, country-specific payment-method icons in checkout. Each layer of regional specificity is a small uplift; together they produce the regional ranking that a generic English-language site cannot achieve in the FR or DE market.

Ninety days. Three phases. By day 90, a documented program.

The framework for an existing $20M-plus ecommerce brand auditing where to start. Fits inside a single quarter; outputs a multi-quarter roadmap, a quick-wins list shipped, and an operational dashboard the team will own ongoing.

01

Days 1-30 · Diagnostic

Pull a full Screaming Frog crawl of the production site. Export every URL from Search Console that earned at least one impression in the last 90 days. Pull a server-log sample of Googlebot hits across a representative two-week window. Pull the baseline Core Web Vitals report at the 75th percentile from the public CrUX dataset. Document the indexation count by URL category - product, category, blog, brand, search-results, filter URLs, sort URLs, paginated URLs, orphaned legacy URLs. Surface the gaps. Map the existing site architecture, the existing schema implementation, and the existing internal-linking pattern. Output: a single diagnostic deck with the indexation count by category, the Core Web Vitals baseline, the crawl-waste percentage, the search-visibility share against three named competitors, and a prioritized issues list.

Output: diagnostic deck + prioritized issues list + named-tools setup (Search Console, Screaming Frog, log-file analyzer, visibility tracker)

02

Days 31-60 · Quick-wins

Ship the foundational fixes that take less than a day each. Schema corrections on PDPs and category pages (Product, Offer, Brand, BreadcrumbList, ItemList). Robots.txt cleanup - disallow session-based parameters, sort-order URLs, search-results pages with no commercial value. Canonical fixes on filter URLs - canonical back to base category, not self-canonical. Internal-linking improvements from the homepage and top-10 category pages to the bottom-20-percent-by-inbound-links revenue pages identified in the Day-1-to-30 audit. Sitemap-index restructuring - separate sitemaps for products, collections, blog posts, pages, capped at 50,000 URLs each. Image weight sweep on the top 100 pages by traffic. Hreflang implementation if multi-region and not already shipped. Meta-tag rewrites on the top 50 pages with weak commercial-intent alignment.

Output: quick-wins shipped + redirect map updated + first measurable Core Web Vitals lift

03

Days 61-90 · Program design

Build the multi-quarter roadmap. The category-page program: which 30 PLPs convert to topical hubs, what the cadence is, who writes the above-grid copy, what the schema-extension pattern is. The content engine: what the upper-funnel commercial-intent queries are, what the prioritization framework is, what the cadence is, who produces. The faceted-navigation governance policy: which filter URLs index, which canonical, which noindex, who reviews quarterly. The link-building program: which Tier-1 publications, which content angles, which day-5/12/20 follow-up cadence. The operational dashboard: which five metrics, how they refresh, who owns the daily check, who escalates. By day 90 the brand has a documented program covering the next four quarters, not just a backlog.

Output: multi-quarter roadmap + named-owner per workstream + operational dashboard live

If the foundations themselves are broken, the 90 days collapses to triage and program design pushes to days 91-180. The framework above assumes the brand has technical foundations broadly in place and is moving from foundations to scale. The diagnostic phase is the moment to find out which scenario applies.

SEO inside Plus builds. Tier 1 and Tier 2 catalogs.

Digital Heroes ships ecommerce SEO inside Shopify Plus engagements rather than as a standalone monthly retainer. The fit is strongest at Tier 1 and Tier 2 catalogs ($5M to $30M ARR, 1,000 to 10,000 SKUs) where the foundational technical SEO and category-page program work belongs in the build phase, not retrofitted afterwards. We're a Premier Shopify Plus partner agency, NY and Delhi headquartered, 2,000-plus stores shipped since 2017, Trustpilot 4.9 across 70-plus reviews.

This is not the right fit for every brand on this page. At Tier 3 ($50M-plus, 50,000-plus SKUs) the work belongs with a senior in-house SEO lead supported by a specialist consultant on deep work like faceted-navigation governance redesign or log-file analysis pipelines. We can play the consultant role at that tier when the in-house lead wants outside senior input on a specific deep workstream, but the program ownership belongs in-house. If the brand is at $5M and only needs the foundational work without a Plus build, a freelancer or small specialist retainer carries that work better than we do.

If you're at $5M to $30M ARR running a Shopify Plus build, migration, or replatform and you want SEO baked into the engagement at Tier 1 or Tier 2 - that's the fit. Read our Shopify Plus agency service, our Shopify development service, our web development service, and our growth strategy service for the work and the cadence. The companion piece on optimizing product pages for better SEO drills into the PDP layer that this article zooms past. The Shopify SEO services overview covers the agency-archetype frame; top skills for an ecommerce SEO specialist covers the in-house hiring rubric; common mistakes ecommerce SEO specialists make covers the patterns that disqualify a hire. Our ecommerce data-migration playbook covers the SEO-preservation work that runs alongside any platform replatform. The work here was written by Prasun Anand based on the patterns across our 2,000-plus client engagements.

Six honest answers.

What is different about SEO for large ecommerce sites versus smaller stores?

Three structural differences. First, scale changes the failure mode. A 200-SKU Shopify store can ship with default theme settings, no internal-linking plan, and no faceted-navigation strategy and still rank, because the catalog is small enough that Google can crawl and index every page on its own. A 20,000-SKU catalog cannot - the same neglect produces hundreds of thousands of low-value indexed URLs (filter combinations, sort parameters, session URLs) that dilute crawl budget and bury the high-revenue pages. Second, category-page SEO becomes the dominant traffic channel above 5,000 SKUs. Product pages individually capture long-tail brand-plus-SKU queries; category pages capture the broader commercial-intent queries that actually drive volume. Third, governance becomes a discipline. At 50,000 SKUs you cannot hand-write title tags page by page; you need patterns, automation, and review gates. The strategy at scale is less about specific tactics and more about the framework that prevents the long tail of catalog-quality decay.

What is the right SEO investment level by ecommerce revenue tier?

Honest mid-points by revenue tier. At $5M ARR with 1,000 SKUs the right investment is foundational technical SEO - canonical discipline, sitemap-index structure, schema parity on PDPs and category pages, mobile Core Web Vitals at the 75th percentile, and a basic redirect map. Typical spend is $3,000 to $8,000 per month with a freelancer or small retainer, or 30 to 60 hours of in-house implementation time at zero cash cost. At $20M ARR with 5,000 to 10,000 SKUs the right investment shifts to category-page SEO programs and a content engine - PLPs treated as topical hubs, blog content for upper-funnel queries, internal linking that lifts revenue pages. Typical spend is $8,000 to $20,000 per month with a multi-discipline retainer agency. At $50M-plus with 25,000 to 100,000 SKUs the right investment is faceted-navigation controls, dynamic landing pages, crawl-budget governance, and a senior in-house SEO lead with an agency or consultant on retainer for specific deep work. Typical spend is $20,000 to $60,000 per month all-in. The mistake operators make is over-investing in content production at $5M when the technical foundations are not yet stable, or under-investing in governance at $50M when the catalog has grown past what manual oversight can manage.

How should faceted navigation be handled on a large ecommerce catalog?

Faceted navigation - filters like color, size, price-range, material, brand applied to a category page - generates a combinatorial explosion of URLs that breaks SEO at scale unless governed. The default behavior on most ecommerce platforms is to render every filter combination at a unique URL with a unique title tag, which means a category with 6 filters of 5 values each can produce 7,776 indexed URL variants. Google does not have crawl budget to discover, render, and rank all of those, and the high-revenue category page itself often gets less crawl attention than the long tail of filter combinations. The governance pattern is selective indexation: index only the filter combinations that map to commercial-intent queries with measurable search volume (color and size on apparel, price-range and material on furniture often), canonical the rest back to the base category page, and apply noindex to all multi-filter combinations and all sort-order parameters. A senior SEO at this tier maintains a documented filter-indexation policy and a quarterly review of which filter URLs are earning impressions in Search Console. The pattern is non-trivial and the wrong default - especially the platform default - costs ranking on the highest-revenue pages.

What metrics should a large ecommerce SEO program track?

Five operational metrics that move at scale and that a junior SEO will not look at. One, organic revenue per visit by URL category - product page, category page, blog post, brand page - to surface which page types are pulling weight versus which are dead pixels. Two, indexed-page health - the ratio of indexed URLs that have earned at least one impression in the last 90 days versus the ratio that have not. The latter group is crawl waste. Three, crawl-budget utilization from log-file analysis - what percentage of Googlebot hits land on revenue pages versus filter URLs versus 404s versus redirected URLs. Four, search-visibility share by category - the brand's share of voice on the top 200 commercial-intent queries in the catalog versus three named competitors, tracked monthly. Five, AI Overviews citation rate - how often the brand's pages get cited in AI Overviews answers for branded queries and unbranded category queries. The first three move with technical and on-page work; the latter two move with content quality and authority signals. Tracking only organic-sessions and conversion rate misses the leading indicators.

What is the 90-day SEO audit framework for an existing $20M ecommerce brand?

A 90-day framework that fits inside a single quarter. Days 1-30 are diagnostic. Pull a full Screaming Frog crawl, a Search Console export of every URL that earned an impression in the last 90 days, a server-log sample of Googlebot hits across a representative two-week window, and a baseline Core Web Vitals report at the 75th percentile from the CrUX dataset. Document the indexation count by URL category and surface the gaps. Days 31-60 are quick-wins. Ship the foundational fixes that take less than a day each: schema corrections on PDPs and category pages, robots.txt cleanup, canonical fixes on filter URLs, internal-linking improvements from the homepage and top categories to revenue pages that need ranking lift, sitemap-index restructuring, image weight sweep on the top 100 pages. Days 61-90 are program design. Build the multi-quarter roadmap covering the deeper category-page program, the content engine cadence, the faceted-navigation governance policy, the link-building program, and the operational dashboard the team will own ongoing. By day 90 the brand has a documented program, not just a backlog. This framework assumes the brand has technical foundations broadly in place and is moving from foundations to scale - if the foundations themselves are broken, the 90 days collapses to triage and the program design pushes to days 91-180.

Should a large ecommerce brand hire an SEO agency or build an in-house team?

Both, in sequence. At $5M to $20M with foundational SEO debt, a retainer agency is usually the right call - the work is wide enough to need multiple disciplines (technical, on-page, content, off-site) and not deep enough at any one discipline to justify a full-time in-house specialist. At $20M to $50M the optimal pattern is a senior in-house SEO lead supported by an agency or specialist consultant on specific deep work (faceted-navigation policy redesign, schema architecture, link-building campaigns). At $50M-plus the in-house team grows to 2-4 SEOs covering technical, content, and program management, with the agency or consultant role narrowing to senior strategic input rather than execution. The mistake operators make at the $20M tier is hiring an in-house SEO too early before the foundations are stable, or staying on a generalist agency too long after the catalog has outgrown what a multi-client retainer can attend to. The transition from agency-led to in-house-led ownership is the same transition every other marketing function makes between $10M and $50M revenue.

Bring the catalog scale. We'll bring the diagnostic deck.

A 30-minute catalog SEO discovery call. Named lead engineer plus SEO lead on the call, not a sales rep. Tier-mapped diagnostic returned within two business days. Our Shopify development service covers the foundational SEO and category-page work inside Plus engagements at Tier 1 and Tier 2.

Published · Last updated .