A testing operations cadence at $1M-$10M scale runs roughly two tests in flight at any time, with a third in build and a fourth in research. Tests are roughly 2-4 weeks each, so the throughput is 8-16 tests a year on a tight programme. The cadence below is what we run with clients on the growth strategy retainer.
Days 1-7. Research. Session replay review, post-purchase survey reads, GA4 funnel-drop analysis, support-ticket review for friction signals. Output is a written hypothesis document with the four hypothesis-hierarchy answers from § 04, plus a candidate variant sketch.
Days 8-21. Build. Variant implementation. Client-side via the testing platform if visual; server-side via Shopify Functions or middleware if structural. GA4 wiring (custom dimensions for test ID and variant ID, server-side eventing for ad-blocked traffic recovery) verified before launch. Pre-launch checklist: SRM dry-run on QA traffic, power calculation re-confirmed, primary/secondary/guardrail metrics signed off.
Days 22-49. Observation. Test runs untouched for 2-4 weeks. Daily SRM check (just on traffic split, not on result). Weekly check-in but no decisions until the test reaches its planned sample size or the four-week cap. If a guardrail metric breaks (page errors, refund rate spikes), the test is paused and reviewed.
Days 50-60. Analysis and ship. Final read against pre-registered metric tree. Statistical significance evaluated on primary metric. Practical significance evaluated against the pre-registered threshold. Decision: ship, hold, or retest. Written post-test note added to the testing log. Next test moves from research to build; new test enters research.
The discipline that holds the cadence: every test has one written brief, one written post-test note, and one row in the testing log. The log is the growth-strategy team's source of truth for what's been tested, what won, and what's been learned. Without the log, repeated tests of the same hypothesis under different framing become inevitable, and the testing programme drifts into theatre.
For teams running this internally without an agency, the smallest viable stack is one of the following client-side tools (Optimizely Web, VWO, Convert), GA4 with Enhanced Ecommerce events, server-side eventing via Google Tag Manager Server-Side or a tool like Klaviyo's server-side connector for the email-side measurement, and a session-replay tool for qualitative grounding. Plus a written test brief template that covers the § 04 four questions and the § 05 metric tree.
For agencies considering whether to bring CRO in-house or hire it, the broader companion piece on benefits of hiring an ecommerce development agency covers the trade-off; the technical-SEO piece on Shopify SEO services covers the upstream traffic question that conversion-optimization sits on top of. Design-side considerations are covered in our web design service and the related UI/UX design service.