What is the Octo 3-Batch Test?
A supplier's production capability is measured across three independent batches taken at different points in the order lifecycle. Each batch tests a different question. All three must match the spec for the supplier to enter regular rotation.
| Batch | When | What it tests | What you compare against |
|---|---|---|---|
| 1. Paid sample | Before the master agreement is signed | Does the factory exist? Can they produce a unit at all? | Spec sheet — material, tolerances, finish, packaging |
| 2. Pilot run (10–20% of master order) | After PO is signed; before full production runs | Can the line actually produce at order-volume rate? | The approved Batch 1 sample as the golden sample |
| 3. Random pull from the 2nd or 3rd full order | After the relationship is in regular rotation | Are they cutting corners as they get comfortable? | Both Batch 1 (golden sample) and Batch 2 (pilot retain) |
Weak suppliers rarely fail at the sample. They fail at the third batch — when the QA pressure relaxes, the discount raw materials enter the line, or the factory subcontracts to a partner you have never visited.
Why does Batch 1 lie?
Three reasons, in order of how often we see the pattern in seller reports.
Many factories assign stronger operators to samples, especially when the sample is used to win the order. The unit you receive may not be produced by the rotation that runs your master order. It is often produced by the senior operators the factory uses for sales-critical work. The stitching is tighter, the prints are crisper, the assembly tolerances are tighter — because the people doing the work are the factory's best, not their average. The pattern shows up across categories: r/EntrepreneurRideAlong reported hoodie samples with crisp embroidery and printing, then a master order with foam plastisol showing through and stitching going lazy on the second batch.
Factories often pull better-controlled material lots for samples than for ordinary production runs. Yarn lots, plastic resin batches, electronics components — the lot used for the sample is often the lot that scored well on the factory's recent QC pulls. Production runs on the next-best lot, which is statistically guaranteed to be different. For commodity products this gap is small. For brand-sensitive products (apparel, cosmetics, packaging), the gap is the gap between "looks premium" and "looks cheap."
A nominal supplier may subcontract before they commit. A supplier you found on Alibaba may be a trading company that subcontracts production. The sample comes from one factory; the master order from a partner you have not visited. The cluster-geography mismatch from Octo's 3-Consistency Rule maps directly here — a Shenzhen "manufacturer" producing a hoodie has likely subcontracted to a Foshan or Guangzhou apparel partner, and the QC chain breaks at the handoff.
What does "golden sample" actually mean?
A golden sample is the approved Batch 1 unit that becomes the reference standard for every subsequent batch. The factory keeps one, you keep one, and your inspector (Bureau Veritas, SGS, or Intertek) compares Batch 2 and Batch 3 against the retain. Without a golden sample retained on both sides, every dispute becomes a "your-word-against-mine" because the spec sheet alone cannot capture the look-feel-finish details that buyers are actually paying for.
The protocol that works: order Batch 1, approve in writing, sign and date both physical units (yours and the factory's), photograph both, and lock the spec sheet at the version that matched. Any drift from the golden in Batch 2 or Batch 3 is a defect — even if the spec sheet alone would not catch it.
Step 2 — Pilot run (10–20% of master order)
Once the master agreement is signed, the first run should be a pilot — 10–20% of the full order volume. This sounds expensive. It is cheaper than discovering on Batch 3 that the production line cannot hit the golden sample at scale.
The pilot tests two things the sample cannot:
- Line capacity at order rate. A factory that produced one sample in three days might run the master order through a line that hasn't been set up for the spec — different operators, different shift, different raw-material lot. The pilot reveals whether the factory's production line has actually been configured for your spec, or whether they are about to run it through a generic line.
- The factory's tolerance variance at volume. The first hundred units of the pilot will show the natural drift between operators. If the variance exceeds the spec tolerance, the factory cannot scale you. The pilot makes this visible before you commit the rest of the order.
Per published 2026 China-inspection pricing, pre-shipment inspection runs around $199–$320 per man-day at major providers (SGS, Bureau Veritas, Intertek). For a pilot run, hire the inspection at the start of the pilot — not at the end. The variance pattern shows up in the first 50 units; catching it on day one of the pilot saves the whole order.
Step 3 — Random pull from the 2nd or 3rd full order
The third batch is the one most buyers skip. The supplier has shipped twice without complaint, the relationship feels stable, and the buyer reduces inspection frequency. Octo treats this as the highest-leverage QC moment in the relationship.
The pattern: by Batch 3, some factories test the boundaries. They substitute a slightly cheaper raw material lot, run the line with a different shift, or push tolerances toward the lenient edge of the spec. Each substitution is small. Stacked, they are the difference between "supplier we trust" and "supplier we lose money on."
The fix: random-batch QC pulls on every 3rd to 5th order, conducted on the production floor before shipment, comparing units against both the golden sample and the pilot retain. Per Bureau Veritas's Supplier Audits service, random pulls are a standard supply-chain risk-management tool. The third-batch substitution pattern is common enough to plan against, especially in apparel, packaging, accessories, and private-label consumer goods.
What 4 patterns kill repeatability?
- No golden sample retained on both sides. Batch 2 and Batch 3 disputes become unwinnable because there is no agreed reference standard.
- Pilot run skipped to "save time." The first 10–20% of the master order is the cheapest place to catch line-capacity failure. Skipping it means the failure surfaces on the full master order.
- Inspection scheduled at the end of the pilot, not the start. Variance shows up in the first 50 units. Catching it on day one of the pilot saves the order; catching it on day five means the whole pilot is at risk.
- Random-batch QC dropped after Batch 2. Seller reports suggest factories test buyer attention at Batch 3. A buyer who stops inspecting after two clean batches is signalling a shift in QC pressure.
A sample order tests existence. A pilot tests capacity. A random pull tests integrity. Skip any one and the supplier is not verified for production.
How does the Octo 3-Batch Test connect to the 3-Consistency Rule?
The Octo 3-Consistency Rule verifies whether a supplier can run the work — across legal entity, export record, and production capability. The 3-Batch Test verifies whether the supplier will run the work consistently — across sample, pilot, and random pull. The two frameworks operate at different stages of the relationship:
- 3-Consistency Rule runs before the PO is signed (week 0–3).
- 3-Batch Test runs after the PO is signed (week 4 through ongoing rotation).
A supplier that passes the 3-Consistency Rule but fails the 3-Batch Test is a real factory that does not maintain quality at production scale. A supplier that fails the 3-Consistency Rule should not have been shortlisted in the first place. Both frameworks are needed because they catch different failure modes.
How does the 3-Batch Test compare to other QC approaches?
| Approach | Catches | Misses |
|---|---|---|
| Sample-only verification | Whether the factory exists and can produce a unit | Production-line consistency, raw-material substitution, third-batch corner-cutting |
| AQL pre-shipment inspection (one-off) | Defects on the inspected batch | Inter-batch variance over a multi-order relationship |
| Continuous SPC monitoring | Statistical process drift inside the factory | Requires deep factory cooperation; rarely available to non-key accounts |
| Octo 3-Batch Test | Sample, pilot, and third-batch consistency together | Nothing the other approaches catch — designed to layer on top of standard AQL inspection |
The 3-Batch Test is not a substitute for AQL pre-shipment inspection. It is a layer above it. AQL tells you whether this batch matches the spec. The 3-Batch Test tells you whether the supplier matches the spec across batches.
How does Octo SAM apply the 3-Batch Test?
Octo SAM bakes the 3-Batch Test into the shortlist brief. Suppliers who refuse golden-sample retention, refuse pilot runs, or refuse random-batch QC pulls are not shortlisted — Octo treats those refusals as a relationship-management risk signal. SAM coordinates the inspection schedule with Bureau Veritas, SGS, or Intertek at published 2026 inspection-market pricing of around $199–$320 per man-day for pre-shipment inspection, or $220–$458 per man-day for factory audits.
A sample is not production. Production is proved across three batches. Octo SAM coordinates golden-sample retention, pilot-run inspection, and random-batch QC across every supplier in your shortlist before the second order ships. See how SAM applies the 3-Batch Test →