Data Readiness Checklist for AI Demand Forecasting

Most AI demand forecasting projects that stall in the first six months share a common root cause: the data environment was not assessed before vendor selection or model configuration began. Teams discover mid-deployment that their sales history is stored at the wrong granularity, that ERP extraction produces duplicate transaction records, or that promotional events were never systematically tagged. By that point, the project timeline has slipped and the vendor's implementation team is billing for remediation work that should have been scoped upfront.

This guide is a working checklist — organized by assessment stage — for demand planning analysts, IT leads, and operations directors who need to evaluate data readiness before committing to an AI forecasting deployment. It is not a vendor evaluation framework. It addresses the data and integration conditions that must exist for any ML-based demand forecasting model to function as described, regardless of which platform you select.

Stage 1: Historical Sales Data — Depth, Granularity, and Continuity

The single most common data gap is insufficient history. Most gradient boosting and neural forecasting models require at least 24 months of clean sales history at the SKU-location level to produce reliable baseline forecasts. Some vendors will quote 12 months as a minimum, but that floor assumes relatively stable demand patterns — it rarely holds for seasonal products, new-market entries, or SKUs with high intermittency.

Minimum 24 months of point-of-sale or shipment data at SKU × ship-to location granularity — aggregated weekly or daily. Monthly-only history is often insufficient for models that need to learn within-period patterns.
Consistent SKU identifiers across the full history period. Item master changes — renumbering, merges, splits — must be mapped so the model treats a renamed SKU as a continuous series, not a new item.
No unexplained gaps longer than 4 consecutive weeks in any SKU's history. Gaps caused by stockouts are common and manageable if tagged; gaps caused by data extraction failures are not.
Stockout periods identified and flagged. Zero demand during a stockout is not the same as zero demand from the market. Models trained on untagged stockout zeros systematically underforecast.
Demand data separated from supply data. Shipment records reflect what you shipped, not what customers wanted. If your only source is outbound shipments, you need a method to identify and adjust for constrained periods.

Stage 2: Data Cleanliness and Structural Integrity

Raw transactional data from ERPs and order management systems is almost never model-ready. The question is not whether your data has quality issues — it does — but whether those issues are documented, bounded, and correctable within a realistic pre-deployment window.

Duplicate and Erroneous Transactions

Run a deduplication check on your order history before any vendor demo or proof-of-concept. Duplicate order lines — common in SAP environments after system migrations or interface retries — inflate historical demand and cause the model to overforecast. A single month with 8% duplicate lines can bias a 24-month training set enough to affect safety stock calculations downstream.

Duplicate order line rate below 2% across the training history period
Returns and cancellations correctly netted against gross demand, not recorded as separate negative lines without linkage to the original transaction
Outlier demand events (e.g., one-time bulk orders, sample shipments, inter-company transfers) identified and either excluded or tagged — not left as unexplained spikes in the series
Unit-of-measure consistency verified: eaches vs. cases vs. pallets must be normalized before aggregation

Causal Variable Coverage

AI forecasting models earn their advantage over statistical baselines primarily through their ability to incorporate external and internal causal variables — promotions, pricing changes, weather, macroeconomic signals, and product lifecycle events. If those variables are not captured in your data environment, you are paying for a capability you cannot use.

Common causal variable gaps and their downstream effects on AI forecasting model accuracy
Causal Variable	Typical Source	Common Gap	Impact if Missing
Promotional events	Trade promotion management (TPM) system or manual calendar	Events logged after the fact, not before; no lift magnitude recorded	Model treats promotion spikes as noise; cannot anticipate future events
Price changes	ERP pricing tables or POS data	List price captured but net price after discounts not available	Price elasticity modeling fails; demand-price relationship distorted
Product introductions / discontinuations	Item master or PLM system	New items added to item master with no lifecycle flag; discontinued items left active	New product forecasts default to zero; discontinued items generate phantom demand
Seasonality indices	Derived from history or external calendar	Calendar events (holidays, fiscal year-end) not aligned to demand data timezone	Seasonal peaks misaligned by 1–2 weeks in model output
External demand signals	Third-party data providers, POS syndicated data	Not available or not contracted	Model relies solely on internal shipment history; misses early demand shifts

Stage 3: ERP and System Integration Readiness

Data readiness is not only about what exists in your systems — it is about whether that data can be extracted reliably, at the required frequency, in a format the forecasting platform can consume. Integration failures are the second most common cause of delayed AI forecasting deployments, after data quality issues.

Extraction Feasibility

Confirm that your ERP or OMS can export order history at the required granularity (SKU × location × date) without custom development. Many SAP S/4HANA and Oracle EBS environments require custom extraction views to produce demand-level data — standard reports aggregate to the wrong dimensions.
Verify extraction performance for large history pulls. A 36-month history across 50,000 active SKU-location combinations can produce datasets exceeding 100 million rows. Extraction jobs that take 14+ hours are a deployment blocker if the forecasting platform expects daily refreshes.
Identify who owns ERP extraction access in your organization. In many mid-market and enterprise environments, the demand planning team does not have direct database access — they depend on IT, and IT has a ticket queue. This organizational dependency needs to be surfaced before the project starts, not after the first data pull fails.
Document the data refresh cadence your operations require. Weekly S&OP cycles need weekly data refreshes at minimum. If your ERP batch jobs run monthly, that is a constraint the forecasting vendor needs to know about before scoping the integration.

Write-Back and Workflow Integration

Most AI forecasting platforms generate forecast outputs that need to be consumed by a downstream system — typically an ERP's demand planning module, an S&OP tool, or a replenishment engine. The write-back path is frequently underspecified during vendor evaluation.

Confirm whether the AI platform writes forecasts directly to your ERP's demand planning tables or exports a file that requires manual import
Identify whether your ERP's demand plan can accept probabilistic forecast outputs (e.g., P50/P80/P95 quantiles) or only point forecasts — most legacy ERP demand modules accept only point forecasts
Verify that forecast override workflows in the AI platform can be reconciled back to the ERP without creating version conflicts in the demand plan
Check whether your ERP requires forecasts at a different time bucket (monthly) than the AI model produces (weekly) — bucket conversion logic needs to be agreed before go-live

Stage 4: SKU Portfolio Segmentation

Not every SKU in your portfolio is a good candidate for ML-based forecasting. Applying a single model uniformly across all items is one of the most reliable ways to produce a deployment that looks poor in aggregate accuracy metrics — because intermittent, low-volume, and new-product SKUs will drag down performance on items where the model actually works well.

Before deployment, segment your active SKU-location combinations across at least two dimensions: demand volume and demand regularity. A basic ABC/XYZ matrix is sufficient for this purpose.

SKU segmentation guide for AI forecasting model applicability
Segment	Demand Volume	Demand Regularity	Forecasting Approach	ML Suitability
AX	High	Consistent	ML model primary	High — sufficient history, stable signal
AY	High	Variable	ML model with causal variables	Medium-High — needs promotional tagging
AZ	High	Intermittent	Croston or similar intermittent method	Low — ML often overfits noise
BX/BY	Medium	Consistent/Variable	ML model or statistical hybrid	Medium — depends on history depth
CZ	Low	Intermittent	Min/max or manual review	Low — insufficient signal for ML training
New items	Unknown	No history	Attribute-based or analogous item approach	Requires separate treatment

The proportion of your portfolio that falls into AZ and CZ segments is a direct indicator of how much of your SKU base will require non-ML treatment even after a successful deployment. For most manufacturers and distributors, that proportion is between 30% and 50% of active items — a number that should inform your ROI expectations before you sign a contract.

Stage 5: Organizational Data Ownership and Governance

Data readiness is partly a technical problem and partly an organizational one. The technical gaps are usually fixable. The organizational gaps — unclear data ownership, no defined process for maintaining promotional calendars, IT and demand planning teams with misaligned priorities — are harder to resolve and more likely to cause a deployment to stall after go-live.

Named owner for each data domain: sales history, item master, promotional calendar, pricing data. If no one person is accountable for maintaining each domain, data quality will degrade after the initial cleanse.
Defined process for promotional event entry: who enters future promotions, at what lead time, in what system. AI forecasting models that incorporate promotional lift require future events to be entered before the forecast cycle runs — not after the fact.
Item master governance process: new item setup workflow that captures lifecycle stage (intro, growth, mature, end-of-life) and links new items to analogous historical items for initial forecasting.
Agreed data refresh SLA between IT and demand planning: documented and signed off before deployment, not assumed. The forecasting platform's performance guarantees are conditional on data arriving on schedule.
Forecast accuracy measurement ownership: who calculates MAPE, WMAPE, or bias metrics, at what granularity, and how often. Without a defined measurement process, model drift goes undetected.

Stage 6: Pre-Deployment Data Audit — Minimum Thresholds

The following table summarizes minimum conditions across the assessment dimensions above. These are not aspirational targets — they are the floor below which most ML forecasting deployments will underperform a well-tuned statistical baseline. If you cannot meet these conditions within a 90-day data remediation window, the deployment timeline should be extended, or the scope should be reduced to a subset of SKUs and locations that do meet them.

Minimum data readiness thresholds for AI demand forecasting deployment
Assessment Dimension	Minimum Threshold	Typical Remediation Effort	Deployment Risk if Unmet
History depth	24 months at SKU × location × week	Low if data exists in ERP; High if requires reconstruction from archives	Systematic underforecasting on seasonal and trend items
Stockout identification	Stockout periods flagged in 90%+ of affected SKU-weeks	Medium — requires joining inventory records to demand records	Demand signal suppressed during constrained periods; model learns wrong baseline
Duplicate transaction rate	Below 2% of order lines	Low to Medium — deduplication logic usually scriptable	Overforecasting proportional to duplicate rate
Promotional event coverage	80%+ of historical promotions tagged with start/end date and channel	High if not previously maintained — requires manual reconstruction	Promotion spikes treated as noise; future lift cannot be modeled
SKU identifier continuity	100% of item master changes mapped across history period	Medium — requires item master change log from ERP	Renamed/renumbered items appear as new items with zero history
ERP extraction latency	Data available within 24 hours of period close	Varies by ERP and IT capacity	Forecast cycle delayed; S&OP input arrives late
Causal variable availability	At least price and promotional calendar available	High if TPM system not integrated	Model operates as statistical baseline; ML advantage not realized

What to Do When You Find Gaps

Finding gaps in this assessment is normal — it is the point of doing the assessment. The question is how to sequence remediation without indefinitely delaying a deployment that has organizational momentum behind it.

Gaps That Block Deployment

Some gaps cannot be worked around. If your history is less than 12 months for the majority of your target SKUs, no vendor configuration will compensate. If your ERP cannot produce SKU-location-level data without a custom development project that will take six months, the deployment needs to wait. These are hard blockers, and treating them as soft risks to be managed during deployment is how projects end up in remediation.

Gaps That Support a Phased Approach

Most gaps fall into a different category: they affect some SKUs or some locations but not all. In these cases, a phased deployment — starting with the subset of items and locations that meet readiness thresholds — is a legitimate strategy. It lets you build organizational confidence in the model, establish measurement processes, and remediate data issues in parallel for the remaining scope.

A phased approach works best when the initial scope is large enough to be operationally meaningful. A pilot on 200 SKUs in one distribution center is a proof-of-concept, not a phased deployment. If the pilot scope is too narrow to affect a real S&OP cycle, the organizational learning you need — planner adoption, override behavior, accuracy measurement — will not happen.

Using This Checklist in Practice

This checklist is designed to be completed before vendor RFP or demo scheduling — not after. The sequence matters. Running Stage 1 through Stage 3 first gives you a realistic picture of your integration complexity, which should inform how you scope vendor conversations and what questions you ask during demos.

Stage 4 (SKU segmentation) and Stage 5 (organizational governance) are often skipped because they require cross-functional coordination rather than technical work. That is precisely why they cause the most problems post-deployment. A model that is technically sound but applied to the wrong SKU segments, or maintained by a team with no defined data governance process, will produce degrading accuracy within 6–12 months.

Document your findings from each stage in a readiness scorecard that can be shared with vendor finalists and with internal stakeholders who control the deployment budget. A written assessment creates accountability for remediation commitments and gives you a baseline against which to measure data quality improvement over time.

Data Readiness Assessment Checklist for AI Demand Forecasting Implementation