

Demand forecasting that survived contact with real data
Four iterations. Three instructive failures. One pipeline in production — measured against the metric the client actually cared about.
The problem statement changed three weeks in
The client came in asking for a demand forecasting model. After two weeks auditing their warehouse data, we found 14 months of records with three undocumented schema changes and no consistent SKU identifier across systems.
We spent three weeks rewriting the problem definition before touching a model. That work — not the eventual algorithm — determined whether the project would hold up in production.
That was not a data-cleaning problem. It was a problem-statement problem. The question they needed answered was not 'what will sell next quarter' — it was 'what do we actually know, and how reliably do we know it.'
Problem definition is not billable overhead. It is the architecture decision with the longest half-life. Every hour here saved two in the pipeline.


What the failed pipelines taught us
Pipeline one assumed the data schema was stable. It wasn't. The model trained cleanly and failed immediately on live data where field names had drifted. Lesson: schema contracts before any training run.
Pipeline two solved the schema problem but ingested trailing-zero noise as a meaningful signal. Forecasts were confident and wrong. Lesson: data profiling is not optional; it is the first deliverable.
Pipeline three worked in staging. It degraded in production within six weeks because the feature store was refreshed on a different cadence than the model expected. Lesson: operational cadence is part of the architecture, not an ops footnote.
Production outcomes against real business metrics
34% reduction in overstock
Pipeline uptime: 99.1%
6-week handoff, full ownership
The client's engineering team took full ownership in six weeks. Documentation-first delivery meant no black-box dependency on our team post-launch.
Measured against the 12-month baseline the client tracked before engagement. Overstock was the metric they brought to us; it is the metric we built toward.
Over 90 days in production with zero manual intervention. Operational cadence contracts — the lesson from iteration three — held exactly as designed.
Working on a forecasting or data-quality problem? We start with the data audit, not the model.
