

The problems, the pivots, and what actually shipped.
Each project brief starts with the problem statement as written on day one. Then we show how the data changed it. Outcomes are measured in latency, cost, accuracy, and adoption — not benchmark scores.




Projects indexed by problem, not by client.
Why the model was right and the data was wrong.
Initial brief: improve inventory accuracy. Actual problem: upstream ERP exports were dropping nulls silently. Rebuilt the ingestion layer before touching the model. Result: 34% reduction in stockout events within 90 days.
Outcome: −34% stockouts, +18% forecast accuracy, zero retraining in 6 months.
The workflow nobody mapped was the one breaking the system.
Brief: automate contract review. Discovery: reviewers had three informal workarounds not in any SOP. Redesigned extraction targets around actual behavior. Adoption hit 78% in week two — no training required.
Outcome: 78% adoption week 2, review time −61%, zero rollback requests.
We shipped the second model. The first one taught us why.
The knowledge base was clean. The query patterns were not.
First model flagged everything — alert fatigue killed adoption in week one. Spent three weeks redefining what 'anomaly' meant to the ops team. Second model shipped with a 94% true-positive rate.
Brief: internal Q&A over proprietary docs. Problem: users asked questions the docs never anticipated. Rebuilt the chunking and retrieval layer around real query logs. Latency dropped to under 800ms at p95.
Outcome: 94% true-positive rate, mean response time −47%, ops team expanded coverage to two new services.
Outcome: p95 latency 780ms, answer relevance score +41%, support ticket volume −29%.
We document what we didn't ship.
Every case study here includes a section on the approach we abandoned and why. That's not humility — it's the most technically useful thing we can hand a new client before scoping begins.
