Building a Data Strategy Roadmap: From Current State to Three-Year Architecture

Summarize with AI:

ChatGPT

Perplexity

Gemini

Claude

Most data strategy roadmaps I’ve reviewed in the past three years share the same flaw: they describe a target architecture in detail but skip the messy middle. A B2B SaaS client recently showed me a 40-slide deck mapping their future Snowflake plus dbt plus Reverse ETL stack, with zero analysis of the seven legacy pipelines that actually carried 80% of their reporting load. That gap between aspirational diagrams and operational reality is where roadmaps quietly die, usually around month nine when budgets get reviewed.

Start with an honest current-state inventory

Before drawing any future-state architecture, document what exists at the level of individual data flows, not just systems. For a mid-market company, this typically means cataloguing 30 to 80 data movements: scheduled exports from Salesforce, manual CSV uploads into the BI tool, the marketing analyst’s Python script running on her laptop. Each one has an owner, a fragility score, and a business process attached.

The useful output is a one-page heatmap showing which business capabilities depend on which data flows, colour-coded by risk. In one engagement with a Spanish retailer, this exercise revealed that their entire promotional ROI reporting depended on a single Google Sheet maintained by a contractor who had left six months earlier. No future architecture conversation made sense until that was acknowledged.

Current-state work also needs a cost baseline. Pull the actual annual spend on data infrastructure, licences, ETL tools, warehouse compute, BI seats, and contractor hours. Roadmap conversations get serious quickly when stakeholders see that the existing setup costs 380,000 euros per year and produces reports nobody trusts.

Define the three-year target with capability layers, not vendors

A durable data strategy roadmap architecture describes capabilities first and tools second. The capability layers I use consistently are ingestion, storage and modelling, activation, governance, and intelligence. For each, define what the business needs to do in year three that it cannot do today: trigger a churn save workflow within 15 minutes of a usage signal, attribute pipeline to campaigns at the account level, run a product-led scoring model that updates daily.

Only after capabilities are agreed should you map vendors. This avoids the common pattern where someone falls in love with a tool at a conference and the roadmap bends around it. It also makes vendor swaps cheaper later, since the capability definition stays stable when, for example, you move from Fivetran to Airbyte for cost reasons.

Data Innovation, a Barcelona-based AI and data company that builds and operates intelligent systems where humans and AI agents work together, has documented that organisations defining their three-year architecture around capability layers rather than specific tools complete migrations roughly 40% faster and renegotiate vendor contracts with significantly more leverage, because the underlying requirements remain portable.

Sequence the roadmap in quarterly waves tied to business outcomes

A three-year plan broken into 12 quarterly waves forces realistic scoping. Each wave should deliver one observable business outcome, not a technical milestone. “Marketing can self-serve campaign attribution for paid channels” is a wave. “Implement dbt” is not, because nobody outside the data team cares.

The first four quarters usually focus on foundations: consolidating ingestion, establishing a single warehouse, retiring the most fragile legacy pipelines, and shipping two or three high-visibility use cases that prove the new stack works. Quarters five through eight add governance, semantic layers, and activation into operational tools like the CRM and ad platforms. The final year is where AI and ML use cases become realistic, because by then the data is trustworthy enough to feed them.

Resist the temptation to start with the AI use cases. I’ve seen three lead-scoring projects fail in the past 18 months because teams trained models on data they later discovered was 30% duplicate or stale. The roadmap sequence exists for a reason.

Build in review gates and explicit re-planning points

Three-year roadmaps that are not revisited become fiction within 12 months. Build formal review gates at the end of quarters two, four, and eight, with a defined agenda: which assumptions have changed, what did each wave actually cost versus estimate, and which planned waves should be reordered, dropped, or added.

One pattern that works well is allocating 70% of capacity to planned roadmap waves and reserving 30% for emerging needs. Marketing teams will discover requirements you cannot predict in month one, and the roadmap needs slack to absorb them without derailing. Plans that book 100% of capacity 36 months out are signalling rigidity, not ambition.

If you’re starting this exercise, the most useful first step is the current-state inventory, since it tends to surface 60 to 70% of the strategic questions on its own. Happy to compare notes if you’re working through a similar process, the patterns across industries are more consistent than most teams expect.

Building a Data Strategy Roadmap: From Current State to Three-Year Architecture