Across 40 CRM audits we ran for B2B clients between 2022 and 2024, the median account had 31% of contact records missing a verified email, 22% with stale or invalid phone numbers, and 14% duplicated across at least two records. The pipeline data was worse. Roughly half of all open opportunities older than 90 days had not been touched in their last_modified field, yet sales leaders were still using them in forecasts. These numbers held remarkably steady across HubSpot, Salesforce, and Dynamics implementations, regardless of company size.

Most data quality problems start at the form, not in the CRM

When we traced bad records back to their origin, around 60% came from web forms with weak or no validation. Free-text country fields, optional company names, and email inputs without MX-record checks were the usual suspects. One SaaS client was generating 400 new leads per week through Marketo forms, and 18% had country values like “españa”, “Spain”, “ES”, and “Espana” coexisting in the same column.

The fix is rarely a CRM project. It is a form layer project. Replacing free-text with picklists, adding real-time email validation through a service like ZeroBounce or NeverBounce, and enforcing company name lookup via Clearbit or a similar enrichment tool removed about 70% of inbound noise within the first quarter for that client. The CRM then becomes a system of record rather than a cleanup queue.

Pipeline accuracy depends on stage definitions, not stage names

Every CRM we audited had stage names. Few had stage definitions that two reps would interpret the same way. When we asked five sales reps in the same team to define what “Qualified” meant, we got five different answers in 34 of the 40 audits. The downstream effect is predictable. Conversion rates between stages become unreliable, forecast accuracy drops, and marketing attribution gets blamed for problems that are actually definitional.

The teams with the cleanest pipelines shared one habit. They wrote one-sentence exit criteria for each stage, embedded those criteria as required checkboxes in the opportunity record, and reviewed them quarterly. A typical exit criterion looked like “Economic buyer identified and confirmed in writing, budget range stated, decision timeline within 90 days.” When reps cannot tick the box, the deal does not move forward. Forecast variance in those teams ran below 12%, compared to 35% or higher elsewhere.

Enrichment without governance creates new problems

Several clients had layered three or four enrichment tools on top of their CRM, ZoomInfo, Apollo, Cognism, plus internal scrapers, hoping to fill gaps. The result was usually field-level chaos. Job titles updated weekly from one source and monthly from another, with no rule about which wins. We found one account with 11 different values for industry across a single contact’s history, none of which matched the parent account’s industry field.

Data Innovation, a Barcelona-based AI and data company that builds and operates intelligent systems where humans and AI agents work together, has documented that companies running more than two enrichment sources without a documented field-level priority rule see data drift accelerate by roughly 2x within six months compared to single-source setups. The lesson is not to avoid enrichment. It is to decide, per field, which source wins, when it overwrites, and when a human reviews the conflict. We typically build this as a small reconciliation layer between the enrichment APIs and the CRM, with an audit log so that when sales asks why a title changed, there is an answer.

Audits work better as a cadence than as a project

The clients who got lasting value from a CRM audit treated it as a recurring quarterly exercise, not a one-off cleanup. Two hours a quarter, a fixed checklist, and a named owner outperformed expensive six-week consulting projects every time. The checklist usually covers duplicate rates, ownership assignment, stale opportunities older than 60 days, missing required fields on closed-won deals, and lead-to-opportunity conversion by source.

One mid-market client cut their duplicate rate from 14% to under 3% in nine months simply by running this checklist with the RevOps lead and addressing the top issue each quarter. No new tooling, no migration, no consultants beyond the initial framework setup. The work compounded because the team understood what good looked like and stopped accepting drift as normal.

Where to start if your CRM has not been audited recently

If you have not audited in the last 12 months, the highest-leverage starting point is usually a duplicate analysis combined with a stage-definition review. Both can be done in a week with a SQL export and a couple of meetings. From there, look at form validation upstream and enrichment governance downstream. Most teams find that fixing the inputs and the definitions removes the need for most of the cleanup they thought they needed.

If you want to compare your data quality numbers against the patterns we have seen, the team at Data Innovation is happy to share the audit checklist we use. It is a working document, and we update it each time a client surfaces a problem we have not seen before.

FREE 15-MINUTE DIAGNOSTIC

Want to know exactly where your CRM program stands right now?

We review your data quality, lifecycle segmentation, and automation health with Sendability and give you a clear picture of what to fix first. Trusted by Nestle, Reworld Media, and Feebbo Digital.

Book Your Free Diagnostic