Data quality is free

Typically, if we can improve the quality of data, then we will have better information to support decision making. Better decisions will lead to better outcomes/results and will in turn be likely to have better quality data arising from them. As the diagram illustrates, this becomes a virtuous circle where better data ultimately leads to better business results.

For most medium and large organisations, the effective management of data and systems can present significant challenges. Organisations may be unaware that their poor data and bad practices leads to inefficiencies and problems and are, therefore, an avoidable cost to an organisation. Data can, in many circumstances, outlive the software that stores it and the processes and users that created it.

To paraphrase a quality management guru, Philip Crosby:

Data quality is free. It’s not a gift, but it’s free. What costs money are the unquality things – all the actions that involve not getting data quality right the first time and all the actions to correct these data quality issues

We explore the impact of this quotation in the post “Data quality is free” and in the “Managing Data Quality” book

So, what are the benefits of improving data quality?


The quotation above illustrates that the benefits of improving your data quality will be the removal of unnecessary costs arising from poor data. Better quality data can additionally present organisations with new areas to exploit.

Factor Poor Data Quality Good Data Quality
Perceived data quality Cost of resources to verify and “clean” data before it can be used to make it fit for purpose Confidence that data can be utilised ‘as is’
Staff more likely to use and maintain local data sources thereby degrading the organisations data quality Use of business systems by staff throughout the organisation
Decision making Poor data may give a false view of business situations, leading to poor decision making Enabler for optimal decision making at strategic, tactical, and operational levels. Allows identification and exploitation of new products and services
Process Outcomes Poor outcomes with lack of visibility that this is the case Optimal process outcomes
Poor customer perception arising from poor service outcomes and organisational reputation Improved customer perception
Performance metrics Effort needed to remove errors from data, and potentially manipulate data from multiple sources before performance can be understood Easy to quickly produce accurate and trusted performance reports
Overall organisational performance difficult to determine Organisational performance easily understood
Development of products and services Difficulties in understanding current performance and trends makes it hard to identify viable future opportunities Trusted data makes it easier to identify and exploit new opportunities

Real world examples of impacts of poor data

Poor data quality can have negative implications (financial, reputational, and other) that are sometimes highly significant. Examples include:

  • A woman bled to death after a spelling mistake meant blood intended for her during an operation was sent back
  • According to the UK’s National Audit Office, more than three-quarters of civil service pension records (1.25m), are incomplete or incorrect, which it says has caused hardship and distress to many pensioners. The National Audit Office found:
    • Systems capabilities not being in place to deal with processes and data on payroll and pensions
    • A lack of data governance and oversight
    • A lack of any methods to track benefits and other KPIs
  • The National Health Service took the unusual step of closing down a children’s heart surgery unit at a UK hospital, after data they had submitted showed that twice as many children and babies died in the unit than anywhere else in the UK. The UK media went into a frenzy; people came out of the woodwork with stories about their treatment at the hospital, neglect and near death experiences in abundance. Eleven days later and the unit reopened. It turned out that there were not twice as many people dying after all – the data that the hospital submitted to the NHS was late and incomplete; in fact, 35% of the expected data was missing completely, with catastrophic results.
  • Poor data quality results in duplicate and confused patient entries on NHS systems. In other words, one patient with more than one NHS number, or the same NHS number assigned to more than one patient. The consequences can result in incorrect and mixed medical records, missed screening requests and even cancelled operations.
  • The Metropolitan Police reportedly had to pay “around £1m” in pay-outs for raiding 900 incorrect addresses over a 3-year period.
  • In Germany, errors in internal accounting in the nationalised Hypo Real Estate, the German National debt was overstated by €55 Billion. This was doubly embarrassing for Germany as they had previously criticised the accuracy of accounting by the Greek Government. In an era of austerity where their government has squabbled tirelessly for two years over a mooted €6-billion tax cut, Germans found it hard to fathom that their government was so suddenly and unexpectedly €55-billion better off. The net effect of the error being found and fixed is that Germany’s Debt to GDP ratio will be 2.6% lower than previously thought
  • The US Postal Service (USPS) estimated in 2013 that there were approximately 6.8 billion pieces of mail that could not be delivered as addressed. Beyond the fact that the USPS itself spent $1.5 billion to process that mail (e.g., forwarding it, returning it, disposing of it, etc.) and assuming an unrealistically low average cost of $0.50 per mailing, this is likely to result in $3.4 billion per year wasted due to incorrect address data
  • At Maidstone and Tunbridge Wells NHS Trust, data on Clostridium difficile infections were found to be incomplete during the investigation into outbreaks of the infection between April 2004 and September 2006 (report published in 2007). The lack of timely and effective monitoring using complete data was one factor in delays in identifying the seriousness of C. difficile infection in the trust and responding to the situation. The report notes that “the significant outbreak in the autumn of 2005 was missed and the trust has acknowledged that it should have detected the rise in cases at that time”.

The above examples all provide examples of the consequences of poor data quality. Most organisations will also have their own examples. Therefore, the benefits of improving data quality should be the removal of these negative impacts.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.