Save the Planet: Recycle your rubbish
Save your space: Recycle your data
As we move from winter to the summer months’ people often think about cleaning their homes and gardens and this may include removing some of the clutter – often referred to as a ‘spring clean’. Do you need to do something similar for your data?
When was the last time you reviewed your organisations data needs? Are you still collecting and managing data that is no longer used? Business processes will evolve over time and therefore so will the data requirements used to support those processes. If additional information is needed, to successfully deliver the desired outputs, then this may be partially offset in some way by data becoming obsolete.
Although the costs associated with storing data are relatively small (compared with storing more traditional paper based records) your surplus data may be creating expense in other less obvious ways. For example, more time being required to gather and manage all this data.
Recent concepts such as the ‘data lake’ and the use of Big Data Analytics tools to make sense of it is sometimes ‘sold’ as not needing to create data schemas or understand your data, since the technology will do it for you. For some data sets this may be the case, but for many organisations this over-reliance on technology without understanding the fundamentals may be short-sighted.
Whilst having more data available to us can improve our knowledge, it can also be problematic; hence the age old saying of ‘not seeing the wood for the trees’. We know storing data is relatively cheap but what other costs apart from storage are there with holding data? The less obvious but higher costs could be around:
- Quality measurement
- Data transactions
- Access management
When I have asked an organisation/individual why they collect and manage some of their data I’ve often received the answer of “because we can”, “because we always have” or “in case we need it in the future” and so on. The effort in collecting and managing this ‘nice to have’ data can deflect resources from assurance activities on the data that is vital to support business processes. It can be far too easy to fall into the trap of having a lot of nice to have data that is lower quality rather than having a bit less data and achieving higher quality necessary data.
As data volumes keep growing, reviewing your data requirements can reduce the impacts of more data by being selective as to what you need to hold.
Going back to spring cleaning your home you may ask yourself a set of questions as you assess each item of clutter:
- Have I got a need for it
- Can I store it elsewhere e.g. in my loft, shed or garage
- Can I sell it or give it away to another user
- Do I have more than 1 of the item in question
- Does it work
These questions can similarly be applied to data, for example:
- Does the data support a business process
- Can it be archived so that I can retrieve it if I subsequently find a need for it
- Can I delete it, sell it as an asset or pass it on to a supplier or client without losing integrity, security or breaking data protection rules and legislation
- Is the data a duplicate of another element/set? Which version is more accurate or complete
- Is the quality good enough for use
If any of the above five points show that the data being managed has little or no value then consider its continued worth, and more importantly, its negative impacts such as creating confusion, its cost to manage, including its storage: it’s removal may then be valid.
Good housekeeping of data is therefore akin to spring cleaning your home, and hopefully will improve your environment. After all, how much neater would your data-set look if you didn’t have to trawl through the rubbish. How much time and money is spent on data cleansing when it simply isn’t needed in the first place.