Over the last two decades of working in information management, I’ve had many opportunities to work on large-scale projects that have a heavy reliance upon high-quality, fit-for-purpose data. Back when master data projects were leading (and bleeding) edge, we didn’t know what we didn’t know (as they say). And in those early days, we also didn’t know a lot about the topic of data remediation.
When I look back at my official job titles over those years, including Data Governance Program Leadership, Director of Enterprise Master Data Management and Coach/Mentor, in nearly all cases I was “accountable for the data” — i.e., the master data required to support the key business processes that made the companies run. And with this responsibility came the duties of fixing bad data, a.k.a. data remediation.
What is Data Remediation
At its core, data remediation is an activity that’s focused on cleansing, organizing and migrating data so it’s fit for purpose or use. The process typically involves detecting and correcting (or removing) corrupt or inaccurate records by replacing, modifying or deleting the “dirty” data. It can be performed manually, with cleansing tools, as a batch process (script), through data migration or a combination of these methods.Data remediation, a #datamanagement best practice, is an activity focused on cleansing, organizing and migrating data so it's fit for purpose & use.Click To Tweet
While there are differences in data remediation methods and even terminology, there is sure to be unanimous agreement in a fundamental best practice that says you should cleanse data at or as close to the source of the data as possible. This offers the most benefit across the process and system landscape — the further “upstream” data is cleansed, the more “downstream” processes and systems will benefit from the inherited, cleansed data.
You might ask me how we remediated bad data back in the day and how should it be done today. Truthfully, some of the efforts I remember clearly, but others I don’t. (Much like buying a home or navigating through a once- or twice-in-a-lifetime experience, it’s easy to forget the hard-won results of “learning by doing” as one moves forward to the next set of priorities and initiatives!)
What Drives Data Remediation
What I can say for certain: Data remediation is an essential requirement for companies, especially as the need typically arises during initiatives or projects that include business process re-engineering, large-scale enterprise resource planning (ERP) or Master Data Management (MDM) implementations — or when a legacy system comes to its end of life (EOL). These instances will drive the opportunity for a company to cleanse and restructure its key data.
Consider these additional factors that will drive the need for data remediation:
- Moving to a new system or environment
- Eliminating personally identifiable information (a.k.a. PII)
- Dealing with mergers and acquisitions activity
- Addressing human errors
- Remedying errors in reports
- Other business drivers
Let’s look at a specific example.
A company embarks on a major MDM initiative. Project plans are completed that encompass everything from Discovery, Blueprinting, Requirements, Design, Build, Test and Deploy. There may even be a work stream devoted to Data Migration. But what about the actual data itself? Not the counts of data. (“We have 10,000 customer records in our old system and expect to see 10,000 customer records in the new one.”) Let’s leave that to the technologists. The Business area needs to focus on ensuring that the right data is moved to the right field and, most importantly, is correct and it can be used in the business process.
More often than not, an initiative like this will require data clean-up activities. And this is where most projects face a challenge: There’s lack of knowledge on where to begin, uncertainty about what steps to take to ensure data is fit for use/purpose — and it’s unclear who should be accountable for the data remediation activities.
Why Businesses Avoid Planning for Data Remediation
In my experience, both the Business and IT sides do not plan for the data remediation activities that are required to support their projects. Whether the project is related to MDM, business transformation (or process re-engineering) or an ERP implementation, this data work will be part of the mix. Often, this activity is assumed to be covered under data migration activities. (Data migration is the not-so-simple act of moving data from point A to point B. It may include some transformations in order to format it for the new environment or system). However, data migration typically doesn’t include the data cleansing and validation phases that should be conducted by Business stakeholders to ensure that correct data is going into the correct field to support key processes. By ignoring the details of this required work effort, schedules can be delayed which impact the overall project timeline. Or, in the worst-case scenario, the migrated data will be inaccurate and not fit for use.
With high-quality, useful data being the desired end state for companies, why is data remediation ignored and not planned for? There are many reasons, but these are top of mind for me:
- The company is unable to plan the full set of required activities for data remediation.
- There’s a lack of established data ownership (the critical roles and responsibilities piece of every successful initiative).
- It’s deemed as too hard or too costly for the company to fix its data-related issues.
- There’s fear about what will be found as the company digs deeper into its data.
No matter the reason, data remediation’s owner should be clear: It’s a business activity. The Business is the only group capable of making decisions about its data. It’s responsible for data quality, data remediation and for defining the quality of the company’s data. (Side note: IT plays a lesser, but still important, role in helping the Business understand its data challenges and providing data extracts and supportive analysis that may be required in order to evaluate the data.)
So … what is your company doing to remediate its data issues? Share your thoughts below in the comments or tweet me, as I’m contemplating a follow-up post on this topic.