Last month at Enterprise Data World (EDW), I presented a topic that I’m quite passionate about — Resolving Data Definition Conflicts: A Practical Guide — to a room filled with data management professionals.
It was a good-sized crowd that seemed very engaged throughout my talk. And while I’d love to think it was my stellar delivery of the material that captured their attention, I believe the reason for their engagement is because everyone could relate to the topic. We’re all working in or with organizations that need to bring order to data.
It doesn’t matter the company size or the industry, our data problems are before us and need to be addressed. And you’ll notice I said “our data” — meaning, it belongs to virtually everyone in our organization and not just IT. We collectively own our data-related problems. If your job title is Data Management Consultant or Data Governance Lead, it’s obvious that data matters to you. And if your title is Marketing Manager or Head of Operations, you’re in the club, too.
Trusted and useful data should be our shared concern — and when it comes to defining data, this warrants an enterprise-wide focus and the business collectively driving the work.
Why Data Definitions Are Critical
At First San Francisco Partners, we’re known for our expertise in the practice area of data governance. When we consult with companies on governance, we often focus their attention on the heart of the matter by addressing metadata. Because companies can’t successfully govern data if they aren’t able to define it.
Sometimes, we’re called upon to educate key departments or executives on the importance of defining data. That’s because definitions drive true clarity of business purpose. Before data can be controlled, measured or optimized, we need to answer some questions — “What is ‘right’ for the data?,” “What is this data?,” and “Why do we need this data?”
There are data-centric questions about an organization that directly relate to its confidence in its data definitions, structures and content — questions like:
- What products are we selling?
- Who is financially responsible for which products?
- How are customers grouped?
Data is more than just ones and zeros. It’s a mechanism to reflect the organization’s reality in its systems.
Don’t Be Conflicted About Conflicting Data
Typically at an organization, a data definition or governance exercise might begin with one area working to create a data glossary or dictionary. This initiative often includes people from across the organization who then work to define a broad scope of data — from the very specific (e.g., defining what data is in a column is or what the value is of column) to the very conceptual, like defining “customer.”
Defining data can be a challenging and conflict-filled endeavor. And this conflict can be revealing. The more conflict there is in an organization about data, the greater the impact of resolving the conflict — and the resulting implications will stretch far and wide across the organization.
If your organization has big, thorny data problems, it could be because there are fundamental disagreements about what key data is supposed to represent.
Types of Conflicting Data
At EDW, I talked about how data definition conflicts go beyond just wordsmithing documented definitions. There are situations in which people in a company just fundamentally disagree with a definition. (Perhaps you’re nodding your head in agreement that this is a common issue, as many did in the EDW audience.)
A data definition conflict is often a reflection of another conflict in the organization (e.g., a system or process conflict) — and I typically see three conflict patterns:
- Different Context: (What this often sounds like in an organization.) “That definition is close. But for our department, we mean ABC.”
- Overloaded Terms: “There are two types of this thing … except for this other thing, which is sort of a hybrid of the other two.”
- Name Conflicts: “That’s not what that term means at all. It means XYZ.”
Start Big or Start Small — Just Start
When an organization begins a data definition and governance program, it’s important to set the scope and define the critical data elements that need to be addressed. This brings the highest-priority data to the forefront.
Ideally, an initiative like this would be done company-wide, but that’s often not feasible. It’s also okay to start small, as in when a singular (and, hopefully, influential) department begins working to address key metadata concerns. Later, when that department is able to connect its better-than-before metadata to other departments’ data, those areas receive value, too. And as they say, all we really need to know is what we learned in kindergarten: You (and your department) will see better outcomes when you share. This is also true of business and technical metadata — each is valuable in itself but gains more value when linked together.
As for when is the best time to start reconciling data conflicts? It’s now … or possibly never, if an organization gets in the way of itself. I once worked with a data governance team that sought to fix every data-related concern it had before them, starting from the base data on up. But their executives prioritized two key reports that needed addressing. The governance area dug in its heels and refused to narrow the scope, even when I showed them how we could address about half of their issues with a smaller, but higher-priority initiative. Unfortunately, the all-or-nothing effort stalled out because of their desire to fix “everything.”
Data is Inanimate, But It’s Personal
I often hear the term “junk data” — and when I do, I try not to cringe. Many times, that junk had (or still has) a specific business purpose that you might not see. It might be misplaced and misnamed, but don’t assume that the data, the name and/or the definition is junk.
Remember, data is at the heart of how people do their jobs and is often part and parcel with our professional self-identity. No particular business usage is “wrong.” No department is “bad” for having a different definition. Everyone is trying to get through their day and do their work — and they are using the data in whatever way is needed to serve their business area’s goals.
Reconciling Data Conflicts Isn’t Easy, But It’s Essential
In our increasingly digital world, expectations for data are ever-expanding. Data definition conflicts are often based on past system and process constraints. While it can be difficult to break apart those constraints, we cannot achieve our goals if the data isn’t aligned. And there will never be a better time to dig in than right now!