Data catalogs icon

Why Data Glossaries, Dictionaries and Catalogs Need to Evolve


Data glossaries, dictionaries and catalogs — really, anything that provides context for data — are the bedrock of data governance. These tools have been around for years. But for the sake of modern, effective governance, they will need to evolve, just like our world and our industry are evolving.

If we don’t change how we use these tools, we’ll have the wrong solutions generating the wrong types of output — rendering them to be almost worthless and not delivering enough value to our organizations. As data enterprise people, it’s up to us to keep these tools viable.


When I think about data glossaries, dictionaries and catalogs and their precarious nature, I can’t help but also think about the encyclopedias that served us so well for decades. They afforded us private, easy access to reference information — either at home, if we were fortunate enough to have a set, or at a (hopefully) nearby library.

With the advent of Wikipedia, we’ve gone from encyclopedias — centrally managed, edited and approved versions of the truth — to a very federated (and somewhat imperfect, people might argue) repository — one with much more breadth and depth of information that it’s almost staggering. And at almost no cost for us to access it. Yet, there are obvious parallels between encyclopedias and today’s data glossaries, dictionaries and catalogs. Encyclopedias didn’t evolve, and if our tools don’t evolve they will also disappear.

Looking at how organizations manage data context today. Most have localized repositories where reference information is stored — tools like data glossaries, dictionaries and catalogs — but also databases, SharePoint sites or even Excel spreadsheets. The types of tools are wide ranging, as they represent anywhere people populate this information via a centralized data governance organization or by the work of data stewards and other responsible parties.

Having a repository is a much better alternative than not having one, but how many organizations can say that they’ve been wildly successful with these? In my experience working with dozens of organizations of all sizes representing a variety of industries, resounding success with centralized systems is rare, indeed.

This feels like a situation that’s poised for destruction — not unlike the encyclopedias in the old world — when what we need is more of a Wikipedia in the new world.


It’s important to focus on data value when we think about the data glossaries, dictionaries and catalogs, as data value helps us to measure their effectiveness and organizational benefits.

Here’s how I think about these tools:

  • A data glossary has more business context with less detail — something that tells us more about how it’s being used or its definition.
  • A data dictionary offers more detail, with less of the contextual surrounding, but with detail on how the information is stored (technology systems, etc.).
  • A data catalog merges the glossary and dictionary and articulates the relationship between the data items.

If you think about driving data governance and data value through data glossaries, dictionaries and catalogs, do you see anything that’s missing? It’s that we’re lacking momentum-drivers — the process or “machinery” to get people to use these tools. In many cases, the incentive structures to use them are not well-suited to creating sustainable glossaries, dictionaries and catalogs. Often, our data governance areas have grand ambition but meager resources and a misalignment between expectations and incentivization.

When there’s a disparity between empowerment and accountability, the systems we’re trying to build will start to unravel. We need to build systems that create momentum and can grow over time, as opposed to a system that constantly needs resourcing from the outside.

The goal for any data initiative should be to create more value than the cost the organization puts into it. It’s the only hope there is to create something sustainable.


The missing piece is a data library … and this isn’t an accepted term, more so a concept to relate to like encyclopedias of the past. Think of a data library as being comprised of glossaries, dictionaries and catalogs and people — they’re the missing link in a system that needs to be fed by its users, in order to create a tool or system that’s more predictable and easy to use.

The good news is that these data libraries will give us the building blocks to achieve a system that reinforces itself — a system that tells people how to find needed information. The parallel here is how we intuitively know how to look for something in a book’s table of contents, in a physical library or how we know how to use the search bar in Google.

Not only do I think glossaries, dictionaries and catalogs have merit, they’re an incredibly essential piece of the entire data governance story. If you’re trying to do governance without some tools to support the size of the governance organization, you’ll run into trouble. Because there comes a time where your toolset could be exceeded by your ambition — similar to how you wouldn’t build a skyscraper out of wood. There’s a limit to what you can accomplish with the tool at hand.

These governance libraries provide leverage to do what we need to do. They give us scale and the ability to build greater capabilities and sustainable systems that can grow over time, without us being a central bottleneck.


Think about our libraries of today. They take energy to get to. You often have to leave the house and be motivated to go to the library. And there’s a question as to what new or different information the library offers vs. searching via the internet. We’ve become accustomed to the point of entitlement — pretty much for anything we can imagine — via the powerful device in our pocket.

Libraries have had to reinvent themselves. They have to offer new services, like online access, meeting spaces and a wide range of programs. Libraries have evolved into community centers and a place to meet people or take our kids. They’re more like an entertainment and learning destination and almost less about books. Because they successfully evolved, libraries are still relevant today.

Just like a library, we need to rethink our tools and look for ways to disrupt things inside our organizations.


One of the specific inhibitors to getting the most from glossaries, dictionaries and catalogs is that we try to be perfect and overthink things. (I call it pragmatism over the pursuit of perfection.) If we can remove some of the friction and create something that’s truly valuable to our organizations, these tools will find their path.

We don’t need to micromanage the use of data in our organization. We need to orient toward value creation — making this our bright, shining beacon. We must ask ourselves how what we’re doing with these tools creates value.

Glossaries, dictionaries and catalogs create no value unless they are used and drive meaningful business improvement.


Today, speed is everything. If we’re asking people to work with tools that aren’t fast and responsive, they will ignore them. Three seconds to load a web page is an eternity. One second is even far too long for most interfaces! We need to recognize that the standards of normal acceptability have become unfair, thanks to the phones we carry everywhere, and our data governance tools must keep up or they will be deemed archaic and unresponsive.

When we simplify our tools by making them easy to use and speedy, while providing an intuitive and asynchronous sharing of insights about the business, the tools will get more use. If we find a way to get people engaged and give them a way to voice what they know and what they care about, the tools will get more use. And when we publish information in these tools and direct people there instead of calling people for answers, the tools will get more use.

More use of data glossaries, dictionaries and catalogs means more momentum and more impact over time.


To accomplish this disruption, we also have to recognize the potential customers in an organization and persuade them to do something different. Yes, we’re in the sales game now, my fellow data professionals, technologists and stewards!

We also need to think about the data consumers — the people in our organizations who are using the data. They’re the real customers, not the people in data governance. They’re saying, “I wish I just understood the data and knew how to use it for my job, my department, my line of business.”


Making change is hard in most any organization. Doing things to create data value is even harder. We are the data people, and we must live by what we preach by sharing the value of data. We need to measure and communicate the adoption of our tools, the results, the impact. Measurement is our key to maintaining relevancy. Because, ultimately, who cares the most about glossaries, dictionaries and catalogs? It’s us, the people who are building and maintaining the tools. For everyone else in the organization, the tools are just a means to an end.

Encyclopedias didn’t know how to evolve and become Wikipedia, though I’m certain they would love to have a do-over. Other organizations have evolved with the times and are major players today — Amazon, Netflix and other disruptors.

When it comes to glossaries, dictionaries and catalogs, we need to be disruptive, too. We’re the data people, after all.


This content in this article was originally part of Anthony’s presentation for DATAVERSITY’S 2019 Enterprise Data Governance Online event.

Article contributed by Anthony Algmin. His experience includes decades of hands-on technology work, management consulting and executive roles, coupled with a passion for leading data-driven change. He is the creator of the Data Leadership Framework, which helps organizations balance their efforts across the many areas of data management, and the author of “Data Leadership: Stop Talking About Data and Start Making an Impact!”

You have Successfully Subscribed!