DIA Webinar Recap: Data Lake vs. Data Warehouse

Author: FSFP • February 20, 2017

Are you Team Data Lake or Team Data Warehouse? Which one does your organization have and actively use? Are there plans to enhance (i.e., improve, streamline and/or upgrade or replace) your data repository this year?

If you're interested in getting First San Francisco Partners' take on the "best" solution for your organization, we hope you attended our February 2 Data Insights & Analytics (DIA) webinar. Our co-hosts Kelle O'Neal and John Ladley covered the Data Lake vs. Data Warehouse topic from all angles.

Our agenda for the one-hour webinar was packed full:

Defining the Data Lake and Data Warehouse
Key differences between the Data Lake and Data Warehouse
How to optimize the Data Lake
How to optimize the Data Warehouse
Sample Data Lake and Data Warehouse architectures and use cases
How a Data Lake can solve the problems of a Data Warehouse
Key findings and takeaways

While Data Lakes and Data Warehouses are not new concepts, explaining them to someone who's unfamiliar with the terms can benefit from a straight-forward approach, like the Gartner definitions Kelle and John shared:

A Data Warehouse is a storage architecture designed to hold data extracted from transaction systems, operational data stores and external sources. The warehouse then combines that data in an aggregate, summary form suitable for enterprise-wide data analysis and reporting for predefined business needs.*

A Data Lake is a collection of storage instances of various data assets additional to the originating data sources. These assets are stored in a near-exact, or even exact, copy of the source format. The purpose of a Data Lake is to present an unrefined view of data to only the most highly skilled analysts, to help them explore their data refinement and analysis techniques independent of any of the system-of-record compromises that may exist in a traditional analytic data store.*

We also appreciate a great analogy, like this one from Pentaho CEO James Dixon who coined the term Data Lake more than five years ago (you can think of a Data Mart as a subset of a Data Warehouse):

Think of a Data Mart as a store of bottled water — it’s cleansed, packaged and structured for easy consumption. The Data Lake, meanwhile, is a large body of water in a more natural state. The contents of the Data Lake stream in from a source to fill the lake, and various users of the lake can come to examine, dive in or take samples.

Kelle and John highlighted several use cases which emphasized the importance of aligning business strategy with whatever solution you choose — a Data Lake, Data Warehouse or a "best fit," like a blended model.

You can find the February 2 webinar replay and presentation material on demand at DATAVERSITY. And … it's never too early to reserve your (virtual) seat for our next call on March 2. Our topic, Descriptive, Prescriptive and Predictive Analytics, promises to be a great one for those of us who care about integrated and effective business intelligence and analytics.

*Source: Gartner Glossary – Data Warehouse and Data Lake.

Array

DIA Webinar Recap: Data Lake vs. Data Warehouse

Are you Team Data Lake or Team Data Warehouse? Which one does your organization have and actively use? Are there plans to enhance (i.e., improve, streamline and/or upgrade or replace) your data repository this year?

*Source: Gartner Glossary – Data Warehouse and Data Lake.

You have Successfully Subscribed!

Post Categories

About the Author

FSFP

More Posts by Author

Get Practical Tips and Information to Turn Data into Actionable Insights

Your download is on the way!

Data Quality Office Proves Pivotal for Our Client

DIA Webinar Recap: Data Lake vs. Data Warehouse

Are you Team Data Lake or Team Data Warehouse? Which one does your organization have and actively use? Are there plans to enhance (i.e., improve, streamline and/or upgrade or replace) your data repository this year?

*Source: Gartner Glossary – Data Warehouse and Data Lake.

Interested in getting more ideas and inspiration from our expert data management and governance consultants?

You have Successfully Subscribed!

Recent Posts

Using Governance to Tackle Tough Data Problems (EDGO Session)

EDGO Recap: Optimize Your Data Catalog

10 Learnings from DGIQ

Post Categories

FSFP

More Posts by Author

Using Governance to Tackle Tough Data Problems (EDGO Session)

EDGO Recap: Optimize Your Data Catalog

10 Learnings from DGIQ

Your download is on the way!