An appropriately unhinged deep dive into Kimball's dimensional modeling primer
do you watch the wheels of the business turn?! WELL DO YA?!
Ralph Kimball. You know him. You know his book. You might think it’s irrelevant, you might think it’s the Holy Grail. You might be an OBT cargo cultist and think Kimball is outdated. But have you even read his book?
Well I have, at least I’ve read most of it. And if I have to have his ideas and snark bouncing around in my head for the rest of my life, then so do you. Join me as I take you down the rabbit hole of Chapter 1, a primer on business intelligence and dimensional modeling.
🐰 Businesses may have wheels, but mine came off a long time ago
Kimball is the king of statements that seem obvious when you read them, but they probably didn’t occur to you before he pointed them out. If you had to name the most important asset of your organization, what would you say it is?
Would you get caught up in how your special SaaS product has an astronomical ROI and is totally gonna take down a decades-long incumbent?
Would you start waxing poetic about how your service that turns healthcare workers into contractors is going to fix our broken healthcare system, actually?
Or would you have the based take of Sir Ralph Kimball, that your organization’s most valuable asset is its information? Because he’s absolutely right and that is an invaluable north star for all of us to think about. Every corner of corporate America has operational systems that capture orders, sign up new customers, and log complaints. In Kimball’s mind, these users turn the wheels of your organization. They’re the ones that actually carry out and create data from every tiny thing a business does. When you put mission and vision statements aside, them’s the brass tacks.
We, the collection of nerdy gremlins that call ourselves data practitioners, watch those wheels turn and evaluate performance. Which wheel is going faster than others? Are any wheels stuck, and stopping their peers from turning? Which wheels need grease?
Sounds easy, right? Try not to laugh too hard, if you’ve been suffering in the data world. Instead, join me on a sunlit mountaintop where King Kimball lays out for us what the goals of data warehousing should be, in this sun-drenched ideal world.
🥲 What if we COULD get what we want?
Yes, yes, I know, the world of data is in flames, everything is a VC conspiracy, capitalism will be the downfall of us all, and perfect data systems don’t exist. Whatever. Where’s your Peter Pan-like dreamer sensibilities? When’s the last time you asked yourself what you’d want if you could have anything?
Kimball is a man bold enough to ask these questions, and this is what he came up with. Chapter 1 of the data warehouse toolkit really is quite validating and insightful. In Chapter 1, he lays out 7 bedrock requirements for a data warehouse/BI system. That’s too many for the average reader’s attention span, so I picked 3 of my favorites.
💐 A data warehouse/BI system must make information easily accessible.
Can I get a hell yeah? He goes on to say that the data “must be intuitive and obvious to the business user, not merely the developer.” That’s the good stuff right there. Yes, I’m sure we’ve all had our fair share of frustrations with business users making weird, vague, or just downright unreasonable requests.
But we could probably reduce that kind of frustration if we took the time to think like they think, and build our data systems to “mimic a business user’s thought processes and vocabulary.”
Kimball summarizes this requirement neatly with simple and fast.
💐 A data warehouse/BI system must present information consistently.
This requirement is sneaky. It’s not controversial on its face, but do any of you know of any data systems that do a perfect job at this? If so, slide into my DMs because I’d like to buy a ticket straight to that nirvana.
Kimball is militant about conformed dimensions1 and master data management2 practices that enforce this level of consistency. He’s right that both of those concepts can go a long way towards enforcing data warehouse and BI tool consistency. They’re certainly not the only methods to enforce data consistency and quality, but they are tool-agnostic and can keep you from falling into marketing traps.
Very hot take: data observability and quality tools can be helpful, but they aren’t the only way to ensure data quality and consistency.
This requirement of data consistency is so real and true. But it’s also the one that gives me the biggest existential crisis. Setting up a data warehouse/BI system to enforce quality and consistency the whole way through takes time, energy, and attention to detail. So far, my experience of corporate America has been hyper-focused on “quick wins”. I even say the phrase “quick win”, and I make myself want to vomit.
Not everything worth doing is a quick win. Who is going to advocate for the slow wins? Am I losing my mind?
💐 The business community must accept the data warehouse/BI system to deem it successful.
You want to get punched in the stomach by a data OG? Let me hit you with this quote:
It doesn’t matter if you built an elegant solution using best-of-breed products and platforms. If the business community does not embrace the DW/BI environment and actively use it, you have failed the acceptance test.
-Kimball’s Data Warehouse Toolkit-ch. 1
Look, we’re sitting on top of our sun-dappled mountaintop here, just dreaming about the best possible world. You and I both know that no matter how much we follow beautiful, perfect processes, we’ll still have stakeholders who just want to export our dashboards to a CSV or Excel file.
It hurts a little when that happens. But maybe that’s not a bad thing. Maybe it’s a pertinent reminder that analytics work is primarily a service and enablement profession. We should take a step back every once in awhile and remind ourselves that this field shouldn’t be about our fancy, technical stacks (even if that’s what we have! sometimes the fancy tools Are Good, actually). It’s not about our spiffy visualizations. It’s about whether or not our business users can take a look at our spiffy visualizations and see what we see.
They need to be able to see the wheels of the organization as we see them. The squeaky wheel is only gonna get the grease if we point at it for our business homies.
🤓 Kimball’s hot takes: Chapter One
Don’t overcomplicate your BI tool.
Don’t make your business users ask “hey uh why don’t these numbers match?”
If your business users don’t like your BI system, it’s actually not their fault. You didn’t build it right.
Thanks for joining me on this unhinged deep dive into chapter one of Kimball’s data warehouse toolkit. If you haven’t read this book, I actually think you should. I’d recommend Chapters 1-6 if you can’t bring yourself to do the whole thing-it is a textbook, after all.
I’ll be back next week. Like, subscribe, pet your cat, you know the drill.
ex: being able to answer in the same way, across all systems, the question “What is a customer?”. It’s harder than you think.
When you make conformed dimensions, you are engaging in Master Data Management. MDM for the real ones.
Once you establish a conformed list of Customers 😄, next step in establishing a list of "Active customers". 😎
“If the business community does not embrace the DW/BI environment and actively use it, you have failed the acceptance test.”
Damn you woke up today and chose violence 😁 -- great stuff!