dbt’s Semantic Layer for Dummies

Feb 19, 2024

it's me i'm the dummy

7 Comments

Feb 25, 2024

This is a fantastic piece. Truly. I’ve been struggling to understand the point of the “semantic layer” and the real world example you gave of counting up the devices is super helpful. As an aside, and this obv isn’t your problem or a criticism of your piece at all, but I still feel lost as to the utility of doing all that work because it still doesn’t solve the problem of Mr. Engineer wanting to use a totally different source to answer the question. Like, if the data team has their own warehouse and the fulfillment team has their own warehouse and the org can’t get on the same page as to which one makes sense as the right source for a count of devices, why would a semantic layer sitting on top of the data team’s warehouse solve that problem? Is the idea that the business user would stop asking Mr. Engineer questions like that altogether because pivoting out the answer in their BI tool themselves is truly “self service”?

Reply (1)

Faith Lierheimer

Feb 26, 2024

Oh dude you’re spot on that the technology would not solve that problem at all. I’d imagine with the right infrastructure (idk, Trino + SL??) you could solve a cross warehouse issue like that. But that’s still not gonna solve the behavioral issue.

It’s definitely relevant (and I should have included this!) that I was at a 65 person series B company at the time. It’s a lot easier to build good data culture at that small of a scale.

I’m thinking about the same issue a lot. Like what are some characteristics of an org that’s ready to use an SL productively? I think it would be easier at a sub 100 person shop but I am so curious about the characteristics of an org that is suited for it. Gonna be idiosyncratic I’m sure but hey we’re data people, can’t help but look for patterns

Reply (1)

Marco Perez

Mar 22, 2024

Feel that for large orgs, *excuse my corporate slang here * the name of the game is managing up.

Basically communicate/signal (with smoke or otherwise) from whence authoritative metrics can come - and that other sources can serve their purpose, but will not be parsed through with a fine tooth comb of different. If buy off of this concept can be obtained, the many many meetings required to confirm this understanding will pay dividends for years to come.

Maybe in parallel, success might be had with the other teams, as they also feverishly hate having to explain why different sources don’t add up. Nothing ruins my day faster than someone with the most uppity vibe being like: why doesn’t yours match mine. Aaaaarg!

Arynn MP

Feb 19, 2024

The way you describe why a semantic layer exists by breaking it down into parts really resonates with me. It reminds me of ThoughtSpot's built-in semantic layer, aka a Worksheet. It's a light weight semantic layer, since context is only given to tables in the worksheet with a particular join, but it's what ultimately powers all of the liveboards. And that's what helps ThoughtSpot call themselves self-service; end-users are actually querying from that curated semantic layer. :)

Reply (1)

Faith Lierheimer

Feb 19, 2024

Totally!! Common issue, many ways to solve it :) I like thinking of the worksheets as a lightweight semantic layer

Donald Parish

Jul 21, 2024

Microsoft has a “semantic layer”. Power BI is a model based tool. You set up the relationships between facts and dims, and with a formula language called DAX, the correct numbers are generated based on the report context. No SQL generated, at least not that you can modify. See https://www.sqlbi.com/articles/power-bi-is-a-model-based-tool/

Comment deleted

Feb 20, 2024

Comment deleted

Faith Lierheimer

Feb 20, 2024

i am once again challenged to up my unhinged vibes by the master himself :salute_emoji:

faith.facts

dbt’s Semantic Layer for Dummies