Part One: Host her? I barely know ‘er!
Coalesce 2024 felt like going skydiving for the first time. Not my first rodeo, but definitely my first rodeo as part of Coalesce’s hosting team.
I knew I’d get to see many of my good friends from the Data Angels world again, and that was enough to let the excitement of Coalesce win out over any trepidation I had being part of the hosting team. If there’s one thing the data industry loves to do, it’s be catty and strange on the internet over….checks notes software tools.
Working Coalesce this year was a good experience. I like my job as a trainer, and it was really gratifying to get to train people in person. Real-time discussion on how our different training topics integrate into their daily lives is very much the bread and butter of my job.
I also got to meet users at the expert lounge, and help them think through some of their thorniest problems. I don’t know how helpful I was specifically, but I do love the chance to noodle through tricky issues with smart people.
This year, I also had the opportunity to speak on a panel of women working in data. This was a really cool experience and a chance to be candid about our day-to-day working in a male-dominated industry. This was my favorite thing to participate in as a pseudo-speaker this year. Women put up with a lot, and our professional struggles tend to be pretty universal1.
Part Two: The data wild west
I also was lucky enough to get off hosting duty long enough to attend 3 peer exchanges this year. That was a real gift. I’m lucky to have joined the training team at the time that I did, because I benefitted from their learnings from previous conferences. The main one was compartmentalizing training days mostly away from main “conference days” so that the trainers could also attend conference events.
Peer exchanges are where Coalesce shines.The structure of a peer exchange generally involves the facilitator introducing some topic, providing either discussion questions or an activity, and then time for you and your peers at the table to discuss the topic or accomplish the activity together.
For me, a peer exchange encouraged me to listen actively to the experiences of others, share my own experiences as pithily as possible, and try to bring differing viewpoints together into something that we could all benefit from moving forward. The structure of a peer exchange is actually exactly what I hope to get out of conferences, which is learning from other people in my field. It’s a facilitated, semi-structured hallway track.
I loved the peer exchanges I got to go to last year. As soon as the planning for Coalesce 2024 began, I made as much noise as I responsibly could to bring peer exchanges back. I don’t think there was ever any question that they weren’t coming back, I just never miss an opportunity to be annoying.
In between conference duties, I got to attend 3 peer exchanges. They were on the future of analytics engineering, data governance, and imposter syndrome. All 3 felt like the most productive use of my time at the conference by a lot (excluding, of course, my on-the-clock time ;) ).
Data is such a sticky, wild-west, off-the-walls, and, in some ways, influencer-poisoned industry. There is a lot that is wildly exciting about working in data. Depending on where you’re at professionally, there’s tremendous room for impact and for shaping the future of how the world uses data. There are fascinating examples in corporate America of what I would consider “golden” data teams—i.e. data teams that have accomplished some really interesting products. A few that come to mind are:
Spotify’s “daylist” function, and the auto-generated playlists that come up when you search for a “mood”. I’ve searched “confident” and “romantic” on Spotify recently and it’s spun up great playlists within my taste and in-theme. Really interesting data work there.
Air B&B’s new “categories”. I think it’s interesting the way they’ve chosen to organize and re-organize the tremendous amount of information that exists on Air B&B to help consumers figure out what vibe they want, as opposed to the much more prosaic “2 bed condo” organizing method.
Apple’s photo organization. Whatever computer vision/grouping tactic they are using to make it so that I can search the word “cat” and find all my pictures of my cat at once are awesome. That, and facial recognition so I can easily search for images of my friends all at once is a pretty neat quality of life upgrade2.
It’s hard not to find this kind of work inspirational as a data practitioner! There are smart people out there whose well-reasoned point of view is that companies like those have golden data, and the rest of us will never have data that good. Even if that’s true, there’s always more data being generated in the world and with increasing attention to data quality, there’s a very real possibility that more companies can have access to interesting data to do even more interesting and innovative things.
That inspirational vibe, however, can quickly get lost in the sauce. Most data teams apparently operate like service teams, and apparently they hate this model (I disagree, but whatever). Most data teams are very concerned with data quality, and many are experiencing disconnects between data producers and consumers that are threatening data quality. It’s easy to find fellow data practitioners whose worst nightmare is an eldritch being asking for “a number pull real quick” and simultaneously telling you that your number does not match theirs that they already pulled for some reason.
So how are we supposed to move forward in such a challenging industry if there are so many problems that every data team is facing? How can we ever get to Spotify-levels of data work coolness? How are we supposed to escape every niche SaaS tool telling us that they’ll finally solve our data team problems, no really, they will???
Well, peer exchanges (and similar events) are actually a pretty damn good start.
Part Three: Three fabulous peer exchanges
The future of analytics engineering peer exchange was a pretty damn good start. We talked through some of the results of dbt’s annual State of Analytics Engineering survey, and then we got to spend about half an hour dreaming up what we hope the future of analytics engineering will look like.
I was, perhaps, overly insistent at my peer exchange table that we treat the conversation truly as an exercise in vision-casting, and in laying out our biggest, most improbable hopes for the future. I didn’t want to start every conversation with “well this would never work, but….” or “there would be too much pushback on xyz…”. I think there’s enormous value in setting aside the real world for a little while, and dreaming about what could be. Sure, reality will always temper our plans, humble us, and make us change directions. But sometimes, every now and then, hairbrained ideas end up happening. Random thoughts dreamed up in a hotel conference room become an earworm to someone with just the right tools to make it happen. Letting yourself get bogged down in pessimism isn’t the way to improve the data industry. Dreaming, and knowing what dreams are actually possible, is the way forward.
Massive thanks to Phoenix Jay & Millie Symns for facilitating this peer exchange.
The data governance co-lab peer exchange served a very similar, if more specific purpose. We worked through a model-UN-style scenario in separate data governance councils as a vehicle to start identifying what data governance standards might transcend industry and be universally applicable. We also spent time debating how data governance decisions should be made. Does majority rule? Should ICs vote, and managers veto? Should ICs and all managers except for one vote, and one “Governance lead” have veto power? If governance leads get veto power, are they also the ones on the hook when data goes wrong in some way?
I found the conversations in this peer exchange massively productive, particularly an exchange my group had around public models. dbt’s model contracts and model versions introduce new (to dbt, anyways) ways to manage changes to high-traffic models, but they don’t solve everything about changes to high-traffic models.
What standards of communication should there be around public, contracted models changing in some way? How can a team enforce sunsetting of an old model version, and eventually deleting it? What happens when you have one really stubborn stakeholder who won’t use v3 and is dead set on using v1 come hell or high water? HOW DO WE GET PEOPLE TO CHANGE?!
Sticky, sticky issues here. But issues very much worth piecing through with other humans, in real-time. Hats off to Jenna Jordan for facilitating this one.
The imposter syndrome peer exchange got to the heart behind a lot of the turbulence in the data industry. What if I’m just faking it? What if my knowledge is actually completely insufficient to solve any of these problems, and it’s only a matter of time before some shadowy force realizes that, and kicks me out of the industry I enjoy being in so much?
What if, actually, many data people feel this way? And, what if we’re all doing a little bit better than we think we are? How can we strike a productive balance between the reality that every data practitioner needs to be continuously learning to succeed, but that over-indexing on your failures and perceived deficiencies is really holding you back?
Thanks to lots of smart heads getting in the same room together, it turns out there’s lots of strategies to fight imposter syndrome. My favorites were putting together a brag book to remind yourself of times you’ve kicked ass at work, finding a study buddy to pick out one thing at a time to get better at, and creating the affirming energy for others that you want for yourself. These types of things tend to exist in a virtuous cycle!
Thanks to Lauren Benezra, Paige A.C. Berry, and Logan Cochran for facilitating this peer exchange.
Coalesce is special. The data industry and surrounding open-source communities will always have their infighting and drama. That much I have learned over my past year working in data tooling, and it’s truthfully something I have a hard time dealing with.
But coming to Coalesce, meeting up with the Data Angels, and going to peer exchanges like these, remind me why I love working in data in the first place. There are a lot of messy problems out there to be solved. Thinking through many of those problems in person with other humans is an infinitely valuable way to spend time, and I’m grateful I’ve gotten to do so at this particular venue three years running now.
Thanks for reading, and for allowing me a week off to come down from Coalesce. I expect to post a good chunk of technical explainer-style blogs before the year is out, in case you don’t particularly enjoy it when I write about death and mortality 😉. See you next week.
You know what they say about being vulnerable about slightly controversial topics in public—some people are gonna feel empowered and less alone, and others are going to write you a 3 page essay of feedback you didn’t ask for. Such is life!
Now—obviously, each of those above companies have come out with really cool data products, and each of them have engaged in some pretty fucking shady corporate behavior. This post is not an ethical review of these companies, because I think smarter people have commented on that already. Whether their other actions are shady or not, each of them have really excellent data teams that have accomplished fascinating, innovative things using data.
“Most data teams apparently operate like service teams, and apparently they hate this model (I disagree, but whatever).”
fwiw I am seated for a post about this if you have one in you! Maybe to clarify first: do you disagree that the model is wrong/bad or that most data teams hate it?