dbt Labs started as Fishtown Analytics, a lifestyle professional services company offering data consultancy [6]. It has since become a product business, creating the industry standard solution for data transformation, pioneering the field of analytics engineering [1], and earning a valuation of over $4 billion along the way [2].
At the heart of it all is its 100,000 member-strong community [3] - one of the largest communities in the data analytics space [4]. That community generates a massive 80% of its inbound leads [5] and its customers include the likes of JetBlue, Hubspot, Vodafone, and Dunelm [7].
In this deep dive, we take a look at the pivotal role played by community in crowning dbt Labs category kings.
✔️ Origin Story: How dbt Labs and its community got started.
✔️ Category Creation: How they have leveraged community to create a category.
✔️ Growing Pains: Why its strategy has had to evolve as it has grown.
✔️ Return on Community: What value it creates for members and the value it derives for its own business.
It’s a fascinating look at a truly community-first business - let’s get into it 🤓
Over the last decade, as companies have rushed to adopt the cloud, their data has moved into the cloud too. This has meant vast changes in the data space. The rise of cloud-based Data Warehouses and platforms (like Snowflake and Databricks) led to a shift in ownership away from software engineers to less technical, business roles [7]. The growth of data pipeline services (Fivetran, Stitch, and Airbyte) meant many data jobs could now be automated. Combine these changes with the emergence of easy-to-use SQL-based BI tools (like Looker and Metabase) and the once distinct jobs of data science, engineering, and analysis have blurred [14].
So much change in a small amount of time left data practitioners not only wondering how to make the best use of these new developments but also rethinking key aspects of the profession - who should own what and how should they now structure their teams? There was a lot to figure out, but with their early investment in community, dbt found themselves at the forefront of it all.
In 2016, Tristan Handy, Drew Banin, and Connor McArthur started a little consulting shop known as Fishtown Analytics based in Philadelphia [7]. As part of that they created an open-source project known as data build tool or dbt. The first commit to the dbt codebase was in March 2016. They spun up a Slack instance for it three months later - a month before they officially formed the company, making them quite literally community-first [11].
Yet they didn’t have huge goals for either the community or dbt at the time. dbt was merely a tool they needed to work with their consulting clients [1] and Slack was how they kept in touch with them [8].
“That tiny little group of humans ended up being the only (work-related) social interaction I had in those early days,” says Tristan, who became CEO [15].
But the community grew.
After six months there were 100 people [12]. Three and a half years later there were 4,000 [10]. This might not seem like a huge figure, but as A16Z General Partner Martín Casado explains, "You don't often get communities and network effects in infrastructure like this" [13]. They knew they were on to something and decided to go all in on dbt in late 2019 [1], becoming a product company, raising a seed round, and rebranding to dbt Labs [8].
The dbt product makes it easier for data teams to generate insights from raw data. It combines modular SQL with engineering best practices that make data transformation more reliable, automated, and faster [7]. It has an open-core model. There’s dbt Core, which is open-source and free to use under an Apache License. They make money through its dbt Cloud offering - a proprietary and commercial product with a SaaS business model. dbt Cloud provides enhanced features for enterprise teams, helping them with their security, compliance, and governance needs [7].
dbt Labs have three main go-to-market motions. Open source, where you can install dbt locally and use it for free. There’s a self-serve, bottom-up motion, where you let dbt Labs host dbt for you, and there’s a sales-assisted motion for Enterprise customers [1].
dbt has proven to be a product that spreads organically, one team in an organization finds success with it and it’s then quickly adopted by other teams across the business [7]. It’s a horizontal tool too, it can be used across any industry - if you have data, dbt can help you transform it [1]. In 2017, 50 companies were using dbt in production. By the time they raised an A-round in 2020, there were 1,700 using it. By March 2022, that was up to 9,000 [7] and there are now over 30,000 with 3,800 paying customers [3]. It has become the cornerstone of the modern data stack [9], so influential that both Snowflake and Databricks have become investors, too [10]. What’s powering this rapid growth behind the scenes is community. As Tristan says [11]:
“The community is responsible for all of our growth.”
Let’s take a closer look at how this community grew in both size and influence.
The dbt community has been a community of practice from the get-go. “When people joined, it was not just to talk about dbt,” says Tristan [8], it was to talk about everything that was happening in the wider data science space. “Before we realized that dbt was going to be the thing,” says Tristan, “this community was actually called Analyst Collective - I still own the Analyst Collective domain name” [8].
Alongside the community, they had a newsletter known then as the Data Science Roundup. Tristan had inherited it from his position at a previous data startup, RJ Metrics [8]. It was originally hosted on Revue, but it’s now on Substack and it’s still going today, over 7 years later. The newsletter would explore a new topic in the industry each week, as well as round up other interesting data-related articles from across the web, and it even included sponsor slots promoting different vendors. It wasn’t about promoting dbt, but about enabling practitioners to keep up with changes via the newsletter and then head into the community to discuss them.
As Janessa Lantz, former VP Marketing at dbt Labs says, it became “a really rich space for people to learn alongside each other” [5]. It spawned a whole set of other newsletters in response. “All these data nerds have their own Substacks, and they reference each other and have conversations in their Substacks,” Janessa says [5].
The community has a similarly vendor-agnostic element, too. “Mode Analytics and Snowflake and Databricks… all have channels within our Slack community and play a big role moderating,” says Janessa. “There's… whole sub-communities which becomes a great marketing channel for those companies” [5]. This approach built the community’s reputation as a community of practice. Tristan recalls [10]:
“We were practitioners. We were trying to solve real problems. We were using dbt as a tool in solving those problems. But we were not really talking about dbt. We were talking about the problems that we were facing in the context of doing the kinds of analytics that digital native businesses do.”
It proved to be a much-needed resource in the industry, and this is reflected in the organic growth of the community.
Three months into the business they landed Casper as a consulting client and they soon joined the Slack community. Casper was in the midst of its rocketship growth and investing heavily in data. As Tristan recalls, “They had maybe a dozen data analysts or something like that. And we did this project. These dozen data analysts got exposed to dbt and they were very plugged into the overall New York tech ecosystem” [1]. Casper hosted several meetups, and word of dbt and its community began to spread [1]. “They introduced us to Kickstarter and Venmo and all of these New York-based tech companies at the time” [12]. These included teams at SeatGeek, WayUp, Betterment, Bowery Farming, Birchbox, and even JetBlue [16]. The cool thing about this is it was happening organically, based on the passion for the dbt tool and its community. “Drew and I, who don’t live in New York, we were kind of observing all this from afar,” remarks Tristan [17].
“As people hear about dbt, people say, yeah, you should join the Slack community. And so people show up and we all help each other out, not just in using dbt, but in figuring out how to do analytics in this new way” [1].
That new way would become known as Analytics Engineering, the category that dbt Labs now dominates. Although the term itself was something that emerged from the community. Coined by Michael Kaminsky, then Director of Analytics at NYC-based Harry’s, Tristan explains how it came about [1]:
“The community created it. It was very much a conversation that was being had by everyone in the community… but Mike Kaminsky is actually the first person to write a blog post that says, what the heck is this thing that we're all talking about?”
dbt Labs rallied around the term, renaming the newsletter to The Analytics Engineering Roundup [10] and later launching The Analytics Engineering Podcast too. It soon began appearing in job descriptions and LinkedIn profiles [1]. “It has become a really important flag to plant in the ground,” says Tristan. “This community is about doing our jobs differently than people have done their jobs in the past,” so it’s important to have that shared language to bind everyone together and further their shared understanding [1].
It was the community who christened the name for this new way of working but it’s dbt that has become the “technology that enables people to work in this way,” explains Anna Filippova, former Senior Director of Data & Community at dbt Labs. Adding that dbt “comes with a set of very strong opinions about how you should do data work, and those opinions only matter as best practices if other people are also following them” [18]. What this meant is that analytics engineering became intertwined with using dbt and they have each helped each other grow through word-of-mouth.
We can see the power of that word-of-mouth success in how it spread throughout the New York tech scene. This became a consistent pattern of how dbt grew internationally, too. A couple of companies in a new city adopt the tool or new way of working, talk about it at meetups, and word spreads, many folks join the Slack community where relationships form and requests for a city-specific channel spike [16].
This also happened in London, for example. Tristan describes how “Dylan Baker gave dbt a try at Simba Sleep” [16]. Dylan would go on to found the London dbt Meetup, which is now one of the largest in the world. From those meetups, Monzo adopts dbt and becomes an advocate, introducing it to other local fintech startups, including Simply Business, Receipt Bank, Landbay, and GoCardless. The same happened in San Francisco, LA, Boston, Seattle, Montreal, Chicago, Berlin, and elsewhere, too [16].
The result is the dbt community has doubled in size every year it has existed [7] and its growth has mirrored the trajectory of product adoption.
Beyond driving growth for dbt Labs, the creation of the analytics engineering category has been valuable for the community too. As Anna explains, “A big part of it… can be summed up as a sense of shared meaning and shared identity.” Putting a label on it leads to a recognition that “there are other people like you who believe in the same way of working” and this leads to “really strong associations and relationships and support networks” [18].
Members of the community have produced talks and other resources to help each other level up. They’ve collectively helped each other to navigate how to use the technology, structure teams, and enact organizational change to best leverage it [19].
For example, community member Opeyemi Fabiyi shared how when starting a new job they had “zero knowledge about dbt, zero knowledge about the modern data stack and engineering.” But they went on to “successfully implement the data stack as a solo data team” as a result of “leveraging all the support from the dbt community.” They’ve since gone on to found the dbt meetup in Nigeria to raise awareness of dbt and analytics engineering [20].
The dbt Labs community isn’t just a community of practice, but incorporates elements of product and contribution, too. Anna estimates the following breakdown: 40% practice, 40% product, 20% contribution [4].
Here’s how the community impacts the wider dbt business.
The dbt community is a source of continuous feedback for the dbt Labs product team. Early on the founders would spend a lot of time in the community Slack getting to know people, answering their questions, but also getting a clear sense of what they needed in the open source project [18]. It’s no surprise then that cofounder and former Chief Product Officer Drew Banin has racked up over 30,000 messages in the Slack community [5].
Anna says “We're often reading thoughtful, opinionated posts about what feature we should build next," [19] and dbt Labs has invested heavily in developing its open source product. They have an engineering team, an engineering manager, and a product manager responsible solely for this area [4]. All the foundational product features are discussed in public before they work on and release something, so they’re able to get important early feedback from the community [18].
Community contribution to the product isn’t limited to feedback, though, there are hundreds of forks and code contributions too [4]. “Every time we release a new version of dbt, there are a dozen plus members of the community who literally contributed code to that release,” highlights Tristan [1].
An example of the business impact this can have is with its integration strategy. The dbt product is all about integrating with other systems. In the past 12 months, the community has helped to build over 50 adapters resulting from 325 contributors. David Nalley, Director of Open Source Strategy for Amazon Web Services describes this rapid development as “stunning,” highlighting that just 17% of those contributions came from dbt Labs [21].
Another area of contribution is documentation. Documentation is an important area since dbt integrates with so many products and sits on top of many different warehousing options. Anna admits that alone “we can't keep up with documenting all of those relationships” but thankfully they “get a ton of documentation contributions” and they’ve invested in making it “really easy for folks to do that and to help us” [4].
Combined, this all creates a significant competitive advantage. As Tristan describes [1]:
“There's this dynamic of the community that ends up not only being a source of growth, but a source of product improvement… This flywheel continues to accelerate, and as the product gets better, as the community gets bigger, there is this ever-increasing moat.”
“The reality is when you have a good and healthy community and the amount of word of mouth that we have around dbt, marketing gets pretty easy,” says Janessa. Most of its traffic comes in via owned channels and branded search as a result of community’s impact on word-of-mouth [1]. “A lot of our marketing is very boring,” admits Janessa - “I don't think our visual identity is super mature. I don't think our demand-gen offers are particularly cutting-edge” but “it doesn't matter when the underlying dynamics of the business are working” [5].
Its community efforts create this halo effect around its marketing efforts [5]:
“We constantly see that as we move into new tactics that they just perform better. It's not like our marketing is brilliant. It's just because people have already come in contact with dbt previously. So that's where community just creates this halo around marketing.”
For example, when they ran ads for its Coalesce conference, their ad agency saw 3x better results than what they typically get, saying “These are unprecedented results. We've never seen anything like it” [22]. They do very little account development and yet its demo content performs really well, too [5]. A lot of the hard work has already been done by community through word-of-mouth, meaning prospects “just wave their hands and say like, hey, help me buy this thing” [1].
The other thing is that they’ve seen consistent growth. Tristan says, “going back five and a half years now. It's 10% month over month, every single month” adding that “the heat that powers that organic growth is all generated in the community” [1].
It’s worth noting though, that they don’t actively sell to its community. They might have 30,000 companies using their product, but only 3,800 customers. Yet, they don’t try to get everyone else to pay. Janessa explains their stance: those “people are the ones who are out there saying great things about dbt and we don't want to ruin that” [22]. They’ve accepted that the majority of folks who are using dbt are never going to pay dbt Labs a cent. And that’s ok, they derive value from their community in a multitude of other ways.
Let’s take a look at some of the specific programs they deploy to create value for its members and their business.
“Content has been really central for us,” remarks Tristan. In addition to its newsletter and podcast, “our blog has helped us continue to build that brand with this group of people” [1]. To further that, by leveraging the community, they’ve invested in coaching people in how to write well as well as how to speak at events. That can take the form of “helping folks refine talks and submissions to our conference” through to its more formal Technical Writing Mentorship program [4]. The latter is a 6-week program, which guides people through the process of ultimately publishing their first post on the dbt blog [23]. Through a combination of group working sessions and 1:1 editing sessions with an assigned editor, would-be authors are guided from how to choose topics, to drafting, editing, graphics, publishing, and distribution. They support each other along the way via a private Slack channel, providing feedback on contributions via Google Docs [23].
The belief within dbt Labs is that “compelling writing or speaking by the community about their own work with dbt, says so much more about the project and the community than if it was a bunch of [dbt Labs’ own] advocates,” explains Janessa [4].
According to Tristan, events are how dbt Labs delivers transformational moments for its community members [1]. The value goes beyond the content to putting a focus on the people who are there. Events are also how dbt Labs reaches its international community, over 60% of its members are based outside of the U.S. [2]. Its User Groups program has over 20,000 members across 50+ groups in 29 different countries [25].
Having local user groups “allows you to connect with people that you wouldn't normally meet,“ says Anna. It’s there “where people are going to make random connections and… gain opportunities they wouldn't otherwise have” [24]. Tristan adds that it’s at these meetups where people “have these moments where they get converted from, ‘yeah, I like your thing’ to ‘holy crap, this has changed my life’” [1]. And that’s enough for dbt Labs, they don’t attempt to sell to these attendees. They gain email addresses from attendees for the meetups but they don’t even add those into its CRM. “We’re never going to market to those people,” says Janessa [5].
They rarely speak at these events, either - it’s a platform for its users to talk about all the things they’re doing with dbt [5]. They do, however, organize other kinds of events that are designed to create a funnel for its revenue teams, typically targeted at “following up with prospective paying customers,” says Anna [28].
They supplement User Groups with its annual conference, Coalesce. Coalesce is an event focused on the practice of analytics engineering and the talks they choose focus on helping data teams be more impactful [26].
Its inaugural conference was in 2020. While it was planned to be in-person it ended up having to be online due to COVID [1]. They had 3,000 attendees at this first event but that audience had grown to 7,000 in 2022 [5]. They were back to in-person in 2023 and accessibility was a key concern. The company made the in-person venue as accessible as possible for disabled attendees. This starts with the selection of the right venue - considering factors like flooring, elevators, hallway widths, and more. Then making sure that they provide features like all-gender restrooms, ASL interpreters for keynote sessions, private lactation rooms for nursing, hotel shuttle buses, etc. They also provided an online component with live captioning, video recording, and transcripts [27] but the focus here is on getting people together “who couldn't otherwise be in the same room,” says Anna [24].
The community combines with the conference during the event. There’s an individual channel for every talk. The goal here is that “the talk becomes kind of a backdrop for a conversation,” says Tristan. “I'm never going to go to a talk again and just listen to the speaker and not also be interacting” [8].
Their conferences have proved to be especially effective in bringing the community together. “Every time we gather folks together,” says Anna, “we see a step change increase in the rate of growth of the community” [4].
Reward and recognition is an important aspect of community management. One where there’s a special balance to be had. “If someone does something wonderfully kind, I’ll try to say thanks with a small gift. As much as possible, I’m trying to recognize, rather than incentivize,” explains Claire Carroll, the first dbt Community Manager [26]. So they often go light on swag but have instead begun to give out community awards in recognition of people making a significant impact on the global dbt community. One such person is Josh Devlin, who was the recipient of one of the first community awards at the 2023 Coalesce conference. Josh remarks “One of my colleagues calculated that my 8,000+ messages averages out to around 6 messages per day” showing what it takes to become a leader in such a passionate community [29].
dbt Labs also has a quarterly Spotlight program, where they highlight community members who have gone above and beyond to contribute to the community in different ways. This functions to give recognition to those members, but also model to other members the behaviors that create value for the community [30].
A key part of creating a category is making it a viable vocation. To that end, dbt Labs have invested in training and education programs, which has also been a common ask from the community [5].
This started with the creation of the two-day in-person training program called dbt Learn. They put 20 people through that before looking at how to scale it. It ended up morphing into an on-demand online program that has trained over 1,300 people as of May 2022 [5]. It has also been used as a rapid onboarding program for its Enterprise customers, which has given larger orgs confidence in investing in dbt [5] and they now claim 950 certified Analytics Engineers [3]. The creation of this course was a cross-functional effort, that saw the community team joining up with their training, solutions architects, and sales teams [22].
All of these fantastic programs created a ton of value and the community has continued to grow. So much so that growth started to create problems for the community team. The small, intimate Slack community from the early days is very much gone. Now it’s a bustling, passionate hub of 100,000 people. Some amount of growth is good, as Anna notes [18]:
“It's really important to have a constant influx of new folks into the community to make sure that it stays healthy and there's enough humans to support one another.”
With that said, growth can be a double-edged sword. As Tristan astutely remarks, when it comes to community, “The larger it gets, the quieter you will get” - conversation doesn’t grow in line with member count [1]. The growing pains have even become something that members have shared their concerns about publicly [31]. Tristan responded in the way you’d wish every CEO would: he wrote a public response to the criticism, stepping point-by-point through each aspect, providing his reflections and a commitment to address them [32]. The challenges of operating a community at scale are not lost on the dbt Labs team.
“The larger and louder a community gets, the more challenging it can feel for someone to parse through the noise and find answers,” says Anna. They’ve also started to run into the limits of their initial tooling choices [33]. “The back scroll problem is a really significant one,” says Anna, adding “If you're not living in this channel 24/7 you're going to miss questions and people are going to feel like they're just shouting out into the void” [4].
The team’s solution to this has been to try and break the monolithic community down into micro-communities. Using Dunbar’s number as a rough guide, they’ve created channels based on interests, industries, roles, locations, tooling choices, and seniority [4]. For example, they’ve created a channel for data leaders seeking support for managing teams. “If you live in San Francisco, there's a local SF channel. If you use a particular database together with dbt, there's a channel where other people who are using the same setup are talking about their experiences,” Anna explains. They also organize coffee chats around specific channels, so getting folks from a particular channel on a virtual call together [24]. The goal with this strategy is to “create more safe and targeted opportunities for people to contribute” [18] and “helping people find those connections to someone” [4].
So now when a member introduces themselves, they’ll use the info provided to guide them towards relevant channels rather than have them all loaded up from the outset. So if they mention they work in Healthcare, they’ll flag that industry channel, or if they mention they’re from the Bay Area, they’ll highlight the upcoming meetup happening [4]. So their strategy also includes further investment in local meetups. “We go as far down as city-level, especially in places where we have lots of community members,” says Anna [18].
Discovering existing discussions can also be a problem at scale, especially in a chat-based platform like Slack. So they’ve added a forum-based platform for common, long-lived discussions [4]. This initially started by using GitHub Discussions [4], but it has since moved to Discourse [34].
They have also invested in improving the UX of finding these communities with the creation of a Developer Hub. That’s a single destination that pulls together access to its Slack, forum, as well as training materials and documentation. What’s more, conversations from the forum are dynamically pulled into relevant pages within their product documentation. They’ve eliminated friction in using multiple community platforms by enabling them to use a single login, either a username and password or a social login option [34].
Finally, they’ve also turned to community analytics tooling to help understand how community members are contributing across its multiple platforms. Keeping up with the activities of its many members was no longer something that individual community managers could do, so they now centrally store and share this context using Common Room [35].
As of May 2022, dbt Labs had 8 people working on community, documentation, and events [32]. Initially, the community team was a standalone function, reporting directly to the CEO [22]. The thinking here was that when community is placed within marketing, the focus becomes lead gen. If it lives in customer success, then the mission is often limited to support [22].
While the theory of this was fine, in reality, Anna says “it was actually quite difficult to collaborate” with marketing. Both would develop their own goals and while there was a lot of overlap, that functional separation meant it was difficult to co-execute on these shared goals. They’ve since evolved their structure to more of an embedded approach, with community team members supporting specific functions and tracking under their respective budgets. For example, Anna explains “We have a section of the team that rolls into marketing and is really focused on acquisition and spreading the good word, if you will, about dbt” [19].
Ultimately, this approach acknowledges that community is a “very, very, very cross-functional initiative,” says Anna, and this is how they’re trying to align most effectively with the functional areas they think community can best impact [36].
dbt Labs is a data company with data practitioner founders - they have a firm grasp of what can and cannot be measured. I was especially interested in understanding then how they approached measurement and reporting for community. It’s pretty telling that Janessa says “We're actually pretty comfortable saying we do not understand how Slack signups impact revenue” [5].
They track a lot of things, but don’t sweat what can’t be measured: If “people are talking about [the community]. It’s making a difference,” says Janessa [5].
They use Common Room as their foundational analytics platform and funnel all community activity from across different platforms into it [37]. However, they’re realistic about what can be tracked digitally, so fall back to “common sense attribution” when needed, for example, “simply having a question on the sales form” [39].
They take all of those signals and monitor three sets of KPIs that relate to the functions they support [28, 37]:
They not only measure specific metrics within each of those buckets but also the progression of folks across those buckets as they move forward in their community journey [18]. Measuring the efficiency of that journey is used as a way to understand whether or not the community is creating the right kind of value for members - seeing how many members transition from new members to highly engaged ones [36]. Another way they discern community health is by monitoring the ratio of active members to total members [5], and they do that on a per-channel basis too to understand health within specific micro-communities [18].
It’s not just the community team who have access to Common Room, though, their marketing and product teams are in there as well, so they have that shared context [28]. They look closely at how the community journey maps to the product journey for insights to refine its GTM strategy [38].
They’re proactive in furthering the cross-functional understanding of community within the organization, too. Every new employee gets a talk from the community team during onboarding, explaining how the company creates and derives value from the community [18], and stressing how the success of the community and the company are intertwined [16].
You can’t cover an open-source company and not get into licensing. It’s a delicate area for any community, which can, as Tristan puts it, “drive you to the top of Hacker News. Like, not in a good way” [12].
When they decided to open source the product, their reality was very different from the one they now have. They were a lifestyle consulting company, now they’re a VC-backed unicorn under pressure to live up to its high valuation. They pledged several values, which they’ve proudly stuck to over the years. However, its “Profits are exhaust” value was quietly revised to “We are a mission-driven company,” in November 2022. Its commitment to open source would have been a trickier one to alter, but dbt Labs is on the record stating they’re sticking with its Apache license approach [21]. A bold move, especially in the current economic and competitive environment they’re in [6]:
“Our expectation is that at least one, if not more, of the cloud platforms, will launch some sort of managed dbt service in the coming year,” says Tristan.
They’ve made this decision with the benefit of seeing how other open-source companies have faltered [12]. Nevertheless, doubling down is a strong commitment to its community-led approach.
From humble beginnings, dbt Labs has created a community and built its whole organization around it. As Anna says, “The company doesn't exist without the community.”
The dbt Labs community doesn’t just impact awareness via talks and content but drives better outcomes for customer support and success. It helps drive feedback and contribution to the product and documentation and it builds authentic discovery that drives acquisition. While it’s advocacy that turned all of this into a growth flywheel that has created an efficient way to propel its business forward [19].
As a result, dbt Labs has become the undisputed leader in the analytics engineering category. Its user base is greater than that of all its competitors combined [6].
For dbt Labs, its community and product are now inextricably linked to the wider analytics engineering category, and Tristan is clear about the importance of community in driving the category forward:
“Community hasn't just gotten us here. Community is what needs to continue to drive this ecosystem for the next decade-plus” [12].
There you go! That’s how dbt Labs is using community to create a category. For more details, check out the sources below. If you found this useful, please share it with friends and colleagues, and don't forget to subscribe. ✌