Airbyte is an open source data integration platform. It helps organizations to consolidate their data in data warehouses, lakes, and databases. They enable the movement of data through an engine built around connectors that take it from a source, like Salesforce, and move it to a destination, like Snowflake. But there’s a problem: the long tail of connectors needed to provide for all of a company’s data movement needs.
To solve this, Airbyte built the largest community of data engineers in the world who rallied around the goal of finally commoditizing data pipelines. In doing so they created an industry-leading catalog of connectors that enabled them to disrupt the data integration space. We'll closely examine the strategy and tactics they used to achieve this and how community contribution can help make big visions a reality.
As we’ve undergone digital transformation, every company now uses more software and creates more data than ever before. Organizations worldwide used an average of 130 software-as-a-service (SaaS) applications in 2022 [1]. That means for any product based around integrating data and software, there’s a huge number of integrations that need to be built and maintained. What makes this a particularly tricky problem is that integration usage typically follows a power law distribution - a small number of integrations accounts for the majority of usage.
This was my experience at Geckoboard, a KPI dashboard product that relies on integrations to source data to display. The more sources a user could integrate with, the more likely they would convert to a customer and stick around. However, there was a core of around 12 integrations that accounted for ~80% of usage - these were for products like Google Analytics, Salesforce, and other market-dominating software. Then there was the long tail of integrations, each only used by a few customers, but were often just as important to those customers as the core ones.
So get building, right? But as Airbyte COO John Lafleur writes, “the hardest part… is not building the connectors, it is maintaining them. That is costly,” and for any commercial organization, there comes a point at which the return on investment doesn’t make sense to keep building new integrations [2].
That’s the reality Airbyte saw in the data integration space. John explains how incumbents like Fivetran, “stay stuck at max… 200 connectors. They've been ages at it. And they've plateaued now.” Airbyte’s solution was to build a community around data integration and to leverage their contributions to work towards building a library of thousands of connectors [3].
Airbyte turned to open-source as a way to solve the long tail integration problem, building a community of data engineers who contribute by building and maintaining integrations. They went about this in 3 ways [3]:
They started with an inspiring grand vision around which folks could rally: “to commoditize data integration once and for all.” Others had tried to achieve this before but hadn't been able to pull it off. Airbyte was sure they could and set out to deliver by using velocity and transparency to build trust.
From the outset, Airbyte built its company in the open with a public handbook outlining its strategy, vision, and a team manual for employees. They also shared key documents, like their investment pitch decks, and created a lot of content around data integration best practices, believing that providing value upfront would also help them gain trust [6]. Moreover, when they started, they had one Slack that both their team members used and was open to the community [3].
To help with velocity, they built a connector development kit (CDK) that helped reduce the typical time it takes to build an integration from a couple of days to two hours, and make it easier to maintain, too [3]. Then, as Developer Advocate Justin Chau explains, they got shipping [7]:
“From the very beginning, we committed ourselves to a release schedule of at least once a week. This approach ensured that our community was continuously engaged and excited about the project's progress”
They doubled down on this with monthly community calls and weekly open office hours slots that were attended by co-founders and user success team members, to show, not just tell the community about their progress [3].
They also invited their community to contribute to their roadmap as another powerful way to maintain momentum. By seeking feedback and incorporating suggestions from users, they were able to create a sense of ownership and foster a spirit of collaboration [7].
To support their burgeoning community, they formed the Community Assistance Team. This team is charged with the mission “to support and assist the community in finding solutions and creating connections with each other and with Airbyte” [4]. Each new member gets a personalized welcome message upon joining, and they ask for product feedback and suggestions [5]. Many new members are seeking support, so a big goal of the team is to acknowledge or respond to every question asked within just a few hours [14]. To do this, they invested in a global team early with staff located in EMEA, the US, and APAC [3]. Their average 'Time to Response' is about 2.5 hours, and 'Time to Resolution' is 3.5 hours [5].
Great support though also extends beyond the service they provide, and into how they’ve architected the product, too. For example, they noticed that the industry lacked standardization and enforcement of protocol when it came to integrations. This meant that developers could add whatever code and functionality they wanted to their implementations, making them harder for other users to adopt and maintain. They spent a lot of time working with their community to define an open protocol to provide a standard set of components that contributors could make use of when developing integrations [17].
What's more, even if a user's support experience falls short for some reason, their open source approach helps here too. If you notice an integration is broken, you don’t need to wait for anybody to fix it. You can fix the integration yourself, everyone else benefits from the fix, and you’re not at the mercy of a closed-source company's support priorities [15].
Through their maintainer program, they provide incentives to create submissions and direct folks to the most impactful PRs [6].
Contributors “have points levels. Every PR you do, every issue, every interaction you have, gives you points and levels.”
There’s a leaderboard to invite some friendly competition, and at major-level milestones, they send out swag [3]. Frequent contributors are also spotlighted in their ‘Ink-credible data people’ series of interviews [21] and they launched a surprise and delight program, sending thank you notes to folks who contribute to the community. Their larger vision for this, though, is to incentivize users to help create the connectors through a revenue-sharing model [8]. Maintainers of connectors could get a revenue share for integrations used in the paid version of Airbyte in exchange for SLAs, features, and fixes for those connectors [9]. So far, they’ve created a bounty system as part of their maintainer program to financially reward code and documentation contributors, paying people to submit the highest priority PRs and take connector quality to the next level [10].
Airbyte launched with just 6 connectors, but within 18 months had reached ~170 connectors and had caught up with incumbents - a feat that took Fivetran some 8 years to accomplish [3]. Of those, 80 were created by community members. However, with the introduction of their CDK, the ratio of community vs. Airbyte-created connectors has shifted.
Many connectors came about through Hacktoberfest, the annual month-long celebration of open-source projects. In the first year, by providing some rewards for new connectors being built, they added 42 new connectors [19]. Over 100 connectors were built the next year, with one community member submitting 15 of them. Cash prizes were a big incentive here. They paid out between $500 and $700 per approved PR, and $100 for issues raised [18]. In total, they gave out a combined $70,000 in prizes across all of the received submissions [13]. This has proved pivotal in proving that a community can, with the right incentives, drive meaningful contributions [3]. They’ve subsequently started their own month-long connector hackathon, offering swag and cash prizes for migrating connectors and producing video and written tutorials [16].
Now there are more than 350 connectors, making them the industry leader in the number of integrations [11], and they have over 600 contributors [12]. In fact, the Airbyte team is no longer building connectors at all [3] and they receive 3-5 community PRs a day [6].
“We are just focusing on the reliability and the quality of connectors. The community has taken over,” says John.
Like most open source tools, their goal is to become the industry standard and this best-in-class integration library helps them to deliver on that goal [14]. Ongoing maintenance of all these connectors remains a concern, but they’re actively trying to solve this with financial incentives as part of their maintainer program [20].
Of course, the benefit to Airbyte of a community-led approach hasn’t only been code contributions. In the first 7 months from launch, they scaled their Slack to 1,000+ members, of which 45% were active every week [5]. This has now surpassed 15,000 members, making it the largest community around data integration, providing them with a core target audience for distribution [11], not only for their open source project but for their paid offering too [3]. What’s more, with around 6,000 deployments per month, they benefit from having more sales conversations than any other provider, giving them unique market and product insights. They also use the community feedback they get to create better documentation, and improve the overall developer experience [6].
That's it! That’s how Airbyte is using community contribution to solve the long tail of integrations. For more details, check out the sources below. If you found this useful, please share it with friends and colleagues, and don't forget to subscribe below. ✌