Data Architect Working Group — Composed of senior data engineers from across the company. Responsible for making major architectural decisions, and conducting reviews for Midas certification (see below).
Trust is the key ingredient that ties all great relationships together. It’s the glue. The beauty of Robin’s teaching is he provides very simple, yet actionable advice and reminders that we can all use in our day-to-day conversations to built trust with the people around us.
We created new communication channels to better connect the data engineering community, and established a framework for making decisions across the organization. We created the following groups to address these gaps:
Once momentum on the Data Quality initiative reached a critical point, leadership realigned the company’s limited data engineering resources to kickstart the project. This was sufficient to unblock progress on Airbnb’s most critical data; however, it became obvious that we needed to unite and substantially grow the data engineering community at Airbnb. Below are changes we made to facilitate progress.
Tables describing a similar domain are grouped into Subject Areas. Each Subject Area must have a single owner that naturally aligns with the scope of a single team. Ownership should be obvious.
“The individuals in life that are able to either mask their agenda or shift the agenda to something altruistic will have great success at building rapport.” — Robin Dreeke
Data Engineering Leadership Group — Composed of data engineering managers and our most senior Individual Contributors. Responsible for organizational and hiring decisions.
Meanwhile, the company built Minerva, a widely-adopted platform that catalogs metrics and dimensions and computes joins across these entities (among other capabilities). Given its broad capabilities and wide-scale adoption, it was obvious Minerva should continue to play a central role in our data architecture, and that our data models should play to Minerva’s strengths.
We revamped our hiring process for data engineers, and allocated aggressive headcount towards growing our data engineering practice. We paid particular attention to bringing in senior leaders to provide direction as we make decisions that will affect the organization in the years to come. This is an ongoing effort.
To complement the distributed pods of data engineers, we founded a central data engineering team that develops data engineering standards, tooling, and best practices. The team also manages global datasets that don’t align well with any of the product teams.
For several years, Airbnb did not have an official Data Engineer role. Most data engineering work was done by data scientists and software engineers who were recruited under a variety of different monikers. This misalignment made hiring for data engineering skill sets very challenging, and created some confusion with respect to career progression. To resolve these issues, we reintroduced the role “Data Engineer” as a specialization within the ranks of the Engineering organization. The new role requires Data Engineers to be strong across several domains, including data modeling, pipeline development, and software engineering.
The company’s initial analytics foundation, “core_data”, was a star schema data model optimized for ease-of-use. It was built and owned by a central team, and incorporated numerous sources — often across different subject areas. This model worked extremely well in 2014; however, it became more and more difficult to manage as the company grew. Based on this learning, it was clear that our future data model should be designed thoughtfully and avoid the pitfalls of centralized ownership.
The next step was to align on a common set of architecture principles and best practices to guide our work. We provided comprehensive guidelines for data modeling, operations, and technical standards for pipeline implementation, which are discussed below.
We also committed to a decentralized organizational structure composed of data engineering pods reporting into product teams (as opposed to a single centralized Data Eng org). This model ensures data engineers are aligned with the needs of consumers and the direction of product, while ensuring a critical mass of engineers (3 or more). Team size is important for providing mentorship/leadership opportunities, managing data operations, and smoothing over staffing gaps.
- The lawyer of an Iranian-British national detained in Iran on widely refuted spying charges has told The Associated Press that she has finished
- And since taking office, the Democrat has ordered the reunification of migrant children with their families, ended construction of the border wall and called.