Engineering—October 20, 2021

How to assemble a cross-functional data quality team

Smash your data silos and improve data quality across your organization by assembling a cross-functional team to own data planning.

When was the last time you had to make a decision that would impact multiple teams across your organization? Maybe it was choosing which internal tool your company would use for project management (and finally putting an end to the Kanban vs. Gannt debate). Perhaps it was whether to have a pumpkin carving contest or costume fashion show at your Zoom Halloween party. Regardless of the choice, whenever a decision impacts multiple parties, it is always advisable for everyone involved to have input in the process. Otherwise, the needs of one or more teams are likely to go unmet.

Just like project management tools, multiple teams rely on customer data to make decisions and do their jobs effectively. Marketers need insight into their customers’ preferences and interests to build audiences for personalization. Data scientists need large data sets to train predictive models with machine learning. Product managers need to understand how customers use apps and websites in order to continuously improve upon digital product experiences. Since each of these teams has its own needs with regards to customer data, it is critical that they all have input in a collaborative data planning process. However, consistently translating each team’s requirements for data into clear technical specifications, without sacrificing data quality in the process, can be quite challenging.

For the rest of this piece, we’re going to look at a blueprint for taking a cross-functional approach to handling customer data within an organization that offers a variety of benefits, including breaking down data silos, smoother communication around data requirements, and ultimately, a higher level of data quality for the teams that depend on it. However, a quick note before we proceed––while the best practices we discuss here could potentially benefit the handling of any data, we will specifically use the term “data” to refer to customer data.

What is a cross-functional approach to handling data, and what does it look like within an organization?

Taking a cross-functional approach to handling data entails emphasizing collaboration between teams on data strategy, collection, validation and activation, rather than leaving these responsibilities up to individual teams. To illustrate how this works, it is useful to demonstrate what it is not.

Most organizations today do not take a cross-functional approach to handling their data. In these cases, individual teams often have end-to-end ownership of the data that enters the systems they use, like email solutions, analytics tools, customer service platforms, etc. These teams are responsible for relaying data requirements to engineers, at which point developers translate these requirements into production code. When data starts flowing into the systems where it will be activated, performing quality assurance and validating that the data actually fulfills its requirements typically falls on the end users as well.

This is often the approach to handling data that materializes within organizations by default. While it is quite common in the real world, even among large enterprise companies with a longstanding data practice, these practices can result in a variety of data- and process-related problems. Most significantly, relying on each individual team to communicate its data requirements to engineers is likely to result in data quality problems. Each handoff is an opportunity for either a misinterpretation or implementation error, which will ultimately result in poor-quality data making its way into downstream systems. Additionally, this structure also leads to the formation of data silos, which makes it difficult for data users across the organization to communicate and collaborate around data effectively.

In contrast to this, a cross-functional approach to data offers an alternative to one-to-one relationships between data consumers and technical implementers, and alleviates many of the problems that result from this fragmented structure. As a relatively nascent outlook on handling data, there are no hard-and-fast rules about how to do cross-functional data the right way. However, as a solution provider that partners with clients who face these challenges at enterprise scale, we at mParticle have some insight into the most effective ways of creating this transformation.

Andy Wong, a senior leader on mParticle’s Solutions Consulting team, has deep experience helping our clients develop robust data infrastructure, and partnering with enterprise organizations through processes including CDP implementation, data planning sessions, and identity strategy development. Recently, Andy shed some light on a specific structure that organizations can adopt to start reaping the benefits of cross-functional collaboration around data.

“What I’ve seen work best,” Andy says, “is when companies create a centralized function in the organization that manages data collection and data quality. Data consumers should relay their requirements to that central customer data team, and that team should be responsible for aligning the requirements, communicating the designs cross-functionally, and managing the rollouts and changes as needed.”

According to Andy, this centralized group should ideally include individuals from different data-driven teams throughout an organization, including Marketing, Product, and Data Science/BI, as well as the app developers and data engineers with knowledge of the technical requirements around implementing these data requests as code. This group should communicate closely with individual teams around their data requirements, oversee the translation of these requirements into technical specifications, and ensure that data quality standards are met.

One tool is key to this group’s ability to fulfill these needs, and that is a data plan––a centralized document that keeps different teams aligned on information related to the strategy and details of the company’s data collection efforts, including (but certainly not limited to):

What specific data points do we need to achieve our business objectives?
What naming conventions and data types should we use in our data schema?
What checks are required to validate incoming data?
How can we ensure that the data we collect complies with privacy legislations?

While this single group forms the backbone of effective cross-functional data planning, it is still critical for individual teams to have an effective process for identifying what data to collect. In the next section, we’ll outline a framework that can help data end users articulate data requirements that ensures that the data they receive is actionable and impactful.

Taking an outcome-oriented approach to data planning

The ultimate purpose of collecting data is to serve real world use cases, whether that is delivering personalized messaging to customers across touchpoints, reducing friction within user journeys, or identifying new market opportunities with business intelligence. Considering this, teams should look to these use cases, and the broader objectives they will help achieve, as their north star when deciding what data to collect.

To help anchor these discussions and ensure that internal planning around data requirements remains outcome-oriented, here are some questions that each team can consider:

Team	Questions
Marketing	What additional customer data could help us measure our KPIs? What engagement data might give us more insight into the effectiveness of our messaging? What data would have helped us more thoroughly evaluate the last campaign we ran? What additional information about our customer might help us incorporate more personalization in our messaging?
Product	How extensive are our current customer profiles? Where are our gaps in how we understand our customers? How well do we understand the ease/difficulty of common customer journeys? What information would fill the gaps here? How well do we understand how our customers are using specific features? What data do we need to drive greater adoption or our next feature?
Data Science/BI	What types of analysis do we perform most commonly? What types of analysis do we have a hard time performing due to data limitations? Can we anticipate a need to perform any specific type of predictive analytics in the next quarter/half/year? What data sets will we need to complete these? Are we using any third-party data sources whose quality is suspect? Could we replace these with first-party data sets?
Engineering	Are there certain types of data points that marketing/product/BI teams ask us to update or implement on a regular basis? What business objectives will this data serve? Are there ways we can make this implementation flexible to future data needs and save work at a later date?

In addition to helping teams ensure that their data aligns with practical use cases, taking an outcome-oriented approach to developing data requirements will also help teams adhere to data minimization. Outlined in the GDPR, this principle states that companies should collect only what is “adequate, relevant and limited to what is necessary in relation to the purposes for which they are processed”––in other words, maintain as small a data footprint as possible.

Data minimization is considered best practice for guiding data collection efforts, and for good reason. First and foremost, it helps companies ensure that they are prioritizing their customers’ privacy. Additionally, while it may seem counterintuitive, being selective about what data points to collect is a benefit to data-driven teams, as it enforces the need to be very deliberate and strategic in the data planning process.

Benefits of cross-functional data planning

By establishing a central group to oversee a data plan, and providing individual teams with clear guidelines on how to identify and communicate their data needs, organizations can help ensure that the data flowing into downstream systems is as complete and actionable as possible from the outset. This approach has many potential benefits, and can possibly prevent several costly pitfalls as well. Here are some examples:

Better data quality

Teams can only make good decisions with good data. Above all, the practices outlined above will provide individual teams with access to accurate, actionable, and comprehensible data that they can use to make decisions effectively and confidently. Additionally, a higher level of data quality will help teams accelerate the time-to-value of their data, since decision making will no longer be delayed by second guessing or last minute ad hoc quality assurance. Furthermore, when data quality is consistently high, teams can realize the full potential of their activation tools, and feel confident using them across a wider variety of use cases.

Breaking down data silos

When data stakeholders throughout an organization consolidate around a data plan as a single source of truth, both end users and technical implementers can easily align on a common terminology and nomenclature used to refer to that data across the business. Establishing a broadly understood language around customer data has many benefits, like reducing the likelihood of communication errors between technical and non-technical teams, and facilitating further collaboration on ways to leverage data among end users.

Closer alignment between technical teams and end users

When marketing and product teams collaborate with the engineers who will implement the data plan, technical and non-technical stakeholders tend to develop better communication, which leads to fewer mistakes and greater efficiency. For instance, when engineers appreciate the business cases that the data plan will serve, they will be able to implement data collection more efficiently, and anticipate ways to avoid having to update tracking code as the company’s apps scale. Likewise, when marketing and product teams have insight into the considerations around implementing data collection, they can take technical feasibility into account when developing a data plan.

Lowering the burden on developers

Bringing each data stakeholder into the data planning process reduces the chances that any one team will lack mission-critical data points after the plan has been implemented. This cuts down on data reiteration cycles, and reduces demand on the engineering teams tasked with implementing the data plan. When technical teams are relieved of these cycles, they can spend more time focusing on building core products and features rather than making additions and updates to data collection code.

Maintain a culture of data-informed decision making

Taking the time to assemble cross-functional teams to collaborate on data plans is a great way to make sure that your organization maintains a culture of data-informed decision making. Encouraging teams to get together to talk about data––to identify what works, what doesn’t, where the gaps lie and what opportunities are present––reinforces the very positive habit of leveraging data to make decisions on the individual, team, and organizational levels. Cross-functional data planning isn’t just a great way to reap the practical benefits of complete, high-quality data sets––it’s a Pavlovian experiment in habit reinforcement.

Identify new use cases for data

If one phrase perfectly summarizes the benefits of bringing multiple teams together to collaborate on data planning, it’s the old adage that “The whole is greater than the sum of its parts.” If individual teams are only thinking about their own needs for data in isolation pitching their specific use cases to the data owner in isolation, the organization misses out on the opportunity to uncover creative new use cases for data that can only emerge from breaking down functional silos.

For instance, maybe the marketing team is collecting a set of behavioral data and using this to segment customers into audiences based on the product categories they have purchased in the past. When the Product Manager in charge of in-app personalization hears about this use case and the audiences that marketing is using, that person may get the inspiration to use these same audiences to deliver product recommendations directly to users in the company’s apps and websites. This collaboration not only uncovered a new creative way to leverage customer data, but maximized the efficiency of the data plan as well, since these two teams will be able to use the same data set for these use cases.

When data stakeholders come together to discuss their current and future needs for customer data, organizations become more creative and efficient with the way they use this resource. As this set of integration use cases demonstrates, some of the most impactful ways to activate data leverage cross-functional tools and workflows.

Streamline your data planning process with a Customer Data Platform

Taking a cross-functional approach to developing your data plans is a good strategy regardless of how you are handling everything else that goes along with handling customer data, from implementing collection code to transferring data to where it needs to be. Handling Data Planning with DIY internal tools and workflows can be daunting, however, as data plans encompassing multiple touchpoints can often include hundreds and even thousands of data points. Furthermore, as data needs are constantly evolving, data plans need to be flexible and capable of adapting to changing needs and priorities of the stakeholders who rely on customer data.

Data Planning is one of the core customer data challenges that mParticle’s CDP addresses. Teams that place mParticle at the heart of their customer data infrastructure can leverage an ecosystem of tools for data planning and related functions. If these examples have gotten you interested in a purpose-built data planning tool, here’s a post that walks through how to build a data plan in mParticle, and a video that demonstrates visualizing and debugging customer data in real time.

AuthorSean RyanTechnical Writer