How to plan your integrated data layer
Learn how to develop and implement a data layer instrumentation plan that aligns with your business and techical goals and provides you with a centralized source of clean customer data to power product and marketing initiatives.
Customer data is one of the most valuable assets available to companies, but making this data accessible and actionable for the teams and tools that rely on it can be a challenge. To gain significant insight into your customers' preferences, experiences, and expectations requires a well-instrumented data layer, like a customer data platform. Using a customer data platform to collect, transform, and orchestrate your data provides you with the scalable foundation you need for your growth and analytics stack.
This guide will take you through the process of planning and instrumenting your customer data layer from start to finish, including:
- Identifying the types of user data you need to collect
- Choosing a collection approach and organizational structure
- Mapping data to the schema of your customer data platform and integrated tools
- And integrating data from all of your sources
Following these best practices will help you develop an instrumentation strategy that aligns with your business and technical goals and provides your entire organization with a comprehensive, democratized source of clean customer data that can be used to power marketing and product initiatives.
Step 1: Understand your KPIs and use cases
Before instrumenting your data layer or analytics tools, you need to understand what you want to accomplish with your customer data and what data you need to support those goals. A robust, well-planned data management strategy can empower all facets of your business, but it's important to keep in mind that the data that is relevant for your marketing team may not be the same data that is relevant for your product or engineering team.
Defining the types of data you need to collect about your users ensures your data management strategy will map to the business and technical needs of your entire company.
Step 2: Develop your data plan
Developing your data plan is arguably the most important step in instrumenting effective measurement and analysis of your customer data; any analysis you perform will be entirely dependent on the dataset you collect. There are generally two approaches to data collection, top-down and bottom-up, that can be used in conjunction to flesh out your strategy.
Approaches to data collection
Top-downThe top-down approach focuses on using high-level business goals, like KPIs and use cases, to determine what user data you collect. At this stage, you’ll need to dig deeper into the understanding of your use cases to determine the specific data points needed to properly measure their performance and metrics.
The bottom-up approach uses granular data to drive your strategy and should be used in parallel to the top-down approach. Bottom-up data management strategy planning requires fully exploring your digital properties and databases to understand all available data about the user and any actions that are possible for the user to perform. This low-level, granular view will allow you to uncover some areas of interest that you may not have considered when thinking about your data from the “top-down” perspective. From there, you can determine which granular data points are valuable for collection and analysis.
Both approaches require that you survey the full suite of digital properties because some types of customer data may only be available from an app or website, while other data may be stored server-side in your company’s databases, for example.
Types of data to collect
There are typically two main categories of data that you can collect on your users through your digital properties: user data and event data. User data refers to data about your customers while event data refers to data on customers’ actions; both of these types of data are important when creating a well-rounded data management strategy.
User data describes identities and attributes of your users that should be maintained and persisted in their user profile. For most companies, the most basic user attributes to collect will include demographic data on the user’s age, gender, and location, but can be expanded to include more specific attributes like membership level, lifetime value to your company, or opt-in status for marketing communications. These attributes allow you to analyze your user base to gain insights and build segments to perform specific workflows.
If your business operates in the EU, you may also need to capture and manage your users’ consent status to adhere to the General Data Protection Regulation (GDPR) standards. Consent parameters for a user function very similarly to user attributes, in that they describe a persistent property of that user, based on their consent response. Robust customer data platforms, like mParticle, can capture and maintain consent status for individual users, allowing companies to action and filter based on a user’s consent status. Consent collection and management is something that requires thorough planning and consideration from your organization with input from legal and privacy teams to ensure compliance.
Event data describes behaviors performed by your users, typically within your digital properties. These events can describe explicit actions such as navigating through an app, watching a video, performing a search, or completing a purchase. They can also describe passive actions, such as loading a particular screen or completing a call to your servers from the client.
There are many ways to track event data, so it’s helpful to pick a methodology and stick to it through your instrumentation. For example, when tracking a user’s navigation through your app, you could log events for each individual tap/swipe that takes them to a new page, or you could log an event when a new page loads. You may even want to track both, but the important thing is to be consistent.
You’ll also need to decide how granular your event tracking should be. For example, if you have a sign-in page on your app or website, you likely want to track sign-in events as a key data point, but you could increase the granularity of event data to include sign-in attempts and failures prior to a successful sign-in. This type of granular tracking is particularly useful for funnel analysis and A/B testing when you want to understand precise user behaviors.
Taxonomy of events
Another important consideration when creating your data plan is the taxonomy that will be used for your events. There are an infinite number of ways to define the nomenclature of your data, but there are a few best practices that will help you keep your data scalable and usable.
Understand how the data will be used
Create a data model that your team will be able to use and understand. Keeping in mind who will be using the data after it’s captured is critical because their understanding of the data model will directly impact how well they are able to analyze the data. Ideally, every team member that will come into contact with your customer data should understand what each data point represents so that when the time comes to analyze or report on that data, they know how to find the right insights.
Naturally, most users will want to create event and attribute names that make sense in plain language, rather than coded names that will require translation. To improve legibility and organization, we recommend adding prefixes to event names so they can be easily grouped by module.
Keep event names abstract
Maintaining a user-friendly taxonomy means you need to build in scalability. Keep your data model scalable by keeping your event names abstract and adding more detailed information in the custom attributes of the event. As you add new content to your digital properties, the new values associated with that content will expand the set of metadata that you’ve captured, rather than inflate the set of unique event names.
One example of this type of abstraction can be applied to product views on an ecommerce site. Ideally, the event name could be abstracted to something like “View Product”, while the details of the product itself would be captured in custom attributes such as “Product ID,” “Category,” and “Product Description.” This taxonomy ensures that your data is easy to find and read.
Step 3: Understand how your data will map in your integrated tools
Now that you have developed your data plan, you’ll need to understand how that data will map to the schema of your customer data platform and integrated tools. This means familiarizing yourself not only with how events are tracked within the platform but also how your customers’ identities are managed.
Customer identity resolution
To get the most accurate and useful insights from your data, you need to understand how your customers’ identities are resolved and managed within your data analytics tool or CDP. Different tools will manage identity differently, so it’s important to understand your specific tool’s nuances.
When considering your tool’s identity resolution capabilities, you should ask yourself:
- What persistent identity types does the tool support? (custom IDs, email, social logins, etc.)
- What are the minimum user identifiers required for data to be accepted?
- How does the tool handle multiple users on the same device?
- How does the tool resolve the identity of a single user across multiple devices?
- How does the tool associate anonymous actions with actions from subsequent logged-in users?
- How does the tool maintain user attributes/properties against the customer’s profile?
Understanding the answers to these questions will guide how you send data and report analysis from your integrated tools.
At an event level, each tool will also have its own unique event architecture that may differ from other tools. Some tools have dedicated event types for events like ecommerce actions, media interactions, or navigation with unique parameters and attributes built for that specific type of event data. Before implementing any code in your apps, determine which events in your data plan should be mapped to a specific event type.
Step 4: Instrument data collection across your data sources
Once you’ve developed a comprehensive data plan and mapped it to the architecture of your customer data platform or analytics tool, it’s time to instrument the data collection across your digital properties. This will involve familiarizing your development team with the data collection SDKs or server APIs of the chosen tool.
Implement across relevant digital properties
If you choose to collect data directly from your apps and sites, as many companies do, you’ll need to add the data collection SDK to those properties. You may be directly implementing an analytics SDK, or you may be piping your data through a customer data platform, like mParticle. Either way, you’ll want to thoroughly review the documentation for the SDK to understand how it can be configured and how the methods should be properly called.
There are some areas of SDK implementation that can be more complex than others, so I’ve listed a few below that may require additional attention and testing.
- SDK initialization and configuration
- User sign-in and sign-out
- Ecommerce actions
- User attribute updates
Instrument server-to-server jobs
In addition to client-side data from your digital properties, you may have customer data that sits in your company’s servers or databases. Many customer data platforms and tools will expose server APIs that will allow you to transmit event and user data directly to their system.
When orchestrating the server-to-server jobs, you should be familiar with the service’s API endpoints and how to construct requests to those endpoints. Many platforms will have multiple endpoints with specific uses, so be sure that you are using each endpoint for the correct purpose and type of data.
While creating and executing on a data layer instrumentation plan can seem daunting, following the steps outlined in this guide will help you create a scalable data foundation that will benefit your customers and your business. Taking the time to consider what business and technical goals you want to accomplish, what data you need to accomplish your goals, and how you will collect and store that data from your digital properties will ensure your data layer maps to your business’ needs and enable you to gain insight into your customer base.
To learn more about how mParticle can serve as the foundation for your integrated data layer, you can explore our documentation here.
Latest from mParticle
Avoiding the growth trap
What do cattle farmers from the 1600s have in common with teams across modern companies? Both rely on shared resources that can quickly be depleted by an overzealous desire for growth, leading to the tragedy of the commons. Learn how you can avoid the growth trap by leveraging your customer data infrastructure and saving your engineering resources from depletion. Stop the vicious cycle, not the development cycle.
Why real-time data processing matters
Business-critical systems shouldn't depend on slow data pipelines. Learn more about real-time data processing and how implementing it strategically can increase efficiency and accelerate growth.
APIs vs. Webhooks: What’s the difference?
An API (Application Programming Interface) enables two-way communication between software applications driven by requests. A webhook is a lightweight API that powers one-way data sharing triggered by events. Together, they enable applications to share data and functionality, and turn the web into something greater than the sum of its parts.