Data strategySeptember 28, 2021

Improve data quality with mParticle’s data planning infrastructure

Data planning is the foundation of any mature data strategy. But without the ability to translate a data plan into views conducive to each team's needs, it's hard to turn planning into action. Learn how mParticle's data planning infrastructure can help you get more out of your data plan.

improve-data-quality

We’re living in a data-driven economy. Today, the Cannes Lions Grand Prix are awarded to the brands, like mParticle customer Burger King, who are able to design the most innovative data-driven experiences, and governance of customer data are the primary focuses of international regulators and Fortune 100 companies alike.

Although it has never been easier for companies to collect customer data, there is a big gap between the few organizations that are able to use their data to drive results and the many that aren’t. One key trait that separates the winning group is the prioritization of data quality. 

Customer data that is inaccurate, inconsistent across tools, or incomplete can lead to wasted advertising budget, mis-targeted customer experiences, and hours of tedious de-bugging for developers, all of which restrict growth. No matter how much data you have, it’s useless if your teams don’t trust it.

Having a process in place to protect data quality, on the other hand, enables you to build rich customer profiles, seamlessly integrate data across tools, and power real-time customer experiences. The foundation of any robust data quality system is data planning. 

In this article, we’ll discuss how planning helps you improve data quality and explain how to create a plan mParticle. We’ll then walk through two examples of how you can translate your data plan into views conducive to different stakeholders.

What is data planning?

Data planning is a cross-departmental exercise in which key stakeholders come together to establish the data that will be collected, what it will look like, where it will live, and how it will be used. Data planning helps guide developers’ and marketers’ data-related tasks while also promoting cross-org collaboration and data minimization. Your data planning exercise can be completed in a good ol’ fashioned spreadsheet or using a purpose-built data quality solution within your Customer Data Platform.

But if data planning is so simple, why don’t all teams just get it over with and enjoy great data quality for life?

Once a data plan is created, it’s often cumbersome for different teams to apply their plan due to the fact that each team needs the plan in a distinct format. Developers need to be able to get the plan into their IDE so that it can guide them through event implementation. Marketers need visibility into how events and attributes are organized so that they can build audience segments and design campaigns. Furthermore, every company is different, and the process that works for a marketing team at one company may not work for the marketing team at another.

mParticle’s data planning infrastructure enables teams across your company to turn your data plan into action by making it easy to download plans as JSON and translate them into any view needed. This allows each team to access your universal data plan in the view that suits them. Let’s walk through a couple examples.

Simplify data plan implementation for developers

Once a data plan has been finalized, it’s up to developers to instrument the events listed in the plan. It’s important for this process to be efficient and error-free, as developer resources are precious and errors at the implementation stage can lead to data quality woes throughout the data pipeline.

When data plans are stuck in a spreadsheet or a closed-off SaaS tool, developers are forced to jump between their IDE and another location to copy and paste event names – a tedious process.

Here’s how developers can use mParticle’s data planning infrastructure to simplify data plan implementation.

Step 1: Create a Data Plan in mParticle

mParticle Data Plans allow you to define multiple data plans within the mParticle UI. Each Data Plan contains self-identifying information like an ID and version number, as well as information describing each of the plan’s data points including event names, event attributes, and the expected data types. Data Plans serve as an interface for marketers, product managers, developers, and analysts to collaborate on defining the customer data that is important to the business.

improve-data-quality

Step 2: Download your Data Plan

Once you’ve created a Data Plan in your mParticle workspace, you can download it as a JSON file directly from the plan version editor. The resulting JSON can be updated in a text editor or programmatically via your own custom scripts.

Alternatively, you can use the mParticle Command Line Interface (CLI) to fetch it and store it in your project. More information on how to do that is available in this post

Step 3: Instrument event collection based on your data plan

With your data plan stored in your project as a JSON object, you can begin to implement data collection. mParticle’s Data Planning Snippet SDK helps streamline data collection and prevent errors in the process. This SDK ingests individual data points from your Data Plan, and translates them it into executable code inside of an interface in which you can copy-paste your Data Plan and generate event collection code:

improve-data-quality

If you want to set up code completion in your IDE, mParticle developer tools, such as Smartype, are available to simplify event collection and enforce the rules of your Data Plan. Smartype automatically translates your data plan into usable libraries with a single CLI command. Once these typesafe libraries are in your project, Smartype provides autocomplete and linting features that make it virtually impossible to collect data that does not conform to your Data Plan. This post goes into detail on how to set up Smartype and use it for this purpose.

Simplify audience segmentation for marketers

A second application of a data plan is as a reference document for marketers using data for growth initiatives, such as audience segmentation. When creating audiences, it’s helpful for marketers to know which attributes are available for each event in their data set, as well as what those data points mean, so that they can create compelling customer experiences and optimize spend.

As data planning documents are set up for the purpose of cross-team collaboration, they are often not designed for marketers’ needs primarily. To support marketers, mParticle certified solution partner Human37 designed a workflow that allows clients to use the mParticle data planning infrastructure to download a data plan and transform it into a matrix more conducive to audience building and other data-centric marketing initiatives.

Step 1: Create a Data Plan in mParticle

As with the previous use case, this workflow begins by creating a Data Plan in mParticle.

Step 2: Download your Data Plan

Once a Data Plan is created in mParticle, it is stored as a JSON file. You can download a plan version as a JSON file directly from the plan version editor. The resulting JSON can be updated in a text editor or programmatically via your own custom scripts.

Alternatively, you can use mParticle’s Data Planning API to download a specific data plan from mParticle programmatically. More information on how to do that in the docs here.

Step 3: Install dependencies and run the script

Following Human37's documentation here, go to the folder: `cd h37-mparticle-data-plan` and install dependencies `pip install -r requirements.txt`.

Finally, run the script `python main.py`. This will generate a CSV file named matrix.csv by default.

The matrix generated will consist of three main elements:

  1. Rows representing events, ordered alphabetically
  2. Columns representing attributes, ordered alphabetically
  3. An “X” at the intersection of rows and columns if that event-attribute combination exists in your mParticle data plan
improve-data-quality

improve-data-quality

Human 37 has open sourced a basic version (v1) of the Python script on their Github account. This allows you to create your very own analyst matrix for your mParticle implementation. You can access the documentation here.

Improve data quality by turning your data plan into action

A data plan is the first, and critical, step towards protecting data quality across your organization. But without the ability to effectively communicate your plan to different audiences across your company, the exercise will leave you no better off than you began.

mParticle’s data planning infrastructure enables you to turn planning into action by making it easy to download your data plans from the mParticle UI and translate them into views conducive to each team’s needs. 

To learn more about mParticle’s Data Planning offering, you can see the docs here.

To check out the mParticle platform, you can explore our platform demo here.

Get started today

Connect with an mParticle expert to discuss how to integrate and orchestrate customer data the right way for your business.

Request a demoContact us

Startups can now receive up to one year of complimentary access to mParticle. Learn more