EngineeringNovember 02, 2018

10 Critical data infrastructure capabilities

Instead of focusing on core data management challenges, many Customer Data Platforms are focused on the application of data. Learn about the 10 critical components of modern data infrastructure.

modern data infrastructure

A few years ago no one had heard of customer data platforms (CDPs), but now it seems everyone wants to be one as customer engagement has become an integral part of reaching business goals. As Forrester analyst Joe Stanhope writes in a new report (Forrester subscription required), “Brands need a modern data fabric to support high stakes customer engagement.”

The problem, according to Stanhope, is that the CDP category is now a mile wide, with many vendors only an inch deep. Rather than solve core data management challenges, as should be the category’s primary intent, the majority of companies currently billing themselves as “customer data platforms” are, instead, focused on the application of data, not the enablement of other people’s data applications. They are platforms only in a loose sense of the term.

Therein lies the confusion. If every company that ingests and stores customer-level data, and exposes profile information to other systems, is a CDP, then almost every martech vendor can call themselves one (and will). As Stanhope writes, “The lack of structure and go-to-market rigor in the CDP market today makes it difficult for marketers to understand potential benefits, identify prospective vendors, and make the business case to invest.”

Customer data infrastructure: Foundational vs app-centric

All that being said, there is a subset of the CDP market that differs substantially from conventional martech in its focus and approach, and this is the relatively small group of CDPs addressing core infrastructure challenges.

Although Stanhope calls these “data-pipes-oriented” CDPs, we prefer to use the term “foundational CDPs” as it contrasts easily to the many flavors of app-centric CDPs (which Stanhope in his report refers to as “measurement-oriented”, “automation-oriented”, and “orchestration-oriented” vendors).

Foundational CDPs like mParticle are not “yet another tool“ in the stack. Rather, they focus on addressing upstream challenges associated with data collection and hygiene, identity resolution, pipeline management, and enrichment, improving the data that’s available to  all the other tools in your stack, while replacing none of them. In other words, foundational CDPs are the foundation of the house, not the windows or doors or trimming.

The buyers are these solutions are highly technical and applications they enable are not limited to “marketing” or “advertising” at all, but also include a number of product and engineering ones, for example.

What constitutes a foundational CDP

At a minimum, a company’s customer infrastructure should meet the following criteria:

1. Flexible, easy-to-use data ingestion and export tools

Why should you care: Every organization has a unique data infrastructure; a foundational CDP should allow you to choose the API’s, SDK’s and other tools that allow you to optimize your setup.  It should be able to standardize data across these systems while accounting for nuanced differences.

What to beware: Look out for CDP’s with oversimplified data transfer tools; file uploads and exports are easy to implement, but don’t give you the benefits of real-time data processing.

2. Ingest and process data in real time

Why should you care: Personalized, timely customer communications requires data to be available downstream for activation within seconds.

What to beware: Watch out for vendors who oversimplify data ingestion, using tools like CSV file uploads - their solutions probably do not allow for truly real-time data processing.

3. Ability to handle huge, fluctuating data volumes

Why should you care: When it comes to the amount of data collected, sudden spikes from special events and promotions can easily overwhelm an infrastructure that is not properly prepared to handle them.  For foundation CDP to support you at high priority marketing times, it must be built to scale quickly.

What to beware: Any vendor who recommends an on-premises setup, or a private cloud.

4. Data transformations for specific inputs and outputs

Why should you care: Data hygiene is a problem for even the most technically sophisticated organizations.  The ability to fix inconsistent tagging, standardize data across ingestion points or customize naming and formats for a specific downstream system is a core requirement for enterprise data usability.

What to beware: One-off manual transformations, or bolted-on MDMs.

5. Sophisticated, customizable Identity resolution  

Why should you care: A foundational CDP should be able to stitch the user journey together across channels, accounting for variances in the types of identities available from each source.  The solution should have built-in flexibility and easy-to-use tools to address your consumer privacy and regulatory needs.

What to beware: What doesn’t count: Look out for vendors using probabilistic matching who say they can have a broader reach.  Look out for vendors whose product can’t be customized to the identities you have available and the rules you have for tracking anonymous vs. known users.

6. Intelligence to handle Identities for all downstream applications

Why should you care: Different personalization practice can require different identity information (anonymous vs. known, email vs CRM identifier).  A foundational CDP should be able to optimize the identities forwarded to include only what's needed (accounting for each partner’s PII requirements), and to pass data enriched with all relevant identities from the user profile.  

What to beware: Vendors who can’t only handle certain known identifiers, and no anonymous ones. Or vice versa.

7. Bi-directional integrations with downstream (execution) systems

Why should you care: A foundational CDP should offer bi-directional data flows that give you performance insights back from downstream (executional) tools. That way, for example, you can flag a user who converts through one channel (say, a marketing email) not to be targeted via another channel (say, Facebook ads); or, you can flag a user who never responds to one channel (e.g. never opens emails) to be sent messages via another channel (e.g. push notifications).

What to beware: Vendors that tout “machine-learning” or have black box tools that automatically choose the best downstream services but do not offer bi-directional integrations so that you can access all the data, and build and test your own models against theirs.

8. A standardized system to build new integrations

Why should you care: As the martech and adtech ecosystems expand and you build out your stack, you'll want to know that your CDP can continue to support your data needs, no matter what the future may hold.

What to beware: Vendors that don’t have standardized APIs to build into.

9. Fine-grained data controls that can be updated in real-time

Why should you care: Whether for privacy and regulatory reasons, data volume optimization or experimentation, you should have a range of methods available to start and stop forwarding.  You should be able to control forwarding by event and by user (based on their profile attributes).

What to beware: Vendors that can control data only at the event level, without the flexibility to exclude users with certain attributes.  Also, beware of vendors who say they can selectively forward data for GDPR but can’t customize the setup for different purposes, e.g. marketing v. analytics.

10. Long-term, granular data storage

Why should you care: You should be able to build out your database and choose the best vendors for each service with confidence that previously collected data will still be available for activation no matter what.

What to beware: Vendors that can’t replay your historical data, and who don’t consider future-proofing your data architecture a primary goal.

Let use cases lead the way

A foundational CDP is no replacement for a data management strategy or a data warehouse. However, once you have a strategy and baseline architecture, a foundational CDP can add significant agility and resilience to your infrastructure, as well as reduce its operating costs (particular when it comes to building and maintaining integrations) and improve governance.

Use the principles above as to check if your existing infrastructure meets the demands of modern business. If not, ensure that whatever CDP you choose offers these capabilities.

To learn more about how mParticle can support your data infrastructure, you can access our documentation here.

Thanks to Serena Xu and Justin McManus for their assistance with this blog post.

Latest from mParticle

MACH Alliance and mParticle featured image


Leading the next generation of CDP solutions: mParticle celebrates acceptance into the MACH Alliance

Madeleine Doyle – April 16, 2024
A stock image of a woman hiking with the onX logo

How onX accelerated marketing campaigns with mParticle's AI engine

April 17, 2024
A headshot of mParticle's Director of Digital Strategy & Business Value, Robin Geier


Introducing Robin Geier: mParticle's Director, Digital Strategy and Business Value

Robin Geier – April 16, 2024
M&S Customer Story

How Marks & Spencer drove revenue growth with mParticle's real-time segmentation and personalization

February 26, 2024

Try out mParticle

See how leading multi-channel consumer brands solve E2E customer data challenges with a real-time customer data platform.

Explore demoSign upContact us

Startups can now receive up to one year of complimentary access to mParticle. Receive access