How to use a Customer Data Platform with your data warehouse
Data warehouses enable critical insights, and speed of data collection and stability of warehousing are important to their performance. Learn how you can use a CDP with your data warehouse to improve functionality and take action on business intelligence.
The Customer Data Platform (CDP) space has grown significantly in the last few years, with many brands beginning to implement CDPs as the foundational infrastructure of their growth stack.
When initially learning about CDPs, however, some may find themselves asking, “What’s the big deal, doesn’t our data warehouse already do that?!”
The confusion lies in the fact that both systems ingest data from multiple sources and allow stakeholders across various teams to access that data. A closer look, however, reveals that data warehouses and CDPs are fundamentally different tools and that they are not mutually exclusive. In fact, they can be used together to unlock numerous use cases.
What is a data warehouse?
As defined by Amazon Web Services (AWS), a data warehouse is a central repository of information that can be analyzed to make more informed decisions. Data warehouses collect processed data from transactional systems, relational databases, and other sources on a regular cadence (often not in real time) and organize it into databases.
Marketers, product managers, and data scientists use applications such as business intelligence (BI) tools and SQL clients to access and analyze data within the data warehouse. The value of data warehouses is in their ability to collect, organize, and store large amounts of data in a way that is easily accessible to these applications’ reports, dashboards, and analytics queries.
Benefits of using a data warehouse include:
- Better access to data for informed decision making
- Consolidated data from many sources
- Historical data analysis
- Data quality, consistency, and accuracy
- Separation of analytics processing from other “upstream” systems, such as transactional systems, which improves the performance of all systems
Examples of leading data warehouses are Amazon Redshift, Snowflake, and Google BigQuery.
What does a typical data warehouse architecture look like?
Data warehouse architecture is often broken into three tiers. The top, most accessible tier is the front-end client that presents results from BI tools and SQL clients to users across the business. The second, middle tier is the Online Analytical Processing Server (OLAP) that is used to access and analyze data. The third, bottom tier is the database server where data is loaded and stored. Data stored within the bottom tier of the data warehouse is stored in either hot storage (such as SSD Drives) or cold storage (such as Amazon S3) depending on how frequently it needs to be accessed.
What is a Customer Data Platform?
A Customer Data Platform (CDP) is a centralized data infrastructure that collects a company’s customer data from across sources, validates it against an established data plan, ties it to persistent customer profiles, and connects that data with the tools and systems used to drive growth.
CDPs support Developers, Product Managers, and Marketers by making it much easier to collect customer data in real time, improve the quality of that data, and get that data to external tools and systems where it can be used for customer engagement, analytics and more. With a CDP in place, developers can spend less time working on vendor implementations and managing third party code, and Product Managers and Marketers can access the real-time data they need, where they need it.
The benefits of a Customer Data Platform include:
- Increased access to real-time customer data for non-technical stakeholders
- Improved customer data quality throughout the tools and systems that are being used to drive growth
- Simplified data governance processes and increased data security
- Faster data activation for better data-driven personalization across channels
- Less engineering hours spent working on vendor implementations and managing third party code
What does a typical CDP architecture look like?
CDPs collect first party, individual-level customer data from across your business digital touchpoints and servers (mobile app, website, OTT, S2S data feeds, and more) via API connections and/or SDK implementations. This data is then processed and standardized (transformation, enrichment, validation) to make it easy to integrate with external tools and systems. As data is collected, a real-time view of incoming data is available within the UI so that users across your organization can monitor activity. Customer data is then stored for the long term in different data repositories depending on the type of data and the intended purpose. Functions such as profile lookups, data quality management, audience segmentation, and data connection are available within the CDP’s UI, enabling users to activate customer data.
How can a CDP be used with your data warehouse?
CDPs and data warehouses are not mutually exclusive. While data warehouses provide a system for long-term data storage and analysis, CDPs provide an infrastructure for real-time data connectivity. A valuable use case of a CDP is to export clean, consistent customer data from your CDP to your data warehouse, where it can then be queried directly for historical analysis. This provides you with automated data exportation, advanced filtering and compliance, and data replays for faster and more stable data warehousing. For example, mParticle allows you to forward incoming customer data and load historical data to data warehouses such as Snowflake, Amazon Redshift, and Google BigQuery via packaged integrations.
Here is a data architecture diagram that connects a CDP and a data warehouse and details the use cases supported within each system.
Additionally, mParticle’s Kafka integration allows you to stream customer data to Kafka-enabling systems and applications with event data forwarding, advanced filtering and compliance, distributed event notifications, and event sourcing. mParticle can also subscribe to real-time event data with the Kafka Feed. Once events are collected into mParticle from Kafka, they can be used to support marketing and product initiatives.
If you’re using a BI tool to access and analyze the data within your data warehouse, many CDPs will allow you to export query data from your BI tools into your CDP through cloud feed integrations. Once this data has been ingested into your CDP, it can be used to support marketing and product initiatives. For example, mParticle’s Looker Feed integration allows you to send results from Looker to mParticle where they’re stored as user attributes and can be used to power audience segmentation, calculated attributes, data filtering and more.
To learn more about how mParticle makes it easier to connect your customer data to the tools and systems you're using to drive growth, including your data warehouse, you can explore our documentation here.
Latest from mParticle
Avoiding the growth trap
What do cattle farmers from the 1600s have in common with teams across modern companies? Both rely on shared resources that can quickly be depleted by an overzealous desire for growth, leading to the tragedy of the commons. Learn how you can avoid the growth trap by leveraging your customer data infrastructure and saving your engineering resources from depletion. Stop the vicious cycle, not the development cycle.
Why real-time data processing matters
Business-critical systems shouldn't depend on slow data pipelines. Learn more about real-time data processing and how implementing it strategically can increase efficiency and accelerate growth.
APIs vs. Webhooks: What’s the difference?
An API (Application Programming Interface) enables two-way communication between software applications driven by requests. A webhook is a lightweight API that powers one-way data sharing triggered by events. Together, they enable applications to share data and functionality, and turn the web into something greater than the sum of its parts.