Engineering

How we reduced our S3 spend by 65% with block-level compression

As our customer base and platform offerings expanded over recent years, so did the cost of storing our clients’ data. We implemented a block-level compression solution that reduced our S3 spend by 65% without impacting the customer experience or client data.

Sean Ryan – October 06, 2023

BigQuery vs. Redshift: Which cloud data warehouse is right for you?

The data warehouse is the source of truth from your business's data set. Choosing the right solution is critical. This article explains how BigQuery and Redshift compare in factors such as performance, security, and cost so that you can select the right warehouse for your needs.

January 09, 2023

Kinesis vs. Kafka: Comparing performance, features, and cost

In this article, we compare two leading streaming solutions, Kinesis and Kafka. We focus on how they match up in performance, deployment time, fault tolerance, monitoring, and cost, so that you can identify the right solution for your streaming needs.

January 26, 2023

A diagram depicting a data connection between a table and an application.

What the heck is reverse ETL?

Reverse ETL is a process in which data is delivered from a data warehouse to the business applications where non-technical teams can put it to use. By piping data from a data warehouse to downstream business systems, reverse ETL tools fill the gap between data storage and activation.

Sean Ryan – January 13, 2023

Snowflake vs. Redshift: Which Data Warehouse Is Better for You?

Learn how popular data warehouse providers Snowflake and Redshift compare in maintenance requirements, pricing, structure, and security so that you can understand which solution is right for your team.

November 29, 2022

Snowflake vs. BigQuery: What are the key differences?

Learn more about the differences between two popular data warehouse solutions, Snowflake and Google BigQuery, and understand how to identify which is right for your team.

November 29, 2022

How we improved performance and scalability by migrating to Apache Pulsar

We recently made a significant investment in the scalability and performance of our platform by adopting Apache Pulsar as the streaming engine that powers our core features. Thanks to the time and effort we spent on this project, our mission-critical services now rest on a more flexible, scalable, and reliable data pipeline.

November 17, 2022

New ways to understand in-app behavior with Apple iOS 16

With the latest updates to iOS and Xcode, Apple has introduced changes to its operating system and developer environment that give engineers and product teams creative new ways to uncover user behavior.

Sean Ryan – October 14, 2022

How does Azure work? An explanation of Microsoft’s cloud platform

Learn more about cloud platform Microsoft Azure and how it fits into your data infrastructure.

September 27, 2022

How does Snowflake work? A simple explanation of the popular data warehouse

Learn more about what Snowflake is and how it fits into your data stack.

September 27, 2022

Enhancements to mParticle’s developer tools make it easier to collect data and ensure quality at the source

mParticle makes it easy for engineers to accurately collect customer data by translating data schemas into production-ready code.

Sean Ryan – September 07, 2022

How we reduced Standard Audience calculation time by 80%

mParticle’s Audiences feature allows customers to define user segments and forward them directly to downstream tools for activation. Thanks to our engineering team’s recent project to optimize one of one of our audience products, mParticle customers will be able to engage high-value customers with even greater efficiency.

June 01, 2022

The engineer’s guide to working with marketers

While developers don’t readily admit it, working with marketers can sometimes be a pain. But when engineers and marketers collaborate effectively on data, amazing things can happen. We’ve assembled this guide to provide engineers with a roadmap for effectively working with their colleagues in marketing and making friends out of frenemies.

Sean Ryan – May 20, 2022

Harveer Singh leads Western Union’s digital transformation with data

Chief Data Architect Harveer Singh creates the data and tech roadmap that will help the iconic company emerge as a leader in fintech.

May 16, 2022

Developer Deep Dive: mParticle Sample Apps

Recently, a cross-functional squad of engineers, PMs and designers at mParticle assembled to produce a labor of love––sample applications. These sample apps help developers implement our SDK in Web, iOS, and Android environments and understand the value of mParticle. Here’s the nuts-and-bolts story behind what they built, the technical choices they made while building these apps, and what they learned along the way.

Sean Ryan – May 05, 2022

Implement a CDP with ease using mParticle's sample applications

Developers rarely look forward to integrating third-party systems into their projects. The learning curve to understand vendor platforms is time-consuming and diverts attention away from more interesting product initiatives. Our sample applications address this problem by helping developers understand how mParticle works on various platforms and providing production-quality, copy/paste-ready code to implement our CDP with ease.

Sean Ryan – April 13, 2022

How we cut AWS costs by 80% while usage increased 20%

How do you replace a tire while driving on the highway? This is what it felt like to re-architect the engine behind one of our most heavily used and relied upon products, the mParticle Audience Manager. Here's how we optimized this critical piece of our architecture and positioned it to play a key role in the next phase of our growth, all while customer adoption and usage steadily increased.

Yuan Ren – March 25, 2022

Data quality vital signs: Five methods for evaluating the health of your data

It’s simple: Bad data quality leads to bad business outcomes. What’s not so simple is knowing whether the data at your disposal is truly accurate and reliable. This article highlights metrics and processes you can use to quickly evaluate the health of your data, no matter where your company falls on the data maturity curve.

Sean Ryan – March 21, 2022

How to choose the right foundation for your data stack

If you’re relying on downstream activation tools to combine data events into profiles, don’t. You’ll end up with fragmented and redundant datasets across systems. Enriching each data point before it is forwarded downstream will prevent this problem, but not all customer data infrastructure solutions deliver this capability.

Sean Ryan – March 02, 2022

Clear costs: How we used data aggregation to understand our Cost of Goods Sold

Understanding our cost allocation on the level of individual customers and services is an important metric for us to track. However, the major cloud providers do not readily provide this information, so to obtain it, our data engineering had to get creative. This case study describes how we built a custom library that combines data housed in disparate sources to acquire the insights we needed.

Matt Phillips – February 16, 2022

Smartype Hubs: Keeping developers in sync with your Data Plan

Implementing tracking code based on an outdated version of your organization's data plan can result in time-consuming debugging, dirty data pipelines, and misguided decisions. mParticle's Smartype Hubs helps your engineering team avoid these problems by importing the latest version of your Data Plan into your codebase using Github Actions.

Sean Ryan – February 11, 2022

A simpler way to implement and maintain video analytics code

Video analytics are essential to maximizing the impact and value of video content. For technical teams, however, capturing this data can often be more challenging than collecting other user events. In this article, we’ll show how mParticle’s Media SDK simplifies this process for engineering teams, and provides data stakeholders with actionable user insights.

Sean Ryan – February 01, 2022

Prevent data quality issues with these six habits of highly effective data

Maintaining data quality across an organization can feel like a daunting task, especially when your data comes from a myriad of devices and sources. While there is no one magic solution, adopting these six habits will put your organization on the path to consistently reaping the benefits of high quality data.

Sean Ryan – December 15, 2021

How to implement an mParticle data plan in an eCommerce app

This sample application allows you to see mParticle data events and attributes displayed in an eCommerce UI as you perform them, and experiment with implementing an mParticle data plan yourself.

November 16, 2021

What does good data validation look like?

Data engineers should add data validation processes in various stages throughout ETL pipelines to ensure that data remains accurate and consistent throughout its lifecycle. This article outlines strategies and best practices for doing this effectively.

November 11, 2021

Should you be buying or building your data pipelines?

With demand for data increasing across the business, data engineers are inundated with requests for new data pipelines. With few cycles to spare, engineers are often forced to decide between implementing third-party solutions and building custom pipelines in-house. This article discusses when it makes sense to buy, and when it makes sense to build.

Joey Colvin – November 10, 2021

Three threats to customer data quality (and how to avoid them)

In this video, Jodi Bernardini, a Senior Solutions Consultant at mParticle, lays out three major threats standing in the way of customer data quality, and offers advice on how organizations can address them.

Sean Ryan

Ask an mParticle Solutions Consultant: What is data quality?

In this video, Andy Wong, a senior leader on mParticle’s Solutions Consulting team, discusses what data quality means, why it is important prioritize, and the benefits of creating a centralized data planning team to oversee data quality.

Sean Ryan

When to use a data lake vs data warehouse

Enabling teams with access to high-quality data is important for business success. The way in which this data is stored impacts on cost, scalability, data availability, and more. This article breaks down the difference between data lakes and data warehouses, and provides tips on how to decide which to use for data storage.

Joey Colvin – November 04, 2021

How Reverb optimized their data workflows at scale and gave users the rockstar treatment

With mParticle at the heart of their data stack, the world’s largest online music marketplace said goodbye to burdensome ETL pipelines, slashed their data maintenance workload, and unlocked new opportunities to build data-driven features into their product.

Sean Ryan – February 19, 2024

How to assemble a cross-functional data quality team

Smash your data silos and improve data quality across your organization by assembling a cross-functional team to own data planning.

Sean Ryan – October 20, 2021

What is data integrity and why does it matter for customer data?

Integrity is a good quality. Just like you want the people around you to have integrity, you also want the data on which you base strategic decisions to be of high integrity as well. That sounds good, but what does it mean for data to have integrity, and why is this so important? In this post, we’ll explore this broad and nuanced concept, define what it means in the context of customer data, and learn a strategy to ensure your customer data maintains high integrity throughout its lifecycle.

September 14, 2021

Debug customer event collection in real time

If you are responsible for implementing data tracking plans across your apps and websites, you’re probably familiar with how tedious and time consuming it can be to track down data collection bugs when they pop up. This video walks through how you can use mParticle’s Live Stream to simplify your team’s testing and debugging cycles.

Sean Ryan – September 09, 2021

Everything you need to know about data integrations

Data integrations are ubiquitous throughout the SaaS ecosystem. But not all data integrations are created equal. This article walks through the different types of integrations commonly available and provides tips on how to choose the right integration types for your use cases.

Joey Colvin – September 02, 2021

How do CDPs benefit engineers?

Customer Data Platforms (CDPs) have traditionally been thought of as tools that benefit marketers and product managers. But from simplifying data collection to enabling data-driven feature development, CDPs have far-reaching value for engineers as well. Learn more about the benefits of CDPs for technical teams.

Sean Ryan – August 24, 2021

What is a data plan, and why is it important to have one?

"Wait, why do we need this data again?" "Was that attribute supposed to use snake or camel case?" Data tracking plans keep everyone in your organization aligned on your data efforts, from the high-level strategy to the nittiest, grittiest details.

Sean Ryan – August 20, 2021

What is a UUID?

The challenge of identifying data shared between systems dates back to the advent of networked computing. One of the earliest solutions to this problem, the Universally Unique Identifier (UUID), is still in wide use today. Here, we’ll explore this ever-present data identifier in detail.

July 12, 2021

How we improved our core web vitals by migrating to Gatsby

By migrating the architecture of this website to Gatsby, we were able to double key core web vitals, increase our accessibility rating by 50%, and boost our SEO scores from 80 to 100

Sean Ryan – May 18, 2021

What is Gatsby?

Gatsby is an open-source framework that combines functionality from React, GraphQL and Webpack into a single tool for building static websites and apps. Owing to the fast performance of the sites it powers, impressive out-of-the-box features like code splitting, and friendly developer experience, Gatsby is fast becoming a staple of modern web development.

May 11, 2021

What is data engineering?

The quantity and complexity of the data that companies deal with is constantly increasing. While Data Scientists analyze and generate actionable insights from data, they cannot do this effectively with data that suffers from poor quality. Data Engineering roles exist in companies to build data pipelines, transform data into useful formats and structures, and ensure quality and completeness in data sets.

Sean Ryan – April 08, 2021

Comparing SQL and Python for Data Analysis use cases.

Python and SQL: Complementary tools for complex challenges in data science

While Data Scientists today have an ever-expanding list of toolkits, languages, libraries and platforms at their disposal, two mainstays––Python and SQL––are likely to remain staples of data analysis for years to come. Here, we’ll look at the role these languages play in the rapidly evolving field of Data Science.

Sean Ryan – March 31, 2021

The value of a universal customer ID across your tech stack

Teams across industries are striving to create a 360-degree customer view. But if that view isn't seamlessly integrated with the tools and systems throughout the tech stack in real time, growth teams aren't able to use it to drive results. Learn more about how you can implement a universal ID and make it available across the stack.

Joey Colvin – March 17, 2021

Relational vs. Non-Relational Databases

What are the key differences between these two main categories of databases, and how do you select the right type of database for different use cases?

Sean Ryan – March 16, 2021

Data enrichment and machine learning: Maximizing the value of your data insights

Data enrichment and machine learning are two techniques that can enhance the ability of your customer data to drive personalized experiences. While there is some overlap in the end goal of both approaches to enhancing data value, there are significant differences in the time, resources, and overhead they each require.

Sean Ryan – February 23, 2021

How to stop endless data shipping cycles

Engineers should ship products, not data. Product managers and marketers should experiment with data, increase personalization, and improve experiences. With a permanent data infrastructure, these goals are not mutually exclusive.

Sean Ryan – February 17, 2021

Capture page navigation events in a React Application

In a single-page application, understanding which pages your customers visit and the journeys they take through your website can be challenging. Here, we’ll look at a scalable and maintainable strategy for tracking page navigation events in a React application.

Sean Ryan – February 08, 2021

Track User Events in Single-Page Applications

Owing to their fast load times and smooth user experiences, Single-Page Applications (SPAs) are now an extremely popular design pattern for developing websites. While building your site as an SPA offers clear advantages for your customers, it places challenges in the way of collecting robust analytics on user behavior.

Sean Ryan – January 26, 2021

APIs vs. Webhooks: What’s the difference?

An API (Application Programming Interface) enables two-way communication between software applications driven by requests. A webhook is a lightweight API that powers one-way data sharing triggered by events. Together, they enable applications to share data and functionality, and turn the web into something greater than the sum of its parts.

Sean Ryan – January 07, 2021

What is data orchestration

Data orchestration is an automated process in which a software solution combines, cleanses, and organizes data from multiple sources, then directs it to downstream services where various internal teams can put it to use. The purpose of data orchestration is to help a company make its data as useful and versatile as possible.

Sean Ryan – December 15, 2020

Smartype Generate: Translate any JSON schema into data collection libraries for web, iOS and Android

mParticle’s Smartype is a platform-agnostic tool that can help every engineering team ensure data quality and consistency. Learn how to use Smartype to translate any JSON schema into custom data collection libraries for iOS, Android, and Web platforms.

Sean Ryan – November 30, 2020

CDPs vs. Data Lakes: What’s the difference, and can you use both?

CDPs and Data Lakes differ in the insights they surface, the users they serve, and the overall value they deliver. Though when used together, they are a powerful duo that can help your organization leverage historical and real-time customer data to the fullest extent.

Sean Ryan – November 10, 2020

CDP vs Data Warehouse: What's the difference?

Learn the differences between a CDP vs data warehouse, and how you can use both in tandem to configure an architecture that makes sense for your data and business needs.

Joey Colvin – August 18, 2022

Future-proof your customer data strategy: Get ready for iOS 14 privacy updates

There are significant changes coming to iOS relating to user privacy, tracking transparency, and specifically the use of the iOS advertising identifier (IDFA). Since the announcement, mParticle has been collaborating with some of the largest consumer brands in the world to holistically achieve a balance between adhering to compliance obligations and ethical data collection policies to protect consumer choice, while also delivering personalized and relevant information to people globally.

Sam Dozor – September 16, 2020

Improve mobile app performance with SDK abstraction

Implementing third-party SDKs in your mobile app allows Marketers and Product Managers to get data into the tools they love, but unstable third-party code can impact mobile performance and drain engineering resources. Learn how you can get high quality customer data to your team's favorite tools without having to manage excess third-party code.

Joey Colvin – August 31, 2020

snowflake-apache-airflow-machine-learning

Generate in-warehouse predictive audiences

Learn how Jayant Subramanian, data science intern, developed a proof-of-concept machine learning pipeline for predicting user behaviors from data pre-processing to model training and beyond using Snowflake and Apache Airflow.

Jayant Subramanian – January 07, 2021

How a CDP supports customer data security

The trust between a customer and brand is the foundation of a strong customer relationship. Part of maintaining that trust is sound customer data management and security. Learn how a Customer Data Platform helps you secure your customer data pipeline so that you can build trust throughout the customer journey.

Joey Colvin – July 08, 2020

Test in production with mParticle and Split

Testing with production data allows you to release features with more efficiency and greater confidence, but doing it successfully requires good testing control and data management processes. Learn more about using mParticle and Split feature flags to simplify testing in production.

Joey Colvin – July 02, 2020

Smartype: Proper event collection at run time

Smartype, a data quality product that translates any data model into type-safe code to help developers ensure proper event collection at run time. Smartype generates personalized SDKs, based on any data model, providing automated code completion and improving data collection and quality at scale. Now available in beta.

Shabih Syed – May 13, 2020

The difference between CDPs, DMPs, and CRMs

Discover the distinctions between these three very different martech solutions and which uses cases is best suited to your chosen technology provider.

Joey Colvin – December 11, 2019

How JetBlue improved their mobile customer experience

Learn how JetBlue uses mParticle to understand how customers experience the app on an individual basis, identify points of friction that affect customers' satisfaction, and test and deploy tools efficiently without adding third-party code that could impact end-user functionality

Abril McCloud – October 15, 2019

Cognetik and mParticle: Simplify mobile tags

Learn how using Cognetik and mParticle can help you simplify mobile tagging and reduce vendor overhead.

Matt Alexander – December 19, 2018

10 Critical data infrastructure capabilities

Instead of focusing on core data management challenges, many Customer Data Platforms are focused on the application of data. Learn about the 10 critical components of modern data infrastructure.

Abril McCloud – November 02, 2018

How Bleacher Report automates their data pipeline

Learn how the team at Bleacher Report uses mParticle to automate their data pipeline, leading to better insight, reduced storage costs, and less engineering time spent on non-core development.

August 01, 2018

With a mobile data layer like mParticle, you could be ready to support iOS 12 in as little as one day.

Get ready for iOS 12

In just a few days, Apple will be releasing iOS 12. Is your app ready to support this new software? With a mobile data layer like mParticle, you could be ready to support iOS 12 in as little as one day.

Tricia Prashad – September 10, 2018

Best practices for deploying a mobile messaging platform

How to assemble a best in class tech stack

How to implement a best in class stack...the right way

Al Harnisch – June 06, 2018

Learn how Postmates unifies data to deliver a world-class customer experience

Postmates is an on-demand delivery platform with the largest delivery fleet in over 45 major US cities. Unlike traditional delivery services, Postmates can power local, on-demand logistics from any store or merchant for a variety of products.

How to avoid the SDK tax

SDKs have made it simpler for companies to connect their customer data across analytics, marketing, and BI tools, but it often comes at the price of increased dependencies and can affect end-user experience. Learn how using a single-point API can save you from this fate.

David Spitz – October 26, 2017

ScyllaDB Migration: How to design high throughput and low latency NoSQL deployments

Yuan Ren, Head of Data Science at mParticle, discusses our ScyllaDB migration and how to process 50 billion monthly messages via NoSQL deployments.

Yuan Ren – August 24, 2017

Location services best practices

Many mobile apps can greatly improve their users’ experience by making use of information on users’ whereabouts. A hotel app…

Dalmo Cirne – December 02, 2014

5 tips for integrating SDKs the right way

Learn how to integrate SDKs into your mobile data strategy holistically with these five tips from our head of SDK engineering, Sam Dozor.

Abril McCloud – October 18, 2017

Introducing Support for Cordova on Android and iOS

With mParticle's server-based architecture, your Cordova app will perform better with our single lightweight SDK than it would a series of other SDKs.

Sam Dozor – June 07, 2017

Introducing support for Unity on Android and iOS

We are excited to announce that mParticle now supports Unity on Android and iOS!

Sam Dozor – May 10, 2017

The App Gap: Why Customer Data Platform installations fail

David Spitz – February 14, 2017

Hyperloglog Algorithm: A must-know for data scientists

Hyperloglog (HLL), a powerful streaming algorithm, helps mParticle deliver real-time analytics products. Learn why it's a must-know for data scientists.

Yuan Ren – October 23, 2014

Make the right mobile architecture decisions

David Spitz – August 30, 2016

Behind the script: Building a Roku SDK

mParticle's Sam Dozor walks us through building our open source Roku SDK and the many eccentricities of the platform that made the experience so unique.

Sam Dozor – February 09, 2017

How to use native iOS and Android services in your hybrid app

Here's how you build a single web app and either deploy it for a browser, or wrap it in a hybrid app using the mParticle solution.

Sam Dozor – February 03, 2015

The real reason mobile DMPs don’t make sense

Omnichannel marketers & publishers need an omnichannel solution.

Michael Katz – July 08, 2015

Choosing the right SDK for your mobile app

With the array of mobile app services for analytics, attribution, marketing automation, and so on, it can be difficult to choose the right SDK for your mobile app.

Sam Dozor – October 10, 2014

App marketing using audience targeting

By applying good business sense and using technologies such as mParticle, app marketing to mobile users via audience targeting doesn't have to be a chore.

Paul Mander – April 14, 2015

The problem with app updates: Communication breakdown

While apps continue to make improvements from the time they launch, app stores don’t offer them a means to share progress within app updates.

Coby Berman – August 12, 2015

Which mobile SDK should I put in my app?

It’s easy to see how people can arrive at this question. However, asking which mobile SDK you should put in your app is entirely the wrong question to ask.

Paul Mander – June 30, 2015

Preparing for iOS 9 – NSURLSession

Starting with the mParticle SDK 4.2.0 NSURLSession becomes the class of choice for sending and receiving data to our platform.

Dalmo Cirne – September 08, 2015

SDKs for mobile marketers 101

Software Developer Kits (SDKs) are widely used by mobile developers. Learn what an SDK is and why it’s needed when working with different service providers

Coby Berman – February 10, 2015