GrowthJune 16, 2021

Probabilistic vs deterministic: Which method should you be using for identity resolution?

The way in which you build your customer profiles can have big consequences on marketing strategy, data privacy, and customer relationships. Learn about the difference between probabilistic and deterministic identity models and how to determine which method you should be using.


It used to be much easier to get to know your customers. They waited on line to checkout in your store, and exchanged pleasantries as they approached the register to pay for their goods. As you built relationships with loyal customers, you could let them know when a new product was in stock or when something was on sale.

Today, the way in which we do business is not so simple. The majority of customer engagements have moved from stores to websites and apps, and in-store payments are being replaced by digital point-of-sales systems. 

Despite this transition, both brands and customers desire to retain certain aspects of the traditional customer experience. Customers appreciate feeling that brands demonstrate a basic understanding of their preferences and interests, especially when it comes to fraud prevention and customer service. Brands want to notify customers when a relevant new product becomes available, and aspire to personalize the user experience to each customer's interests. They also need to track customer identity in order to fulfill data subject requests and practice responsible data governance.

The challenge, however, lies in the fact that identifying your customers in today’s digital-first, cross-device world is more complex than it was in a brick-and-mortar store. Customers engage across multiple touchpoints throughout the customer journey, and they’re not always logged-in on every device. To understand who’s interacting with your brand and deliver relevant experiences based on that information, you need to be able to resolve cross-device data to unified customer profiles.

Two identity resolution methods have emerged to help brands accomplish this: Probabilistic modeling and deterministic matching.

What is a probabilistic model?

Probabilistic modeling ties engagements made by a single user across multiple devices to a unified customer profile by using predictive algorithms to link information such as IP address, operating system, location, wifi network, and behavioral data to an individual at a given confidence level. 

For example, if an anonymous customer is browsing a specific product on two separate devices that are connected to the same wifi network, a probabilistic modeling algorithm may tie that engagement data to a single user profile. If, however, a third device becomes active on the same wifi network and begins browsing a completely different product category, that engagement data may be tied to a separate user profile.

Probabilistic identity resolution

Probabilistic identity resolution

The draw of probabilistic modeling is that it allows you to build customer profiles without collecting any personally identifiable information (PII) such as email, name, and phone number from the customer. This makes it easier to increase the scale of your database, build profiles for top-of-funnel prospective customers, and extend the reach of your campaigns.

The downside of probabilistic modeling is that there is a margin for error. Predictive algorithms will never be accurate 100% of the time, and a database riddled with inaccurate customer profiles can lead to manual identity management for your developers, wasted paid media spend for your marketing team, and poor experiences for your customers.

What is a deterministic model?

Deterministic matching, on the other hand, leverages first-party data that has been provided by customers to unify device-level data to unique customer profiles with 100% confidence. Device-level engagement is linked only when common PII has been shared, prioritizing the accuracy of your customer profiles.

For example, if a customer engages with your brand on her mobile phone and tablet, and signs in to her account on both devices using a common email, device-level engagement data from both platforms will be linked to a common customer profile. If, however, the same customer continues browsing on her laptop and does not log in with any identifiable information, engagement data from that channel will not be unified to the profile, even if she is browsing the same product category.

Deterministic identity resolution

Deterministic identity resolution

The benefit of deterministic matching is the accuracy of the profiles that are created. By only linking device-level activity when there is a common identifier shared, deterministic resolution helps you build the foundation for a high-quality customer database. Furthermore, leading identity resolution tools will allow you to control how and when profiles are merged based on the nuances of your customer journey. Deterministic profiles can be leveraged to send personalized email, app messaging, and retargeting campaigns with high confidence.

The downside of deterministic matching is that it will only merge device-level activity when a common identifier is shared, and not when it likely comes from the same customer. For this reason, deterministic matching does not provide the same scalability as probabilistic modeling and may be less effective at building profiles for top of funnel prospective customers, from whom you’ve collected less identifiable information. 

Which method should you be using today?

The means by which you build customer profiles will ultimately depend on your use cases and data governance policy. Probabilistic modeling allows you to build profiles for customers from whom you have collected little zero- and first-party data, but it does have a margin for error. For that reason, probabilistic profiles have often been used for 1:Many use cases, when you want to deliver a single experience to a broad audience and there is relatively little consequence for bad matches. On the contrary, using profiles that are likely accurate to power 1:1 customer communications, such as email, can quickly result in bad customer experiences.

For those 1:1 use cases, deterministic profiles are very useful. Because they are built based on linked PII, deterministic profiles allow you to communicate directly with customers with increased confidence and efficiency. 

Within the last few years there have been changes in the ways in which brands can collect customer data. From the introductions of GDPR and CCPA, to the limitation of third-party cookie tracking, to the introduction of Apple’s App Tracking Transparency (ATT) Framework, consumers have been given more control over how their data is collected and used by the brands they do business with.

For brands to control their own destiny, they need to break their addiction to third-party cookies and develop their own first-party database. 

Many of the identifiers that probabilistic systems rely on, such as device IDs and third-party cookies, are becoming more difficult for brands to collect as Google and Apple empower consumers with the right to decide how that information is shared. The effectiveness of a probabilistic model depends on the quality and breadth of the data that it is provided. Without access to a diverse supply of device-level data that can be added to an identity graph, the accuracy of probabilistic customer profiles is called into question. 

As brands are forced to shift from a third-party database, they need to lay the foundation for a first-party data strategy and prioritize building trust with their customers. Embracing a deterministic approach as the core of your identity strategy will allow you to build high-quality customer profiles based on the information your customers provide to you directly so that you can provide personal experiences. Once you have your deterministic foundation in place, it is still possible to leverage probabilistic modeling on the periphery of your infrastructure to power certain use cases.

mParticle’s IDSync provides you with the ability to build deterministic customer profiles with data that is unified from across devices. These profiles can be activated programmatically across your digital properties with mParticle’s Profile API, connected to your favorite marketing automation, analytics, and experimentation tools, and exported to your data warehouse for long-term storage via mParticle’s 300+ integrations.

For more, you can read the docs on IDSync here.

Latest from mParticle

See all insights
mParticle 2.0


Deep-dive into the new mParticle: A unified platform and updated UI

The new mParticle featured image thumbnail


Welcome to the new mParticle

Mach Alliance


Leading the next generation of CDP solutions: mParticle celebrates acceptance into the MACH Alliance