Validate stitching
The goal of identity stitching (or simply, stitching) is to elevate the suitability of an event dataset for cross-channel analysis. This elevation is achieved when all rows of data in the dataset contain the desired highest order of identity that is available. This elevation allows you then to:
- Create person-centric reports, while not leaving out anonymous people.
- Connect multiple devices to a single person.
- Connect a person across channels.
This article outlines analysis methods to measure the elevation on one or more newly created stitched datasets and to provide confidence that stitching is delivering these benefits.
The analysis methods involve Data view component settings that are typically accessible to admins. The methods also require analysts, who work in an Analysis Workspace project, to create calculated metrics and visualizations.
While these analysis methods can be used for both field-based stitching and graph-based stitching, some elements may not be present in the dataset, especially in a graph-based stitching scenario. These missing elements can make it difficult to calculate lift directly in Analysis Workspace.
Data view prerequisites
For the stitching validation measurement plan, you need to ensure you have all the required dimensions and metrics from your stitched dataset defined in a data view. You need to verify that both stitchedID.id
and stitchedId.namespace.code
fields are added as dimensions. While the stitched dataset is an exact copy of the original dataset, the stitching process adds these two new columns to the dataset:
-
Use
stitchedID.namespace.code
to define a Stitched Namespace dimension. This dimension contains the namespace of the identity that the row was elevated to, for exampleEmail
,Phone
. Or the namespace the stitching process fallbacks to, such asECID
.
-
Use
stitchedID.id
to define a Stitched ID value dimension. This dimension contains the raw value of the identity. For example: hashed email, hashed phone, ECID. This value is used with Stitched Namespace.
Furthermore, you need to add two stitching metrics that are based on the presence of values in a dimension.
-
Use the field that contains the Person ID from the stitched dataset to configure a metric that defines whether a Person ID is set. Add this Person ID even if you are using graph-based stitching as the Person ID helps to establish a baseline. In case the Person ID is not contained within the dataset, your baseline is 0%.
In the example below,
personalEmail.address
serves as the identity and is used to create the ** _Email set** metric.
-
Use
stitchedID.namespae.code
field to create an Email stitched namespace dimension. Ensure you specify Include Exclude values component settings, so you only consider values of the namespace you are trying to elevate rows of data to.- Select Set include/exclude values.
- Select If all criteria are met as the Match.
- Specify Equals
email
as the Criteria to select events that have been elevated to the Email namespace.
Stitched dimensions
With both of these dimensions added to the data view, use Freeform tables in Analysis Workspace to check the data that each dimension has.
In the Stitched Namespace dimension table, you typically see two rows for each dataset. One row that represents when the stitching process had to use the fallback method (ECID). The other row shows events associated with the desired identity namespace (email).
For the Stitched ID dimension table, you see the raw values that are coming from the events. In this table, you see that values oscillate between the persistent id and the desired Person ID.
Device-centric or Person-centric reporting
When you create a connection, you have to define what field or identity is used for the Person ID. For instance, on a web dataset, if you choose a device id as the Person ID, then you create device centric reports and lose the ability to join this data with other offline channels. If you select a cross-channel field or identity, for example email, you lose out on any unauthenticated events. To understand this impact, you need to figure out how much of the traffic is unauthenticated and how much of the traffic is authenticated.
-
Create a calculated metric Unauthenticated events over total. Define the rule in the rule builder like below:
-
Create a calculated metric Email authentication rate, based on the _Email set metric that you defined earlier. Define the rule in the rule builder like below:
-
Use the Unauthenticated events over total calculated metric, together with the Email authentication rate calculated metric, to create a Donut visualization. The visualization shows the number of events in the dataset that are unauthenticated and are authenticated.
Stitching identification rates
You want to measure the identification performance before and after stitching. To do so, create three additional calculated metrics:
-
A Stitched authentication rate calculated metric that calculates the number of events where the stitched namespace is set to the desired identity over the total number of events. When you set up the data view, you created an Email stitched namespace metric that included a filter to count only when an event has a namespace set to email. The calculated metric uses this Email stitched namespace metric to provide an indication of what percentage of the data has the desired identity.
-
A Percent increase calculated metric that calculates the raw percentage change between the current identification rate and the stitched one.
-
A Lift calculated metric that calculates the lift between the current identification rate and the stitched identification rate.
Conclusion
If you combine all data in an Analysis Workspace Freeform table you can start to see the impact and value that stitching provides, inclusive of:
- Current authentication rate: The baseline of the number of events that already had the correct Person ID over the total number of events.
- Stitched authentication rate: The new number of events that have the correct Person ID over the total number of events.
- Percent increase: The raw percentage increase from the stitched authentication rate minus the baseline current authentication rate.
- Lift: The percent change over the baseline current authentication rate.
The key takeaway from this article is that this type of stitching validation and analysis supports you to:
- Provide a comprehensive custom view of authentication effectiveness by comparing current versus stitched rates.
- Enable clear measurement of the improvement through percentage increases and lift metrics.
- Help identify the true impact of implementing stitching on user authentication.
- Create a standardized way to communicate authentication performance across teams.
- Allow for data-driven decisions about authentication strategy and optimization.
These metrics together give stakeholders a complete picture of how Customer Journey Analytics stitching affects authentication success rates and overall person identification performance.