Network_userid and domain_userid


#1

I have a broad question. If network_userid is set by the third party cookie and domain_userid is set by the first party cookie, wouldn’t this mean that if you are setting both and a user gets a network_userid of XXXX and domain_userid of YYYYY then every time you search for network_userid XXXX you should see YYYY for all the domain_userids right?

What I see in my database is that if I search for domain_userid = YYYY then network_userid always equals XXXX but if I search where network_userid = XXXX i get many different domain_userids. They all have the same IP address and behaviors but this is key to trying to identify users multi-touch attribution for campaigns.

Any insight would be great. Thanks.


#2

Hi @sevenm,

Taking this is a “broad question”, what you really want to know is how to identify the user journey in terms of multiple marketing touches.

There are a few posts published on this forum already that address this topic in one way or another which I give the reference to below. But first, let me elaborate more on domain_userid and network_userid.

The domain_userid is a UUID which is generated by the Javascript tracker and stored in a (first party) cookie. Because it’s stored in a cookie, it’s not 100% perfect. If a user deletes their cookies, or the cookie expires, a new domain_userid will be generated (which will make them look like a new user). Also, if the user visits from another browser or computer, it will look like this is a totally different user.

The network_userid is set as a third party cookie ID and applicable to Clojure collector (set against the domain of the collector). It is typically used when site visitors need to be uniquely identified across multiple different domains (e.g. on a content or ad network). In a nutshell, the Clojure Collector receives events from the Snowplow JavaScript tracker, sets/updates a third-party user tracking cookie, and returns the pixel to the client. The ID in this third-party user tracking cookie is stored in the network_userid field in Snowplow events. However, many browsers block 3rd party cookies by default (e.g. Safari & Firefox).

The IP address is also not a reliable identifier because a single user can have many IP addresses and many users can have the same one (e.g. one office) - it’s mainly used for the geo IP lookup and as a possible input in an identity stitching process (see below).

That’s why we provide additional identifiers. For example, the user_id, the user_fingerprint (aka browser fingerprint), and perhaps the user_ipaddress.

The user_fingerprint is generated once with each page load unless the user explicitly calls the setUserFingerprint method. It takes the useragent, the string dimensions and colour depth, the timezone, the existence of session storage and local storage, and the list of plugins as inputs and uses the murmurhash function to convert those into the final fingerprint.

You can have a process in SQL that creates a map/graph between different identifiers. For example, if a user logs in into different browsers, you’ll see the same user_id appear on 2 different domain_userid. The same can happen if cookies are deleted. It’s then a reasonable assumption that all events belonging to these 2 domain_userid actually belong to the same user_id (even the events where the user_id is not set, e.g. when the user is not logged in). This is what we call the identity stitching process.

Yali wrote a good post on this a while back: Identifying users (identity stitching) which should clarify the topic more.

More specifically about the touch attribution models, here the link to Yali’s tutorial: First and last touch attribution models in SQL [tutorial]. You could see that domain_userid is the identifier used to track marketing touches by users.

Yet another tutorial on campaign tracking is here: Web traffic driven campaign tracking with Snowplow [tutorial]

Hopefully, the above is useful.


#3

@ihor am I correct in my understanding given the shortcomings of domain_userid it is still the main trackerid for user tracking/attribution (barring Declared ids like login etc.)?
Mainly as networkuser_id is 3P and prone to be blocked?

Thanks


#4

@sachinsingh10,

They all have their shortcomings that’s why we provide different means of identifying the user. The combination of these values would give you the best result.

If you are to pick one property out of the rest then it’s more relevant to the actual apps/environments you are tracking. The likely “best” candidate would probably be domain_userid.


#5

Hi @ihor,
What is confusing me is that the First party cookie and the network_userid have the same values. I note that the KVP networkUserId within event logs is the same value that is stored in _sp cookie on the domain specified on collector config.

Question - what is the location of domain_userid in the logs and local storage?

Thanks in advance.


#6

Hi @sachinsingh10!

That’s a little surprising. The domain_userid and network_userid are UUID that are generated independently of each other. How did you reach this conclusion?


#7

Hi @sevenm,

One new idea here - if you have badly configured snowplow tracker - to keep cookie rather in particular path than in root of your website (old version of sp.js - pre 2.5 or issues with cookie path config parameter) you will get similar results - users with the same network_userid but different domain_userid (in fact you would track many apps with multiapp tracking).

if you observe duplicates in UUIDs verify if you do not overwrite/break/mangle with RNG - this is the only chance to have 100% control on UUIDs…