Taking this is a “broad question”, what you really want to know is how to identify the user journey in terms of multiple marketing touches.
There are a few posts published on this forum already that address this topic in one way or another which I give the reference to below. But first, let me elaborate more on
domain_userid will be generated (which will make them look like a new user). Also, if the user visits from another browser or computer, it will look like this is a totally different user.
network_userid field in Snowplow events. However, many browsers block 3rd party cookies by default (e.g. Safari & Firefox).
The IP address is also not a reliable identifier because a single user can have many IP addresses and many users can have the same one (e.g. one office) - it’s mainly used for the geo IP lookup and as a possible input in an identity stitching process (see below).
That’s why we provide additional identifiers. For example, the
user_fingerprint (aka browser fingerprint), and perhaps the
user_fingerprint is generated once with each page load unless the user explicitly calls the
setUserFingerprint method. It takes the useragent, the string dimensions and colour depth, the timezone, the existence of session storage and local storage, and the list of plugins as inputs and uses the murmurhash function to convert those into the final fingerprint.
You can have a process in SQL that creates a map/graph between different identifiers. For example, if a user logs in into different browsers, you’ll see the same
user_id appear on 2 different
domain_userid. The same can happen if cookies are deleted. It’s then a reasonable assumption that all events belonging to these 2
domain_userid actually belong to the same
user_id (even the events where the
user_id is not set, e.g. when the user is not logged in). This is what we call the identity stitching process.
Yali wrote a good post on this a while back: Identifying users (identity stitching) which should clarify the topic more.
More specifically about the touch attribution models, here the link to Yali’s tutorial: First and last touch attribution models in SQL [tutorial]. You could see that
domain_userid is the identifier used to track marketing touches by users.
Yet another tutorial on campaign tracking is here: Web traffic driven campaign tracking with Snowplow [tutorial]
Hopefully, the above is useful.