Tracking the user ID and session ID on AMP pages?

Hi,
When i send tracking in AMP page amp-analytic type=snowplow tag, there is way to send also the user id and session id?
i have problem how to determine which user send the events.For example if user click on button and then scroll down how should i know that it’s the same user (because right now when i send it i don’t have user_id or session_ud).
On more question i read that there is two type of events one pageview and one unstructEvent there isn’t one for ping i success to create event that work same as ping but it send the event also when the user not in the specific tab so there is option to create this ping event same as it work on none AMP page ?

Thanks Adam.

2 Likes

Hi @adambatso - AMP is quite a restricted environment and our current AMP implementation is rather minimal - just the page view and structured event types; no user tracking or sessionization.

You’ll find the implementation here:

I’m not aware of any way of getting what you’re looking for into AMP, but if you discover anything please let us know…

Hi Alex , thanks for you answer.
Are you aware of plane to expand the support to AMP ?
Because without the user ID or session ID we can’t really use this tracking.

Thanks,
Adam.

I know Simo Ahava has written a couple of guides on how to access the GA cookie/Client ID in AMP pages, to help this issue within GA, but I’m not sure how possible it is to do something similar with the Snowplow tracker.

Thanks @jrpeck1989,

Accessing the Snowplow cookies from AMP pages “out-of-band” is an interesting idea.

Client-side sessionization is unfortunately a “wontfix” in Google AMP:

https://github.com/ampproject/amphtml/issues/1612

However Snowplow gives you the atomic data to build sessionization downstream, in Redshift or similar.

On user IDs - there is an AMP client ID which we could be setting:

However, this is not a UUID and it is behaviorally different from our own client-set domain_userid, so we would need to load this into a different ‘slot’.

I am leaning towards us adding a maximalist AMP context to all AMP events with all the available information that doesn’t map onto our other contexts:

Ticket added:

https://github.com/snowplow/snowplow/issues/2998

How is the plan for the amp-tracker?
it’s so minimalistic what we can track on AMP pages.

First i checked the amphtml project itself:
https://github.com/ampproject/amphtml/blob/00e53e2c7b797dbe40898ce20d3e9bb271dfd52f/extensions/amp-analytics/0.1/vendors.js#L960
i think it would be good to add at as many as possible parameters and placeholders for the tracking-pixel here.

maybe it’s also possible to create some context, if there is a way to create base64encoded jsonstrings.
someone already involved in the amphtml project?

The AMP Tracker could definitely do with some love from users who have committed deeply to the AMP platform! As above:

I am leaning towards us adding a maximalist AMP context to all AMP events with all the available information that doesn’t map onto our other contexts:

A PR into Iglu Central with this new schema would be super-appreciated:

https://github.com/snowplow/iglu-central/pulls

If it’s possible to provide any kind of bridge into the richer capabilities of the Snowplow JS Tracker from AMP (e.g. sessionization, automatic event types), then that would be super-interesting too - perhaps somebody with a good grasp of AMP’s internals could share their thoughts on the feasibility of this?

2 Likes

So we will try to give it some love :wink:
i created already a feature request in the amphtml project, but i’m not sure if i can spend enough time for it.
maybe someone can join

https://github.com/ampproject/amphtml/issues/7984

A short update, i started with extending of the AMP snowplow tracker. it’s not tested yet.

https://github.com/ecoron/amphtml/commit/69a77bc5ff3fd13feab752056bd90d2d584eb31c

The clientId and sessionId needs some more magic. the values should come from the cookie clientId(sp_id_XXXX).
the hazzle is, we must split the string of the cookie value, to get the domain_uid and session_id.
But this is not so easy in amphtml. Maybe it’s possible to use the extension amp-bind which has some whitelisted String functions.

1 Like

Happy to announce, finaly the changes was accepted and merged into master. Might be available within the next amphtml release.

2 Likes

Hey nice, great work @ecoron!

One more PR on amphtml where the ClientID scope is added named as

_sp_id

In this cases the value should be taken from this cookie.
But since there are cookie names like

_sp_id.1234

it would be nice to make the cookiename configurable. this could be done in the amp-analytics code block as additional var.

Then we come to another issue, the cookie value is a concatenation of domain_uid,timestamps,session id … and there is no way to break it into single values on the amphtml side, maybe this could be done in the enrich/shredding process? what would be the best way?

Hey @ecoron - I think you are referring to this PR:

https://github.com/ampproject/amphtml/pull/9440/files

I am confused as to why this PR was merged:

  • The Snowplow cookie takes the form ${user-configurable prefix}.${domain_hash}, not simply _sp_id - so this PR won’t pick up most (all?) domain user IDs
  • The Snowplow cookie contains other metadata, not just the duid - so this PR would send invalid domain user IDs into the pipeline

Are you okay to open a new PR to remove the invalid addition of the duid, or do you want us to do it?

Thanks,

Alex

Hi,

you are right, this will not fetch the most of the origin domain_uids.

If i understand correct the first argument of CLIENT_ID() is the scope or namespace on the amphtml side, and the 3rd argument is the fallback cookie name, if the 3rd isn’t provided the first is used, described here:

I think it would be good to make the fallback cookie name configurable like:

<amp-analytics type="snowplow" id="snowplow">
<script type="application/json">
{
  "vars": {
    "collectorHost": "d3rkrsqld9gmqf.cloudfront.net",
    "appId": "amp-examples",
    "cookieName": "_sp_id.1234"
  },

but then the issue with the other metadata still exists.

For now we should be able to get the clientID of the amp-pages as duid. But there will be a lag between duid in amp pages and origin domain pages. For the configuration of the cookie name i can make a new PR, maybe someone creates an issue for that.

Would be also nice, if someone has some ideas how to proceed with the metadata issue.
Maybe projects with amphtml pages stores the domain_uid in an additional cookie without the other metadata.

Hey @ecoron - thanks for the detailed follow-up!

A couple of thoughts:

Cookie name

I think the cookieName idea makes sense. I might use cookieNameWithHash because in the JS Tracker, cookieName just means "_sp_id". I would not try and retrieve the cookie if the cookieNameWithHash argument is not provided, as it’s impossible to predict the cookie name.

Cookie value

On the metadata issue - you cannot continue to use the duid field because what you are sending in is not a duid - it is just a cookie value. My gut feel is that this problem is bound to come again with another field, so it would be worth:

  1. Defining an amp_context in Iglu Central that contains the cookie value as a field (and we can add other fields to this if other fields need post-processing)
  2. Writing an AMP enrichment which works on the amp_context and post-processes its contents for Snowplow (e.g. extracting the duid from the amp_context.cookieValue)

Does that make sense?

Hey @alex @ecoron. As far I could see this issue has not been resolved so far. We tested the integration of the AMP tracker and found it breaks the pipeline due to invalid duids. (I guess AMP is returning the standard AMP client ids if the _sp_id cookie can not be found?)

However effectively hat means the AMP tracker is broken and unusable in my understanding. I can not find any open issue or fix related to this.

As the thread is about one year old I wonder if we are really the only ones ever attempted again to use SnowPlow for AMP tracking or if there is a fix/workaround I failed to find ?

1 Like