RFC: Google Tag Manager Server Side and Snowplow

This Request For Comment (RFC) covers our recent thinking on how GTM SS and Snowplow could fit together. Whilst we’ve been doing some thinking on what would be most helpful to do for users of Snowplow who are also interesting in leveraging GTM Server Side, we haven’t reached any conclusions - so haven’t built anything yet.

However, we understand some users would like to leverage GTM SS, Snowplow and other systems together so we’re opening this Request For Comment (RFC) and would love to understand your use cases for GTM SS and your preferred method of deployment, as this will help us figure out the best possible approach.

We believe there are a few possibilities that are worth exploring further:

GAv4 client side → GTM → Snowplow Client in GTM that claims the GAv4 protocol requests and forwards to the Snowplow collector

  • This would be useful for a user that has implemented GA and wants to adopt Snowplow without reinstrumenting tracking on web.
  • However there are significant limitations (e.g. lots of the rich data that the Snowplow JS tracker collects client side wouldn’t be collected e.g. time on page, scroll depth, page view ID.
  • Also it is might be hard to use e.g. entities to collect rich data in a consistent way across event types, without significant data transformation in GTM.

Snowplow script client side → GTM → Snowplow Client in GTM that claims the request and forwards to Snowplow collector

  • This seems like the simplest approach and it creates the opportunity to create other clients in GTM that forward the Snowplow data to other end points. It is a good solution if you want to send Snowplow data to other locations.
  • However it might not be that easy working with the Snowplow data in GTM.

Snowplow enriched data → GTM (via e.g. PubSub using our Analytics SDKs)

  • This would effectively pipe Snowplow processed data from the enriched stream into GTM SS so it could be forwarded to other systems
  • This might be an effective approach to e.g. relaying Snowplow data to other systems
  • It has the benefit that the data would be well structured / easy to work with (all JSON and all schemas would have been validated). You also are working with processed data, so would have all the enrichments and e.g. PII redacted if appropriate
  • However it is not the intended use case for GTM SS
2 Likes

so if I understood correctly, GTM SS would be useful here mostly for pushing around data across other points, something like what Segment does?

My guess the 2nd option where sp.js calls GTM and then have a tag with Snowplow client seems most reasonable, unless we could set GAv4 client to more or less cover all aspects of current snowplow capabilities which is probably a bit of work.

Is there any good examples of GTM SS usage so far? I mean is there like clients already premade that could be used, to the similar degree as we have in GTM js (e.g. fb conversion tracking and the like).

thanks for the response @evaldas

so if I understood correctly, GTM SS would be useful here mostly for pushing around data across other points, something like what Segment does?

I think there might be some possibility for that. From the above options I think there’s a couple of strategies, but being able to have different clients claim the requests and forward the events on to the respective platforms is certainly something we’d like to consider. One thing we’d like more feedback on is if this is something people are using GTM SS for, to send events to multiple vendors (segment style).

This is an interesting introduction to clients and how they work.

Is there any good examples of GTM SS usage so far?

There is a gallery for Clients now, there’s not too much going on there at the moment, but I think that’d be a good place to start and where we’d see a Snowplow client finding itself.

thanks for the references @PaulBoocock, the link has the best explanation of the GTM SS workflow I’ve seen so far. I’m not too certain if multiple vendor fan-out support is something that’s widespread in Snowplow community, as it’s not trivial to do without any extra tools. Thought if there would be support for that maybe then there could be some use cases. One example comes to mind if you have conversion tracking might be enough to track it once and then send it to all other targets like FB, GA and so on, which could lead to more consistent data.

When I look of current support, its a little bit questionable how much Google is planning to invest this, as I don’t see google ads tag support, unless its part of GA4. Also in contrast of just client side tracking which is relatively simple to understand, server one will introduce more technical hurdles, and its a question if adoption will be low will it stick around for long as GTM client one or will be just a small project. Maybe if someone already has tried using it can chime in or/and has been using some similar approaches, would be interesting to read.

So to conclude, in theory, I kind of see the benefits of having this supported, but its just a question if there will be enough push/demand for this. Maybe once the 3rd party cookies will become totally obsolete this could be something that replaces it.

Hello, I’m currently in the process of trying to implement method #2, but am a bit confused on some of the details. I am hoping someone might be able to help or link to some related documentation that I haven’t been able to find. Please let me know if this would be better suited to be its own post!

I’m confused about how to achieve the last part:

Snowplow script client side → GTM → Snowplow Client in GTM that claims the request and forwards to Snowplow collector

I have the Snowplow script running in the client side container, with it sending requests to my server-side container. I also have created a custom snowplow client to receive these requests in my server-side container (which it is doing correctly). However, the part that I’m confused about is how the requests are then supposed to be sent to the collector. I assume a custom tag template should be set up on the server-side container to receive the event from the custom snowplow client and forward it to the collector, but how? What sort of snowplow tracker should we be using for this and how do we make it accessible in our server-side tag? The only possibility I see is using the API, but this isn’t currently available to me as a Try-Snowplow user.

Thanks in advance!

Hey all

Perhaps I can clarify a bit.

SGTM comprises Clients (listeners that respond to certain types of HTTP requests, and then do something) and Tags (scripts typically designed to dispatch outgoing requests when a Client triggers them).

So while you could indeed use a Client to send data to the vendor directly, typically the Client should be built so that it parses all the useful information from the incoming request into a generic event data object which can then be used to send the data to any endpoint, not just the one corresponding with the Client.

This leads to the development options for Snowplow. In my opinion it should support both.

A Snowplow Client needs to be built which pulls in the payload generated by the SP JS tracker. The Client parses the information into a generic event object which can then be used to fire the Snowplow tag, and/or UA tag, and/or GA4 tag, and/or Facebook tag etc. (this is the power of SGTM).

A Snowplow Tag needs to be built so that it grabs the event data generated by the Snowplow Client (or a GA4 Client or a UA Client etc.) and sends it to the SP pipeline.

In other words, both the Client and the Tag need to be built so that they can work with each other but also with other vendors’ Clients and Tags.

This is totally doable! The key content in the Snowplow payload from the JS tracker is remarkably similar (semantics-wise) to what GA4 digests and UA and Facebook. This is why these multi-purpose streams are possible with the right kind of engineering.

Fields and parameters that are idiosyncratic to Snowplow can be mapped from custom parameters or HTTP headers, if using e.g. GA4 to collect the data to the Server container.

For an idea how this all works together, check out this guide I wrote for using a GA4 web tag to feed data into a Facebook server-side GTM tag: Facebook Conversions API Using GA4 Web Tags And A GTM Server | Simo Ahava's blog. The principle would be the same here. The Facebook Tag would of course work best with the Facebook Client, but it can be engineered to work with other Clients as well.

Ultimately, the idea is to reduce the number of outgoing streams from the browser (or app or other connected internet device). Instead of sending both a GA4 payload AND a Snowplow payload from the browser, only one needs to be sent as the Server container can then “fan” it out to the other.

@PaulBoocock - happy as always to work with you on building the SGTM templates!

Simo

3 Likes

Thanks @simoahava, that helped clarify the overall process for me a lot.

I am still confused about how you would actually send the event from the SGTM tag template to your snowplow collector. I see now the API I linked in my earlier comment is completely unrelated. The Facebook Conversions API Tag example obviously uses the Facebook Conversions API to accomplish this; what is the Snowplow equivalent?

@PaulBoocock Should I be able to send POST requests to my collector endpoint (https://Google_[some-number].try-snowplow.com) with a payload of data structured as described here? I played around a bit with trying this and couldn’t get it to work. Assuming this is the way to go, a basic example would be amazing. If I’m completely off base is there some other way of accomplishing this? TIA!

Hi @Laura_Henn

With SGTM tag templates, you can generate an HTTP request which the server then dispatches.

The tag needs to map the fields in the Event Data object (generated by the Client) into e.g. GET request parameters (e.g. &param1=value1&param2=value2) or into a POST request body (e.g. JSON.stringify({param1: value1, param2: value2})), so it really just depends how the Collector is configured and what types of requests it digests.

This part is particularly interesting for the use cases we’re aiming at. It’s likely where we will start with our efforts I think.

It also feels we could achieve that slightly strange option #3 above using a specific Client for the enriched stream. We could build a client for the enriched stream which parses and packages up for others to consume - safe in the knowledge that it’s been validated and enriched by Snowplow. I’m just not sure how useful it would be though, I guess the object passed between tags needs to be relatively “standard” so we might have to strip quite a bit out which likely doesn’t make it much more useful than the original data the Snowplow JS Tracker sends anyway.

Do you see the Snowplow JS Tracker being a good source for this as it’s quite a rich datset (and extensible), or do you think GAv4 would be a better initial source for a Snowplow tag? I think I’m just wondering where I should start my initial effort, with the Snowplow tag or client.

Thanks for the offer :slight_smile: I’m hoping to spike something out the week after next, so I’ll reach out once I’ve had a little tinker to see if I’m heading in the right direction.

2 Likes

Yes that should work. The Snowplow JavaScript Tracker will generate events which conform to the tp2 protocol. So it’d probably be better to use that rather than try to generate your own payloads, or use the payloads from the JavaScript Tracker in the browser as inspiration.
If you’re looking at sending from within the Container, then yes constructing your own payloads which conform to tp2 and POSTing them is the right approach.

1 Like

I think the most useful option is to start with the Client. Snowplow could well be a candidate for the mythical first-party library that becomes the (almost) sole driver of web- / app-based hits to the Server container, which then fans it out to other vendors such as GA and Facebook.

Naturally the Client itself is fairly useless without a Snowplow tag, but if you start with the Client you’ll establish the architecture for what the tag will eventually digest.

Another thing to think about is where to add the option for overrides. If you take a look at the Google Clients, for example, they don’t really let you touch the incoming stream or the event data object they’re mapped to. I personally think that’s a good idea – the Client should generate the event data object, and then it’s up to the tags to map / create / delete the fields and parameters parsed for the outgoing request.

But some manipulation options might be useful, such as the option to obfuscate source IP and User Agent (and any other potentially fingerprintable surfaces).

3 Likes