JS Tracker Edge Analytics: Aggregating Page Ping events client side

Version 2.13.0 of the Snowplow JavaScript Tracker introduces edge analytics within our JavaScript tracker, in particular a new callback function on activity tracker has been introduced so it is now possible to receive the activity tracking (page ping) events in the browser and aggregate them together into a single event.

2.13.0 is available on GitHub releases.

Activity Tracking is a feature of the Snowplow JavaScript tracker, that sends events to a Snowplow collector at specified time intervals. These events can be aggregated in a data modelling step to calculate page activity, from length of time on page to how far the user scrolled through the page.

However, one of the downsides to this technique is the large amount of events that are sent to achieve good levels of accuracy. With version 2.13.0 it is now possible to do this aggregation client side to reduce, or potentially elimate, page ping events. You will need to do some work to aggregate the information and then send this information to the Snowplow collector, below we look at one example of how you might achieve this.

The Snowplow JavaScript tracker now exposes a new function called enableActivityTrackingCallback (See docs). Using this callback means that each time a page ping event would normally be sent, your callback function will be executed with the page activity information instead.

window.snowplow('newTracker', 'sp', '<<collectorUrl>>', {
    appId: 'my-app-id',
    eventMethod: 'beacon',
    contexts: {
        webPage: true,
        performanceTiming: true
    }
});
var aggregatedEvent = {
    pageViewId: null,
    minXOffset: 0,
    maxXOffset: 0,
    minYOffset: 0,
    maxYOffset: 0,
    numEvents: 0
};
window.snowplow('enableActivityTrackingCallback', 10, 10, function (event) {
    aggregatedEvent = {
        pageViewId: event.pageViewId,
        minXOffset: aggregatedEvent.minXOffset < event.minXOffset ? aggregatedEvent.minXOffset : event.minXOffset,
        maxXOffset: aggregatedEvent.maxXOffset > event.maxXOffset ? aggregatedEvent.maxXOffset : event.maxXOffset,
        minYOffset: aggregatedEvent.minYOffset < event.minYOffset ? aggregatedEvent.minYOffset : event.minYOffset,
        maxYOffset: aggregatedEvent.maxYOffset > event.maxYOffset ? aggregatedEvent.maxYOffset : event.maxYOffset,
        numEvents: aggregatedEvent.numEvents + 1
    };
});
window.addEventListener('unload', function() {
    window.snowplow('trackSelfDescribingEvent', {
        schema: 'iglu:com.acme_company/page_unload/jsonschema/1-0-0',
        data: {
            minXOffset: Math.max(0, Math.round(aggregatedEvent.minXOffset)),
            maxXOffset: Math.max(0, Math.round(aggregatedEvent.maxXOffset)),
            minYOffset: Math.max(0, Math.round(aggregatedEvent.minYOffset)),
            maxYOffset: Math.max(0, Math.round(aggregatedEvent.maxYOffset)),
            activeSeconds: aggregatedEvent.numEvents * 10
        }
    });
});
window.snowplow('trackPageView');

Note: For this technique of sending on page unload to work reliably, we recommend initialising the Snowplow JavaScript Tracker with eventMethod: 'beacon' and/or stateStorageStrategy: 'cookieAndLocalStorage' (if navigating to a page that also contains the JS Tracker). Using the page unload technique will not work for Single Page Applications (SPA), you would need to send the aggregated event to the Snowplow collector on navigation within your application.

In this example, we have introduced a new event schema called page_unload. This would need to be uploaded to your Iglu repository and will likely need to look something like this:

{
        "$schema": "http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#",
        "description": "Schema for a context of an aggregated activity tracking on page unload",
        "self": {
                "vendor": "com.acme_company",
                "name": "page_unload",
                "format": "jsonschema",
                "version": "1-0-0"
        },
        "type": "object",
        "properties": {
                "minXOffset": {
                        "type": "integer",
                        "minimum": 0
                },
                "maxXOffset": {
                        "type": "integer",
                        "minimum": 0
                },
                "minYOffset": {
                        "type": "integer",
                        "minimum": 0
                },
                "maxYOffset": {
                        "type": "integer",
                        "minimum": 0
                },
                "activeSeconds": {
                        "type": "integer",
                        "minimum": 0
                }
        },
        "required": ["minXOffset","maxXOffset","minYOffset","maxYOffset","activeSeconds"],
        "additionalProperties": false
}

If you have any other thoughts about this feature, other edge analytics ideas, or examples of where you have utilised it, we would love to hear from you!

5 Likes

Paul - this is a brilliant idea and a really nice execution. Anything that makes moves towards removing page_pings from events and into it’s own context is a win in my book!

Is it worth publishing a com.snowplowanalytics.snowplow/jsonschema/page_unload/1-0-0 schema in Iglu Central to use by default but allowing users to override this schema in the tracker?

We thought about publishing a new schema but we felt there were a number of different use cases for this new callback concept (on page_unload, on page navigation in a SPA, batching or aggregating page pings and sending on a less frequent timer) that we didn’t feel there was a definitive schema to publish.

In the end leaving this open for users of this functionality to describe their own event types based on their usage of the data in the callback felt like the best approach.

1 Like

This is brilliant, thank you! Just a heads up, there is a small copy paste error:

This row is a duplicate and one should refer to maxXOffset instead:

maxYOffset: aggregatedEvent.maxYOffset > event.maxYOffset ? aggregatedEvent.maxYOffset : event.maxYOffset

Cheers

1 Like

Fixed the original post, good spot!

Hi @PaulBoocock,

while testing I’ve realized, that both minOffset (X and Y) values will always be 0 in the code example above, if I’m not mistaken.
This would be right most of the time of course as a new page will be loaded with YOffset = 0. However, when reloading pages that are scrolled already, there would be an initial minYOffset as well as with opening links that have anchors.
I believe the problem lies in the initial minOffset values, that are set to 0. However I don’t know how to initialize them differently. Does Snowplow offer an API to get the initial scroll offsets? I could probably use standard javascript functions to get the values…

Cheers
Andreas

This is correct :slight_smile: The example we gave was a simplified one that works for the majority of page views but if you’re navigating with an anchor tag, then you’ll want to initialise the values to something other than 0.

Here is the code we run in the JS tracker to populate the values, you could use the same snippet to do so when your pages load:

		function getPageOffsets() {
			var iebody = (documentAlias.compatMode && documentAlias.compatMode !== "BackCompat") ?
				documentAlias.documentElement :
				documentAlias.body;
			return [iebody.scrollLeft || windowAlias.pageXOffset, iebody.scrollTop || windowAlias.pageYOffset];
		}

This returns an array where [0] is the x offset and [1] is the y offset

1 Like

Hi Paul,

Thank you for your post, this is a great idea to cut down on the number of requests sent.

I’ve tried implementing this but I don’t see my requests being sent when the page is closed. I am using the “unload” event listener just like you do in your example, but it doesn’t seem to send a request when the page is closed. Could it be that the request takes too long to send? Or is there something else that I’m doing wrong?

I am using eventMethod: ‘beacon’ in my tracker configuration.

Thank you,

Munichdev

Hi @munichdev ,

This thread is a bit outdated. Nowadays it is recommended to avoid using the unload event listener, see here.

Instead, it is better to use the visibilitychange change API. In our docs, there is a code snippet that shows how to do that, see here.

Hi Matus,

Thank you for your quick answer!

I was using the visibilitychange at first, and I saw it in your docs. But for our use case, we are trying to send an event when the user closed the page, not just hides it. Is this possible another way?

Without knowing when the user closed the page, you can’t identify if the user was dormant on a page for a while since page pings wouldn’t be fired. Am I understanding this correctly? So you couldn’t really get the total time the user was on a page, only the time when they loaded the page until their last interaction on the page. I guess this would be good enough for most scenarios, but I assumed there was an option to fire an event when the page was closed.

Thanks again,
Munichdev

I think the visibilitychange event is useful exactly for recognizing when the user closed the page. Reading from the mdn web docs:

This event fires with a visibilityState of hidden when a user navigates to a new page, switches tabs, closes the tab, minimizes or closes the browser, or, on mobile, switches from the browser to a different app.

So I think that should cover your use case too, right?

Hi Matus,

Yes, it somewhat covers our use case. Unfortunately it doesn’t distinguish something that hides the page, such as switching tabs, between an event that closes the page.

But thank you for your help.

Best regards,
Munichdev

Hi Munichdev,

more than one page_ping event per page_view (e.g. due to tab switch) is not an issues, since you need to aggregate the events in the data model anyway on pageViewId (web page context). We implemented the logic with onVisibilityChange and it works. Keep in mint: in case of an SPA onRouteChange is needed as an additional trigger + reset aggregatedEvent.

Hope that helps.
David

Hi David,

I think you’re right, we have to do some aggregation regardless, so visibility change seems to be the best route.

And thanks for the SPA information! I’ll keep a note of that in the future.

Best regards,
Munichdev