We’ve recently set up a Snowplow system set up to replace an older (custom built) tracking system, it has mostly been implemented via the Snowplow documentation, I am on the data science side and getting up to speed with it, but was curious about this specific point in the title.
With our legacy system we had a boolean/flag where each session (
session_id) had a ‘status’ or ‘active’ flag, with:
- 1 = session is open (ie, still capturing events), or;
- 0 = session is closed
I depend on this for a few models and analytics needs and want to have a similar understanding with the Snowplow data – is there a way to understand the state of a session (
domain_sessionidx) in Snowplow? If so, would this be based on some other column/variable in the
atomic.events data table (ie,
etl_tstamp) or would there need to be some modification to our Snowplow sessions to include such a flag? The desire is to know this as soon as possible, so simply reviewing at an arbitrarily later time would not be ideal.
A real analytics use case: we currently have an analytics report that summarizes session behaviour for our marketing team, and some of these are specifically focused on when a session is closed/has ended to enable them to understand how web visits are doing, are people spending less time per session or more? Which visits contain a specific marketing event? etc
A real data processing use case: we have a rules-based system for notifications, some of which are based on a completed/closed session, such as notify a representative when a visit includes a desired event (ie, visit contained a specific page view and surpassed a desired time on page) – this should only happen once the session is closed as it is specifically a follow up.