Snowplow Bounce Rate - Tableau


#1

Hi all,

We’re using Tableau to visualise Snowplow web data, and I want to create a simple visualisation of bounce rate.

Does anyone know how to create a calculated metric for the number of bounced sessions i.e. sessions with just a single event? I’ve read Yali’s article on SQL queries for Snowplow data, and the methodology does make sense, however I am not sure how to calculate the number of bounces. Once I have that, I think the calc will be very straight forward:

SUM(Bounces)/SUM(Sessions)

Where Sessions is a calculated metrics like the following:

COUNTD([Domain Userid]+"-"+STR([Domain Sessionidx]))

If anyone can help me crate a calc for bounces it’d be massively appreciated!

Thanks


#2

Have you created a persistent sessions table similar to what the Snowplow team does using SQL Runner? If so, there should be a column for event_count that you can use for counting how many sessions only have an event_count=1

If not, you’d need to use a SQL query on the events table that aggregates (counts) the number of events (grouped by domain_userid and session_idx) and then you could calculate bounce rate off of that view with Tableau.

The Snowplow team has a basic SQL recipe that creates a daily bounce rate view (off the atomic.events table) which looks like this:

-- Bounce rate by day
CREATE VIEW recipes_basic.bounce_rate_by_day AS
	SELECT
		DATE_TRUNC('day', time_first_touch) AS "Date",
		SUM(bounces)::REAL/COUNT(*) as "Bounce rate"
	FROM (
		SELECT
			domain_userid,
			domain_sessionidx,
			MIN(collector_tstamp) as "time_first_touch",
			COUNT(*) as "number_of_events",
			CASE WHEN count(*) = 1 THEN 1 ELSE 0 END AS bounces
		FROM "atomic".events
		WHERE collector_tstamp > current_date - integer '31'
		GROUP BY 1,2
	) v
	GROUP BY 1
	ORDER BY 1;

#3

Thanks for this.

I have seen this SQL, but I was just curious of any Calculated Field inside Tableau that would generate Bounces as a measure, so I could do the calculation in Tableau.


#4

Hi,

First of all, bounce is usually calculated through page views (google analytics), and not through total events. So I don’t agree with the count(*).
You can create a view like this:

create view atomic.events_w_bounce as 
select e.*, case when a.domain_sessionid is not null and e.event = 'page_view' then 1 else 0 end as bounce
from atomic.events e
left join (
select domain_sessionid, count(distinct event_id) as page_views
from atomic.events
where event = 'page_view'
group by 1
having count(distinct event_id) = 1
) a
on e.domain_sessionid = a.domain_sessionid;

#5

Bounce is a session level metric and the count(*) from the snowplow query is pulling from a subquery that has grouped the events into user sessions, so it should work

Any interaction event other than the first pageview will constitute a non-bounce, and that’s consistent with GA’s definition: https://support.google.com/analytics/answer/1009409?hl=en


#6

It still says “single-page session”. You may interpret it as you like, it’s fine. Just think about things like scrolling- if a user went into to the page, scrolled and closed the browser, would you consider it as bounce? The user did perform an action (scrolled).
If you wouldn’t count it as bounce, I would recommend you adding a tag to the events of “Interaction/ Non Interaction”, and count only the interaction events when calculating bouncing.
And another thing, you should still go for count(distinct event_id) instead of count(). We see sometimes events are written twice. You can check it by running:
select event_id, count(
) as n
from atomic.events
where collector_tstamp>=getdate()::date - 30
group by 1
having count(*)>1
order by 2 desc
limit 100

And see if you have any results. If not, then forget about it :slight_smile:


#7

Ahh, I see what you’re saying about the count(*) with duplicate events. We run a series of event deduplication queries on our tables but it is more robust to count distinct for sure.

Agree on the interaction/non-interaction event tag. Scrolling/reading the page for some amount of time is definitely in the grey area for some of us.


#8

Depending on your definition of bounce rate, you may want to exclude page_ping events from your SQL query.

OP is asking specifically about “the number of bounced sessions i.e. sessions with just a single event”. In that case you’ll need to add event != 'page_ping' to the WHERE clause of the query. Otherwise, the first ping on a page will increase the event count and that user will not be counted as bounced.

Of course, a visitor who only visits one page but stays on it for 5 mins could be seen as highly engaged, depending on what you’re looking to measure. Also, the time between pings is something Snowplow users set by themselves, so keep that in mind when deciding whether to exclude page_pings or not.