Customer Funnel based on pageviews?


#1

Hello all

This might be more of a generic question and not specific to Snowplow but I just wanted to pick your brains on setting up a funnel report based on Page visits and views. (current page & previous page combinations)

If you have any experience with this or if there is anything the Snowplow data that can help with this, please let me know. For example, I have yet to look into and understand the PageView Index.

What I’m looking into is flows between pages. For example. from a Product Page -> how many people went to Basket, how many Exited, how many went to another product, etc…

Each Page View will be an opportunity for the visitor to either follow the funnel (as we intended it) and move to the Basket or do something different such as go back to homepage, exit, etc. I’ve got a feeling that this sort of report will give different insight vs a funnel report where 100% of sessions start at Page A and 2% end up on Page Z.

Very interested to hear your thoughts.


#4

Hi @Woody,

Thanks for the question. This is something you can do with Snowplow, and there are some atomic.events fields that will help with this. It looks like you’ve got the standard page view tracking live, which is great. Here’s the cannonical data model, which can be a useful resource that might help with exploring your data - it gives you a definition of all the standard fields.

If I understand what you’ve described, you’re looking to count domain_userids (or userids if you’re only looking at logged in users) for page A first, then count them for page B only if the previous page was page A (then iterate this for the whole funnel).

You can achieve this using the refr fields alongisde the page fields (which of these is most useful depends on your website’s urls - my starting point would be the page_urlpath and refr_urlpath.

The refr fields will tell you what the last page was for that user, the page fields describe the current.

There are a few ways to build the query. You could try using a CASE statement, then a count distinct users per step, for example.

The query would look something like this:


WITH step1 AS(

SELECT 

  domain_userid,
  CASE 
    WHEN page_urlpath = 'product/something' THEN 'product'
    WHEN page_urlpath = 'conversion/step1' AND refr_urlpath = 'product/something' THEN 'intended_conversion_step_1'
    WHEN page_urlpath = 'conversion/step1' AND refr_urlpath = 'conversion/step2' THEN 'intended_conversion_step_2'
  END AS buckets,

  FROM atomic.events
)

SELECT 

  buckets,
  count(DISTINCT domain_userid)

FROM step1
GROUP BY 1

You can amend the case statement to use a LIKE statement if that’s more appropriate.

Let us know how you get on!


#5

Thanks Colm! I’ve started going through the data and am now validating some of the numbers.

Thanks very much for the Cannonical page! Do you know why this one “page_view_in_session_index” isn’t in the list? It might help me with my funnel as it tell me where in a session the page most appears…


#6

Disco is an excellent piece of software for this (though a bit pricey). You can try out an evaluation licence with Snowplow data (or any other clickstream data for that matter) to see how it works for you.

See this guide on how to get it up and running with Snowplow and GA 360 data - https://www.iglooanalytics.com/blog/path-analysis-in-google-analytics-360.html


#7

Do you know why this one “page_view_in_session_index” isn’t in the list?

Are you running the web model? This field is calculated in the web model by assigning a number to each page_view within the session in this line.

The cannonical event model above describes the atomic data (ie the data as it arrives in Redshift) - this will always live in the atomic schema. The web-model is a data modeling process which aggregates the data to a page_view, session, and user level. This will live wherever you have set up the web model to output the data (we usually use a derived schema).

While I don’t have any field-by-field documentation of the web model, you can see the SQL in its entirety here.

Hope this helps!


#8

Thanks again Colm, that helped a lot! Much appreciated.

I setup several different Looks and the data coming out of them is great! The only downside is that for each step in the funnel, I need to setup several different Looks.

E.g. to find the progress from PageA to PageB, I need two Looks:
Look1 = Total pageviews on PageA (e.g 100k)
Look2 = Total pageviews on PageB where PageA was the referral (e.g 3k)

Look2 / Look1 tells me the progress rate, 3%.

The next challenge is to do this for several pages within the same Look… :slight_smile: I guess a fancy query will be able to do this so I need to dust of my 101 book…

All the best,
Woody


#9

Glad to help @Woody.

Doing this kind of path analysis can get tricky in SQL alright. As @jrpeck1989 suggested above there are tools to help with this. In addition to Disco, there’s Indicative, an awesome tool for this kind of visualisation (not sure on price though), and Neo4j have an opensource option and is designed for this kind of thing, but would take some work in getting Snowplow data in. Here’s a blog that takes you through doing this, and analysing the data.

Do let us know if you come up with some SQL magic or some other solution to this.

Best,
Colm