Calculating realtime metrics from mobile events

A bit of context:
In the project that I’m working on we are already calculating some metrics (in “realtime”) from web events that are generated by a snowplow web tracker. Let’s focus on AvgTimeSpent.

We are using Spark Structured Steaming to perform those calculations. In those jobs, events with the same pageview_id are grouped in windows of 1 minutes. Then that pre-aggregation is used to create 5 minute windows that are finally used to calculate the metrics.

Here is a simplified example:

Incoming events:
ts       ev_type   page  pv_id
12:00:00 page_view /home 0001
12:00:10 page_ping /home 0001
12:00:20 page_ping /home 0001
12:00:22 page_view /home 0002
12:00:32 page_ping /home 0002

1 minute aggregation:
ts       page  time_spent pv_id
12:00:00 /home 20         0001
12:00:00 /home 10         0002

Average time spent:
ts       page  avg_time_spent
12:00:00 /home 15

The calculation of these metrics is possible because in web there are page pings, therefore a 5 minute window has enough information to determine how many pageviews ocurred and how much time a user spent on a page, on that 5 minute interval.

Since in mobile trackers there is not support for page pings, I wonder if anyone have calculated similar realtime metrics for mobile events

I’m not looking for a solution specific for Spark Structured Streaming, we could well be using Flink or something else.

I’m more interested in the approach used to perform the calculation. Things like:
- Which events did you configure on the mobile tracker?
- How do you use those events to calculated your metrics?

Any help is appreciated.

screen_view events would be the event type I’d initially focus on, you can think of that in similar ways to a page_view. This is an unstructured event but you can filter on the event_name field rather than event.
I think how to work out activity when on a screen is quite app specific as behaviour can be very different between apps. Do users switch screens a lot or do they scroll a lot? If you’re scrolling, you might want to introduce a custom event schema that captures scroll depth at certain amounts of scroll or at certain time elapsed (like page_pings do). With that, the calculation should be quite similar to what you do for the web.

One other consideration on mobile is that app switching often happens quite frequently for many applications and for many users. Capturing the foreground and background events can be really useful for this, this informs you when your user has stopped being active in your app and then when they return. This could certainly become part of your calculation although event order might be more important here (You’re going to want to order by a timestamp).

Lastly, I’m not sure active time on screen for mobile is as useful as active time on page is on the web. Active time in session might be a better metric, you could group together all the events for a session id (in the session context) to achieve this. Then you’d take all the screen views, foreground and background events and be able to calculate how long a user has been in session in your app.