Device Classification Library - End of Life


#1

Hey everyone!

There seems to be an issue which could be affecting Snowplow device classification reporting, and am hoping someone where might have some insight.

An external tool called user-agent-utils is imported by Snowplow to classify device type:

Parses user agent strings and returns a hash of info. More about UA string standards: https://developer.mozilla.org/en-US/docs/Web/HTTP/Browser_detection_using_the_user_agent

This is what user-agent-utils parses against - an array of UA strings from different platforms:

Device type is hardcoded into each entry, e.g. DeviceType.COMPUTER

But user-agent-utils has reached EoL as of Version 1.20: https://github.com/HaraldWalker/user-agent-utils#eol-warning, last updated in May 2016.

Snowplow is currently using 1.19.

So if this array isn’t updated, the UA parser will return devices as unknown and we lose that from our reporting.:

	UNKNOWN_MOBILE(	Manufacturer.OTHER,null, 3, "Unknown mobile", new String[] {"Mobile"}, null, DeviceType.MOBILE, null ),
	UNKNOWN_TABLET(	Manufacturer.OTHER,null, 4, "Unknown tablet", new String[] {"Tablet"}, null, DeviceType.TABLET, null ),
	UNKNOWN(		Manufacturer.OTHER,null, 1, "Unknown", new String[0], null, DeviceType.UNKNOWN, null );

Has anyone here come up with a solution to keep their device classification up to date? A different UA library or something else altogether?

Thanks!


#2

There’s the ua parser enrichment which uses a different library:

Thanks for bringing this up though, I need to switch my data models to use the fields in the context table generated by ua-parser rather than the fields in atomic.events which are generated by user-agent-utils

Tip: make sure that when joining atomic.events table with the atomic.com_snowplowanalytics_snowplow_ua_parser_context_1 table, join on both event_id = root_id as well as collector_tstamp = root_tstamp to avoid cartesian-product hell from duplicates containing the same event_id