Integrating Adjust with Snowplow to add mobile attribution data to the rest of your event data (2/2)


#1

This is part 2 of a guide to integrating Adjust attribution data into your Snowplow data set. Part 1 is available here.

5. Extending the setup for reattribution events

To extend the setup for reattribution events you work through the same process as for installation events:

5.1 Setup the callback

E.g. by creating the following callback for a reattribution event:

http://mycollector.mydomain.com/com.snowplowanalytics.iglu/v1?schema=iglu%3Acom.adjust.snowplow%2Freattribute%2Fjsonschema%2F1-0-0&app_id={app_id}&app_name={app_name}&app_name_dashboard={app_name_dashboard}&store={store}&tracker={tracker}&tracker_name={tracker_name}&last_tracker={last_tracker}&last_tracker_name={last_tracker_name}&network_name={network_name}&campaign_name={campaign_name}&adgroup_name={adgroup_name}&creative_name={creative_name}&impression_based={impression_based}&is_organic={is_organic}&gclid={gclid}&click_attribution_window={click_attribution_window}&impression_attribution_window={impression_attribution_window}&reattribution_attribution_window={reattribution_attribution_window}&inactive_user_definition={inactive_user_definition}&adid={adid}&idfa={idfa}&android_id={android_id}&android_id_md5={android_id_md5}&mac_sha1={mac_sha1}&mac_md5={mac_md5}&idfa-android-id={idfa||android_id}&idfa-or-gps-adid={idfa||gps_adid}&idfa_md5={idfa_md5}&idfa_md5_hex={idfa_md5_hex}&idfv={idfv}&gps_adid={gps_adid}&gps_adid_md5={gps_adid_md5}&win_udid={win_udid}&win_hwid={win_hwid}&win_naid={win_hwid}&win_adid={win_adid}&match_type={match_type}&reftag={reftag}&referrer={referrer}&user_agent={user_agent}&ip_address={ip_address}&click_time={click_time}&engagement_time={engagement_time}&installed_at={installed_at}&installed_at_hour={installed_at_hour}&created_at={created_at}&reattributed_at={reattributed_at}&connection_type={connection_type}&isp={isp}&city={city}&country={country}&language={language}&device_name={device_name}&device_type={device_type}&os_name={os_name}&api_level={api_level}&sdk_version={sdk_version}&os_version={os_version}&environment={environment}&tracking_enabled={tracking_enabled}&timezone={timezone}&time_spent={time_spent}&lifetime_session_count={lifetime_session_count}&deeplink={deeplink}&fb_campaign_group_name={fb_campaign_group_name}&fb_campaign_group_id={fb_campaign_group_id}&fb_campaign_name={fb_campaign_name}&fb_campaign_id={fb_campaign_id}&fb_adgroup_name={fb_adgroup_name}&fb_adgroup_id={fb_adgroup_id}&tweet_id={tweet_id}&twitter_line_item_id={twitter_line_item_id}&label={label}

5.2 Create a corresponding jsonschema for the reattribution event

E.g.:

{
  "$schema": "http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#",
  "description": "Schema for Adjust reattribute event",
  "self": {
    "vendor": "com.adjust.snowplow",
    "name": "reattribute",
    "format": "jsonschema",
    "version": "1-0-0"
  },

  "type": "object",
  "properties": {
    "app_id": {
      "type": "string"
    },
    "app_name": {
      "type": "string",
      "maxLength": 1024
    },
    "app_name_dashboard": {
      "type": "string",
      "maxLength": 1024
    },
    "store": {
      "type": "string",
      "maxLength": 1024
    },
    "tracker": {
      "type": "string",
      "maxLength": 1024
    },
    "tracker_name": {
      "type": "string",
      "maxLength": 1024
    },
    "last_tracker": {
      "type": "string",
      "maxLength": 1024
    },
    "last_tracker_name": {
      "type": "string",
      "maxLength": 1024
    },
    "network_name": {
      "type": "string",
      "maxLength": 1024
    },
    "campaign_name": {
      "type": "string",
      "maxLength": 1024
    },
    "adgroup_name": {
      "type": "string",
      "maxLength": 1024
    },
    "creative_name": {
      "type": "string",
      "maxLength": 1024
    },
    "impression_based": {
      "type": "string",
      "maxLength": 1024
    },
    "is_organic": {
      "type": "string",
      "maxLength": 1024
    },
    "gclid": {
      "type": "string",
      "maxLength": 1024
    },
    "click_attribution_window": {
      "type": "string",
      "maxLength": 1024
    },
    "impression_attribution_window": {
      "type": "string",
      "maxLength": 1024
    },
    "reattribution_attribution_window": {
      "type": "string",
      "maxLength": 1024
    },
    "inactive_user_definition": {
      "type": "string",
      "maxLength": 1024
    },
    "adid": {
      "type": "string",
      "maxLength": 1024
    },
    "idfa": {
      "type": "string",
      "maxLength": 1024
    },
    "android_id": {
      "type": "string",
      "maxLength": 1024
    },
    "android_id_md5": {
      "type": "string",
      "maxLength": 1024
    },
    "mac_sha1": {
      "type": "string",
      "maxLength": 1024
    },
    "mac_md5": {
      "type": "string",
      "maxLength": 1024
    },
    "idfa-android-id": {
      "type": "string",
      "maxLength": 1024
    },
    "idfa-or-gps-adid": {
      "type": "string",
      "maxLength": 1024
    },
    "idfa_md5": {
      "type": "string",
      "maxLength": 1024
    },
    "idfa_md5_hex": {
      "type": "string",
      "maxLength": 1024
    },
    "idfv": {
      "type": "string",
      "maxLength": 1024
    },
    "gps_adid": {
      "type": "string",
      "maxLength": 1024
    },
    "gps_adid_md5": {
      "type": "string",
      "maxLength": 1024
    },
    "win_udid": {
      "type": "string",
      "maxLength": 1024
    },
    "win_hwid": {
      "type": "string",
      "maxLength": 1024
    },
    "win_naid": {
      "type": "string",
      "maxLength": 1024
    },
    "win_adid": {
      "type": "string",
      "maxLength": 1024
    },
    "match_type": {
      "type": "string",
      "maxLength": 1024
    },
    "reftag": {
      "type": "string",
      "maxLength": 1024
    },
    "referrer": {
      "type": "string",
      "maxLength": 1024
    },
    "user_agent": {
      "type": "string",
      "maxLength": 1024
    },
    "ip_address": {
      "type": "string",
      "maxLength": 1024
    },
    "click_time": {
      "type": "string",
      "maxLength": 1024
    },
    "engagement_time": {
      "type": "string",
      "maxLength": 1024
    },
    "installed_at": {
      "type": "string",
      "maxLength": 1024
    },
    "installed_at_hour": {
      "type": "string",
      "maxLength": 1024
    },
    "created_at": {
      "type": "string",
      "maxLength": 1024
    },
    "reattributed_at": {
      "type": "string",
      "maxLength": 1024
    },
    "connection_type": {
      "type": "string",
      "maxLength": 1024
    },
    "isp": {
      "type": "string",
      "maxLength": 1024
    },
    "city": {
      "type": "string",
      "maxLength": 1024
    },
    "country": {
      "type": "string",
      "maxLength": 1024
    },
    "language": {
      "type": "string",
      "maxLength": 1024
    },
    "device_name": {
      "type": "string",
      "maxLength": 1024
    },
    "device_type": {
      "type": "string",
      "maxLength": 1024
    },
    "os_name": {
      "type": "string",
      "maxLength": 1024
    },
    "api_level": {
      "type": "string",
      "maxLength": 1024
    },
    "sdk_version": {
      "type": "string",
      "maxLength": 1024
    },
    "os_version": {
      "type": "string",
      "maxLength": 1024
    },
    "environment": {
      "type": "string",
      "maxLength": 1024
    },
    "tracking_enabled": {
      "type": "string",
      "maxLength": 1024
    },
    "timezone": {
      "type": "string",
      "maxLength": 1024
    },
    "time_spent": {
      "type": "string",
      "maxLength": 1024
    },    
    "fb_campaign_group_name": {
      "type": "string",
      "maxLength": 1024
    },
    "fb_campaign_group_id": {
      "type": "string",
      "maxLength": 1024
    },
    "fb_campaign_name": {
      "type": "string",
      "maxLength": 1024
    },
    "fb_campaign_id": {
      "type": "string",
      "maxLength": 1024
    },
    "fb_adgroup_name": {
      "type": "string",
      "maxLength": 1024
    },
    "fb_adgroup_id": {
      "type": "string",
      "maxLength": 1024
    },
    "tweet_id": {
      "type": "string",
      "maxLength": 1024
    },
    "twitter_line_item_id": {
      "type": "string",
      "maxLength": 1024
    },
    "label": {
      "type": "string",
      "maxLength": 1024
    }
  },
  "additionalProperties": true
}

5.3 Implement the corresponding SQL table definitions and jsonpaths

Do this using Schema Guru e.g. the SQL table definition by executing the following from the root of your schema registry:

$ /path/to/schema-guru-0.6.2 ddl --with-json-paths schemas/com.adjust.snowplow/reattribute

The generated SQL file:

CREATE TABLE IF NOT EXISTS atomic.com_adjust_snowplow_reattribute_1 (
    "schema_vendor"                    VARCHAR(128)  ENCODE RUNLENGTH NOT NULL,
    "schema_name"                      VARCHAR(128)  ENCODE RUNLENGTH NOT NULL,
    "schema_format"                    VARCHAR(128)  ENCODE RUNLENGTH NOT NULL,
    "schema_version"                   VARCHAR(128)  ENCODE RUNLENGTH NOT NULL,
    "root_id"                          CHAR(36)      ENCODE RAW       NOT NULL,
    "root_tstamp"                      TIMESTAMP     ENCODE LZO       NOT NULL,
    "ref_root"                         VARCHAR(255)  ENCODE RUNLENGTH NOT NULL,
    "ref_tree"                         VARCHAR(1500) ENCODE RUNLENGTH NOT NULL,
    "ref_parent"                       VARCHAR(255)  ENCODE RUNLENGTH NOT NULL,
    "adgroup_name"                     VARCHAR(1024)  ENCODE LZO,
    "adid"                             VARCHAR(1024)  ENCODE LZO,
    "android_id"                       VARCHAR(1024)  ENCODE LZO,
    "android_id_md5"                   VARCHAR(1024)  ENCODE LZO,
    "api_level"                        VARCHAR(1024)  ENCODE LZO,
    "app_id"                           VARCHAR(4096) ENCODE LZO,
    "app_name"                         VARCHAR(1024)  ENCODE LZO,
    "app_name_dashboard"               VARCHAR(1024)  ENCODE LZO,
    "campaign_name"                    VARCHAR(1024)  ENCODE LZO,
    "city"                             VARCHAR(1024)  ENCODE LZO,
    "click_attribution_window"         VARCHAR(1024)  ENCODE LZO,
    "click_time"                       VARCHAR(1024)  ENCODE LZO,
    "connection_type"                  VARCHAR(1024)  ENCODE LZO,
    "country"                          VARCHAR(1024)  ENCODE LZO,
    "created_at"                       VARCHAR(1024)  ENCODE LZO,
    "creative_name"                    VARCHAR(1024)  ENCODE LZO,
    "device_name"                      VARCHAR(1024)  ENCODE LZO,
    "device_type"                      VARCHAR(1024)  ENCODE LZO,
    "engagement_time"                  VARCHAR(1024)  ENCODE LZO,
    "environment"                      VARCHAR(1024)  ENCODE LZO,
    "fb_adgroup_id"                    VARCHAR(1024)  ENCODE LZO,
    "fb_adgroup_name"                  VARCHAR(1024)  ENCODE LZO,
    "fb_campaign_group_id"             VARCHAR(1024)  ENCODE LZO,
    "fb_campaign_group_name"           VARCHAR(1024)  ENCODE LZO,
    "fb_campaign_id"                   VARCHAR(1024)  ENCODE LZO,
    "fb_campaign_name"                 VARCHAR(1024)  ENCODE LZO,
    "gclid"                            VARCHAR(1024)  ENCODE LZO,
    "gps_adid"                         VARCHAR(1024)  ENCODE LZO,
    "gps_adid_md5"                     VARCHAR(1024)  ENCODE LZO,
    "idfa"                             VARCHAR(1024)  ENCODE LZO,
    "idfa_android_id"                  VARCHAR(1024)  ENCODE LZO,
    "idfa_or_gps_adid"                 VARCHAR(1024)  ENCODE LZO,
    "idfa_md5"                         VARCHAR(1024)  ENCODE LZO,
    "idfa_md5_hex"                     VARCHAR(1024)  ENCODE LZO,
    "idfv"                             VARCHAR(1024)  ENCODE LZO,
    "impression_attribution_window"    VARCHAR(1024)  ENCODE LZO,
    "impression_based"                 VARCHAR(1024)  ENCODE LZO,
    "inactive_user_definition"         VARCHAR(1024)  ENCODE LZO,
    "installed_at"                     VARCHAR(1024)  ENCODE LZO,
    "installed_at_hour"                VARCHAR(1024)  ENCODE LZO,
    "ip_address"                       VARCHAR(1024)  ENCODE LZO,
    "is_organic"                       VARCHAR(1024)  ENCODE LZO,
    "isp"                              VARCHAR(1024)  ENCODE LZO,
    "label"                            VARCHAR(1024)  ENCODE LZO,
    "language"                         VARCHAR(1024)  ENCODE LZO,
    "last_tracker"                     VARCHAR(1024)  ENCODE LZO,
    "last_tracker_name"                VARCHAR(1024)  ENCODE LZO,
    "mac_md5"                          VARCHAR(1024)  ENCODE LZO,
    "mac_sha1"                         VARCHAR(1024)  ENCODE LZO,
    "match_type"                       VARCHAR(1024)  ENCODE LZO,
    "network_name"                     VARCHAR(1024)  ENCODE LZO,
    "os_name"                          VARCHAR(1024)  ENCODE LZO,
    "os_version"                       VARCHAR(1024)  ENCODE LZO,
    "reattributed_at"                  VARCHAR(1024)  ENCODE LZO,
    "reattribution_attribution_window" VARCHAR(1024)  ENCODE LZO,
    "referrer"                         VARCHAR(1024)  ENCODE LZO,
    "reftag"                           VARCHAR(1024)  ENCODE LZO,
    "sdk_version"                      VARCHAR(1024)  ENCODE LZO,
    "store"                            VARCHAR(1024)  ENCODE LZO,
    "time_spent"                       VARCHAR(1024)  ENCODE LZO,
    "timezone"                         VARCHAR(1024)  ENCODE LZO,
    "tracker"                          VARCHAR(1024)  ENCODE LZO,
    "tracker_name"                     VARCHAR(1024)  ENCODE LZO,
    "tracking_enabled"                 VARCHAR(1024)  ENCODE LZO,
    "tweet_id"                         VARCHAR(1024)  ENCODE LZO,
    "twitter_line_item_id"             VARCHAR(1024)  ENCODE LZO,
    "user_agent"                       VARCHAR(1024)  ENCODE LZO,
    "win_adid"                         VARCHAR(1024)  ENCODE LZO,
    "win_hwid"                         VARCHAR(1024)  ENCODE LZO,
    "win_naid"                         VARCHAR(1024)  ENCODE LZO,
    "win_udid"                         VARCHAR(1024)  ENCODE LZO,
    FOREIGN KEY (root_id) REFERENCES atomic.events(event_id)
)
DISTSTYLE KEY
DISTKEY (root_id)
SORTKEY (root_tstamp);

COMMENT ON TABLE atomic.com_adjust_snowplow_reattribute_1 IS 'iglu:com.adjust.snowplow/reattribute/jsonschema/1-0-0';

and the generated jsonpath file:

{
    "jsonpaths": [
        "$.schema.vendor",
        "$.schema.name",
        "$.schema.format",
        "$.schema.version",
        "$.hierarchy.rootId",
        "$.hierarchy.rootTstamp",
        "$.hierarchy.refRoot",
        "$.hierarchy.refTree",
        "$.hierarchy.refParent",
        "$.data.adgroup_name",
        "$.data.adid",
        "$.data.android_id",
        "$.data.android_id_md5",
        "$.data.api_level",
        "$.data.app_id",
        "$.data.app_name",
        "$.data.app_name_dashboard",
        "$.data.campaign_name",
        "$.data.city",
        "$.data.click_attribution_window",
        "$.data.click_time",
        "$.data.connection_type",
        "$.data.country",
        "$.data.created_at",
        "$.data.creative_name",
        "$.data.device_name",
        "$.data.device_type",
        "$.data.engagement_time",
        "$.data.environment",
        "$.data.fb_adgroup_id",
        "$.data.fb_adgroup_name",
        "$.data.fb_campaign_group_id",
        "$.data.fb_campaign_group_name",
        "$.data.fb_campaign_id",
        "$.data.fb_campaign_name",
        "$.data.gclid",
        "$.data.gps_adid",
        "$.data.gps_adid_md5",
        "$.data.idfa",
        "$.data.idfa-android-id",
        "$.data.idfa-or-gps-adid",
        "$.data.idfa_md5",
        "$.data.idfa_md5_hex",
        "$.data.idfv",
        "$.data.impression_attribution_window",
        "$.data.impression_based",
        "$.data.inactive_user_definition",
        "$.data.installed_at",
        "$.data.installed_at_hour",
        "$.data.ip_address",
        "$.data.is_organic",
        "$.data.isp",
        "$.data.label",
        "$.data.language",
        "$.data.last_tracker",
        "$.data.last_tracker_name",
        "$.data.mac_md5",
        "$.data.mac_sha1",
        "$.data.match_type",
        "$.data.network_name",
        "$.data.os_name",
        "$.data.os_version",
        "$.data.reattributed_at",
        "$.data.reattribution_attribution_window",
        "$.data.referrer",
        "$.data.reftag",
        "$.data.sdk_version",
        "$.data.store",
        "$.data.time_spent",
        "$.data.timezone",
        "$.data.tracker",
        "$.data.tracker_name",
        "$.data.tracking_enabled",
        "$.data.tweet_id",
        "$.data.twitter_line_item_id",
        "$.data.user_agent",
        "$.data.win_adid",
        "$.data.win_hwid",
        "$.data.win_naid",
        "$.data.win_udid"
    ]
}

You’re now all set for capturing reattribution events as well as installation events.

6. Tailoring the setup

6.1 Maximalist vs minimalist setup

The above setup is a maximalist setup: we’ve decided to grab as much data from Adjust as Adjust make available with their installation and reattribution events. Our corresponding callback, schema, SQL table definition and jsonpath file are all very long. (Because they contain all the different fields.)

You can choose to set this up for a more limited set of fields. In this case you’d simply remove the name value pairs from the callback, remove them from the schema, jsonpath and sql table definitions.

You might also want to setup Callbacks for other Adjust events, if you want to add more than the attribution data to your Snowplow data set.

6.2 Adding custom parameters

Adjust supports sending your own custom parameters with each callback.

To include these custom parameters in your Adjust-Snowplow integration, all you need to do is extend the schemas for each Adjust event types with the custom parameters you want to record into Snowplow. Note that you do not need to update the webhook itself. Whilst you do need to list each Adjust placeholder that you want passed into Snowplow on the webhook, Adjust will automatically add every custom parameter you record in it onto the callback. So as long as your event schema includes those custom parameters, they will automatically be fetched.

Once you’ve updated your schema to include the custom parameters, you’ll need to similarly make sure that your corresponding Redshift table definition and jsonpath files have the additional parameters in too.

7. Integrating callbacks from other third parties

Adjust is not the only company to provide event-level data via a Callback or Webhook. Zendesk is another example. The same approach / methodology should work for Zendesk and any similar provider that lets you specify the Callback URL and individual data points passed on the querystring.


Web traffic driven campaign tracking with Snowplow [tutorial]
#2

Thanks @yali, we are implementing this very soon. Is there a reason not to send everything but just a subset?

Was thinking of using the adjust install/re-attribution event just to get the install linked with the IDFA number (if on Android) and figure everything else out with the normal screen_views and events table. So is there a (Snowplow) reason to send more events through Adjust that happen in-app?


#3

Hi @Koen87 - great question.

From a Snowplow perspective you might as well track every parameter: the only cost is in some additional storage.

In terms of additional events some of our users also capture sessions. Some choose to capture all Adjust events into Snowplow. I personally don’t see the point of recording e.g. a screenview into Snowplow twice: once directly and once via Adjust. But recording sessions can make sense just to make the setup less fragile: if you’re only recording installation and reattribution events, and for some reason an installation event falls through the cracks, you might lose that attribution information for that particular user. In contrast if you’re also recording sessions you have more Adjust events in Snowplow so that if one of them is left out you have others to work with.

We’ve just worked with a client to test that sending custom parameters through Adjust into Snowplow works and it works a treat - so I’ve updated the instructions above accordingly. (Section 6.2.)


#4

Thanks @yali for this post! We implemented part 1 and have the installs flowing into our data warehouse. We also have the events table that tracks the online conversions (in structured events). However, we’re having trouble joining the 2 tables and getting to the attribution of conversions to the install channels. Please could you direct us on how to join the 2 tables? Thanks in advance!


#5

Hi @arbhat85 - you’ll need to user a common user identifier so you can match upstream install events to downstream conversion events. There are two options:

  1. Use one of the mobile identifiers that Adjust and Snowplow support out of the box (e.g. the IDFV). This is the simplest approach, I think.
  2. Track a “match” event into Adjust that includes a Snowplow user identifier

Does that make sense?