Campaign Attribution Enrichment - Null Data

I’ve got all data loading into snowflake. All enabled enrichments work except campaign attribution. The data is not showing up in the transformed file to load into atomic.events. I’ve decoded the enrichments JSON sent over to the EMR job and included it below. Maybe someone can point me in the right direction.

{
  "schema": "iglu:com.snowplowanalytics.snowplow/enrichments/jsonschema/1-0-0",
  "data": [
    {
      "schema": "iglu:com.snowplowanalytics.snowplow/event_fingerprint_config/jsonschema/1-0-0",
      "data": {
        "name": "event_fingerprint_config",
        "vendor": "com.snowplowanalytics.snowplow",
        "enabled": true,
        "parameters": {
          "excludeParameters": [
            "eid",
            "stm"
          ],
          "hashAlgorithm": "MD5"
        }
      }
    },
    {
      "schema": "iglu:com.snowplowanalytics.snowplow/ip_lookups/jsonschema/2-0-0",
      "data": {
        "name": "ip_lookups",
        "vendor": "com.snowplowanalytics.snowplow",
        "enabled": true,
        "parameters": {
          "geo": {
            "database": "GeoLite2-City.mmdb",
            "uri": "http://snowplow-hosted-assets.s3.amazonaws.com/third-party/maxmind"
          }
        }
      }
    },
    {
      "schema": "iglu:com.snowplowanalytics.snowplow/referer_parser/jsonschema/1-0-0",
      "data": {
        "name": "referer_parser",
        "vendor": "com.snowplowanalytics.snowplow",
        "enabled": true,
        "parameters": {
          "internalDomains": [
            "aaacarolinas.com"
          ]
        }
      }
    },
    {
      "schema": "iglu:com.snowplowanalytics.snowplow/campaign_attribution/jsonschema/1-0-1",
      "data": {
        "name": "campaign_attribution",
        "vendor": "com.snowplowanalytics.snowplow",
        "enabled": true,
        "parameters": {
          "mapping": "static",
          "fields": {
            "mktMedium": [
              "utm_medium"
            ],
            "mktSource": [
              "utm_source"
            ],
            "mktTerm": [
              "utm_term"
            ],
            "mktContent": [
              "utm_content"
            ],
            "mktCampaign": [
              "utm_campaign"
            ]
          }
        }
      }
    },
    {
      "schema": "iglu:com.snowplowanalytics.snowplow/ua_parser_config/jsonschema/1-0-1",
      "data": {
        "vendor": "com.snowplowanalytics.snowplow",
        "name": "ua_parser_config",
        "enabled": true,
        "parameters": {
          "database": "regexes-latest.yaml",
          "uri": "s3://snowplow-hosted-assets/third-party/ua-parser/"
        }
      }
    },
    {
      "schema": "iglu:com.snowplowanalytics.snowplow/javascript_script_config/jsonschema/1-0-0",
      "data": {
        "vendor": "com.snowplowanalytics.snowplow",
        "name": "javascript_script_config",
        "enabled": false,
        "parameters": {
          "script": "ZnVuY3Rpb24gcHJvY2VzcyhldmVudCkgew0KDQogIHZhciBwbGF0Zm9ybSA9IGV2ZW50LmdldFBsYXRmb3JtKCksDQogICAgICBhcHBJZCAgICA9IGV2ZW50LmdldEFwcF9pZCgpOw0KDQogIGlmIChwbGF0Zm9ybSA9PSAic2VydmVyIiAmJiBhcHBJZCAhPSAic2VjcmV0Iikgew0KICAgIHRocm93ICJTZXJ2ZXItc2lkZSBldmVudCBoYXMgaW52YWxpZCBhcHBfaWQ6ICIgKyBhcHBJZDsNCiAgfQ0KICANCiAgaWYgKGFwcElkID09IG51bGwpIHsNCiAgICByZXR1cm4gW107DQogIH0NCg0KICAvLyBVc2UgbmV3IFN0cmluZygpIGJlY2F1c2UgaHR0cDovL25lbHNvbndlbGxzLm5ldC8yMDEyLzAyL2pzb24tc3RyaW5naWZ5LXdpdGgtbWFwcGVkLXZhcmlhYmxlcy8NCiAgdmFyIGFwcElkVXBwZXIgPSBuZXcgU3RyaW5nKGFwcElkLnRvVXBwZXJDYXNlKCkpOw0KDQogIHJldHVybiBbIHsgc2NoZW1hOiAiaWdsdTpjb20uYWNtZS9mb28vanNvbnNjaGVtYS8xLTAtMCIsDQogICAgICAgICAgICAgICBkYXRhOiB7IGFwcElkVXBwZXI6IGFwcElkVXBwZXIgfQ0KICAgICAgICAgICB9IF07DQp9"
        }
      }
    }
  ]
}

@sonnypolaris, the enrichment configuration looks OK. Maybe there’s a misunderstanding on what data and when should be present?

The enrichment parses querystring of the page URI and looks for the fields according to the configuration. If no querystring matches those properties no data will populate mkt_ fields of the atomic.events.

Could you check the page_urlquery field and see if it contains Google specific (utm_) properties (in your case) in there? Have you configured the properties right?

They appear all be null. Here is a quick data sample

PAGE_URLQUERY	PAGE_URLHOST	PAGE_URLPATH	PAGE_URLFRAGMENT

    	sonny.aaacarolinas.com	/index.html	?utm_source=take5-july&utm_medium=email&utm_campaign=cn0201
    	sonny.aaacarolinas.com	/index.html	?utm_source=take5-july&utm_medium=email&utm_campaign=cn0201
    	sonny.aaacarolinas.com	/index.html	?utm_source=take5-july&utm_medium=email&utm_campaign=cn0201
    	sonny.aaacarolinas.com	/index.html	?utm_source=take5-july&utm_medium=email&utm_campaign=cn0201
    	sonny.aaacarolinas.com	/index.html	?utm_source=take5-july&utm_medium=email&utm_campaign=cn0201

This is rather odd. I would expect the values you show for PAGE_URLFRAGMENT to be the values for PAGE_URLQUERY while no values (null) for PAGE_URLFRAGMENT. Are you sure you haven’t confused the two?

Here is the sql from snowflake. I wonder if the file format is off or shifted. Would it help to provide the raw files from the CF collector and the output from the snowflake transformer?

Can you show the contents of page_urlquery?

It should be showing up in there assuming you’re putting the campaign tracking into the URL query and not pay the URL hash.

Have you got an example of a tagged link that users click through to?

I’ve got a test website setup on my local box. I’m using this to get the whole snowplow environment configured (tracker, collector, enrich, snowflake transform, snowflake load). I uploaded the SQL and Results previously. Is this what you wanted?

Here is the test url that i’m using
http://sonny.aaacarolinas.com/index.html?utm_source=take5-july&utm_medium=email&utm_campaign=cn0201

It looks oddly like the page_urlfragment and the page_urlquery are being switched around. I’d try adding a fragment to your test URLs to test and it might also be worth posting the Snowflake CREATE TABLE DDL that you are running.

Yes, sorry - missed that.

That’s really odd… never seen anything like that. Mike’s suggestion seems like a sensible way to check this out.

I’ll test this out. But I have a failing EMR ETL ob right now. Not sure why the enrich step is failing, the same config was working yesterday. I’ll post another topic.

Resolved. Thanks guys for pointing me in the right direction. I believe that I had some mal-formed url with url-fragements that caused the issue. I’ve got it all working now. I’m going to keep probing the issue. Below is an example that doesn’t work (all my url where like this):

http://sonny.aaacarolinas.com/index.html#?utm_source=some-source

The parser must have just stopped when it encountered the ‘#’ followed by ? instead of a &. I’m guessing here.

Thanks

@sonnypolaris, that makes total sense now. Indeed, anything after # is treated as a fragment rather than a querystring. It is a malformed URI.

1 Like