GDPR - PII configuration in the batch pipeline


#1

Hey everyone!

I’m trying to setup pseudonymization on a batch pipeline. I want to obfuscate the idfa/idfv identifiers from mobile phones.

I added this config to my enrichments folder:

{
    "schema": "iglu:com.snowplowanalytics.snowplow.enrichments/pii_enrichment_config/jsonschema/2-0-0",
    "data": {
      "vendor": "com.snowplowanalytics.snowplow.enrichments",
      "name": "pii_enrichment_config",
      "emitEvent": false,
      "enabled": true,
      "parameters": {
        "pii": [
          {
           "json": {
              "field": "contexts",
              "schemaCriterion": "iglu:com.snowplowanalytics.snowplow/mobile_context/jsonschema/1-0-*",
              "jsonPath": "$.data.['openIdfa', 'appleIdfa', 'appleIdfv', 'androidIdfa']"
            }
          }
        ],
        "strategy": {
          "pseudonymize": {
            "hashFunction": "SHA-1",
            "salt": "randomsalt"
          }
        }
      }
    }
  }

The situation is that I don’t have any errors (the ones I got are now fixed so I know the enrichment file is
being considered) but the pseudonymization just doesn’t work, the values don’t change to an SHA-1 hash.

Couple of questions here:
1 - Is this supposed to work on the batch pipeline?
2 - Does anyone sees something wrong in the config that I’m missing?

Thanks a lot for any help you’ll be able to provide!
Cheers!


#2

Hi @Timmycarbone,

  1. Yes, it should work.
  2. Could you try replacing "$.data.['openIdfa', ... with "$.['openIdfa', ..."?

Regards,


#3

Worked.

I guess I wrongly assumed that I needed to follow the path described in the jsonpath.

Thanks a lot!


#4

Cool!

This depends on a schema: whether it has a data node or not. You can check the following schemas as an example: