Dataflow_runner encryption

Hi

I can find some documents how to use encryption with EmrEtlRunner but i dont use it. (e.g.https://snowplowanalytics.com/blog/2018/07/24/snowplow-r108-val-camonica-with-batch-pipeline-encryption-released/#encryption)

My pipeline is based on dataflow_runner and i can’t see how to configure it to work with encrypted S3 buckets.

Hi Eko,

With Dataflow Runner you define all of the job-steps directly. As such you configure this directly by supplying the appropriate commands to the job_step. For example with S3-Dist-CP you would add the following option:

--s3ServerSideEncryption

If you let us know what job steps you have configured in Dataflow Runner we can likely point out if any other steps need a special addition to use encrypted S3 buckets.

Hi Josh,

I have 3 steps:

  1. s3-dist-cp
  2. command-runner
    this step runs “snowplow-snowflake-transformer”. it takes the enriched events from one bucket, transform then stage the events in another bucket
  3. snowplow-snowflake-loader

Thanks!