Storage target credentials


#1

With the new EmrEtlRunner r90 I understand that the loading of storage targets happens in the EMR job and that storage target configurations are provided to EmrEtlRunner when it’s run.

The Redshift configuration requires username and password.

Is there a way to provide the password from an environment variable or something rather than storing it directly in the JSON?

I could create a script to insert the password from an env variable when the Docker container starts, for instance, but I’d still be writing it to disk on the orchestration server, which I’d rather avoid if possible.


#2

Hi @bryce - no, there isn’t currently an option to use environment variables to resolve secrets in either the enrichment or the storage target JSONs - they are just JSONs.

We have some ideas around potentially moving all of this configuration into a Control Plane which could allow better secrets resolution, but nothing concrete planned here yet (although Snowplow Mini will start iterating on its own Control Plane soon).


#3

Thanks @alex!


#4

I’m interested too in a solution. Before r88 we had env vars in the config.yaml file.


#6

The latest version of RDB loader (R28) supports putting the password and/or private SSH key in EC2 parameter store so you can avoid putting these in environment variables.

An example configuration might look like the below where snowplow.rdbloader.redshift.password is the name of your parameter.

{
  ...
  "password": {
    "ec2ParameterStore": {
      "parameterName": "snowplow.rdbloader.redshift.password"
    }
  ...
  }
}

#7

Thanks @mike!

Detailed setup-guide is also available in our wiki.