Snowplow Quickstart - Now with Snowflake Loading!

We’re extremely excited to announce we’ve now introduce Snowflake loading to the AWS Quickstart Pipeline!

When creating a quick start pipeline on AWS, you now have the option to select Postgres or Snowflake. The Snowflake Loading is using the latest RDB Loader v3 release which introduced Snowflake as a destination.

Introduction

Once you have cloned the quickstart-examples repository, you will need to navigate to the pipeline directory to update the input variables in either postgres.terraform.tfvars or snowflake.terraform.tfvars according to the chosen destination.

Your chosen database needs to be specified with the new pipeline_db variable. Allowed values for that variable are postgres and snowflake . The respective terraform.tfvars file should be filled in according to the chosen database. Only database specific variables are different in those two tfvars files, everything else should be consistent between the two.

Migration from earlier Quickstart Releases

If you already have a quick start pipeline and you wish to upgrade, as you’ll be using postgres then you should simply need to copy your values into postgres.terraform.tfvars and set the pipeline_db var to postgres.

Additional Snowflake setup

If you’re using Snowflake, you also need to run the new Snowflake Setup module.

Prerequisites

Authentication for the service user is required for the Snowflake Terraform provider – follow this tutorial to obtain Snowflake connection details:

Parameter Description
account The account name.
username A snowflake user to perform resource creation.
region Region for the snowflake deployment.
role Needs to be ACCOUNTADMIN or similar.
private_key_path Path the private key.

Usage

  1. Fill variables in terraform.tfvars within the aws/snowflake folder. Snowflake connection details found in the Prerequisites section need to be assigned to respective variables in terraform.tfvars .
  2. Run terraform init
  3. Run terraform apply

Output

Snowflake Terraform module will output the name of the created resources. Full list can be found here.

These output values need to be passed to aws/pipeline modules as a variable when Snowflake is selected as pipeline’s destination.

Helpful Links

Github: GitHub - snowplow/quickstart-examples: Examples of how to automate creating a Snowplow Open Source pipeline

Documentation: Quick Start Installation Guide on AWS - Snowplow Docs

7 Likes