Storage load : Enrich to Redshift

I’ve set up a Snowplow collector and enrich with Kafka sink using docker images. Now I want to load enriched stream into AWS Redshift directly.

Want to explore real time continuous load, or batch load.

please guide me through

There’s no continuous load into Redshift (Redshift does not support streaming) but you can do batch or microbatch loads.

If your data is in an enriched Kafka topic you’ll next want to:

  • sink the data to S3 in small batches. Snowplow doesn’t currently have a S3 loader for Kafka but you can use the Confluent S3 sink to do this.
  • use the RDB Loader to shred / load data from S3 to Redshift