Snowplow Realtime pipeline with Docker


#1

Hi guys,

I’ve been looking for an easy way to test snowplow realtime pipeline on a local machine, mainly just to get a better understanding of the components, but also in part because I wanted to write a custom enrichment with JavaScript script enrichment - https://github.com/snowplow/snowplow/wiki/JavaScript-script-enrichment.

Seeing as people were struggling in debugging it’s behaviour here: Best way to test/debug javascript enrichment scripts?, I was not satisfied with either bringing up Scala REPL or running the script in node.js and so I set up the realtime pipeline locally on docker, extending the example given in the snowplow-docker repo.

This makes it much easier to test custom enrichments, as you are only required to restart scala-stream-enrich container and enriched data is instantly visible in Kibana.

I used NSQ and set the buffer to only store 1 record for testing purposes and instant feedback.

I hope this could help others in a similar position and stand in as a fast and easily extensible alternative to snowplow-mini.

You can find the project here: https://github.com/kazysgurskas/snowplow-realtime-docker


#2

Great work, @kazgurs1, thanks a ton!


#3

Yes indeed - many thanks for sharing @kazgurs1! A great effort :fireworks: