Tutorial ideas for the upcoming SQL Query Enrichment?

We’re hard at work on the SQL Query Enrichment, which will let Snowplow users dimension widen their Snowplow events with the results of arbitrary SQL queries.

After the response to the Clearbit tutorial for the API Request Enrichment, we’d love to do a similar tutorial for the SQL Query Enrichment. Although a lot of use cases for the SQL Query Enrichment will be using internal data (e.g. customer records / product databases), there are no doubt some interesting public data sets which could be fun to join into a Snowplow event stream.

We’d love to get your suggestions for our tutorial here! We are looking for an interesting public dataset which:

  1. Is already available in MySQL/Postgres format, or is easily convertible to the same (i.e. is published as CSV or maybe JSON)
  2. Has some key which is easily joined onto a fairly standard Snowplow event stream

Suggestions in this thread please! I’ll get the ball rolling with an idea we had internally.

An idea we had internally: use the MaxMind city field to join to the UN city data available here:

https://raw.githubusercontent.com/datasets/population-city/master/data/unsd-citypopulation-year-both.csv

Has the advantage that most Snowplow events will have the MaxMind city field set. The disadvantages are:

  • The dataset is not very exciting
  • We would need some involved SQL to return a single meaningful row per city

I bet the community can come up with something more exciting - suggestions please!

+1 for Magento & Wordpress.

When do we start? :slight_smile:

1 Like

Thanks guys, @ihor has put together a first tutorial for the SQL Query Enrichment:

http://discourse.snowplow.io/t/how-to-enrich-events-with-mysql-data-using-the-sql-query-enrichment-tutorial/385

R83 should be out shortly… Excited to see what use cases the community puts the new enrichment to!

1 Like