Snoplow on Postgres - encoded fields, constraints, DISTKEYS and other doubts

rcpp · June 3, 2019, 7:31pm

Hello!

While working on a new JSON Schema for our structured events, we tried do build a new table to save them there. Based on link_click_1.sql1, we added our fields and ran them against our Postgres database.

The first problem we have was with field type values that are exclusive to Redshift: raw, runlength, text32k, text255.

Since we could not find any information on how to adapt these to Postgres, we them removed this information from the fields declaration, but to no avail: Postgres complained about DISTKEYS.

We understand that DISTKEYS are used by Redshift to keep data distributed across its nodes, but that this is not a Postgres parameter. So we tried to removed it.

After removing the DISTKEYS definitions, Postgres complained about the FK constraint not being satisfied because there are no unique value on the referenced field/table. Since that table events was created using the SQL script provided by Snowplow, we believe that this should be satisfied.

Our questions are:
Are there any DDL for creating the com_snowplowanalytics_snowplow_link_click_1 on Postgres? How can we satisfy the FK constraint?

If there is no DDL script to create this table, how can we adapt it to our needs?

And at last, how do we create a dedicated table for our structured custom events?

Thank you in advance,
Ricardo

============

ihor · June 3, 2019, 8:00pm

@rcpp, self-describing events and context are currently not supported with Postgres. You would need to use one of the following as your data store to have the “non-atomic” data loaded:

Redshift
Elasticsearch (real-time)
Snowflake
S3
BigQuery (GCP)

rcpp · June 3, 2019, 8:48pm

Hi! Thank you for you quick reply!

I see. We will see what we can do. Right now, we have already an infrastructure with Redshfit and we were playing around with Postgres to validate some ideas.

Cheers!

Topic		Replies	Views
Structured events not stored For engineers	2	921	November 16, 2016
Passing values from atomic.events to a custom table Redshift	5	4000	May 11, 2017
ELI5: Where can I find the schema for the canonical event model? Redshift	5	1798	March 1, 2019
Redshift Indexing	3	974	October 1, 2019
Upgrade from 2-yr-old version, where is unstruct_event col? For engineers	3	763	December 29, 2017

Snoplow on Postgres - encoded fields, constraints, DISTKEYS and other doubts

Related Topics