I got snowplow completely set up on GCP, using a CE for the stream collector and 2 dataflow pipelines for the other steps:
- Scala stream collector with PubSub (1.0.0)
- Beam Enrich (1.1.0)
- BigQuery mutator with BigQuery Loader (0.4.0)
Now, from reading previous entries I read that the 0.4.0 mutator does not make a time partitioned table and from what I know of BigQuery is that you can’t change a table to time partitioned.
I’ve also read the discussion (Google Cloud Platform data pipeline optimization) where Anton mentions: “Unfortunately Mutator cannot create partitioned tables yet - we’ll add this in next version. But right now you create partitioned table manually via BigQuery Console: Create table -> Schema edit as text -> Paste example atomic schema . Partitioning dropdown menu will automatically propose you to choose any datetime columns as partitioning key.”
If I perform this action, everything goes into my failed-streaming-inserts.
Can anybody help here?