I’ve got a snowplow tracker pushing custom structured events to a collector on GCP that’s sending those events to a pubsub topic/subscription. Where can I find basic documentation or examples on how to parse the data in those pubsub messages in a dataflow job?
Specifically, I’m hoping to do this without setting up separate jobs/deployments for “validate” and “enrich” steps (because I don’t need to do any enrichment - just need to parse the metadata and use it to kick off some other workflows). It seems like I should be able to use some of the code from scala-common-enrich to deserialize and parse my events myself, but there’s no documentation on how to use any of the classes in that library.
In general, am I even thinking about this the right way? Or is setting up a pre-built “Enrich” step (separate from my actual data processing) really the only correct way to use snowplow? If that’s the case, where can I find docs on the data types/structures that come out of an enrich step? Feels like there’s a ton of basic usage direction that’s either hidden or missing.