Beam enrich - GTM bot traffic

I got snowplow completely set up on GCP, using a CE for the stream collector and 2 dataflow pipelines for the other steps:

  1. Scala stream collector with PubSub (1.0.0)
  2. Beam Enrich (1.1.0)
  3. BigQuery mutator with BigQuery Loader (0.4.0)

My tracker is installed using GTM and it generates quite some bot traffic in my results. I assumed this would be eliminated by the beam enrich? Can somebody share there insights here?

Snowplow doesn’t automatically remove bot traffic but if configurable so you can remove it yourself.

There are a couple of options for this:

  1. Write a JS Enrichment that will identify them using your own logic and send to bad stream.
  2. Pay for the IAB list and use the IAB Enrichment.
  3. Identify them in a data modelling step and remove them from aggregated data sets.

Okay thanks!!