Shredded/bad-rows output directory already exists

Thanks for having a look. It’s definitely my fault it’s happening, hopefully it will make Snowplow a bit more robust. :slight_smile:

I have experimented with the advice the discourse post on Spark optimisation but didn’t see any change in the step time so just went for the defaults i.e. no Spark config changes. I didn’t do any log file analysis though. I might post over on the Spark learnings topic to see if someone has some further optimisation advice. We only have two files output from the enriched step and I thought that might be limiting the shredding step’s parallelism.