Shredded/bad-rows output directory already exists

gareth · September 28, 2017, 10:48am

Thanks for having a look. It’s definitely my fault it’s happening, hopefully it will make Snowplow a bit more robust.

I have experimented with the advice the discourse post on Spark optimisation but didn’t see any change in the step time so just went for the defaults i.e. no Spark config changes. I didn’t do any log file analysis though. I might post over on the Spark learnings topic to see if someone has some further optimisation advice. We only have two files output from the enriched step and I thought that might be limiting the shredding step’s parallelism.

Topic		Replies	Views
Error: Directory Already Exists when running Snowflake transformer Snowflake	12	1644	March 22, 2018
RDB shredder doesn't create S3 folder referenced in SQS message For engineers	2	1115	July 7, 2022
Snowplow RDB Loader R31 released New releases	7	1424	October 1, 2019
RDB loader container fails when there's no new shredded data Storage targets	3	996	July 22, 2021
RDB Loader 0.18.1: "Folder with atomic-events was not found in shredded/good" Storage targets	4	787	May 6, 2021

Shredded/bad-rows output directory already exists

Related Topics