Shredding EMR spark config (IOException: All datanodes ... are bad)

No problem! And thanks a lot for taking your time and answering the questions.

Unfortunately, I still do not understand why the EMR job failed after all. Initially, we started with an empty Redshift and an empty shredded archive. If I am not mistaken the shredder could realize that it has to shred the entire enriched archive. And indeed, it tried to shred the entire enriched archive. The issue arises where the shredder tries to process the same enriched runs twice (or maybe multiple times) in the same EMR run.

The new RDBLoader creates the manifest table automatically, and probably that is how it could catch those errors when it received the same sqs message twice. Yet, the question is why Shredder is running over files more than once, at all?

Is that because of bad spark configuration? Same batch being assigned to multiple executors for any reason?