In R108 we started leveraging the official AWS Ruby SDK in EmrEtlRunner and replaced our deprecated Sluice library.
However, the functions we wrote to run the different empty checks were recursive and can blow up the stack if you have a large number of EMR S3 empty files (more than five thousands in our tests).
This issue prevents the EMR job from being launched.
Who is affected
You are affected if:
- you’re running EmrEtlRunner released with version 108
- you have a large number of EMR S3 empty files
How to avoid this issue
You can remove the offending empty files or downgrade to the version of EmrEtlRunner which was released with version 106.