[IMPORTANT ALERT] R108 bug may prevent EmrEtlRunner from launching

Summary

In R108 we started leveraging the official AWS Ruby SDK in EmrEtlRunner and replaced our deprecated Sluice library.

However, the functions we wrote to run the different empty checks were recursive and can blow up the stack if you have a large number of EMR S3 empty files (more than five thousands in our tests).

This issue prevents the EMR job from being launched.

Who is affected

You are affected if:

  • you’re running EmrEtlRunner released with version 108
  • you have a large number of EMR S3 empty files

How to avoid this issue

You can remove the offending empty files or downgrade to the version of EmrEtlRunner which was released with version 106.

When will a fix be rolled out

A fix will be rolled out with the upcoming release 109. You can check out the specific issue to know more.

1 Like