Error in Raw S3 -> Raw HDFS Step


#1

Hello, I am facing an error in Step 2 of my EMR process. The StdErr log is as follows:

   Error: java.lang.RuntimeException: Reducer task failed to copy 347 files: s3://<production-processing-bucket>/processing/EWBPWVW3GFOLK.2018-06-27-20.df525fd1.gz etc
at com.amazon.elasticmapreduce.s3distcp.CopyFilesReducer.cleanup(CopyFilesReducer.java:67)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:179)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:635)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:390)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

The EMR has been failing at this step on every run, unless I manually stage the files and then kick the job off with --skip staging.