I’m trying to figure out why the EMR cluster won’t start in my VPC in eu-central-1. I’ve tried ami version 4.5.0, 4.8.2, 4.1.0, 5.0.3 and a few others. Tried hadoop_enrich: 1.8.0, 1.7.0, 1.9.0 to no avail. Every time it fails on the same step ‘Elasticity Setup Hadoop Debugging’ while taking 0 seconds. stdout,stderr and syslog are not logged to s3, controller log shows only this:
2016-10-28T11:16:26.161Z INFO Ensure step 1 jar file s3://elasticmapreduce/libs/script-runner/script-runner.jar
the node logs show nothing of interest it seems. Any ideas?
Okay, I found out this is an issue with eu-central-1. Documentation is incredibly vague on this…
I got past the first step when trying to run EMR in Ireland, but my s3 bucket region is eu-central-1, so I guess it can’t communicate with it since I got an exception on step2 (Elasticity Scalding Step: Enrich Raw Events):
Exception in thread "main" com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: 77075F391C52BF29), S3 Extended Request ID:
Yes - --debug refers to enabling in-EMR debugging behavior. Debug lines from EmrEtlRunner are controlled by the monitoring.logging.level in config.yml.