My EMR job failed this morning (was successfully run since last week).
This is what I found in controller log:
2017-03-29T19:48:32.155Z WARN Step failed with exitCode -1 and took 2586 seconds
stderr and stdout logs are empty and syslog has nothing interesting in it as well.
Could you please help to understand the meaning of exitCode -1?
Ok, after I moved all files back into “in” folder EMR job processed them successfully on next scheduled time.
But it’s still good to know what exitcode -1 means. May be there is a way to prevent those issues in the future.
@tyomo4ka I’m curious why that worked… Did you do anything differently on the second run? E.g. bump your instances or upgrade EMR ETL runner between jobs?
I’ve been re-running enrichment from the Shred step but no cigar. I’m currently re-running from the Enrich step to see if that works.
It worked!
Kind of like you said @tyomo4ka , all I had to do was:
- Remove the files in enriched from the failed run
- Re-run enrichment from step “Enrich”
No changes to the pipeline or anything. Figured I’d share back here in case others have issues like this showing up in stderr:
18/07/16 03:42:17 INFO Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 172.31.19.185
ApplicationMaster RPC port: 0
queue: default
start time: 1531708082200
final status: FAILED
tracking URL: http://ip-172-31-27-154.ap-southeast-2.compute.internal:20888/proxy/application_1531707705580_0002/
user: hadoop
Exception in thread "main" org.apache.spark.SparkException: Application application_1531707705580_0002 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1104)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1150)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
18/07/16 03:42:17 INFO ShutdownHookManager: Shutdown hook called
18/07/16 03:42:17 INFO ShutdownHookManager: Deleting directory /mnt/tmp/spark-6393393a-b249-4816-82a1-56e842e65821
Command exiting with ret '1'