EmrEtlRunner stops with no error

I recently started having an issue where whenever I run EmrEtlRunner, it stops within a couple of seconds with no error. For example:

ubuntu@ip--------:~$ ./snowplow-emr-etl-runner -c config.yml -r iglu_resolver.json D, [2017-06-08T15:23:57.149000 #23957] DEBUG -- : Staging raw logs... ubuntu@ip---------:~$

I have verified that the input bucket is not empty. If I use the option --skip staging the behavior is the same but I get
D, [2017-06-08T15:26:33.452000 #23977] DEBUG -- : Initializing EMR jobflow
instead of
D, [2017-06-08T15:23:57.149000 #23957] DEBUG -- : Staging raw logs...

I’d be happy to provide more information if it’s useful.

Hi @benjjs,

I suspect your “processing” bucket is not empty. Thus, the runner assumes the previous job hasn’t completed yet. It also possible the previous run failed and left the files in the bucket(s) unarchived.

You can refer to the wiki https://github.com/snowplow/snowplow/wiki/Batch-pipeline-steps for guidance on recovery steps to take and understand the workflow.

Do note, the release R88 introduced some changes to EMR job and thus the step 10 (archive_raw) is now executed in EMR.

Hi @benjjs,

As ihor said, it is probably due to the script considering that your enriched bucket is not empty.
You can check the exit code of the script and check to which error it corresponds in

I think we could improve on the error reporting here :slight_smile:

Hope it helped.

Hi all, thanks for your replies. There were a couple of issues, some of which are now resolved. It was indeed the case that my enriched folder was not empty, and the EmrEtlRunner functioned after I cleared it manually.

I’m not sure where to view the exit code you’re referring to. When I run the script, it hangs after “GET request … finished with status code 200” and does NOT archive the results of the enrich or shred process. As a result, I’ve had to manually move files from shredded/good to shredded/archive to get the runner to work each time.

Hi,

To check the exit code of emr-etl-runner, after you ran it execute the following command :

  • Linux / OSX : echo $? # This will give the exit code of the last command
  • Windows : echo %errorlevel%

Happy debugging :slight_smile:

Hm. The exit code is 0.