Failed to start EmrEtlRunner

Hi Guys,

I have been checking most of the posts online and in this forum, but still can’t get the snowplow-emr-etl-runner to work.

The command I used is:
./snowplow-emr-etl-runner --debug --config snwplw-config.yml --resolver iglu_resolver.json

Keep getting this error:

Value guarded in: Snowplow::EmrEtlRunner::Cli::load_config
With Contract: Maybe, String => Hash
At: /home/user/loader/snowplow-emr-etl-runner!/emr-etl-runner/lib/snowplow-emr-etl-runner/cli.rb:134 ):
/home/user/loader/snowplow-emr-etl-runner!/gems/contracts-0.7/lib/contracts.rb:69:in Contract' org/jruby/RubyProc.java:271:in call’
/home/user/loader/snowplow-emr-etl-runner!/gems/contracts-0.7/lib/contracts.rb:147:in failure_callback' /home/user/loader/snowplow-emr-etl-runner!/gems/contracts-0.7/lib/contracts/decorators.rb:164:in common_method_added’
/home/user/loader/snowplow-emr-etl-runner!/gems/contracts-0.7/lib/contracts/decorators.rb:159:in common_method_added' file:/home/user/loader/snowplow-emr-etl-runner!/emr-etl-runner/bin/snowplow-emr-etl-runner:37:in (root)’
org/jruby/RubyKernel.java:1091:in load' file:/home/user/loader/snowplow-emr-etl-runner!/META-INF/main.rb:1:in (root)’
org/jruby/RubyKernel.java:1072:in require' file:/home/user/loader/snowplow-emr-etl-runner!/META-INF/main.rb:1:in (root)’
/tmp/jruby5396713056082654976extract/jruby-stdlib-1.7.20.1.jar!/META-INF/jruby.home/lib/ruby/shared/rubygems/core_ext/kernel_require.rb:1:in `(root)’

My config file is as below (masked credentials):

Someone please help to get it work!

Thank you very much!

Hi @mythsam,

Could you remove the unnecessary quotes and try again? For example log: s3://st-snwplw-sl/logs instead of log: "s3://st-snwplw-sl/logs".

Check it against the example here.

Regards,
Ihor

Hi Ihor,

Thank you for your reply, I give a try and remove all the un-necessary quotes " (since i didn’t use the software section, the whole yml file is without any quote now), and update the jobflow section to be the same as on the example.

I have updated the change in gist, however, still getting the same error.

Could you see if there is another other thing wrong?

Thanks alot!

Sam

Got the config working now. It is failed due to a mistake of attribute jobflow_name, it should be jobflow_role. After fixed that it works now.

However, when i run the command, i am getting error on staging step, file can’t be moved from log bucket to processing bucket.

./snowplow-emr-etl-runner --debug --config config.yml --resolver config/iglu_resolver.json

D, [2016-05-06T20:41:21.334000 #20758] DEBUG – : Staging raw logs…
moving files from s3://st-snwplw-logs/ to s3://st-snwplw-sl/processing/
F, [2016-05-06T20:41:26.278000 #20758] FATAL – :

NoMethodError (undefined method files' for nil:NilClass): /home/user/loader/snowplow-emr-etl-runner!/gems/sluice-0.2.2/lib/sluice/storage/s3/s3.rb:469:inprocess_files’
org/jruby/ext/thread/Mutex.java:149:in synchronize' /home/user/loader/snowplow-emr-etl-runner!/gems/sluice-0.2.2/lib/sluice/storage/s3/s3.rb:437:inprocess_files’
org/jruby/RubyKernel.java:1511:in loop' /home/user/loader/snowplow-emr-etl-runner!/gems/sluice-0.2.2/lib/sluice/storage/s3/s3.rb:428:inprocess_files’

Any suggestion of this issue?

Thanks a lot!

Sam

Hi @mythsam,

Most likely cause of this error is the problem accessing your raw:in bucket.

Could you, please, try to

  1. Use the AWS CLI to confirm that your AWS credentials have access to your in bucket?
  2. Check ec2_key_name exists in the same region as the one you stated in s3:region
  3. Use placement instead of ec2_subnet_id if not running in VPC

Regards,
Ihor

Hi ihor,

I have checked your concern:

  1. Use the AWS CLI to confirm that your AWS credentials have access to your in bucket?
    Yes, it is accessible

  2. Check ec2_key_name exists in the same region as the one you stated in s3:region
    Yes, ec2_key_name exists in Key Pairs under the same region as in the config

  3. Use placement instead of ec2_subnet_id if not running in VPC
    It is under the VPC, and i tried both cases, either using placement or ec2_subnet_id, it is still giving me the same error.

D, [2016-05-09T14:55:00.438000 #26552] DEBUG – : Staging raw logs…
moving files from s3://st-snwplw-logs/ to s3://st-snwplw-sl/processing/
F, [2016-05-09T14:55:05.744000 #26552] FATAL – :

NoMethodError (undefined method files' for nil:NilClass): /home/user/loader/snowplow-emr-etl-runner!/gems/sluice-0.2.2/lib/sluice/storage/s3/s3.rb:469:in process_files’
org/jruby/ext/thread/Mutex.java:149:in synchronize' /home/user/loader/snowplow-emr-etl-runner!/gems/sluice-0.2.2/lib/sluice/storage/s3/s3.rb:437:in process_files’
org/jruby/RubyKernel.java:1511:in loop' /home/user/loader/snowplow-emr-etl-runner!/gems/sluice-0.2.2/lib/sluice/storage/s3/s3.rb:428:in process_files’

Please disregard my previous msg. I finally got it working now. Thanks for your guide and help. It all due to typo of destination.

Great Thanks!