Can't resume from rdb-load step


#1

i don’t understand why this won’t resume from rdb-load.

it failed to load because redshift was in maintenance mode. but when i run the EMR command to resume from rdb-load it shouldn’t care about enriched folder no? EMR definitely failed on the load step.

ERROR: Data loading error Problems with establishing DB connection
Amazon Error setting/closing connection: Connection refused.
Following steps completed: [Discover]
INFO: Logs successfully dumped to S3 [s3://ga-snowplow-production/snowplow-log/rdb-loader/2018-03-10-07-15-28/37bc5a82-de8f-4c54-aa39-437f07177f2b]

[root@ip-X-X-X-X bin]# /var/app/current/bin/snowplow-emr-etl-runner run --config /var/app/current/etc/emr-config.yml --resolver /var/app/current/etc/resolver.conf --enrichments /var/app/current/enrichments --targets /var/app/current/etc/storage_targets --resume-from rdb_load
D, [2018-03-12T13:42:29.574000 #32360] DEBUG – : Initializing EMR jobflow
E, [2018-03-12T13:42:37.775000 #32360] ERROR – : No run folders in [s3://ga-snowplow-production/snowplow-enriched/good/] found
F, [2018-03-12T13:42:37.779000 #32360] FATAL – :

Snowplow::EmrEtlRunner::UnexpectedStateError (No run folders in [s3://ga-snowplow-production/snowplow-enriched/good/] found):
uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/emr_job.rb:715:in get_latest_run_id' uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/emr_job.rb:480:ininitialize’
uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_reference.rb:43:in send_to' uri:classloader:/gems/contracts-0.11.0/lib/contracts/call_with.rb:76:incall_with’
uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_handler.rb:138:in block in redefine_method' uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/runner.rb:100:inrun’
uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_reference.rb:43:in send_to' uri:classloader:/gems/contracts-0.11.0/lib/contracts/call_with.rb:76:incall_with’
uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_handler.rb:138:in block in redefine_method' uri:classloader:/emr-etl-runner/bin/snowplow-emr-etl-runner:41:in
org/jruby/RubyKernel.java:979:in load' uri:classloader:/META-INF/main.rb:1:in
org/jruby/RubyKernel.java:961:in require' uri:classloader:/META-INF/main.rb:1:in(root)’
uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:1:in `’


#2

Hi @mjensen,

What version of Snowplow/EmrEtlRunner you use? I think it can be this issue fixed in R95.


#4

@anton thanks, i believe we are on R92.

would upgrading help my stuck state though? i can fix it by clearing enriched/shred folders and letting it process from scratch to get things going again which is fine, i’ve done it before. or would upgrading to R95/latest be the best?

[root@ip-X bin]# ./snowplow-emr-etl-runner --version
snowplow-emr-etl-runner 0.28.0
[root@ip-X bin]# ./snowplow-storage-loader --version
snowplow-storage-loader 0.11.0


#5

@anton the problem now is because shredding completed, all those records are in dynamodb but not in redshift. so i would have to turn dyanmodb dedupping off in config and then let those records in twice :frowning: and then run dedup SQL scripts :frowning:


#6

One thing about cross-batch deduplication: if you start from shred, e.g. process same enriched data with same etl_tstamp - it won’t harm. When cross-batch in shredding encounters event_id:event_fingerprint pair with same etl_tstamp it lets it go through shredding process, (in other words does not de-duplicate it).

It is a bit strange that on R92 you still use StorageLoader. Since R90, RDB Loader is preferred way to load data into Redshift.


#7

@anton sorry that storage loader was a mistake. i verified i’m loading from EMR not from binary.

storage:
versions:
rdb_loader: 0.13.0
rdb_shredder: 0.12.0
hadoop_elasticsearch: 0.1.0

and ok, let me look.


#8

@anton yeah it won’t let me start from shred. enriched should be good and it’s not. i cleaned out shred folders and left enriched folders intact.

only option is to re-process from scratch i would think. and turn off dedup in dynamodb and only process those missing logs again.

/var/app/current/bin/snowplow-emr-etl-runner run --config /var/app/current/etc/emr-config.yml --resolver /var/app/current/etc/resolver.conf --enrichments /var/app/current/enrichments --targets /var/app/current/etc/storage_targets --resume-from shred
D, [2018-03-12T15:56:22.859000 #7598] DEBUG – : Initializing EMR jobflow
E, [2018-03-12T15:56:32.953000 #7598] ERROR – : No run folders in [s3://ga-snowplow-production/snowplow-enriched/good/] found
F, [2018-03-12T15:56:32.963000 #7598] FATAL – :

Snowplow::EmrEtlRunner::UnexpectedStateError (No run folders in [s3://ga-snowplow-production/snowplow-enriched/good/] found):
uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/emr_job.rb:715:in get_latest_run_id' uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/emr_job.rb:480:ininitialize’
uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_reference.rb:43:in send_to' uri:classloader:/gems/contracts-0.11.0/lib/contracts/call_with.rb:76:incall_with’
uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_handler.rb:138:in block in redefine_method' uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/runner.rb:100:inrun’
uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_reference.rb:43:in send_to' uri:classloader:/gems/contracts-0.11.0/lib/contracts/call_with.rb:76:incall_with’
uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_handler.rb:138:in block in redefine_method' uri:classloader:/emr-etl-runner/bin/snowplow-emr-etl-runner:41:in
org/jruby/RubyKernel.java:979:in load' uri:classloader:/META-INF/main.rb:1:in
org/jruby/RubyKernel.java:961:in require' uri:classloader:/META-INF/main.rb:1:in(root)’
uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:1:in `’


#9

@mjensen,

The fact that you tried resuming from “shred” step and got the error related to “enrich” step

Snowplow::EmrEtlRunner::UnexpectedStateError (No run folders in [s3://ga-snowplow-production/snowplow-enriched/good/] found

is likely to be an indication of a bug which has been resolved in the later version, R95+. You might also get “No run folders” error due to a large number of empty files (with prefix $folder$) accumulated in “enriched/good” and “enriched/shredded” buckets due to the usage of S3DistCp utility as a file moving means. You might wish to delete those files manually.


#10

@ihor got it thanks . i’ll have to clean up and rerun. will upgrade once i’m up to date log files wise from R92 -> R95/100