Can't load data back into redshift


#1

i don’t get this. we had that amazon AWS redshift outage last night for an hour. so EMR failed at the loading step. usually i can just run this command to reload the data stuck in shredded folder but it’s stuck on enriched. and i see the good folder in enriched folder that it’s complaining about. also i remember an older release had a file on S3 that told me what files were put into enriched processing. if i could find that file again i could just start over and re-process those files from scratch but i can’t find that file anymore.

./snowplow-emr-etl-runner --version
snowplow-emr-etl-runner 0.30.0

/var/app/current/bin/snowplow-emr-etl-runner run --config /var/app/current/etc/emr-config.yml --resolver /var/app/current/etc/resolver.conf --enrichments /var/app/current/enrichments --targets /var/app/current/etc/storage_targets --resume-from rdb_load
D, [2018-06-01T11:59:08.980000 #7229] DEBUG – : Initializing EMR jobflow
E, [2018-06-01T11:59:16.549000 #7229] ERROR – : No run folders in [s3://ga-snowplow-production/snowplow-enriched/good/] found
F, [2018-06-01T11:59:16.553000 #7229] FATAL – :

Snowplow::EmrEtlRunner::UnexpectedStateError (No run folders in [s3://ga-snowplow-production/snowplow-enriched/good/] found):
uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/emr_job.rb:719:in get_latest_run_id' uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/emr_job.rb:488:ininitialize’
uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_reference.rb:43:in send_to' uri:classloader:/gems/contracts-0.11.0/lib/contracts/call_with.rb:76:incall_with’
uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_handler.rb:138:in block in redefine_method' uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/runner.rb:102:inrun’
uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_reference.rb:43:in send_to' uri:classloader:/gems/contracts-0.11.0/lib/contracts/call_with.rb:76:incall_with’
uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_handler.rb:138:in block in redefine_method' uri:classloader:/emr-etl-runner/bin/snowplow-emr-etl-runner:41:in
org/jruby/RubyKernel.java:979:in load' uri:classloader:/META-INF/main.rb:1:in
org/jruby/RubyKernel.java:961:in require' uri:classloader:/META-INF/main.rb:1:in(root)’
uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:1:in `’


#2

@anton or @alex any ideas? i was going to rerun from scratch if i knew what files enrichment processed so i can re-process them but can’t find the list. other problem is dynamodb already has the shredded items on file from going through shred.


#3

Hey @mjensen,

Sorry, I’m not sure I entirely follow the problem, but what command are you using to run recovery? If your previous run failed on load then you should try this:

./snowplow-emr-etl-runner --resume-from load ...

This should load data from shredded good and then archive both enriched and shredded.


#4

@anton sorry forgot to paste that part:

/var/app/current/bin/snowplow-emr-etl-runner run --config /var/app/current/etc/emr-config.yml --resolver /var/app/current/etc/resolver.conf --enrichments /var/app/current/enrichments --targets /var/app/current/etc/storage_targets --resume-from rdb_load

and i’ve tried all steps and none work. tried resuming from every single one.


#5

found the folder that contains the list of files i need to re-process if i clean out enriched and shredded folders.

2018-05-31 18:28:05 14574509 snowplow-raw/archive/run=2018-05-31-22-05-25/2018-05-31-49580171778532873512806680164459533124461988638724456562-49580171778532873512806680316801111357379993223926120562.lzo
2018-05-31 18:28:07 1672 snowplow-raw/archive/run=2018-05-31-22-05-25/2018-05-31-49580171778532873512806680164459533124461988638724456562-49580171778532873512806680316801111357379993223926120562.lzo.index
2018-05-31 18:28:08 11767428 snowplow-raw/archive/run=2018-05-31-22-05-25/2018-05-31-49580171778532873512806680316857930870901880863856787570-49580171778532873512806680447002422129875679103558877298.lzo
2018-05-31 18:28:06 1344 snowplow-raw/archive/run=2018-05-31-22-05-25/2018-05-31-49580171778532873512806680316857930870901880863856787570-49580171778532873512806680447002422129875679103558877298.lzo.index
2018-05-31 18:28:08 14185467 snowplow-raw/archive/run=2018-05-31-22-05-25/2018-05-31-49580171778555174258005210795026291226807694957553385602-49580171778555174258005210945608882392186414162277040258.lzo
2018-05-31 18:28:03 1640 snowplow-raw/archive/run=2018-05-31-22-05-25/2018-05-31-49580171778555174258005210795026291226807694957553385602-49580171778555174258005210945608882392186414162277040258.lzo.index
2018-05-31 18:28:06 11469797 snowplow-raw/archive/run=2018-05-31-22-05-25/2018-05-31-49580171778555174258005210945619762724562945824849395842-49580171778555174258005211075495872451582296456486191234.lzo
2018-05-31 18:28:07 1328 snowplow-raw/archive/run=2018-05-31-22-05-25/2018-05-31-49580171778555174258005210945619762724562945824849395842-49580171778555174258005211075495872451582296456486191234.lzo.index


#6

just to close this out. i had no choice but to clean out shredded and enriched folders and then move the raw files in archive directory back into processing and then run --skip staging. i had to disable dynamodb for that batch since they already were run through it.

btw our dev cluster had the same exact problem as prod.


#7

Hi @mjensen,

E, [2018-06-01T11:59:16.549000 #7229] ERROR – : No run folders in [s3://ga-snowplow-production/snowplow-enriched/good/] found

This error occurs when EER can’t extract the latest run ID due to a large number of empty files ( *$folder$) in snowplow-enriched and snowplow-shredded buckets. The files get left around as part of the S3DistCp routine. There’s an open issue to add a maintenance step - (#3439).

As for now, you can create a script to remove the files regularly or remove them manually when there is such need with aws s3 rm command. Once the files are removed --resume-from rdb_load should work as expected.

Hope this helps.


#9

@egor thanks,