Snowplow Failing On EMR Step

wyip · December 15, 2017, 5:14pm

For the past couple of days, my Snowplow script has been failing. It hangs up for about 38 minutes and then terminates saying there was an internal error. The script uses EC2 Classic. No changes have been made to the environment. I experimented with skipping all but one step and it is the EMR step that is causing the issue. However, if only the EMR step is used and the rest skipped, instead of hanging up for 38 minutes, the command finishes after a couple of seconds saying “INFO – : Completed successfully.” Has anyone experienced this?

Here is the script:

#!/bin/bash
clear
 
# use jruby environment
# https://rvm.io/integration/cron#loading-rvm-environment-files-in-shell-scripts
source /usr/local/rvm/environments/jruby-1.7.19
 
echo "enrichment kick-off"
cd /home/ec2-user/snowplow/3-enrich/emr-etl-runner
bundle exec bin/snowplow-emr-etl-runner  --config config/sp.yml --resolver config/resolver.json --enrichments ../config/enrichments

Here is the sp.yml file:

aws:
  access_key_id: Aaxc
  secret_access_key: asdf
  s3:
    region: us-east-1
    buckets:
      assets: s3://snowplow-hosted-assets
      log: s3://rrsnowplow-log/emr
      raw:
        in:
        - s3://elasticbeanstalk-us-east-1-69/resources/environments/logs/publish/e-p/
        - s3://elasticbeanstalk-us-east-1-69/resources/environments/logs/publish/e-i /
        - s3://elasticbeanstalk-us-east-1-69/resources/environments/logs/publish/e-9/
        processing: s3://rrsnowplow-etl/processing
        archive: s3://rrsnowplow-archive/raw
      enriched:
        good: s3://rrsnowplow-data/enriched/good
        bad: s3://rrsnowplow-data/enriched/bad
        errors: s3://rrsnowplow-data/enriched/errors
        archive: s3://rrsnowplow-storage-archive/enriched/good
      shredded:
        good: s3://rrsnowplow-data/shredded/good
        bad: s3://rrsnowplow-data/shredded/bad
        errors: s3://rrsnowplow-data/shreddederrors
        archive: s3://rrsnowplow-storage-archive/shredded/good
      jsonpath_assets: 
  emr:
    ami_version: 3.6.0
    region: us-east-1
    placement: us-east-1c
    ec2_subnet_id:
    jobflow_role: EMR_EC2_DefaultRole
    service_role: EMR_DefaultRole
    ec2_key_name: Key_Name
    software:
      hbase: # not used for ami_version 3.6.0
      lingual: # not used for ami_version 3.6.0
    jobflow:
      master_instance_type: m1.medium
      core_instance_count: 3
      core_instance_type: c3.xlarge
      task_instance_count: 0
      task_instance_type: m1.medium
      task_instance_bid: 0.015
    bootstrap_failure_tries: 3
collectors:
  format: clj-tomcat
enrich:
  job_name: Snowplow ETL
  versions:
    hadoop_enrich: 1.0.0
    hadoop_shred: 0.4.0
  continue_on_unexpected_error: false
  output_compression: NONE
storage:
  download:
    folder:
  targets:
  - name: RR Snowplow Events
    type: redshift
    host: snowplow.redshift.amazonaws.com
    database: db
    port: 5439
    table: atomic.table
    username: adm
    password: pw
    maxerror: 10
    comprows: 200000
monitoring:
  tags: {}
  logging:
    level: INFO
  snowplow:

wyip · December 15, 2017, 6:10pm

I’ll have to run the script at least one more time but changing the AMI from 3.6.0 to 3.9.0 seems to have fixed the issue.

wyip · December 15, 2017, 6:42pm

I am no longer getting original error. However, now the process fails during the last step, “Shredded HDFS.” My colleague thinks this is due to an outdated hadoop_enrich or hadoop_shred version.

BenFradet · December 18, 2017, 9:57am

Hi @wyip, what’s the error you’re getting now?

wyip · December 18, 2017, 3:29pm

This probably has to do with the hadoop_enrich and hadoop_shred versions. This is the text from the stderr:

Exception in thread “main” java.lang.RuntimeException: Failed to get source file system
at com.amazon.elasticmapreduce.s3distcp.S3DistCp.run(S3DistCp.java:739)
at com.amazon.elasticmapreduce.s3distcp.S3DistCp.run(S3DistCp.java:720)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at com.amazon.elasticmapreduce.s3distcp.Main.main(Main.java:22)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.io.FileNotFoundException: File does not exist: hdfs:/local/snowplow/shredded-events
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
at com.amazon.elasticmapreduce.s3distcp.S3DistCp.run(S3DistCp.java:736)
… 9 more

BenFradet · December 18, 2017, 3:53pm

Maybe try with AMI 3.11.0, it’s still the same hadoop version (2.4.0).

If you don’t have any dependency on those particular enrich and shred versions, I would advise upgrading indeed.

wyip · January 4, 2018, 6:45pm

Sorry for the delayed response. My colleague has gone ahead and upgraded to one of the most recent version. Thanks for the help!

Topic		Replies	Views
Shred problems using Batch Troubleshooting	1	829	December 5, 2020
EMR failing : Enriched HDFS -> S3: FAILED Troubleshooting	4	1870	April 11, 2017
Steps Elasticity S3DistCp Step: Raw Staging S3 -> Raw Archive S3	13	1089	January 17, 2020
EMR jobflow failing on Hadoop Enrich step after a few seconds AWS batch pipeline (Legacy)	5	2244	April 29, 2016
Snowplow ETL: TERMINATED_WITH_ERRORS [STEP_FAILURE AWS batch pipeline (Legacy)	5	1336	October 4, 2017

Snowplow Failing On EMR Step

Related Topics