Unexpected Error Conversion of Nil to String


I am starting to run the enrichment process and when I get to using the storage loader I get the following error. here is the error I get:
./snowplow-storage-loader --config config/config.yml
Unexpected error: no implicit conversion of nil into String
org/jruby/RubyFileTest.java:95:in directory?' org/jruby/RubyFileTest.java:87:indirectory?'
uri:classloader:/storage-loader/lib/snowplow-storage-loader/config.rb:88:in get_config' uri:classloader:/storage-loader/bin/snowplow-storage-loader:31:in'
org/jruby/RubyKernel.java:973:in load' uri:classloader:/META-INF/main.rb:1:in'
org/jruby/RubyKernel.java:955:in require' uri:classloader:/META-INF/main.rb:1:in(root)β€˜
uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:1:in `’

Any help is greatly appreciated.


looks like you config.yml isn’t correct, could you post it here and please remove any credentials before if you post it.


Ok got past the nil issue but now when I run it there is no response:

./snowplow-emr-etl-runner --config config/config.yml --resolver config/iglu_resolver.json --enrichments enrichments
D, [2017-03-22T10:30:07.625000 #20718] DEBUG -- : Staging raw logs...

Here is my config file:

  # Credentials can be hardcoded or set in environment variables
  access_key_id: XXXXX
  secret_access_key: XXX
    region: us-west-2
      assets: s3://snowplow-hosted-assets # DO NOT CHANGE unless you are hosting the jarfiles etc yourself in your own bucket
      jsonpath_assets: # If you have defined your own JSON Schemas, add the s3:// path to your own JSON Path files in your own bucket here
      log: s3://elasticbeanstalk-us-west-2-XXXXX/resources/environments/emrlog
        in:                  # This is a YAML array of one or more in buckets - you MUST use hyphens before each entry in the array, as below
          - s3://elasticbeanstalk-us-west-2-XXXXX/resources/environments/logs/publish/e-cftpjpq6vh         # e.g. s3://my-old-collector-bucket
        processing: s3://XXXXX-etl/processing
        archive: s3://XXXXX-archive/raw    # e.g. s3://my-archive-bucket/raw
        good: s3://XXXXX-data/enriched/good       # e.g. s3://my-out-bucket/enriched/good
        bad: s3://XXXXX-data/enriched/bad        # e.g. s3://my-out-bucket/enriched/bad
        errors:      # Leave blank unless :continue_on_unexpected_error: set to true below
        archive: s3://XXXXX-data/enriched/archive    # Where to archive enriched events to, e.g. s3://my-archive-bucket/enriched
        good: s3://XXXXX-data/shredded/good       # e.g. s3://my-out-bucket/shredded/good
        bad: s3://XXXXX-data/shredded/bad        # e.g. s3://my-out-bucket/shredded/bad
        errors:      # Leave blank unless :continue_on_unexpected_error: set to true below
        archive: s3://XXXXX-data/shredded/archive    # Where to archive shredded events to, e.g. s3://my-archive-bucket/shredded
    ami_version: 4.5.0
    region: us-west-2        # Always set this
    jobflow_role: EMR_EC2_DefaultRole # Created using $ aws emr create-default-roles
    service_role: EMR_DefaultRole     # Created using $ aws emr create-default-roles
    placement: us-west-2b     # Set this if not running in VPC. Leave blank otherwise
    ec2_subnet_id: ADD HERE # Set this if running in VPC. Leave blank otherwise
    ec2_key_name: XXXXX
    bootstrap: []           # Set this to specify custom boostrap actions. Leave empty otherwise
      hbase:                # Optional. To launch on cluster, provide version, "0.92.0", keep quotes. Leave empty otherwise.
      lingual:              # Optional. To launch on cluster, provide version, "1.1", keep quotes. Leave empty otherwise.
    # Adjust your Hadoop cluster below
      master_instance_type: m1.medium
      core_instance_count: 2
      core_instance_type: m1.medium
      core_instance_ebs:    # Optional. Attach an EBS volume to each core instance.
        volume_size: 100    # Gigabytes
        volume_type: "gp2"
        volume_iops: 400    # Optional. Will only be used if volume_type is "io1"
        ebs_optimized: false # Optional. Will default to true
      task_instance_count: 0 # Increase to use spot instances
      task_instance_type: m1.medium
      task_instance_bid: 0.015 # In USD. Adjust bid, or leave blank for non-spot-priced (i.e. on-demand) task instances
    bootstrap_failure_tries: 3 # Number of times to attempt the job in the event of bootstrap failures
    additional_info:        # Optional JSON string for selecting additional features
  format: clj-tomcat # For example: 'clj-tomcat' for the Clojure Collector, 'thrift' for Thrift records, 'tsv/com.amazon.aws.cloudfront/wd_access_log' for Cloudfront access logs or 'ndjson/urbanairship.connect/v1' for UrbanAirship Connect events
  job_name: XXXXX ETL # Give your job a name
    hadoop_enrich: 1.8.0 # Version of the Hadoop Enrichment process
    hadoop_shred: 0.10.0 # Version of the Hadoop Shredding process
    hadoop_elasticsearch: 0.1.0 # Version of the Hadoop to Elasticsearch copying process
  continue_on_unexpected_error: false # Set to 'true' (and set :out_errors: above) if you don't want any exceptions thrown from ETL
  output_compression: NONE # Compression only supported with Redshift, set to NONE if you have Postgres targets. Allowed formats: NONE, GZIP
    folder: downloads # Postgres-only config option. Where to store the downloaded files. Leave blank for Redshift
    - name: "XXXXX"
      type: postgres
      host: # Hostname of database server
      database: XXXXXetl # Name of database
      port: 5432 # Default Postgres port
      ssl_mode: disable # One of disable (default), require, verify-ca or verify-full
      table: atomic.events
      username: XXXXX
      password: XXXXX
      maxerror: # Not required for Postgres
      comprows: # Not required for Postgres
  tags: {} # Name-value pairs describing this job
    level: DEBUG # You can optionally switch to INFO for production