EMR Job - Failing with SSL handshake?


#1

Hey there, in the last few days I started receiving the error below when trying to run the EMR job:

java -version
openjdk version "1.8.0_151"
OpenJDK Runtime Environment (build 1.8.0_151-b12)
OpenJDK 64-Bit Server VM (build 25.151-b12, mixed mode)

openssl version
OpenSSL 1.0.2k-fips  26 Jan 2017

./snowplow-emr-etl-runner run -c config.yml -n ./enrichment/ -r ./iglu_resolver.json -t ./targets/ --debug 
D, [2018-01-29T16:45:34.207000 #6972] DEBUG -- : Initializing EMR jobflow
F, [2018-01-29T16:57:26.585000 #6972] FATAL -- : 

Excon::Error::Socket (Unsupported record version Unknown-0.0 (OpenSSL::SSL::SSLError)):
    org/jruby/ext/openssl/SSLSocket.java:222:in `connect_nonblock'
    uri:classloader:/gems/excon-0.52.0/lib/excon/ssl_socket.rb:121:in `initialize'
    uri:classloader:/gems/excon-0.52.0/lib/excon/connection.rb:403:in `socket'
    uri:classloader:/gems/excon-0.52.0/lib/excon/connection.rb:100:in `request_call'
    uri:classloader:/gems/excon-0.52.0/lib/excon/middlewares/mock.rb:48:in `request_call'
    uri:classloader:/gems/excon-0.52.0/lib/excon/middlewares/instrumentor.rb:26:in `request_call'
    uri:classloader:/gems/excon-0.52.0/lib/excon/middlewares/base.rb:16:in `request_call'
    uri:classloader:/gems/excon-0.52.0/lib/excon/middlewares/base.rb:16:in `request_call'
    uri:classloader:/gems/excon-0.52.0/lib/excon/middlewares/base.rb:16:in `request_call'
    uri:classloader:/gems/excon-0.52.0/lib/excon/connection.rb:249:in `request'
    uri:classloader:/gems/excon-0.52.0/lib/excon/middlewares/idempotent.rb:27:in `error_call'
    uri:classloader:/gems/excon-0.52.0/lib/excon/middlewares/base.rb:11:in `error_call'
    uri:classloader:/gems/excon-0.52.0/lib/excon/middlewares/base.rb:11:in `error_call'
    uri:classloader:/gems/excon-0.52.0/lib/excon/connection.rb:272:in `request'
    uri:classloader:/gems/excon-0.52.0/lib/excon/middlewares/idempotent.rb:27:in `error_call'
    uri:classloader:/gems/excon-0.52.0/lib/excon/middlewares/base.rb:11:in `error_call'
    uri:classloader:/gems/excon-0.52.0/lib/excon/middlewares/base.rb:11:in `error_call'
    uri:classloader:/gems/excon-0.52.0/lib/excon/connection.rb:272:in `request'
    uri:classloader:/gems/excon-0.52.0/lib/excon/middlewares/idempotent.rb:27:in `error_call'
    uri:classloader:/gems/excon-0.52.0/lib/excon/middlewares/base.rb:11:in `error_call'
    uri:classloader:/gems/excon-0.52.0/lib/excon/middlewares/base.rb:11:in `error_call'
    uri:classloader:/gems/excon-0.52.0/lib/excon/connection.rb:272:in `request'
    uri:classloader:/gems/fog-xml-0.1.2/lib/fog/xml/sax_parser_connection.rb:35:in `request'
    uri:classloader:/gems/fog-xml-0.1.2/lib/fog/xml/sax_parser_connection.rb:-1:in `request'
    uri:classloader:/gems/fog-xml-0.1.2/lib/fog/xml/connection.rb:7:in `request'
    uri:classloader:/gems/fog-aws-1.4.0/lib/fog/aws/storage.rb:612:in `_request'
    uri:classloader:/gems/fog-aws-1.4.0/lib/fog/aws/storage.rb:-1:in `_request'
    uri:classloader:/gems/fog-aws-1.4.0/lib/fog/aws/storage.rb:607:in `request'
    uri:classloader:/gems/fog-aws-1.4.0/lib/fog/aws/requests/storage/get_bucket.rb:43:in `get_bucket'
    uri:classloader:/gems/fog-aws-1.4.0/lib/fog/aws/models/storage/directories.rb:21:in `get'
    uri:classloader:/gems/fog-aws-1.4.0/lib/fog/aws/models/storage/files.rb:30:in `all'
    uri:classloader:/gems/fog-aws-1.4.0/lib/fog/aws/models/storage/files.rb:51:in `each'
    uri:classloader:/gems/sluice-0.4.0/lib/sluice/storage/s3/s3.rb:69:in `list_files'
    uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_reference.rb:43:in `send_to'
    uri:classloader:/gems/contracts-0.11.0/lib/contracts/call_with.rb:76:in `call_with'
    uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_handler.rb:138:in `block in redefine_method'
    uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/emr_job.rb:135:in `block in initialize'
    org/jruby/RubyArray.java:2564:in `select'
    uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/emr_job.rb:133:in `initialize'
    uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_reference.rb:43:in `send_to'
    uri:classloader:/gems/contracts-0.11.0/lib/contracts/call_with.rb:76:in `call_with'
    uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_handler.rb:138:in `block in redefine_method'
    uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/runner.rb:100:in `run'
    uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_reference.rb:43:in `send_to'
    uri:classloader:/gems/contracts-0.11.0/lib/contracts/call_with.rb:76:in `call_with'
    uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_handler.rb:138:in `block in redefine_method'
    uri:classloader:/emr-etl-runner/bin/snowplow-emr-etl-runner:41:in `<main>'
    org/jruby/RubyKernel.java:979:in `load'
    uri:classloader:/META-INF/main.rb:1:in `<main>'
    org/jruby/RubyKernel.java:961:in `require'
    uri:classloader:/META-INF/main.rb:1:in `(root)'
    uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:1:in `<main>'

#2

Hey! I’m having the same issue.

Did you figure out a solution ?


#3

hey, it seems like this issue is related to the number of files in the S3 bucket. If I run with about 100k files, it’s fine. Usually more than that it shows this SSL error message.


#4

Interesting, you mean the number of log files generated by the collector ? I don’t have that many files to process :o
It’s weird because it happens before the s3 copy (first step of the EMR job flow, I’m running R97).

I’ll try running the EMR more often in case …

Does anyone have another theory ?

Also, is there a way to retrieve more detailed logs about this initialization phase ?
Thanks!


#5

Sorry but I need to up this!

This error still comes back … I tried a bunch of different EC2 instance types.
Sometimes it runs correctly a few times but it always comes back to that error and it gets very very rare to not encountering it …

When running it from my local machine, it works … so I guess it’s not a configuration problem.

What kind of EC2 instances are you usually spinning up your EmrEtlRunner process from ? Do they need to have a certain processing power/memory ?

Do we have a way to log more details about what’s happening when the process tries to create the cluster ?

Thanks to anybody who could help on this !