Socket Errors during archiving process


#1

During both the ‘Staging raw logs…’ and ‘Archiving Snowplow events…’ stages, Sluice continually has problems moving files, e.g.

Problem copying bucket/enriched/good/run=YYYY-MM-DD-h-m-s/part-00001. Retrying

Typically, this will result in the following error being thrown before the archiving process has completed:

Unexpected error: Socket closed (OpenSSL::SSL::SSLError)
org/jruby/ext/openssl/SSLSocket.java:220:in `connect_nonblock'
uri:classloader:/gems/excon-0.51.0/lib/excon/ssl_socket.rb:120:in `initialize'
uri:classloader:/gems/excon-0.51.0/lib/excon/connection.rb:404:in `socket'
uri:class`loader:/gems/excon-0.51.0/lib/excon/connection.rb:106:in `request_call'
uri:classloader:/gems/excon-0.51.0/lib/excon/middlewares/mock.rb:47:in `request_call'
uri:classloader:/gems/excon-0.51.0/lib/excon/middlewares/instrumentor.rb:25:in `request_call'
uri:classloader:/gems/excon-0.51.0/lib/excon/middlewares/base.rb:15:in `request_call'
uri:classloader:/gems/excon-0.51.0/lib/excon/middlewares/base.rb:15:in `request_call'
uri:classloader:/gems/excon-0.51.0/lib/excon/middlewares/base.rb:15:in `request_call'
uri:classloader:/gems/excon-0.51.0/lib/excon/connection.rb:250:in `request'
uri:classloader:/gems/fog-xml-0.1.2/lib/fog/xml/sax_parser_connection.rb:35:in `request'
uri:classloader:/gems/fog-xml-0.1.2/lib/fog/xml/connection.rb:7:in `request'
uri:classloader:/gems/fog-1.25.0/lib/fog/aws/storage.rb:521:in `_request'
uri:classloader:/gems/fog-1.25.0/lib/fog/aws/storage.rb:516:in `request'
uri:classloader:/gems/fog-1.25.0/lib/fog/aws/requests/storage/copy_object.rb:32:in `copy_object'
uri:classloader:/gems/fog-1.25.0/lib/fog/aws/models/storage/file.rb:92:in `copy'
uri:classloader:/gems/sluice-0.3.4/lib/sluice/storage/s3/s3.rb:623:in `block in retry_x'
org/jruby/ext/timeout/Timeout.java:117:in `timeout'
uri:classloader:/gems/sluice-0.3.4/lib/sluice/storage/s3/s3.rb:622:in `retry_x'
uri:classloader:/gems/sluice-0.3.4/lib/sluice/storage/s3/s3.rb:548:in `block in process_files'
org/jruby/RubyKernel.java:1290:in `loop'
uri:classloader:/gems/sluice-0.3.4/lib/sluice/storage/s3/s3.rb:412:in `block in process_files'

Has any body encountered/had success in solving a similar error?


#2

Hi @samf89 - we are planning on sunsetting Sluice because file moves should be done from the EMR cluster itself. We see the failures you mention but they are relatively infrequent across our customer base - for them to be happening continually for a single pipeline is unusual.

Are you perhaps running ErmEtlRunner/StorageLoader somewhere other than EC2, or on an instance type with very low networking capability?


#3

Hi, thanks for the reply.

The EtlRunner/StorageLoader is currently being ran on an m1.small EC2 instance type.


#4

Right - I’d try it on a larger instance type and see if the problem goes away…