S3 curl Error on rdb loader

Hi All,

I am setting up snowplow and I ran through this error on the rdb loader and I am not quite sure how to solve it.

Data loading error Amazon Invalid operation: S3CurlException: Connection timed out after 50000 milliseconds, CurlError 28, multiCurlError 0, CanRetry 1, UserError 0
Details:

error: S3CurlException: Connection timed out after 50000 milliseconds, CurlError 28, multiCurlError 0, CanRetry 1, UserError 0
code: 9002
context: Listing bucket=test-snowplow-data prefix=shredded/good/run=2019-07-25-15-48-51/atomic-events/
query: 61336
location: s3_utility.cpp:600
process: padbmaster [pid=32060]
-----------------------------------------------;

I checked all the security groups. I am running this from my local and the spun up machine running EMR has access to GET and LIST from S3. Not sure what I am doing wrong here. I want to thank in advance for taking a look at this.

Hi @grvregmi,

The issue might be caused by the redshift cluster being deployed in private subnets that don’t have nat gateways or s3 vpc endpoints.

A configuration like described above would make the redshift cluster isolated from the internet. Causing redshift not being able to connect to S3.

Regards