Private S3 Bucket for Snowplow Schema Registry

Hi,

we would like to host out custom snowplow schemas in a private S3 bucket. Is that possible?

Turning the bucket into a public one and enabling static website hosting works but for security reasons we would like to keep our registry private.

Is that possible?

Best,

Matthias

Hi @mgloel,

So it is possible but with caveats. On S3 you can restrict on IP Address access via the Access Policy but this only works if you are not using VPC<>S3 Gateways. You would also need to be running the servers that are accessing the S3 Bucket behind a NAT Gateway/Server or have them use an Elastic IP Address to ensure that you do not lose access and have stable addresses to pin against.

Details here: https://aws.amazon.com/premiumsupport/knowledge-center/s3-aws-ip-addresses-access/

Your other option is to launch a webserver directly. You could use NGINX or some other simple webserver system to create a static website that you can then apply normal AWS Security Group rules against - which has different drawbacks in terms of how you get your schemas onto the server and deal with HA and redundancy but makes security easier to manage at a network level.


The most future proofed option however is to setup an Iglu Server setup instead. This is a slightly more complicated but allows you to set API Key based access to your schemas as well as applying all of the normal security group related networking limitations. More importantly it is also required for a lot of our new loader technology to function (like Redshift auto-migrations).

Guide for this tech can be found here: https://github.com/snowplow/iglu/wiki/Setting-up-an-Iglu-Server


Hope this helps!
Josh

2 Likes