We are pleased to announce version 2.1.0 of the Snowplow S3 Loader.
This new version fixes a problematic bug that was in Version 2.0.0, in which the loader could hang during kinesis scaling events, and stop processing events. For this reason we strongly recommend you upgrade to 2.1.0 if you are currently using 2.0.0.
It also adds a new configuration option to give complete control over the partitioning of S3 directories, for example by date, time, or by the schema type of self-describing Jsons. We expect this partitioning to be particularly helpful if you use Athena to query your Snowplow failed events in S3.
If you are already running version 2.0.0 then you can switch to the
2.1.0 docker image without any change to your configuration.
docker pull snowplow/snowplow-s3-loader:2.1.0
If you want to enable the feature of partitioning files by date or schema, then set the
output.s3.partitionFormat field in your configuration file. There is an examples on github, and more details in the configuration reference.
Optimise fromEnriched function
Fix duplicate statsd metrics when loading lzo files
Fix dateFormat partitioning in output path
Fix premature shutdown of HTTP connection pool
Add Twitter Maven repository
Bump amazon-kinesis-client to 1.14.4