Link to http://dl.bintray.com/snowplow/snowplow-generic/kinesis_s3_0.5.0.zip is broken
It looks like the documentation for this download was updated a bit too eagerly. The last release is actually version 0.4.1; 0.5.0 is in RC still.
The working link is: http://dl.bintray.com/snowplow/snowplow-generic/kinesis_s3_0.4.1.zip
The documentation does need a good a bit of work to be a bit clearer. However to answer your questions here.
Some Googling suggested you leave interface as-is then put in 80 for the normal use case.
If you are implementing the collector behind something like an AWS Load Balancer you can put this component on any port you like - you will just need to configure the Listener to forward requests to the correct port. Port 80 is however recommended!
As the collector does not handle TLS Termination itself you will always need to have some form of Load Balancer / Proxy in-front which can then route traffic from this point.
No idea what normal use case would be. Would these values change depending to the number of shards? Also, time-limit does not stipulate whether it’s mins, seconds or milliseconds in the sample config or setup guide.
From the sample configuration file here you can get a quick overview of what each of these settings does:
- collector.buffer.time-limit: This is measured in milliseconds
No you would not change these numbers based on the number of shards. Although that number could have an impact on the success of some of these settings. Kinesis has several per shard limits that need to be taken into account when coming up with values for these buffers.
In the case of the Stream Collector the main thing to keep an eye on is that:
Each PutRecords request can support up to 500 records. Each record in the request can be as large as 1 MB, up to a limit of 5 MB for the entire request, including partition keys. Each shard can support writes up to 1,000 records per second, up to a maximum data write total of 1 MB per second.
Taken from: http://docs.aws.amazon.com/kinesis/latest/APIReference/API_PutRecords.html
So to ensure your application does not run into any issues pushing data to the stream it must adhere to these limits. Our default settings for this application are:
byte-limit: 4000000 # 4 MB
time-limit: 5000 # 5 seconds
Hope this helps!