Here at GetNinjas we are currently using Snowplow as a solution to track client’s events and some kind of transaction events, e.g: Some user paid an invoice then my app generates an event through Snowplow just to help my BI team.
Nowadays, we are doing lots of stuff when an user pays an invoice, e.g: giving them some credits in our platform, sending them a receipt, etc. This stuff is triggered by an event which comes from AWS SQS, so basically, my App sends an event through SQS and we have a daemon getting these messages and processing them and doing all stuff described above.
We are using SQS because it’s reliable and we trust the event will land in the right destination, basically, because AWS ensures that for us.
For us, events sent through SQS could be, perfectly, sent using Snowplow, in other words, we are planning to use Snowplow as our event hub for transactional events. Perhaps it could not be the best decision, because of Snowplow characteristics like the buffer.
Moreover, we are planning to run Snowplow on Docker, and by container’s characteristics (someone can shut down one and start another one) we are afraid we may lose few events when we suffer some kind of outage in our clusters.
Is our worry reasonable? Can we use Snowplow, safely, as an event hub solution? Is there anything wrong with our architecture? Do you have any tips?