Server-side infrastructure

bernardosrulzon · February 2, 2017, 10:56am

Hey all!

We’re trying to figure out a good infrastructure for sending server-side events to Snowplow. Our tech team is concerned about sending events directly as the HTTP requests are blocking, and might have an impact on the application itself in case of latency or availablity issue with the collector. As analytics/trackings are orthogonal to the application, it makes sense to avoid this extra point of failure.

This is the infrastructure we’re currently evaluating:

Application: Write events to a local events.log file
Fluentd: Send out the events to a SQS queue
Logstash: Get messages from SQS and send them to the collector, through a plugin that uses the Snowplow gem

It would be nice to hear your feedback about this How have you structured your applications to send data to Snowplow?

Thanks!
Bernardo

alex · February 2, 2017, 10:59am

Hey @bernardosrulzon - great question. What languages/frameworks are you looking to support server-side?

bernardosrulzon · February 2, 2017, 11:35am

@alex Mostly Ruby applications (Rails and Daemon Kit)

alex · February 2, 2017, 3:10pm

Hi @bernardosrulzon - did you look at the Ruby Tracker’s AsyncEmitter?

Are your tech team dead-set on there being a process boundary and intermediate queue between the tracking site and your emitting code?

bernardosrulzon · February 2, 2017, 3:57pm

Thanks, @alex!

We’re open minded on the subject - the thing is that the event stream coming from the Ruby applications might be consumed by a variety of clients. The fluentd layer would be responsible for allowing this “parallel processing” of events.

…but Snowplow itself could be the centralized log, if we switch to the real-time pipeline, and other clients could consume events coming out of the collector “good” stream.

Any thoughts/best practices on that?

Topic		Replies	Views
Snowplow Ruby Tracker with Async Emitter Tracking SDKs	10	1822	May 19, 2017
Bulk import of old events into Snowplow from Apache Kafka For engineers	4	650	January 10, 2020
Using Snowplow as an event hub For engineers	17	1766	September 21, 2017
Server-side contexts and end-to-end latency For engineers	5	749	December 21, 2020
All events do not arrive to Snowplow pipeline after being sent in bulk using AsyncEmitter in Python Tracking SDKs	11	934	March 26, 2022

Server-side infrastructure

Related Topics