My company wants a user event logging system to power real-time control systems for marketplaces (content recommendations, personalization, promotions, etc). We’re evaluating using Snowplow. We have some requirements that I want to validate.
Server-side contexts . For each event, I want to log server-side-generated contexts about the requests and items being interacted with. I want to keep this information on the server-side for 2 reasons: it’s a lot of information and it’s sensitive. A majority of the logged data will be this type of data. What’s a good practice for supporting this in Snowplow? I found a generic Enrichment REST API hook but that seems inefficient for this type of logging. Do people fork Snowplow and modify the pipeline directly?
End-to-end latency . Longer-term, I want to reduce the end-to-end ingestion latency as much as possible so we can use the client signals asap. What’s the end-to-end latency goal for Snowplow? I see some docs that say this can get down to a few seconds (Kinesis+Flink streaming). Does this latency apply to enrichments and data models?