Kafka-elasticsearch sink


#1

Hello, I am looking how to use ES with kibana for real-time events dashboard without anything amazon related. We deploy on premise.
Kafka collector - enricher is a great addition, but how can we get the events from kafka and store it to ES?

Thanks,
Milan


On-premise Snowplow Realtime Pipeline with Spark Streaming Enrich
#2

Hey @magaton - it’s great to hear that you are getting some good usage out of the Snowplow real-time pipeline running on-premise with Kafka!

Unfortunately we haven’t built a component to do this (yet!). But it should be possible to build something, most likely by mashing up:

with:

Let us know how you get on if you do go down this route!


#3

I understand you are publishing Thrift messages into Kafka. Do you have an example (in java) how they can be consumed (deserialised)?
Thanks


#4

I’m using Kinesis and can successfully deserialize enriched Thrift messages calling the Scala Analytics SDK from Java in our test environment. It’s not the cleanest solution but it works.


#5

I wanted to implement kafka consumer that deserialises Thrift messages.
Had some problems defining correct maven dependency at first, then ended up implementing
https://github.com/magaton/spring-kafka-snowplow

It’s a spring boot based app that starts kafka consumer that reads thrift serialised message from snowplow topic.