Stream Enrich in Kubernetes cluster

#1

Hi,

I was wondering if anybody had the chance setting up multiple stream enrich containers that work against the same kinesis stream?
I want to see if I can improve the performance of the pipeline.

I just want to understand how the shard assignment is managed (if at all), or if I need to handle it myself (and if so - how).

I couldn’t find anything in the documentation about that.

Thanks.

0 Likes

#2

this definitely works with kafka, but never tried with kinesis.

0 Likes

#3

Thanks @evaldas!

Did you have to do anything special for that or just add more containers?

0 Likes

#4

if you run stream enrich as separate pod just use scale command to increase the pod count. From what I understand it uses kafka consumer group to synchronize messages in between each container (enrich kafka consumer), which avoids duplication.

0 Likes

#5

Hi @moshesh,

I’m running the Snowplow pipeline on AWS ECS and stream enrich runs on multiple containers.

As far as I know, stream enrich uses KCL (Kinesis Client Library). This library handles the shard assignment (and re-assignment on scaling) for you. Here are some references:

0 Likes