Elasticsearch sink - Application processRecords() threw an exception when processing shard

Hey everyone,

I’m trying to run the Snowplow streaming architecture in a super-basic AWS environment. Collector and enrich both working fine, but when the ES sink gets a datapoint I get the KCL error:

[pool-1-thread-1] ERROR com.amazonaws.services.kinesis.clientlibrary.lib.worker.ProcessTask - ShardId shardId-000000000000: Application processRecords() threw an exception when processing shard 
[pool-1-thread-1] ERROR com.amazonaws.services.kinesis.clientlibrary.lib.worker.ProcessTask - ShardId shardId-000000000000: Skipping over the following data records: []

Having dug into the src a bit it seems to get stuck in SnowplowElasticsearchEmitter#splitBufferRec - v happy to dig further but, before I do, has anyone seen this before?

Thanks!

This is now fixed - see #3019

2 Likes

How you fix this ?

It’s an awkward issue with the KCL which catches exceptions in the processRecords() but prevents them from bubbling up correctly. #3019 fixes one case where this with the buffering logic for Elasticsearch but any other exception will produce the same symptoms. If you have a JVM debugger to hand, stepping through the KCL code in com.amazonaws.services.kinesis.clientlibrary.lib.worker.ProcessTask will show you the exact exception, otherwise it’s trial and error… this often occurs when there is an issue connecting or authenticating to Elasticsearch, so worth checking that first.

1 Like

I have the same issue using snowplow-elasticsearch-sink-0.8.0-2x taken from bintray today.

Any advice on how to debug? Not even sure where to start with this one.

Thanks in advance,

Graham

Invoked via:

sudo -u snowplow java -jar -Xdebug -Dorg.slf4j.simpleLogger.defaultLogLevel=debug ./snowplow-elasticsearch-sink-0.8.0-2x --config elasticsearch-sink.conf
# java -version
java version "1.7.0_141"
OpenJDK Runtime Environment (amzn-2.6.10.1.73.amzn1-x86_64 u141-b02)
OpenJDK 64-Bit Server VM (build 24.141-b02, mixed mode)
# cat /etc/os-release
NAME="Amazon Linux AMI"
VERSION="2017.03"
ID="amzn"
ID_LIKE="rhel fedora"
VERSION_ID="2017.03"
PRETTY_NAME="Amazon Linux AMI 2017.03"
ANSI_COLOR="0;33"
CPE_NAME="cpe:/o:amazon:linux:2017.03:ga"
HOME_URL="http://aws.amazon.com/amazon-linux-ami/"

Error is as follows:

[pool-1-thread-1] ERROR com.amazonaws.services.kinesis.clientlibrary.lib.worker.ProcessTask - ShardId shardId-000000000000: Application processRecords() threw an exception when processing shard
[pool-1-thread-1] ERROR com.amazonaws.services.kinesis.clientlibrary.lib.worker.ProcessTask - ShardId shardId-000000000000: Skipping over the following data records: [UserRecord [subSequenceNumber=0, explicitHashKey=null, aggregated=false, getSequenceNumber()=49574320770751529443060232975955559421246532889736118274, getData()=java.nio.HeapByteBuffer[pos=0 lim=418 cap=418], getPartitionKey()=617bdac6-d41c-4412-90fa-3b28c7a6914a], UserRecord [subSequenceNumber=0, explicitHashKey=null, aggregated=false, getSequenceNumber()=49574320770751529443060232975956768347066147518910824450, getData()=java.nio.HeapByteBuffer[pos=0 lim=405 cap=405], getPartitionKey()=5fd57cf5-d825-4646-9d0d-6e640c7e8761], UserRecord [subSequenceNumber=0, explicitHashKey=null, aggregated=false, getSequenceNumber()=49574320770751529443060232975957977272885762148085530626, getData()=java.nio.HeapByteBuffer[pos=0 lim=409 cap=409], getPartitionKey()=e284d992-91d0-49ad-81d2-6ddfc8eacc0b], UserRecord [subSequenceNumber=0, explicitHashKey=null, aggregated=false, 

My fix (#3020) hasn’t been merged yet and is scheduled for an R9x release. (If I haven understood the GitHub roadmap correctly - @alex correct me if I’m wrong there)

You’ll have to build from that PR or apply the patch manually to force the exception to bubble up properly. As I say it’s worth quadruple-checking your ES connection & auth parameters as that is more than likely the cause.

Thanks for your reply @acgray, when you say ES or auth related, do you mean in that I may have specified wrong types, or not quoted conf items correctly perhaps?

I tcpdump'd whilst running snowplow-elasticsearch-sink-0.8.0-2x and saw no attempt to connect to ES.

If it makes any difference at all, I’m going to be using AWS’s ElasticSearch service onto port 80 (rather than 9200). Not very secure, a bit disappointing from them to be honest.

Hi @Adam_Gray - that’s correct. We hope to start on that release within the next 10 days.

FYI @Graham-M you can also use Amazon ES service over HTTPS on port 443!

@acgray @Graham-M , the 0.9.0 release of the now renamed Elasticsearch Loader will be happening pretty soon.

It will integrate the fix for this post’s original bug written by @acgray as well as support for https and signing aws requests.

Sweet!