Not seeing events in Kibana


#1

I’ve deployed a snowplow-mini instance, AMI: snowplow-mini-0.2.2-1490013561-hvm-ebs-amd64 (ami-0c5f6f6a), and can connect on all the ports I should be able to. I then set up the good and bad Kibana indexes and using the snowplow-mini example-events webpage running on the instance sent a few test events. Nothing appeared in Kibana on either of the indexes which lead to some nosing around on the snowplow-mini instance. I can reproduce the problem described below with a single track page view event.

I believe that the stream enrich process is dying. After sending an event the snowplow_stream_enrich_0.7.0 service becomes stopped as does the snowplow_elasticsearch_sink_good_0.5.0 service although I suspect the latter is due to an EOF on the named pipe. I’ve found the /var/log/snowplow_elasticsearch_sink_bad_0.5.0.err log which I can send over but mostly it’s just repeating the lines:

SEVERE: Record failed with message: ElasticsearchParseException[failed to parse doc to extract routing/timestamp/id]; nested: ElasticsearchParseException[Failed to derive xcontent from (offset=0, length=44): [115, 117, 100, 111, 58, 32, 117, 110, 97, 98, 108, 101, 32, 116, 111, 32, 114, 101, 115, 111, 108, 118, 101, 32, 104, 111, 115, 116, 32, 105, 112, 45, 49, 48, 45, 49, 54, 51, 45, 51, 45, 49, 51, 52]];
Jul 10, 2017 9:34:54 AM com.snowplowanalytics.snowplow.storage.kinesis.elasticsearch.ElasticsearchSender attemptEmit$1
INFO: Emitted 0 records to Elasticsearch
Jul 10, 2017 9:34:54 AM com.snowplowanalytics.snowplow.storage.kinesis.elasticsearch.ElasticsearchSender printClusterStatus
WARNING: Cluster health is YELLOW.
Jul 10, 2017 9:34:54 AM com.snowplowanalytics.snowplow.storage.kinesis.elasticsearch.ElasticsearchSender attemptEmit$1
WARNING: Returning 1 records as failed

I can see the raw events arriving using cat /home/ubuntu/snowplow/pipes/raw-events-pipe. I believe this sets up a race condition on the named pipe so only ran for an single test. The base64 decoded event looked like the snowplow test event I fired. With only the snowplow_stream_enrich_0.7.0 service running, not the good sink, I saw nothing arrive when I set up my own cat /home/ubuntu/snowplow/pipes/enriched-events-pipe and the cat ended, presumably when the service did. The bad sink is still running and no events are showing up in Kibana. I haven’t been able to find any other useful logs.

Can anyone help here?

Thanks
Gareth


#2

One of my colleagues pointed out that [115, 117, 100, 111, … was likely a byte stream which when decoded reads: sudo: unable to resolve host ip-10-163-3-134 which was something I had to fix because my sudo commands were taking a long time to complete printing the previous message along with the correct result.

I saw the same error post hosts file edit and still get the same three messages e.g.

SEVERE: Record failed with message: ElasticsearchParseException[failed to parse doc to extract routing/timestamp/id]; nested: ElasticsearchParseException[Failed to derive xcontent from (offset=0, length=95): [9, 97, 116, 32, 99, 111, 109, 46, 115, 110, 111, 119, 112, 108, 111, 119, 97, 110, 97, 108, 121, 116, 105, 99, 115, 46, 115, 110, 111, 119, 112, 108, 111, 119, 46, 101, 110, 114, 105, 99, 104, 46, 107, 105, 110, 101, 115, 105, 115, 46, 75, 105, 110, 101, 115, 105, 115, 69, 110, 114, 105, 99, 104, 65, 112, 112, 46, 109, 97, 105, 110, 40, 75, 105, 110, 101, 115, 105, 115, 69, 110, 114, 105, 99, 104, 65, 112, 112, 46, 115, 99, 97, 108, 97, 41]];
Jul 10, 2017 11:47:16 AM com.snowplowanalytics.snowplow.storage.kinesis.elasticsearch.ElasticsearchSender attemptEmit$1
INFO: Emitted 0 records to Elasticsearch
Jul 10, 2017 11:47:16 AM com.snowplowanalytics.snowplow.storage.kinesis.elasticsearch.ElasticsearchSender printClusterStatus
WARNING: Cluster health is YELLOW.
Jul 10, 2017 11:47:16 AM com.snowplowanalytics.snowplow.storage.kinesis.elasticsearch.ElasticsearchSender attemptEmit$1
WARNING: Returning 1 records as failed

which reads: ** at com.snowplowanalytics.snowplow.enrich.kinesis.KinesisEnrichApp.main(KinesisEnrichApp.scala)**. It looks like there’s an uncaught exception in the stream enrichment. The full stack trace is

Exception in thread "main"
at com.snowplowanalytics.snowplow.enrich.kinesis.KinesisEnrichApp$$anonfun$10.apply(KinesisEnrichApp.scala:150)
at com.snowplowanalytics.snowplow.enrich.kinesis.KinesisEnrichApp$$anonfun$10.apply(KinesisEnrichApp.scala:150)
at scalaz.Validation$class.fold(Validation.scala:64)
at scalaz.Failure.fold(Validation.scala:330)
at com.snowplowanalytics.snowplow.enrich.kinesis.KinesisEnrichApp$delayedInit$body.apply(KinesisEnrichApp.scala:149)
at scala.Function0$class.apply$mcV$sp(Function0.scala:40)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main$1.apply(App.scala:71)
at scala.App$$anonfun$main$1.apply(App.scala:71)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:32)
at scala.App$class.main(App.scala:71)
at com.snowplowanalytics.snowplow.enrich.kinesis.KinesisEnrichApp$.main(KinesisEnrichApp.scala:76)
at com.snowplowanalytics.snowplow.enrich.kinesis.KinesisEnrichApp.main(KinesisEnrichApp.scala)

which I read off the bad-1-pipe and that is what is being reported in my error log.


#3

Problem solved having looked up the failing line in the Snowplow source code. I have a private not public instance so I needed to add an egress route to the external world to allow things like the IP lookup.

I am now seeing both good and bad events in Kibana. The bad events are

{“line”:"",“errors”:[{“level”:“error”,“message”:“Event contained only 1 tab-separated fields”}],“failure_tstamp”:“2017-07-10T13:16:23.197Z”}

I’m not sure if that is expected.


#4

One final point to explain how I actually got into this state as the above is not quite right. I used terraform to create the security group forgetting to add the egress rules. I just re-read the docs and they do state

For OutBound you can leave the default to allow everything out.

so it’s got nothing to do with this being a private instance and everything to do with user error.