Stream Enricher Error


#1

Hi @ll

We are deployment a RT pipeline but are stuck in the Enrich step.

At some point we think is something related to the IGLU central repo, that only have versions 1-0-0 for each enrichments because seem to be unsync with the /schemas in the github repo. We setup an IGLU repo in http://iglu.spettro.io but nothing change the outcome.

The Stream Enrich App run on a docker container and the stream-raw-events are in a Kinesis Sink

Can somebody have this error? Exist another setup resources than the wiki setup pages for the enrich?? The wiki setup page seem to be outdated (dont have references to Kafka for example)

We are receiving always this error:

Exception in thread “main” com.fasterxml.jackson.core.JsonParseException: Unexpected character (’:’ (code 58)): Expected space separating root-level values
8/12/2017 11:57:32 AM at [Source: 404: Not Found
8/12/2017 11:57:32 AM; line: 1, column: 5]
8/12/2017 11:57:32 AM at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1487)
8/12/2017 11:57:32 AM at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:518)
8/12/2017 11:57:32 AM at com.fasterxml.jackson.core.base.ParserMinimalBase._reportUnexpectedChar(ParserMinimalBase.java:447)
8/12/2017 11:57:32 AM at com.fasterxml.jackson.core.base.ParserMinimalBase._reportMissingRootWS(ParserMinimalBase.java:463)
8/12/2017 11:57:32 AM at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._verifyRootSpace(ReaderBasedJsonParser.java:1236)
8/12/2017 11:57:32 AM at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._parsePosNumber(ReaderBasedJsonParser.java:886)
8/12/2017 11:57:32 AM at com.fasterxml.jackson.core.json.ReaderBasedJsonParser.nextToken(ReaderBasedJsonParser.java:680)
8/12/2017 11:57:32 AM at com.fasterxml.jackson.databind.ObjectMapper._initForReading(ObjectMapper.java:3602)
8/12/2017 11:57:32 AM at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3547)
8/12/2017 11:57:32 AM at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2578)
8/12/2017 11:57:32 AM at org.json4s.jackson.JsonMethods$class.parse(JsonMethods.scala:20)
8/12/2017 11:57:32 AM at org.json4s.jackson.JsonMethods$.parse(JsonMethods.scala:50)
8/12/2017 11:57:32 AM at com.snowplowanalytics.snowplow.enrich.kinesis.KinesisEnrichApp$$anonfun$16.apply(KinesisEnrichApp.scala:253)
8/12/2017 11:57:32 AM at com.snowplowanalytics.snowplow.enrich.kinesis.KinesisEnrichApp$$anonfun$16.apply(KinesisEnrichApp.scala:253)
8/12/2017 11:57:32 AM at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
8/12/2017 11:57:32 AM at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
8/12/2017 11:57:32 AM at scala.collection.immutable.List.foreach(List.scala:318)
8/12/2017 11:57:32 AM at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
8/12/2017 11:57:32 AM at scala.collection.AbstractTraversable.map(Traversable.scala:105)
8/12/2017 11:57:32 AM at com.snowplowanalytics.snowplow.enrich.kinesis.KinesisEnrichApp$.extractEnrichmentConfig(KinesisEnrichApp.scala:253)
8/12/2017 11:57:32 AM at com.snowplowanalytics.snowplow.enrich.kinesis.KinesisEnrichApp$delayedInit$body.apply(KinesisEnrichApp.scala:131)
8/12/2017 11:57:32 AM at scala.Function0$class.apply$mcV$sp(Function0.scala:40)
8/12/2017 11:57:32 AM at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
8/12/2017 11:57:32 AM at scala.App$$anonfun$main$1.apply(App.scala:71)
8/12/2017 11:57:32 AM at scala.App$$anonfun$main$1.apply(App.scala:71)
8/12/2017 11:57:32 AM at scala.collection.immutable.List.foreach(List.scala:318)
8/12/2017 11:57:32 AM at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:32)
8/12/2017 11:57:32 AM at scala.App$class.main(App.scala:71)
8/12/2017 11:57:32 AM at com.snowplowanalytics.snowplow.enrich.kinesis.KinesisEnrichApp$.main(KinesisEnrichApp.scala:71)
8/12/2017 11:57:32 AM at com.snowplowanalytics.snowplow.enrich.kinesis.KinesisEnrichApp.main(KinesisEnrichApp.scala)
8/12/2017 11:57:32 AM[INFO] [08/12/2017 15:57:32.747] [snowplow-scala-tracker-akka.actor.default-dispatcher-2] [akka://snowplow-scala-tracker/user/IO-HTTP/group-0] Message [spray.can.Http$Connect] from Actor[akka://snowplow-scala-tracker/user/IO-HTTP/host-connector-0/0#-243429783] to Actor[akka://snowplow-scala-tracker/user/IO-HTTP/group-0#-25616617] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings ‘akka.log-dead-letters’ and ‘akka.log-dead-letters-during-shutdown’.
8/12/2017 11:57:33 AMException in thread “Thread-1” akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://snowplow-scala-tracker/user/IO-HTTP#2139136183]] after [1000 ms]
8/12/2017 11:57:33 AM at akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:334)
8/12/2017 11:57:33 AM at akka.actor.Scheduler$$anon$7.run(Scheduler.scala:117)
8/12/2017 11:57:33 AM at scala.concurrent.Future$InternalCallbackExecutor$.scala$concurrent$Future$InternalCallbackExecutor$$unbatchedExecute(Future.scala:694)
8/12/2017 11:57:33 AM at scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:691)
8/12/2017 11:57:33 AM at akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(Scheduler.scala:467)
8/12/2017 11:57:33 AM at akka.actor.LightArrayRevolverScheduler$$anon$8.executeBucket$1(Scheduler.scala:419)
8/12/2017 11:57:33 AM at akka.actor.LightArrayRevolverScheduler$$anon$8.nextTick(Scheduler.scala:423)
8/12/2017 11:57:33 AM at akka.actor.LightArrayRevolverScheduler$$anon$8.run(Scheduler.scala:375)
8/12/2017 11:57:33 AM at java.lang.Thread.run(Thread.java:748)

thanks to all


#2

Checking the iglucentral.com all the schemas are sync, just the mainpage are showing the 1-0-0, these maybe is the cause that with a private iglu mirror the outcome is the same.

We validate all the hocon/json config files for the enricher and the enrichments

Any thoughs?


#3

Hi @spatialy,

From the stacktrace it looks like at least one of your enrichments is malformatted.

Can you:

  • Revert to a vanilla Iglu resolver (i.e. don’t attempt any mirroring of Iglu Central)
  • Reduce down to a single configured enrichment
  • Start Stream Enrich
  • Confirm that fails and share in this thread your Iglu resolver and your full unedited configuration file for the single active enrichment

Thanks!


#4

Hi @alex thanks for your help

What I do:

  1. Revert to iglucentral.com
  2. Leave only the ip_lookup enrichment in the subdirectory
  3. Run the enricher

Below you see the new error trace:

8/12/2017 11:51:45 PMException in thread “main” java.lang.RuntimeException: NonEmptyList(error: Resolver configuration failed JSON Schema validation
8/12/2017 11:51:45 PM level: "error"
8/12/2017 11:51:45 PM, error: Verifying schema as iglu:com.snowplowanalytics.iglu/resolver-config/jsonschema/1-0-1 failed: found iglu:com.snowplowanalytics.iglu/resolver-config/jsonschema/1-0-2
8/12/2017 11:51:45 PM level: "error"
8/12/2017 11:51:45 PM)
8/12/2017 11:51:45 PM at com.snowplowanalytics.snowplow.enrich.kinesis.KinesisEnrichApp$$anonfun$7.apply(KinesisEnrichApp.scala:137)
8/12/2017 11:51:45 PM at com.snowplowanalytics.snowplow.enrich.kinesis.KinesisEnrichApp$$anonfun$7.apply(KinesisEnrichApp.scala:137)
8/12/2017 11:51:45 PM at scalaz.Validation$class.fold(Validation.scala:64)
8/12/2017 11:51:45 PM at scalaz.Failure.fold(Validation.scala:330)
8/12/2017 11:51:45 PM at com.snowplowanalytics.snowplow.enrich.kinesis.KinesisEnrichApp$delayedInit$body.apply(KinesisEnrichApp.scala:136)
8/12/2017 11:51:45 PM at scala.Function0$class.apply$mcV$sp(Function0.scala:40)
8/12/2017 11:51:45 PM at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
8/12/2017 11:51:45 PM at scala.App$$anonfun$main$1.apply(App.scala:71)
8/12/2017 11:51:45 PM at scala.App$$anonfun$main$1.apply(App.scala:71)
8/12/2017 11:51:45 PM at scala.collection.immutable.List.foreach(List.scala:318)
8/12/2017 11:51:45 PM at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:32)
8/12/2017 11:51:45 PM at scala.App$class.main(App.scala:71)
8/12/2017 11:51:45 PM at com.snowplowanalytics.snowplow.enrich.kinesis.KinesisEnrichApp$.main(KinesisEnrichApp.scala:71)
8/12/2017 11:51:45 PM at com.snowplowanalytics.snowplow.enrich.kinesis.KinesisEnrichApp.main(KinesisEnrichApp.scala)
8/12/2017 11:51:45 PM[INFO] [08/13/2017 03:51:45.615] [snowplow-scala-tracker-akka.actor.default-dispatcher-2] [akka://snowplow-scala-tracker/user/IO-HTTP/group-0] Message [spray.can.Http$Connect] from Actor[akka://snowplow-scala-tracker/user/IO-HTTP/host-connector-0/0#-750833917] to Actor[akka://snowplow-scala-tracker/user/IO-HTTP/group-0#-491676612] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings ‘akka.log-dead-letters’ and ‘akka.log-dead-letters-during-shutdown’.
8/12/2017 11:51:46 PMException in thread “Thread-1” akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://snowplow-scala-tracker/user/IO-HTTP#1067390856]] after [1000 ms]
8/12/2017 11:51:46 PM at akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:334)
8/12/2017 11:51:46 PM at akka.actor.Scheduler$$anon$7.run(Scheduler.scala:117)
8/12/2017 11:51:46 PM at scala.concurrent.Future$InternalCallbackExecutor$.scala$concurrent$Future$InternalCallbackExecutor$$unbatchedExecute(Future.scala:694)
8/12/2017 11:51:46 PM at scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:691)
8/12/2017 11:51:46 PM at akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(Scheduler.scala:467)
8/12/2017 11:51:46 PM at akka.actor.LightArrayRevolverScheduler$$anon$8.executeBucket$1(Scheduler.scala:419)
8/12/2017 11:51:46 PM at akka.actor.LightArrayRevolverScheduler$$anon$8.nextTick(Scheduler.scala:423)
8/12/2017 11:51:46 PM at akka.actor.LightArrayRevolverScheduler$$anon$8.run(Scheduler.scala:375)
8/12/2017 11:51:46 PM at java.lang.Thread.run(Thread.java:748)
8/12/2017 11:52:14 PMException in thread “main” java.lang.RuntimeException: NonEmptyList(error: Resolver configuration failed JSON Schema validation
8/12/2017 11:52:14 PM level: "error"
8/12/2017 11:52:14 PM, error: Verifying schema as iglu:com.snowplowanalytics.iglu/resolver-config/jsonschema/1-0-1 failed: found iglu:com.snowplowanalytics.iglu/resolver-config/jsonschema/1-0-2
8/12/2017 11:52:14 PM level: "error"
8/12/2017 11:52:14 PM)
8/12/2017 11:52:14 PM at com.snowplowanalytics.snowplow.enrich.kinesis.KinesisEnrichApp$$anonfun$7.apply(KinesisEnrichApp.scala:137)
8/12/2017 11:52:14 PM at com.snowplowanalytics.snowplow.enrich.kinesis.KinesisEnrichApp$$anonfun$7.apply(KinesisEnrichApp.scala:137)
8/12/2017 11:52:14 PM at scalaz.Validation$class.fold(Validation.scala:64)
8/12/2017 11:52:14 PM at scalaz.Failure.fold(Validation.scala:330)
8/12/2017 11:52:14 PM at com.snowplowanalytics.snowplow.enrich.kinesis.KinesisEnrichApp$delayedInit$body.apply(KinesisEnrichApp.scala:136)
8/12/2017 11:52:14 PM at scala.Function0$class.apply$mcV$sp(Function0.scala:40)
8/12/2017 11:52:14 PM at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
8/12/2017 11:52:14 PM at scala.App$$anonfun$main$1.apply(App.scala:71)
8/12/2017 11:52:14 PM at scala.App$$anonfun$main$1.apply(App.scala:71)
8/12/2017 11:52:14 PM at scala.collection.immutable.List.foreach(List.scala:318)
8/12/2017 11:52:14 PM at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:32)
8/12/2017 11:52:14 PM at scala.App$class.main(App.scala:71)
8/12/2017 11:52:14 PM at com.snowplowanalytics.snowplow.enrich.kinesis.KinesisEnrichApp$.main(KinesisEnrichApp.scala:71)
8/12/2017 11:52:14 PM at com.snowplowanalytics.snowplow.enrich.kinesis.KinesisEnrichApp.main(KinesisEnrichApp.scala)
8/12/2017 11:52:15 PMException in thread “Thread-1” akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://snowplow-scala-tracker/user/IO-HTTP#-1725409603]] after [1000 ms]
8/12/2017 11:52:15 PM at akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:334)
8/12/2017 11:52:15 PM at akka.actor.Scheduler$$anon$7.run(Scheduler.scala:117)
8/12/2017 11:52:15 PM at scala.concurrent.Future$InternalCallbackExecutor$.scala$concurrent$Future$InternalCallbackExecutor$$unbatchedExecute(Future.scala:694)
8/12/2017 11:52:15 PM at scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:691)
8/12/2017 11:52:15 PM at akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(Scheduler.scala:467)
8/12/2017 11:52:15 PM at akka.actor.LightArrayRevolverScheduler$$anon$8.executeBucket$1(Scheduler.scala:419)
8/12/2017 11:52:15 PM at akka.actor.LightArrayRevolverScheduler$$anon$8.nextTick(Scheduler.scala:423)
8/12/2017 11:52:15 PM at akka.actor.LightArrayRevolverScheduler$$anon$8.run(Scheduler.scala:375)
8/12/2017 11:52:15 PM at java.lang.Thread.run(Thread.java:748)

Here my iglu_resolver.json file:

{
“schema”: “iglu:com.snowplowanalytics.iglu/resolver-config/jsonschema/1-0-2”,
“data”: {
“cacheSize”: 500,
“repositories”: [
{
“name”: “Iglu Central”,
“priority”: 0,
“vendorPrefixes”: [ “com.snowplowanalytics” ],
“connection”: {
“http”: {
“uri”: “http://iglucentral.com
}
}
}
]
}
}

Here my config file for the enrichment environtment:

enrich {

source = “kinesis”

sink = “kinesis”

aws {
access-key: "AKIAIRX7JV3E7…"
secret-key: “N4K5dr/s2dBzik…”
}

kafka {
brokers: “{{enrichKafkaBrokers}}”
}

streams {

in: {
  raw: "ssc-stream-good-1"
  maxRecords: 10000

  buffer: {
    byte-limit: 4500000 # 4.5 MB
    record-limit: 400 # 400 records. Not supported by Kafka; will be ignored
    time-limit: 60000 # 1 minute
  }
}

out: {
  enriched: "sse-stream-enrich-good-1"
  bad: "sse-stream-enrich-bad-1"
  backoffPolicy: {
    minBackoff: 3000 # 3 seconds
    maxBackoff: 600000 # 10 minutes
  }
}

app-name: "SnowplowKinesisEnrich-$\\{enrich.streams.in.raw\\}"
initial-position = "TRIM_HORIZON"
region: "us-east-1"

}

monitoring {
snowplow {
collector-uri: "https://track.ubiquos.io"
collector-port: 443
app-id: "enricher-test"
method: “GET”
}
}
}


#5

Hi @alex, thanks for all your help

We encounter the problems and now the enrichments are working OK. I leave my comment here if anybody in the future can run in the same problem:

  • For testing/pre-deployment purposes allways run the Enrichment App with the extended verbose logger java -jar -Dorg.slf4j.simpleLogger.defaultLogLevel=debug
  • Hardcode the app-name parameter in the config, the automatic name suggested in the vanilla config file give us an error at DynamoDB name constrains
  • use the forums :slight_smile:
  • debug … debug … debug :sweat:

Thanks to @alex again

Best to all


#6

Glad you got it fixed!