Hey @alex,
Looking back over the logs I found two cases where the schema lookup in Iglu Central seems to have been the problem. I’ll paste the logs where I see the actual error message, but I can get you other logs from the relevant runs if that is helpful (and you can tell me what would be useful).
Here are the contents of logs/<bucket>/steps/<step>/stderr.gz
for an EMR cluster that was created at 2016-07-16 13:24 (UTC-4)
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at com.twitter.scalding.Job$.apply(Job.scala:47)
at com.twitter.scalding.Tool.getJob(Tool.scala:48)
at com.twitter.scalding.Tool.run(Tool.scala:68)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at com.snowplowanalytics.snowplow.enrich.hadoop.JobRunner$.main(JobRunner.scala:33)
at com.snowplowanalytics.snowplow.enrich.hadoop.JobRunner.main(JobRunner.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: com.snowplowanalytics.snowplow.enrich.common.FatalEtlError: NonEmptyList(error: NonEmptyList(error: Could not find schema with key iglu:com.snowplowanalytics.snowplow/currency_conversion_config/jsonschema/1-0-0 in any repository, tried:
level: "error"
repositories: ["Iglu Client Embedded [embedded]","Iglu Central [HTTP]"]
, error: Unexpected exception fetching iglu:com.snowplowanalytics.snowplow/currency_conversion_config/jsonschema/1-0-0 in HTTP Iglu repository Iglu Central: java.io.IOException: Server returned HTTP response code: 500 for URL: http://iglucentral.com/schemas/com.snowplowanalytics.snowplow/currency_conversion_config/jsonschema/1-0-0
level: "error"
)
level: "error"
)
at com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob$$anonfun$2.apply(EtlJob.scala:140)
at com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob$$anonfun$2.apply(EtlJob.scala:140)
at scalaz.Validation$class.fold(Validation.scala:64)
at scalaz.Failure.fold(Validation.scala:330)
at com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob.<init>(EtlJob.scala:139)
... 16 more
And here are the contents of the same file for an EMR cluster created at 2016-07-13 13:23 (UTC-4)
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at com.twitter.scalding.Job$.apply(Job.scala:47)
at com.twitter.scalding.Tool.getJob(Tool.scala:48)
at com.twitter.scalding.Tool.run(Tool.scala:68)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at com.snowplowanalytics.snowplow.enrich.hadoop.JobRunner$.main(JobRunner.scala:33)
at com.snowplowanalytics.snowplow.enrich.hadoop.JobRunner.main(JobRunner.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: com.snowplowanalytics.snowplow.enrich.common.FatalEtlError: NonEmptyList(error: NonEmptyList(error: Could not find schema with key iglu:com.snowplowanalytics.snowplow/enrichments/jsonschema/1-0-0 in any repository, tried:
level: "error"
repositories: ["Iglu Client Embedded [embedded]","Iglu Central [HTTP]"]
, error: Unexpected exception fetching iglu:com.snowplowanalytics.snowplow/enrichments/jsonschema/1-0-0 in HTTP Iglu repository Iglu Central: java.io.IOException: Server returned HTTP response code: 500 for URL: http://iglucentral.com/schemas/com.snowplowanalytics.snowplow/enrichments/jsonschema/1-0-0
level: "error"
)
level: "error"
)
at com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob$$anonfun$2.apply(EtlJob.scala:140)
at com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob$$anonfun$2.apply(EtlJob.scala:140)
at scalaz.Validation$class.fold(Validation.scala:64)
at scalaz.Failure.fold(Validation.scala:330)
at com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob.<init>(EtlJob.scala:139)
... 16 more
Let me know if there is any more information I can provide that would be useful!