Troubleshooting EmrEtlRunner


#1

Hi Team,

Today my job got failed. I am trying to know the cause and going through the troubleshooting guide provided here: https://github.com/snowplow/snowplow/wiki/Troubleshooting-jobs-on-Elastic-MapReduce. In this guide I am going through this Checking the Hadoop logs for errors. So in the 3 step it says to open the task_attempts sub folder but I don’t have the same in my bucket. I only have these folders:
containers, em, hadoop-mapreduce, node & steps.
Can you please help me to find the cause of my job failure. Appreciate your help.

Regards!
Deepak Bhatt


#2

Hi Team,

I have found the error in the emr logs. The error says:

2016-12-05 07:13:12,380 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.OutOfMemoryError: Java heap space
	at java.util.Arrays.copyOfRange(Arrays.java:2694)
	at java.lang.String.<init>(String.java:203)
	at java.lang.StringBuilder.toString(StringBuilder.java:405)
	at com.fasterxml.jackson.core.util.TextBuffer.contentsAsString(TextBuffer.java:349)
	at com.fasterxml.jackson.core.io.SegmentedStringWriter.getAndClear(SegmentedStringWriter.java:83)
	at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:2344)
	at org.json4s.jackson.JsonMethods$class.compact(JsonMethods.scala:34)
	at org.json4s.jackson.JsonMethods$.compact(JsonMethods.scala:50)
	at com.snowplowanalytics.snowplow.enrich.common.outputs.BadRow.toCompactJson(BadRow.scala:86)
	at com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob$$anonfun$13$$anonfun$apply$1.apply(EtlJob.scala:189)
	at com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob$$anonfun$13$$anonfun$apply$1.apply(EtlJob.scala:188)
	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
	at scala.collection.immutable.List.foreach(List.scala:318)
	at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
	at scala.collection.AbstractTraversable.map(Traversable.scala:105)
	at com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob$$anonfun$13.apply(EtlJob.scala:188)
	at com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob$$anonfun$13.apply(EtlJob.scala:182)
	at com.twitter.scalding.FlatMapFunction.operate(Operations.scala:46)
	at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:99)
	at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:39)
	at cascading.flow.stream.FunctionEachStage$1.collect(FunctionEachStage.java:80)
	at cascading.tuple.TupleEntryCollector.safeCollect(TupleEntryCollector.java:145)
	at cascading.tuple.TupleEntryCollector.add(TupleEntryCollector.java:133)
	at com.twitter.scalding.MapFunction.operate(Operations.scala:59)
	at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:99)
	at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:39)
	at cascading.flow.stream.SourceStage.map(SourceStage.java:102)
	at cascading.flow.stream.SourceStage.run(SourceStage.java:58)
	at cascading.flow.hadoop.FlowMapper.run(FlowMapper.java:130)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344)

Can you please help me, how I can increase java heap memory.

Thanks!
Deepak


#3

Hi @deepak,

What size master/task/core nodes are you running? If you’re seeing Java OOM errors it may simply be a case of increasing the task nodes to instances with more RAM.


#4

Hi @mike

I am using 2 core instance of m3.xlarge and master instance of m1.medium.

Thanks!
Deepak


#5

Hi Team

I have tried increasing the java heap space using the bootstrap action, but it failed…
Emr ami version: 4.3.0
hadoop_enrich: 1.6.0
hadoop_shred: 0.8.0

The script I am using:
#!/bin/sh
/usr/share/aws/emr/scripts/configure-hadoop -m mapred.child.java.opts=-Xmx6G

and the error says: “/mnt/var/lib/bootstrap-actions/1/emr-customize-bootsize_unix.sh: line 2: /usr/share/aws/emr/scripts/configure-hadoop: No such file or directory”.
So, I think this version of AMI(4.3.0) does not have configure-hadoop script. Can you please help me with this.

Thanks!
Deepak Bhatt


#6

Could you post the graph/metrics for the “Memory Allocated MB” metric for the EMR cluster? You can find this under Monitoring => IO.


#7

Hi @mike

Please find below the “Memory Allocated” metric for EMR cluster:


#8

Hi @alex ,

Need help. Unable to process these logs. Java out of memory error:

2016-12-06 12:54:29,122 ERROR [main] cascading.flow.stream.TrapHandler: caught OutOfMemoryException, will not trap, rethrowing
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOfRange(Arrays.java:2694)
at java.lang.String.(String.java:203)
at java.lang.StringBuilder.toString(StringBuilder.java:405)
at com.fasterxml.jackson.core.util.TextBuffer.contentsAsString(TextBuffer.java:349)
at com.fasterxml.jackson.core.io.SegmentedStringWriter.getAndClear(SegmentedStringWriter.java:83)
at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:2344)
at org.json4s.jackson.JsonMethods$class.compact(JsonMethods.scala:34)
at org.json4s.jackson.JsonMethods$.compact(JsonMethods.scala:50)
at com.snowplowanalytics.snowplow.enrich.common.outputs.BadRow.toCompactJson(BadRow.scala:86)
at com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob$$anonfun$13$$anonfun$apply$1.apply(EtlJob.scala:189)
at com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob$$anonfun$13$$anonfun$apply$1.apply(EtlJob.scala:188)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob$$anonfun$13.apply(EtlJob.scala:188)
at com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob$$anonfun$13.apply(EtlJob.scala:182)
at com.twitter.scalding.FlatMapFunction.operate(Operations.scala:46)
at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:99)
at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:39)
at cascading.flow.stream.FunctionEachStage$1.collect(FunctionEachStage.java:80)
at cascading.tuple.TupleEntryCollector.safeCollect(TupleEntryCollector.java:145)
at cascading.tuple.TupleEntryCollector.add(TupleEntryCollector.java:133)
at com.twitter.scalding.MapFunction.operate(Operations.scala:59)
at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:99)
at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:39)
at cascading.flow.stream.SourceStage.map(SourceStage.java:102)
at cascading.flow.stream.SourceStage.run(SourceStage.java:58)
at cascading.flow.hadoop.FlowMapper.run(FlowMapper.java:130)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344)
2016-12-06 12:54:29,131 ERROR [main] cascading.flow.stream.TrapHandler: caught OutOfMemoryException, will not trap, rethrowing
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOfRange(Arrays.java:2694)
at java.lang.String.(String.java:203)
at java.lang.StringBuilder.toString(StringBuilder.java:405)
at com.fasterxml.jackson.core.util.TextBuffer.contentsAsString(TextBuffer.java:349)
at com.fasterxml.jackson.core.io.SegmentedStringWriter.getAndClear(SegmentedStringWriter.java:83)
at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:2344)
at org.json4s.jackson.JsonMethods$class.compact(JsonMethods.scala:34)
at org.json4s.jackson.JsonMethods$.compact(JsonMethods.scala:50)
at com.snowplowanalytics.snowplow.enrich.common.outputs.BadRow.toCompactJson(BadRow.scala:86)
at com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob$$anonfun$13$$anonfun$apply$1.apply(EtlJob.scala:189)
at com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob$$anonfun$13$$anonfun$apply$1.apply(EtlJob.scala:188)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob$$anonfun$13.apply(EtlJob.scala:188)
at com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob$$anonfun$13.apply(EtlJob.scala:182)
at com.twitter.scalding.FlatMapFunction.operate(Operations.scala:46)
at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:99)
at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:39)
at cascading.flow.stream.FunctionEachStage$1.collect(FunctionEachStage.java:80)
at cascading.tuple.TupleEntryCollector.safeCollect(TupleEntryCollector.java:145)
at cascading.tuple.TupleEntryCollector.add(TupleEntryCollector.java:133)
at com.twitter.scalding.MapFunction.operate(Operations.scala:59)
at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:99)
at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:39)
at cascading.flow.stream.SourceStage.map(SourceStage.java:102)
at cascading.flow.stream.SourceStage.run(SourceStage.java:58)
at cascading.flow.hadoop.FlowMapper.run(FlowMapper.java:130)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344)
2016-12-06 12:54:29,136 INFO [main] com.amazon.ws.emr.hadoop.fs.s3n.MultipartUploadOutputStream: abort closed:false s3://udmd-a-storage/udmd-a-enriched/enrich/bad/run=2016-12-06-11-06-19/part-00005
2016-12-06 12:54:29,138 INFO [s3n-worker-2] com.amazonaws.latency: Exception=[com.amazonaws.AbortedException: ], ServiceName=[Amazon S3], ServiceEndpoint=[https://udmd-a-storage.s3.amazonaws.com], Exception=1, HttpClientPoolLeasedCount=1, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[15325.284], HttpRequestTime=[15323.544], RequestSigningTime=[0.617], CredentialsRequestTime=[0.006], HttpClientSendRequestTime=[15186.65],
2016-12-06 12:54:29,597 INFO [s3n-worker-2] com.amazon.ws.emr.hadoop.fs.s3n.MultipartUploadOutputStream: uploadPart error com.amazonaws.AbortedException:
2016-12-06 12:54:29,695 INFO [main] com.amazonaws.latency: StatusCode=[204], ServiceName=[Amazon S3], AWSRequestID=[D711B721672C29BC], ServiceEndpoint=[https://udmd-a-storage.s3.amazonaws.com], HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=2, ClientExecuteTime=[557.278], HttpRequestTime=[549.627], HttpClientReceiveResponseTime=[26.882], RequestSigningTime=[6.532], CredentialsRequestTime=[0.005], ResponseProcessingTime=[0.015], HttpClientSendRequestTime=[0.34],
2016-12-06 12:54:29,695 WARN [main] org.apache.hadoop.hdfs.DFSClient: DFSInputStream has been closed already
2016-12-06 12:54:29,700 INFO [main] com.amazon.ws.emr.hadoop.fs.s3n.MultipartUploadOutputStream: close closed:true s3://udmd-a-storage/udmd-a-enriched/enrich/bad/run=2016-12-06-11-06-19/part-00005
2016-12-06 12:54:29,701 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOfRange(Arrays.java:2694)
at java.lang.String.(String.java:203)
at java.lang.StringBuilder.toString(StringBuilder.java:405)
at com.fasterxml.jackson.core.util.TextBuffer.contentsAsString(TextBuffer.java:349)
at com.fasterxml.jackson.core.io.SegmentedStringWriter.getAndClear(SegmentedStringWriter.java:83)
at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:2344)
at org.json4s.jackson.JsonMethods$class.compact(JsonMethods.scala:34)
at org.json4s.jackson.JsonMethods$.compact(JsonMethods.scala:50)
at com.snowplowanalytics.snowplow.enrich.common.outputs.BadRow.toCompactJson(BadRow.scala:86)
at com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob$$anonfun$13$$anonfun$apply$1.apply(EtlJob.scala:189)
at com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob$$anonfun$13$$anonfun$apply$1.apply(EtlJob.scala:188)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob$$anonfun$13.apply(EtlJob.scala:188)
at com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob$$anonfun$13.apply(EtlJob.scala:182)
at com.twitter.scalding.FlatMapFunction.operate(Operations.scala:46)
at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:99)
at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:39)
at cascading.flow.stream.FunctionEachStage$1.collect(FunctionEachStage.java:80)
at cascading.tuple.TupleEntryCollector.safeCollect(TupleEntryCollector.java:145)
at cascading.tuple.TupleEntryCollector.add(TupleEntryCollector.java:133)
at com.twitter.scalding.MapFunction.operate(Operations.scala:59)
at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:99)
at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:39)
at cascading.flow.stream.SourceStage.map(SourceStage.java:102)
at cascading.flow.stream.SourceStage.run(SourceStage.java:58)
at cascading.flow.hadoop.FlowMapper.run(FlowMapper.java:130)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344)

please tell me how we can increase the java heap space in the cluster. There is an script in the google group to increase the heap space but the same is not working for the AMI version 4.3.0. Need help.

Thanks!
DB