Snowflake transformer failing with Futures Timed Out message

Our snowflake transformer has started failing with the following error message in the cluster log:

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/mnt/yarn/usercache/hadoop/filecache/10/__spark_libs__6496353209541393289.zip/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
20/07/07 14:40:50 INFO SignalUtils: Registered signal handler for TERM
20/07/07 14:40:50 INFO SignalUtils: Registered signal handler for HUP
20/07/07 14:40:50 INFO SignalUtils: Registered signal handler for INT
20/07/07 14:40:51 INFO ApplicationMaster: Preparing Local resources
20/07/07 14:40:53 INFO ApplicationMaster: ApplicationAttemptId: appattempt_1594132618622_0001_000001
20/07/07 14:40:53 INFO SecurityManager: Changing view acls to: yarn,hadoop
20/07/07 14:40:53 INFO SecurityManager: Changing modify acls to: yarn,hadoop
20/07/07 14:40:53 INFO SecurityManager: Changing view acls groups to: 
20/07/07 14:40:53 INFO SecurityManager: Changing modify acls groups to: 
20/07/07 14:40:53 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(yarn, hadoop); groups with view permissions: Set(); users  with modify permissions: Set(yarn, hadoop); groups with modify permissions: Set()
20/07/07 14:40:53 INFO ApplicationMaster: Starting the user application in a separate Thread
20/07/07 14:40:53 INFO ApplicationMaster: Waiting for spark context initialization...
20/07/07 14:40:55 WARN JsonMetaSchema: Unknown keyword exclusiveMinimum - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword
20/07/07 14:42:33 ERROR ApplicationMaster: Uncaught exception: 
java.util.concurrent.TimeoutException: Futures timed out after [100000 milliseconds]
	at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
	at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:201)
	at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:401)
	at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:254)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:764)
	at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:67)
	at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:66)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
	at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66)
	at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:762)
	at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
20/07/07 14:42:33 INFO ApplicationMaster: Final app status: FAILED, exitCode: 10, (reason: Uncaught exception: java.util.concurrent.TimeoutException: Futures timed out after [100000 milliseconds])
20/07/07 14:42:33 INFO ApplicationMaster: Unregistering ApplicationMaster with FAILED (diag message: Uncaught exception: java.util.concurrent.TimeoutException: Futures timed out after [100000 milliseconds])
20/07/07 14:42:33 INFO ApplicationMaster: Deleting staging directory hdfs://ip-10-80-20-232.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1594132618622_0001
20/07/07 14:42:33 INFO ShutdownHookManager: Shutdown hook called

I’ve tried increasing the instance size to m2.xlarge (it was previously a medium) to no avail.

The only change we’ve made recently is altering the pathname of the streaming enrich output folders from YYYY-MM-DD-HH to YYYY-MM-DD-HH-mm to get around a problem of the loader missing data when it was loading incomplete buckets. However, the snowflake loader has been running for a couple of days succesfully since then. It’s a fairly low volume of events overall (less than 5k per day).

Advice appreciated! :slight_smile:

Hi @iain! What’s the version of the Loader? Slightly similar bug existed in very old versions where DynamoDB client was trying to fetch the manifest for too long and blocking the ApplicationMaster from initialising.

Hi @anton , it’s version 0.6.0

Can you try out Transformer 0.4.2 (Loader can remain at 0.6.0)? It’s the version we widely use inside Snowplow unless we need bad rows.

Bad rows is the only thing we added to Transformer since 0.4.2, but Loader itself got a lot of changes since then. I’ll check later if I re-introduced the bug I mentioned before, because it looks suspiciously similar.

1 Like

Version 0.4.2 of the transformer is working - thank you!

Do you want me to open an issue on github for the 0.6.0 problem?

Hi @iain! Just a belated heads-up that we’ve fixed the bug in latest 0.8.0