Snowplow-event-recovery-spark error "Attempt to decode value on failed cursor"

HI,

We are trying to recover some events that failed during the enrichment step because the schema wasn’t published in our iglu-schema repository. We are using the doc using the snowplow-event-recovery-spark-0.3.1.jar.

When the job finishes we get an empty file on S3 and all the failed events:

 ~  aws s3 ls s3://{THE-BUCKET}recovery_output_2021-12-02-23/ --recursive --human-readable                                                            ✔ │ 11s
2022-01-07 14:57:03    0 Bytes recovery_output_2021-12-02-23/_SUCCESS
2022-01-07 14:57:03   42 Bytes recovery_output_2021-12-02-23/part-r-00000.lzo
 ~  aws s3 ls s3:/{THE-BUCKET}/recovery_failed_output_2021-12-02-23/ --recursive --human-readable                                                           ✔
2022-01-07 15:09:59    0 Bytes recovery_failed_output_2021-12-02-23/com.snowplowanalytics.snowplow.badrows.recovery_error/_SUCCESS
2022-01-07 15:03:12    6.0 GiB recovery_failed_output_2021-12-02-23/com.snowplowanalytics.snowplow.badrows.recovery_error/part-00000-535cdfba-8591-4581-b91e-a2a10ffb8390-c000

All the failed events on the S3 path s3:/{THE-BUCKET}/recovery_failed_output_2021-12-02-23/ have the error:

{
  "schema": "iglu:com.snowplowanalytics.snowplow.badrows/collector_payload_format_violation/jsonschema/1-0-0",
  "data": {
    "processor": {
      "artifact": "snowplow-event-recovery",
      "version": "0.2.0"
    },
    "failure": {
      "timestamp": "2022-01-06T15:59:56.293Z",
      "loader": "",
      "message": {
        "error": "Attempt to decode value on failed cursor: DownField(error),DownField(failure),DownField(data)"
      }
    },
    "payload": "{\"schema\":\"iglu:com.snowplowanalytics.snowplow.badrows/schema_violations/jsonschema/2-0-0\",\"data\":{\"payload\":{\"enriched\":{\"mkt_network\":null,\"tr_total\":null,\"br_name\":null,\"doc_charset\":\"UTF-8\",\"br_features_director\":null,\"page_urlpath\":null,\"br_features_quicktime\":null,\"tr_total_base\":null,\"mkt_term\":null,\"mkt_source\":null,\"ti_price\":null,\"tr_tax\":null,\"br_renderengine\":null,\"refr_urlhost\":null,\"v_tracker\":\"js-3.1.6\",\"mkt_clickid\":null,\"page_urlscheme\":null,\"mkt_campaign\":null,\"doc_height\":5199,\"geo_timezone\":null,\"app_id\":\"consumer-web\",\"ip_domain\":null,\"mkt_medium\":null,\"geo_longitude\":null,\"br_features_java\":null,\"refr_urlscheme\":null,\"user_id\":null,\"geo_region_name\":null,\"page_referrer\":\"https://www.google.com/\",\"os_timezone\":null,\"refr_source\":null,\"geo_region\":null,\"dvce_ismobile\":null,\"page_urlquery\":null,\"br_cookies\":1,\"useragent\":\"Mozilla/5.0 (Linux; Android 11; SAMSUNG SM-N986U) AppleWebKit/537.36 (KHTML, like Gecko) SamsungBrowser/16.0 Chrome/92.0.4515.16
....

ERROR Attempt to decode value on failed cursor: DownField(error),DownField(failure),DownField(data)

Spark Submit command executed inside the AWS EMR cluster:

spark-submit   \
--deploy-mode cluster   \
--master yarn   \
snowplow-event-recovery-spark-0.3.1.jar   \
--input s3://{THE-BUCKET}/enriched/bad/date_at=2021-12-02/hour=23/   \
--failedOutput s3://{THE-BUCKET}/recovery_failed_output_2021-12-02-23/   \
--unrecoverableOutput s3://{THE-BUCKET}/recovery_unrecoverable_output_2021-12-02-23/   \
--directoryOutput  s3://{THE-BUCKET}/recovery_output_2021-12-02-23/   \
--region eu-west-1   \
--resolver "eyJzY2hlbWEiOiJpZ2x1OmN..........................="   \
--config "eyAic2NoZW1hIjogImlnbHU6Y29tLnNub3dwbG93YW5hbHl0aWNzLnNub3dwbG93L3JlY292ZXJpZXMvanNvbnNjaGVtYS8zLTAtMCIsICJkYXRhIjogeyAiaWdsdTpjb20uc25vd3Bsb3dhbmFseXRpY3Muc25vd3Bsb3cuYmFkcm93cy9zY2hlbWFfdmlvbGF0aW9ucy9qc29uc2NoZW1hLzItMC0wIjogW3sibmFtZSI6ICJwYXNzdGhyb3VnaCIsICJjb25kaXRpb25zIjogW10sICJzdGVwcyI6IFtdfV19fQ=="

Configuration based on the doc:

{ "schema": "iglu:com.snowplowanalytics.snowplow/recoveries/jsonschema/3-0-0", "data": { "iglu:com.snowplowanalytics.snowplow.badrows/schema_violations/jsonschema/2-0-0": [{"name": "passthrough", "conditions": [], "steps": []}]}}

Bad Event Example:

{
    "schema": "iglu:com.snowplowanalytics.snowplow.badrows/schema_violations/jsonschema/2-0-0",
    "data": {
      "payload": {
        "enriched": {
          "mkt_network": null,
     .......
          "true_tstamp": null
        },
        "raw": {
          "headers": [
            "Timeout-Access: <function1>",
            "X-Forwarded-For: 2.234.152.192, 64.252.144.71",
            "X-Forwarded-Proto: https",
            "X-Forwarded-Port: 443",
   .....
            "application/json"
          ],
          "ipAddress": "fasf",
          "useragent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.55 Safari/537.36 Edg/96.0.1054.34",
          "encoding": "UTF-8",
          "version": "tp2",
          "userId": "1d5e2027-6d33-450e-90c1-011b099f8d2b",
          "refererUri": "https://onefootball.com/",
          "hostname": "thehost.onefootball.com",
          "loaderName": "ssc-2.3.1-kinesis",
          "vendor": "com.snowplowanalytics.snowplow",
          "parameters": ".....",
          "contentType": "application/json",
          "timestamp": "2021-12-01T20:21:03.743Z"
        }
      },
      "failure": {
        "messages": [
          {
            "error": {
              "lookupHistory": [
                {
                  "lastAttempt": "2021-12-01T20:14:00.159Z",
                  "repository": "Iglu Central",
                  "errors": [
                    {
                      "error": "NotFound"
                    }
                  ],
                  "attempts": 52
                },
                {
                  "lastAttempt": "2021-12-01T20:14:00.253Z",
                  "repository": "Iglu Central - GCP Mirror",
                  "errors": [
                    {
                      "error": "NotFound"
                    }
                  ],
                  "attempts": 52
                },
                {
                  "lastAttempt": "2021-12-01T09:00:35.386Z",
                  "repository": "Iglu Client Embedded",
                  "errors": [
                    {
                      "error": "NotFound"
                    }
                  ],
                  "attempts": 1
                },
                {
                  "lastAttempt": "2021-12-01T20:13:43.815Z",
                  "repository": "Production Repo",
                  "errors": [
                    {
                      "error": "NotFound"
                    }
                  ],
                  "attempts": 52
                }
              ],
              "error": "ResolutionError"
            },
            "schemaKey": "iglu:com.onefootball/consumer_web_stream_context/jsonschema/1-0-2"
          }
        ],
        "timestamp": "2021-12-01T20:21:06.066019Z"
      },
      "processor": {
        "artifact": "stream-enrich",
        "version": "1.4.2"
      }
    }
}

I already tried to check for this error on others threads but could not find anything related with the error Attempt to decode value on failed cursor: DownField(error),DownField(failure),DownField(data)*.

One thing important to mention! Those bad events were retrieved from our Elasticsearch bad index because we didn’t create an s3_loader for the enrich_bad Kinesis stream at that time. Basically, we wrote a python script to retrieve those bad events from the ElasticSearch bad index, and save them in S3 path enriched/bad.

Hi @rodrigodelmonte ,

Welcome to Snowplow community !

This error comes from circe, that is used under the hood by event recovery to decode JSON (the bad rows). The error means that at some point it was expecting to read error.failure.data but it was not there.

That might be the reason. In Elasticsearch loader in some cases we perform some renaming of the fields, to avoid having union types across bad rows (we’re working on updating the bad rows schemas to fix that).

I think that the first thing to do then is to make sure that the bad rows that you get from Elasticsearch are valid. Would you be able to use the code snippet here with the Iglu resolver that holds your schemas and check ?

3 Likes

Hi @BenB

Thanks for your response! It put me on the right track!

I have created the s3_loader for the bad stream and forced the same error we had before, then with a sample bad record, I compared it with the bad record retrieved from the ElasticSearch entich_bad index.

The problem was occurring because the Elasticsearch was returning the field ['data']['payload']['raw']['parameters'] as a string, checking my sample bad record I realized it is supposed to be a list of arrays.

After I fixed the data type, I executed the spark job again and it managed to save the lzo file on s3!

Spark submit command:

spark-submit \
  --deploy-mode cluster \
  --master yarn \
  snowplow-event-recovery-spark-0.3.1.jar \
  --input s3://${THE-BUCCKET}/enriched/bad/date_at=2021-12-07/hour=16/enriched-bad-consumer-web-2021-12-07-16.gz \
  --failedOutput s3://${THE-BUCCKET}/recovery-failed/ \
  --unrecoverableOutput s3://${THE-BUCCKET}/recovery-unrecoverable/ \
  --directoryOutput  s3://${THE-BUCCKET}/recovery/ \
  --region eu-west-1 \
  --resolver "eyJzY2hlbWEiOiJpZ2x1OmNvbS5zbm93cGxvd2FuYWx5dGljcy5pZ2x1L3Jlc29sdmVyLWNvbmZpZy9qc29uc2NoZW1hLzEtMC0yIiwiZGF0YSI6eyJjYWNoZVNpemUiOjEwMDAsInJlcG9zaXRvcmllcyI6W3sibmFtZSI6IklnbHUgQ2VudHJhbCIsInByaW9yaXR5IjoxLCJ2ZW5kb3JQcmVmaXhlcyI6WyJjb20uc25vd3Bsb3dhbmFseXRpY3MiXSwiY29ubmVjdGlvbiI6eyJodHRwIjp7InVyaSI6Imh0dHA6Ly9pZ2x1Y2VudHJhbC5jb20ifX19LHsibmFtZSI6IklnbHUgQ2VudHJhbCAtIEdDUCBNaXJyb3IiLCJwcmlvcml0eSI6MiwidmVuZG9yUHJlZml4ZXMiOlsiY29t......=" \
  --config "eyAic2NoZW1hIjogImlnbHU6Y29tLnNub3dwbG93YW5hbHl0aWNzLnNub3dwbG93L3JlY292ZXJpZXMvanNvbnNjaGVtYS8zLTAtMCIsICJkYXRhIjogeyAiaWdsdTpjb20uc25vd3Bsb3dhbmFseXRpY3Muc25vd3Bsb3cuYmFkcm93cy9zY2hlbWFfdmlvbGF0aW9ucy9...."

Now, I’m struggling to send the bad events to the kinesis stream using the argument --output

Spark submit command:

spark-submit \
  --deploy-mode cluster \
  --master yarn \
  snowplow-event-recovery-spark-0.3.1.jar \
  --input s3://${THE-BUCCKET}/enriched/bad/date_at=2021-12-07/hour=16/enriched-bad-consumer-web-2021-12-07-16.gz \
  --failedOutput s3://${THE-BUCCKET}/recovery-failed/ \
  --unrecoverableOutput s3://${THE-BUCCKET}/recovery-unrecoverable/ \
  --output collector_good_production \
  --directoryOutput  s3://${THE-BUCCKET}/recovery/ \
  --region eu-west-1 \
  --resolver "eyJzY2hlbWEiOiJpZ2x1OmNvbS5zbm93cGxvd2FuYWx5dGljcy5pZ2x1L3Jlc29sdmVyLWNvbmZpZy9qc29uc2NoZW1hLzEtMC0yIiwiZGF0YSI6eyJjYWNoZVNpemUiOjEwMDAsInJlcG9zaXRvcmllcyI6W3sibmFtZSI6IklnbHUgQ2VudHJhbCIsInByaW9yaXR5IjoxLCJ2ZW5kb3JQcmVmaXhlcyI6WyJjb20uc25vd3Bsb3dhbmFseXRpY3MiXSwiY29ubmVjdGlvbiI6eyJodHRwIjp7InVyaSI6Imh0dHA6Ly9pZ2x1Y2VudHJhbC5jb20ifX19LHsibmFtZSI6IklnbHUgQ2VudHJhbCAtIEdDUCBNaXJyb3IiLCJwcmlvcml0eSI6MiwidmVuZG9yUHJlZml4ZXMiOlsiY29tLnNub3dwbG93YW5hbHl0aWNzIl0sImNvbm5lY3Rpb24iOnsiaHR0cCI6eyJ1cmkiOiJodHRwOi8vbWlycm9yMDEuaWdsdWNlbnRyYWwuY29tIn19fSx7Im5hbWUiOiJQcm9kdWN0aW9uIFJlcG8iLCJwcmlvcml0eSI6MC,,,,0=" \
  --config "eyAic2NoZW1hIjogImlnbHU6Y29tLnNub3dwbG93YW5hbHl0aWNzLnNub3dwbG93L3JlY292ZXJpZXMvanNvbnNjaGVtYS8zLTAtMCIsICJkYXRhIjogeyAiaWds....=="

The Spark job is failing and the error messages are not helping me too much:

22/01/14 10:53:56 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
22/01/14 10:53:57 INFO RMProxy: Connecting to ResourceManager at ip-10-11-1-206.eu-west-1.compute.internal/10.11.1.206:8032
22/01/14 10:53:57 INFO Client: Requesting a new application from cluster with 2 NodeManagers
22/01/14 10:53:57 INFO Configuration: resource-types.xml not found
22/01/14 10:53:57 INFO ResourceUtils: Unable to find 'resource-types.xml'.
22/01/14 10:53:57 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (57344 MB per container)
22/01/14 10:53:57 INFO Client: Will allocate AM container, with 2432 MB memory including 384 MB overhead
22/01/14 10:53:57 INFO Client: Setting up container launch context for our AM
22/01/14 10:53:57 INFO Client: Setting up the launch environment for our AM container
22/01/14 10:53:57 INFO Client: Preparing resources for our AM container
22/01/14 10:53:57 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
22/01/14 10:54:00 INFO Client: Uploading resource file:/mnt/tmp/spark-b0633764-692a-4ae8-99c5-b1f9f165e629/__spark_libs__115653855768236052.zip -> hdfs://ip-10-11-1-206.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1642153681052_0005/__spark_libs__115653855768236052.zip
22/01/14 10:54:00 INFO Client: Uploading resource file:/home/hadoop/snowplow-event-recovery-spark-0.3.1.jar -> hdfs://ip-10-11-1-206.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1642153681052_0005/snowplow-event-recovery-spark-0.3.1.jar
22/01/14 10:54:01 INFO Client: Uploading resource file:/mnt/tmp/spark-b0633764-692a-4ae8-99c5-b1f9f165e629/__spark_conf__6222248894298862503.zip -> hdfs://ip-10-11-1-206.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1642153681052_0005/__spark_conf__.zip
22/01/14 10:54:01 INFO SecurityManager: Changing view acls to: hadoop
22/01/14 10:54:01 INFO SecurityManager: Changing modify acls to: hadoop
22/01/14 10:54:01 INFO SecurityManager: Changing view acls groups to:
22/01/14 10:54:01 INFO SecurityManager: Changing modify acls groups to:
22/01/14 10:54:01 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(hadoop); groups with view permissions: Set(); users  with modify permissions: Set(hadoop); groups with modify permissions: Set()
22/01/14 10:54:02 INFO Client: Submitting application application_1642153681052_0005 to ResourceManager
22/01/14 10:54:02 INFO YarnClientImpl: Submitted application application_1642153681052_0005
22/01/14 10:54:03 INFO Client: Application report for application_1642153681052_0005 (state: ACCEPTED)
22/01/14 10:54:03 INFO Client:
	 client token: N/A
	 diagnostics: AM container is launched, waiting for AM container to Register with RM
	 ApplicationMaster host: N/A
	 ApplicationMaster RPC port: -1
	 queue: default
	 start time: 1642157642336
	 final status: UNDEFINED
	 tracking URL: http://ip-10-11-1-206.eu-west-1.compute.internal:20888/proxy/application_1642153681052_0005/
	 user: hadoop
22/01/14 10:54:04 INFO Client: Application report for application_1642153681052_0005 (state: ACCEPTED)
22/01/14 10:54:05 INFO Client: Application report for application_1642153681052_0005 (state: ACCEPTED)
22/01/14 10:54:06 INFO Client: Application report for application_1642153681052_0005 (state: ACCEPTED)
22/01/14 10:54:07 INFO Client: Application report for application_1642153681052_0005 (state: ACCEPTED)
22/01/14 10:54:08 INFO Client: Application report for application_1642153681052_0005 (state: ACCEPTED)
22/01/14 10:54:09 INFO Client: Application report for application_1642153681052_0005 (state: ACCEPTED)
22/01/14 10:54:10 INFO Client: Application report for application_1642153681052_0005 (state: ACCEPTED)
22/01/14 10:54:11 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:11 INFO Client:
	 client token: N/A
	 diagnostics: N/A
	 ApplicationMaster host: ip-10-11-1-132.eu-west-1.compute.internal
	 ApplicationMaster RPC port: 36553
	 queue: default
	 start time: 1642157642336
	 final status: UNDEFINED
	 tracking URL: http://ip-10-11-1-206.eu-west-1.compute.internal:20888/proxy/application_1642153681052_0005/
	 user: hadoop
22/01/14 10:54:12 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:13 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:14 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:15 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:16 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:17 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:18 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:19 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:20 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:21 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:22 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:23 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:24 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:25 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:26 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:27 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:28 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:29 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:30 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:31 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:32 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:33 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:34 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:35 INFO Client: Application report for application_1642153681052_0005 (state: ACCEPTED)
22/01/14 10:54:35 INFO Client:
	 client token: N/A
	 diagnostics: AM container is launched, waiting for AM container to Register with RM
	 ApplicationMaster host: N/A
	 ApplicationMaster RPC port: -1
	 queue: default
	 start time: 1642157642336
	 final status: UNDEFINED
	 tracking URL: http://ip-10-11-1-206.eu-west-1.compute.internal:20888/proxy/application_1642153681052_0005/
	 user: hadoop
22/01/14 10:54:36 INFO Client: Application report for application_1642153681052_0005 (state: ACCEPTED)
22/01/14 10:54:37 INFO Client: Application report for application_1642153681052_0005 (state: ACCEPTED)
22/01/14 10:54:38 INFO Client: Application report for application_1642153681052_0005 (state: ACCEPTED)
22/01/14 10:54:39 INFO Client: Application report for application_1642153681052_0005 (state: ACCEPTED)
22/01/14 10:54:40 INFO Client: Application report for application_1642153681052_0005 (state: ACCEPTED)
22/01/14 10:54:41 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:41 INFO Client:
	 client token: N/A
	 diagnostics: N/A
	 ApplicationMaster host: ip-10-11-1-24.eu-west-1.compute.internal
	 ApplicationMaster RPC port: 46473
	 queue: default
	 start time: 1642157642336
	 final status: UNDEFINED
	 tracking URL: http://ip-10-11-1-206.eu-west-1.compute.internal:20888/proxy/application_1642153681052_0005/
	 user: hadoop
22/01/14 10:54:42 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:43 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:44 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:45 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:46 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:47 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:48 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:49 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:50 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:51 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:52 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:53 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:54 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:55 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:56 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:57 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:58 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:54:59 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:55:00 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:55:01 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:55:02 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:55:03 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:55:04 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:55:05 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:55:06 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:55:07 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:55:08 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:55:09 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:55:10 INFO Client: Application report for application_1642153681052_0005 (state: RUNNING)
22/01/14 10:55:11 INFO Client: Application report for application_1642153681052_0005 (state: FINISHED)
22/01/14 10:55:11 INFO Client:
	 client token: N/A
	 diagnostics: Shutdown hook called before final status was reported.
	 ApplicationMaster host: ip-10-11-1-24.eu-west-1.compute.internal
	 ApplicationMaster RPC port: 46473
	 queue: default
	 start time: 1642157642336
	 final status: FAILED
	 tracking URL: http://ip-10-11-1-206.eu-west-1.compute.internal:20888/proxy/application_1642153681052_0005/
	 user: hadoop
22/01/14 10:55:11 ERROR Client: Application diagnostics message: Shutdown hook called before final status was reported.
Exception in thread "main" org.apache.spark.SparkException: Application application_1642153681052_0005 finished with failed status
	at org.apache.spark.deploy.yarn.Client.run(Client.scala:1149)
	at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1526)
	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:853)
	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:928)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:937)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
22/01/14 10:55:11 INFO ShutdownHookManager: Shutdown hook called
22/01/14 10:55:11 INFO ShutdownHookManager: Deleting directory /mnt/tmp/spark-b4632574-ff4e-4d76-930f-816f7c32edfa
22/01/14 10:55:11 INFO ShutdownHookManager: Deleting directory /mnt/tmp/spark-b0633764-692a-4ae8-99c5-b1f9f165e629

The EMR cluster has IAM permissions to publish messages in the Kinesis stream.

Do you have any hint or alternative here?

Thanks!

Hi @rodrigodelmonte ,

Spark errors often don’t appear in these YARN logs. Can you go to EMR UI and check stderr and stdout for the driver and the workers ? See on the screenshots below :

Also, FYI we released blob2stream (in Beta) especially to read data from S3 and insert it into Kinesis.