No logs to process: No Snowplow enriched stream logs to process since last run

We are running Snowplow R116 Madara Rider and we are happy with snowplow. We rarely face issues until today :slight_smile:

Issue
We followed the recovery steps for enrich mode, starting from 1.
However when looking at the steps logs in fargate, we are not able to figure out why can’t we recover it.

Hence no EMR is trigerred.

We have both enriched/good/ and shredded/good empty.

What could we do better in order to resolve the problem?
Thank you!

@Sebastian_Villarroel, it is not clear to me how Fargete is related to EmrEtlRunner job.

Before following the recovery steps for the batch job you need to investigate the failure reason. The EmrEtlRunner has its own logs that should clarify what step the failure took place at. Based on the failure and the current batch pipeline status you would pick the appropriate recovery step. You would run the recovery from step 1 if the EER job failed at staging step and no files have been moved to the enriched:good bucket (all the output buckets are empty).

Thus, the questions are

  1. At what step the EER failed initially?
  2. What logs EER produced?
  3. What is the status of the buckets involved in the batch job?

The logs you show in the screenshot indicate that there are no enriched stream data. It sounds that you might be approaching this issue from the wrong side. Do you mean to say that EER job is not running because there are no data in enriched:stream bucket? In that case, you need to look for the reason upstream - your real-time pipeline, not batch.

1 Like

Hello! Thanks for the hint. Actually we figure out that the container which writes data from the kinesis stream to s3 stopped writing data.

In our staging environment looks like this and we see data in enriched:stream

[main] INFO com.snowplowanalytics.s3.loader.SinkApp$ - Initializing sink with KinesisConnectorConfiguration: {regionName=eu-central-1, s3Endpoint=https://s3-eu-central-1.amazonaws.com, kinesisInputStream=snowplow-enrich-stage-good, maxRecords=100, connectorDestination=s3, bufferMillisecondsLimit=60000, bufferRecordCountLimit=500, s3Bucket=coya-snowplow-stage/enriched/stream, kinesisEndpoint=https://kinesis.eu-central-1.amazonaws.com, appName=snowplow-kinesis-s3-stage, bufferByteSizeLimit=4194304

Collectors are working: prod-snowplow-collector service is writing data to the Kinesis stream snowplow-collector-prod-good correctly.
Nothing is reading from the stream though: prod-snowplow-enrich, the service that should read from the stream, is not working - we suspect there’s a file or folder missing on S3 that would enable that.
we see only this error

NonEmptyList(The specified key does not exist. (Service: Amazon S3; Status Code: 404; Error Code: NoSuchKey; Request ID: 23F164C6F69E2D8F; S3 Extended Request ID: kBZ5m77pVxhMqi2PQVXFcyM6q9FEAX+DlkSeNcXE0fVkrhlCrajXagXTp20Bhkyu+43sA+5ME38=), The specified key does not exist. (Service: Amazon S3; Status Code: 404; Error Code: NoSuchKey; Request ID: 6295DB55A5BC33D8; S3 Extended Request ID: 2bcWBSz8oPVSVSwFpi5IvmZovRtsufhR6kvQuEjfjGCDYBF7bv3fBUoyEvACx/pqwwAWqTcvaxY=), The specified key does not exist. (Service: Amazon S3; Status Code: 404; Error Code: NoSuchKey; Request ID: FF80538F1433D8AC; S3 Extended Request ID: nPFvaDkmHZCFyFVDgIyB8XkLzGuCgoRhA+s2+2wu/VidNAETO2PaIjOTT+DwZoHCtQ7XLDz2T4Y=), The specified key does not exist. (Service: Amazon S3; Status Code: 404; Error Code: NoSuchKey; Request ID: A4428BABA6930F4B; S3 Extended Request ID: YAJS2m9/RsgORx9aXdX5JV6vi8KDJXePewLjOTlA6vbyhW80Q7eSLCPCVkPATeDANaWQJe8bGuM=))```


UPDATE:
we ended up resolving the issue, it was indeed an issue that logs were not stored in s3 due to a failing container
1 Like