Trying to set Stream Enrich with docker image - Caught exception when initializing LeaseCoordinator

Hello there!

I’m trying to run the stream enricher within a t2.small Linux AMI EC2 instance, just for a POC. Installed JRE 8 and I’m trying to use the docker images available here: https://github.com/snowplow/snowplow-docker.

At this point, I think I already have a valid Scala Stream Collector going on, curl host/health returns OK.

For the enrichment, I setted the config file (which I named config.hocon) like this:

enrich {

  streams {

    in {
      raw = "raw-stream"
    }

    out {
      enriched = "enrich-stream"
      bad = "enrich-stream"
      pii = "enrich-stream"
      partitionKey = "domain_userid"
    }

    sourceSink {
      enabled =  "kinesis"
      region = "region"

      aws {
        accessKey = "accessKey"
        secretKey = "secretKey"
      }

      maxRecords = 10000
      initialPosition = "TRIM_HORIZON"

      backoffPolicy {
        minBackoff = 3000
        maxBackoff = 600000
      }

    buffer {
      byteLimit = 4000000
      recordLimit = 500 # Not supported by Kafka; will be ignored
      timeLimit = 5000
    }

    appName = "snowplow-enrich-staging"
  }
}

I also renamed the iglu_resolver.json to resolver.json, which I got from here: https://github.com/snowplow/snowplow/blob/master/3-enrich/config/iglu_resolver.json. Also, the enrich stream is created with 2 active shards.

I pulled the docker image using:

sudo docker pull snowplow-docker-registry.bintray.io/snowplow/stream-enrich-kinesis:0.19.1

And ran docker using:

sudo docker run -v $PWD/stream-enrich-config/snowplow/config:/snowplow/config -p 8080:8080 snowplow-docker-registry.bintray.io/snowplow/stream-enrich-kinesis:0.19.1 --config /snowplow/config/config.hocon --resolver file:/snowplow/config/resolver.json --force-cached-files-download

The config files I’m editting are inside $PWD/stream-enrich-config/snowplow/config. I already ran docker with -ti --entrypoint and verified that my config files are being loaded inside the container.

Here’s my log:

[main] INFO com.snowplowanalytics.snowplow.enrich.stream.sources.KinesisSource - Using workerId: f477fc798f2b:cba0cbaf-dfcc-4628-8bf6-3b07e73ae311
[main] INFO com.snowplowanalytics.snowplow.enrich.stream.sources.KinesisSource - Running: snowplow-enrich-staging.
[main] INFO com.snowplowanalytics.snowplow.enrich.stream.sources.KinesisSource - Processing raw input stream: snowplow-events
[main] INFO com.amazonaws.services.kinesis.leases.impl.LeaseCoordinator - With failover time 10000 ms and epsilon 25 ms, LeaseCoordinator will renew leases every 3308 ms, takeleases every 20050 ms, process maximum of 2147483647 leases and steal 1 lease(s) at a time.
[main] WARN com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker - Received configuration for both region name as us-east-1, and Amazon Kinesis endpoint as https://kinesis.us-east-1.amazonaws.com. Amazon Kinesis endpoint will overwrite region name.
[main] INFO com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker - Initialization attempt 1
[main] INFO com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker - Initializing LeaseCoordinator
[main] ERROR com.amazonaws.services.kinesis.leases.impl.LeaseManager - Failed to get table status for snowplow-enrich-staging
[main] ERROR com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker - Caught exception when initializing LeaseCoordinator
[main] INFO com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker - Initialization attempt 2
[main] INFO com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker - Initializing LeaseCoordinator
[main] ERROR com.amazonaws.services.kinesis.leases.impl.LeaseManager - Failed to get table status for snowplow-enrich-staging
[main] ERROR com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker - Caught exception when initializing LeaseCoordinator 

(Spoiler: keeps trying to initialize without success)

Please, how can I make this work? Thanks for your attention.

Are you running in us-east-1 otherwise you’ll need to change this to the AWS region you want to run in?

These 3 streams should have different names.

If you’re just looking for a POC (without data loading) it may be worth spinning up Snowplow Mini.

@Coqueiro would you mind sharing the IAM policy you are using for the Stream Enrich server? It could well be a permissions issue as well with the application not having sufficient access to Kinesis, DynamoDB or both.

This looks particularly like DynamoDB table issue. Your policy should allow oto describe, read and write table. Table should exist.

Thanks for the quick responses!

Sorry, I don’t know why I decided to mask this, but it’s setted to us-east-1.

I removed pii and created both streams, for enriched and bad objects.

Yeah, maybe a POC doesn’t describe well our current stage. We want to start playing around with the architecture pieces, doing some load tests and start preparing for the eventual prod release, which will, probably, be consisted of Ruby & JS Trackers / Scala Stream Collector / Stream Enrich / S3 Loader to our data lake.

@josh, @grzegorzewald, the initial IAM Role only had the default AmazonKinesisFullAccess policy (at first I thought that, by choosing resolver as a file, I wouldn’t need DynamoDB), I now added the default AmazonDynamoDBFullAccess policy. Right now, those are the only two policies in the server’s IAM Role.
Also, I created a DynamoDB table snowplow-enrich-staging with a primary partition key id.

After those things, I’m running the same command and getting the same error log. Any other suggestion?

Hey @Coqueiro - so Stream Enrich creates a DynamoDB table to keep track of where it is in terms of processing the stream. This is needed for scaling so that multiple instances of the application do not run into each other and can process shards independently.

You should not create this DynamoDB table yourself. The application will make it for you so to get this working I imagine you would need to delete the “snowplow-enrich-staging” you have made so that Stream Enrich can make its own table.

Thanks for the information, @josh!

Just tried this suggestion, same error. Which other reason could be causing this error?

I figured what I was doing wrong at this point: accessKey and secretKey were not ok, I should just be setting the fields with value iam. I did that, and also added CloudWatchFullAccess policy to the role.

Bug debugging goes on, normal day. Now I’m getting the following error log (going to post the whole thing, just in case, but I guess only the last lines are relevant for debugging):

[main] INFO com.snowplowanalytics.snowplow.enrich.stream.sources.KinesisSource - Using workerId: 15e9e888a733:0f7b9a88-74fb-4b63-af5f-5fc1dfd7d2f5
[main] INFO com.snowplowanalytics.snowplow.enrich.stream.sources.KinesisSource - Running: snowplow-enrich-staging.
[main] INFO com.snowplowanalytics.snowplow.enrich.stream.sources.KinesisSource - Processing raw input stream: snowplow-events
[main] INFO com.amazonaws.services.kinesis.leases.impl.LeaseCoordinator - With failover time 10000 ms and epsilon 25 ms, LeaseCoordinator will renew leases every 3308 ms, takeleases every 20050 ms, process maximum of 2147483647 leases and steal 1 lease(s) at a time.
[main] WARN com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker - Received configuration for both region name as us-east-1, and Amazon Kinesis endpoint as https://kinesis.us-east-1.amazonaws.com. Amazon Kinesis endpoint will overwrite region name.
[main] INFO com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker - Initialization attempt 1
[main] INFO com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker - Initializing LeaseCoordinator
[main] INFO com.amazonaws.services.kinesis.clientlibrary.lib.worker.KinesisClientLibLeaseCoordinator - Created new lease table for coordinator with initial read capacity of 10 and write capacity of 10.
[main] INFO com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker - Syncing Kinesis shard info
[main] INFO com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker - Starting LeaseCoordinator
[LeaseCoordinator-0000] INFO com.amazonaws.services.kinesis.leases.impl.LeaseTaker - Worker 15e9e888a733:0f7b9a88-74fb-4b63-af5f-5fc1dfd7d2f5 saw 2 total leases, 2 available leases, 1 workers. Target is 2 leases, I have 0 leases, I will take 2 leases
[LeaseCoordinator-0000] INFO com.amazonaws.services.kinesis.leases.impl.LeaseTaker - Worker 15e9e888a733:0f7b9a88-74fb-4b63-af5f-5fc1dfd7d2f5 successfully took 2 leases: shardId-000000000001, shardId-000000000000
[main] INFO com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker - Initialization complete. Starting worker loop.
[main] INFO com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker - Created new shardConsumer for : ShardInfo [shardId=shardId-000000000001, concurrencyToken=544560f3-8b10-43fa-808a-c76ea393797f, parentShardIds=[], checkpoint={SequenceNumber: TRIM_HORIZON,SubsequenceNumber: 0}]
[main] INFO com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker - Created new shardConsumer for : ShardInfo [shardId=shardId-000000000000, concurrencyToken=b087ea0f-5c3b-4e4e-b091-a4d92eee5e99, parentShardIds=[], checkpoint={SequenceNumber: TRIM_HORIZON,SubsequenceNumber: 0}]
[RecordProcessor-0000] INFO com.amazonaws.services.kinesis.clientlibrary.lib.worker.BlockOnParentShardTask - No need to block on parents [] of shard shardId-000000000001
[RecordProcessor-0001] INFO com.amazonaws.services.kinesis.clientlibrary.lib.worker.BlockOnParentShardTask - No need to block on parents [] of shard shardId-000000000000
[cw-metrics-publisher] WARN com.amazonaws.services.kinesis.metrics.impl.CWPublisherRunnable - Could not publish 20 datums to CloudWatch
[cw-metrics-publisher] WARN com.amazonaws.services.kinesis.metrics.impl.CWPublisherRunnable - Could not publish 3 datums to CloudWatch
[RecordProcessor-0000] INFO com.amazonaws.services.kinesis.clientlibrary.lib.worker.KinesisDataFetcher - Initializing shard shardId-000000000000 with TRIM_HORIZON
[RecordProcessor-0001] INFO com.amazonaws.services.kinesis.clientlibrary.lib.worker.KinesisDataFetcher - Initializing shard shardId-000000000001 with TRIM_HORIZON
[RecordProcessor-0000] INFO com.snowplowanalytics.snowplow.enrich.stream.sources.KinesisSource - Initializing record processor for shard: shardId-000000000000
[RecordProcessor-0001] INFO com.snowplowanalytics.snowplow.enrich.stream.sources.KinesisSource - Initializing record processor for shard: shardId-000000000001
[RecordProcessor-0000] INFO com.snowplowanalytics.snowplow.enrich.stream.sources.KinesisSource - Processing 139 records from shardId-000000000000
[RecordProcessor-0001] INFO com.snowplowanalytics.snowplow.enrich.stream.sources.KinesisSource - Processing 152 records from shardId-000000000001
[RecordProcessor-0001] INFO com.snowplowanalytics.snowplow.enrich.stream.sinks.KinesisSink - Writing 6 records to Kinesis stream snowplow-enrich-good
[RecordProcessor-0000] INFO com.snowplowanalytics.snowplow.enrich.stream.sinks.KinesisSink - Writing 3 records to Kinesis stream snowplow-enrich-good
[RecordProcessor-0001] ERROR com.snowplowanalytics.snowplow.enrich.stream.sinks.KinesisSink - Writing failed.
com.amazonaws.services.kinesis.model.AmazonKinesisException: 6 validation errors detected: Value null at 'records.1.member.partitionKey' failed to satisfy constraint: Member must not be null; Value null at 'records.2.member.partitionKey' failed to satisfy constraint: Member must not be null; Value null at 'records.3.member.partitionKey' failed to satisfy constraint: Member must not be null; Value null at 'records.4.member.partitionKey' failed to satisfy constraint: Member must not be null; Value null at 'records.5.member.partitionKey' failed to satisfy constraint: Member must not be null; Value null at 'records.6.member.partitionKey' failed to satisfy constraint: Member must not be null (Service: AmazonKinesis; Status Code: 400; Error Code: ValidationException; Request ID: cfabdbe4-6f13-cc7a-9b8f-95ecaa7cc3f2)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1630)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1302)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
        at com.amazonaws.services.kinesis.AmazonKinesisClient.doInvoke(AmazonKinesisClient.java:2388)
        at com.amazonaws.services.kinesis.AmazonKinesisClient.invoke(AmazonKinesisClient.java:2364)
        at com.amazonaws.services.kinesis.AmazonKinesisClient.executePutRecords(AmazonKinesisClient.java:1859)
        at com.amazonaws.services.kinesis.AmazonKinesisClient.putRecords(AmazonKinesisClient.java:1834)
        at com.snowplowanalytics.snowplow.enrich.stream.sinks.KinesisSink$$anonfun$multiPut$1.apply(KinesisSink.scala:289)
        at com.snowplowanalytics.snowplow.enrich.stream.sinks.KinesisSink$$anonfun$multiPut$1.apply(KinesisSink.scala:275)
        at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
        at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
        at scala.concurrent.impl.ExecutionContextImpl$AdaptedForkJoinTask.exec(ExecutionContextImpl.scala:121)
        at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
[RecordProcessor-0001] ERROR com.snowplowanalytics.snowplow.enrich.stream.sinks.KinesisSink -   + Retrying in 8541 milliseconds...

(Spoiler: keeps rapidly retrying without success)

I guess the events from my SSC (Scala Stream Collector, thinking about future generations) don’t have a valid format, from what I gather from this log. I don’t know if this has something to do with the error, but my SSC config file has the property collector.streams.useIpAddressAsPartitionKey setted to false, which is the default for the config.hocon.sample file.

Any suggestions? Do I have to set this property to true? For us, grouping user events solely by IP, is not that interesting, that’s why I left this property as false, but I guess I don’t fully understand its purpose.

Thanks for the help.

Just to give a quick update:
I changed Stream Enrich config property enrich.streams.out.partitionKey value to "user_ipaddress" and the server is sending objects to both streams now.

Just one more doubt thou: I thought domain_userid was automatically set by tracker + collector, using 1st party cookies. Is that not the case?


EDIT: I just noticed our tracking in staging is all going into the ‘bad’ stream. I thought that this may be caused by the fact that I didn’t pass --enrichments file:/snowplow/config/enrichments/. I configured the enrichments, also created stream enrich.streams.out.pii, which by the way I don’t understand the purpose, and events are still being streamed out to “bad” stream.

I downstreamed data from the “bad” stream using firehose to try to understand what’s happening and I got the following JSON from one of the outputs:

{
    "line": "CwBkAAAADjEwMC4xMjIuMTczLjY0CgDIAAABZhHpJ9ILANIAAAAFVVRGLTgLANwAAAASc3NjLTAuMTIuMC1raW5lc2lzCwEsAAAABFJ1YnkLAUAAAAACL2kLAUoAAAQ9ZT1wdiZ1cmw9JTJGbmVnb2Npb3MmY3g9ZXlKelkyaGxiV0VpT2lKcFoyeDFPbU52YlM1emJtOTNjR3h2ZDJGdVlXeDVkR2xqY3k1emJtOTNjR3h2ZHk5amIyNTBaWGgwY3k5cWMyOXVjMk5vWlcxaEx6RXRNQzB4SWl3aVpHRjBZU0k2VzNzaWMyTm9aVzFoSWpvaWFXZHNkVHBqYjIwdVozbHRjR0Z6Y3k1M2QzY3ZhV2RzZFM5elkyaGxiV0Z6TDJOdmJTNW5lVzF3WVhOekwzQmhaMlZmZG1sbGQzTXZhbk52Ym5OamFHVnRZUzh4TFRBdE1DSXNJbVJoZEdFaU9uc2lkbWxsZDJWeVgybGtJam96TnpBc0luSmxabVZ5Y21Gc1gybGtJanB1ZFd4c0xDSnpaVzUwWDJWdFlXbHNYMmxrSWpwdWRXeHNMQ0pzYjJOaGRHbHZibDlwWkNJNk1UTXlNaXdpY21WalgybGtJanB1ZFd4c0xDSnlaV05mY0hKdlpIVmpkRjlwWkNJNmJuVnNiQ3dpWkdWMmFXTmxYMmxrSWpveU1EQXNJbkJoWjJWZmRIbHdaU0k2TWpBd01EQXNJbkJoWjJWZmRtRnNhV1FpT25SeWRXVXNJbmhvY2lJNlptRnNjMlVzSW1weklqcG1ZV3h6WlN3aWFuTnZiaUk2Wm1Gc2MyVXNJbTF2WW1sc1pTSTZabUZzYzJVc0ltRndjQ0k2Wm1Gc2MyVXNJbUZ3Y0Y5MlpYSnphVzl1SWpvd0xDSnphWFJsWDJsa0lqb3hMQ0poWkcxcGJsOXBaQ0k2Ym5Wc2JDd2lZMjkxYm5SeWVWOXBaQ0k2TnpZc0lteHZZMkZzWlY5cFpDSTZNU3dpWTNWeWNtVnVZM2xmYVdRaU9qazROaXdpYkdGMGFYUjFaR1VpT201MWJHd3NJbXh2Ym1kcGRIVmtaU0k2Ym5Wc2JIMTlYWDAlM0QmZHRtPTE1Mzc4OTg3ODQ2OTImcD1wYyZ1aWQ9MCZ0ej1BbWVyaWNhJTJGU2FvX1BhdWxvJmxhbmc9cHQmaXA9MjAxLjQ2LjIxLjM3JnVhPU1vemlsbGElMkY1LjArJTI4V2luZG93cytOVCsxMC4wJTNCK1dpbjY0JTNCK3g2NCUyOStBcHBsZVdlYktpdCUyRjUzNy4zNislMjhLSFRNTCUyQytsaWtlK0dlY2tvJTI5K0Nocm9tZSUyRjY5LjAuMzQ5Ny4xMDArU2FmYXJpJTJGNTM3LjM2JmR1aWQ9ZGd1UWw2MzlfakcwOTZPS0xmN0xkWXJ1U05rNmh0cjRZRXZCdE1sQUtDbyZ0bmE9MS1zdGFnaW5nJnR2PXJiLTAuNi4xJmVpZD03NjhhYjY1ZS0yYjU1LTQ5NjgtYWVhMi0zZWQwMDA4NjZjZWQmc3RtPTE1Mzc4OTg3ODQ3MDAPAV4LAAAABgAAADRBY2NlcHQtRW5jb2Rpbmc6IGd6aXAsIGRlZmxhdGU7cT0wLjYsIGlkZW50aXR5O3E9MC4zAAAAC0FjY2VwdDogKi8qAAAAEFVzZXItQWdlbnQ6IFJ1YnkAAAARQ29ubmVjdGlvbjogY2xvc2UAAAAwSG9zdDogc25vd3Bsb3ctY29sbGVjdG9yLms4cy5neW1wYXNzLXN0YWdpbmcuY29tAAAAG1RpbWVvdXQtQWNjZXNzOiA8ZnVuY3Rpb24xPgsBkAAAACpzbm93cGxvdy1jb2xsZWN0b3IuazhzLmd5bXBhc3Mtc3RhZ2luZy5jb20LAZoAAAAkOWQwMDNiZTMtYjllNC00ZDZmLWIwY2UtNGY0YWJlMmNjYzdiC3ppAAAAQWlnbHU6Y29tLnNub3dwbG93YW5hbHl0aWNzLnNub3dwbG93L0NvbGxlY3RvclBheWxvYWQvdGhyaWZ0LzEtMC0wAA==",
    "errors": [
        {
            "level": "error",
            "message": "error: ECMA 262 regex \"^iglu:[a-zA-Z0-9-_.]+/[a-zA-Z0-9-_]+/[a-zA-Z0-9-_]+/[0-9]+-[0-9]+-[0-9]+$\" does not match input string \"iglu:com.company_domain.www/iglu/schemas/com.company_domain/page_views/jsonschema/1-0-0\"\n    level: \"error\"\n    schema: {\"loadingURI\":\"#\",\"pointer\":\"/items/properties/schema\"}\n    instance: {\"pointer\":\"/0/schema\"}\n    domain: \"validation\"\n    keyword: \"pattern\"\n    regex: \"^iglu:[a-zA-Z0-9-_.]+/[a-zA-Z0-9-_]+/[a-zA-Z0-9-_]+/[0-9]+-[0-9]+-[0-9]+$\"\n    string: \"iglu:com.company_domain.www/iglu/schemas/com.company_domain/page_views/jsonschema/1-0-0\"\n"
        }
    ],
    "failure_tstamp": "2018-09-25T18:06:33.122Z"
}

What could be causing this?

(sorry for the constant editing, it’s just that I’m actively trying to fix the issue from my side, and I want to keep this post updated…)

The Iglu URI that you are passing in the events isn’t in the format that Iglu is expecting in the form of
iglu:com.vendor/event/jsonschema/1-0-0

i.e., iglu:com.company_domain.www/iglu/schemas/com.company_domain/page_views/jsonschema/1-0-0 should be iglu:com.company_domain/page_views/jsonschema/1-0-0.

So, if my URL has a subdomain, like subdomain.company_domain.com, should I use iglu:com.company_domain.subdomain/page_views/jsonschema/1-0-0? I saw on the documentation that the company domain should be inversed, but this leaves me to some doubts about subdomain. Once I go to work tomorrow, I’ll try those settings.

@Coqueiro, you can call it whatever you want. The JSON schema reference in Iglu follows a convention vendor/event/format/version. The vendor name, as well as the events name, are arbitrary and it’s up to you how you decide to address the custom event you want to track. It is in no way bound to the website URI the event is being tracked on. It’s meant to describe the event and the vendor name is just an indicator of the owner of that event - who developed that event (built the JSON schema for it) or generates the events (like 3rd party SAS).

More importantly, the Iglu URI has to follow the description of JSON schema in self section. For example, if your JSON schema has

	"self": {
		"vendor": "com.acme",
		"name": "custom_event",
		"format": "jsonschema",
		"version": "1-0-0"
	},
       . . .

it has to be referenced as iglu:com.acme/custom_event/jsonschema/1-0-0 and uploaded to the bucket/location on Iglu as 1-0-0 file with schemas/com.acme/custom_event/jsonschema/ path.

It depends. While creating tables by yourself and passing they names to Snowplow apps, you have control on what is going on. Moreover you may limit the machine/container policy to table if you know it apriori. I have found this extremely useful while using Cloudformation to setup whole Snowplow realtime pipeline at once.

Guys, I think everything is working fine now. @ihor, I followed your advice on the resolver.json and also fixed our Iglu static repository, it was not following the right pattern, and to do so I just followed Iglu Central’s static repository structure. To be honest, once I finally understood the relationships between the pieces, everything fell right into place. Now let’s head to the S3 Loader :stuck_out_tongue:

Thanks, @mike, @josh, @grzegorzewald, @ihor !

1 Like

This is easy one :stuck_out_tongue:

Hi ,

can anyone help me to fix this issue.

[main] INFO com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker - Syncing Kinesis shard info
[main] INFO com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker - Starting LeaseCoordinator
[main] INFO com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker - Initialization complete. Starting worker loop.
[cw-metrics-publisher] WARN com.amazonaws.services.kinesis.metrics.impl.CWPublisherRunnable - Could not publish 16 datums to CloudWatch
[LeaseCoordinator-0000] INFO com.amazonaws.services.kinesis.leases.impl.LeaseTaker - Worker ip-10-3-13-151:d2dc637c-4fd6-48a9-a5e8-736ef738d5bd saw 1 total leases, 1 available leases, 1 workers. Target is 1 leases, I have 0 leases, I will take 1 leases
[LeaseCoordinator-0000] INFO com.amazonaws.services.kinesis.leases.impl.LeaseTaker - Worker ip-10-3-13-151:d2dc637c-4fd6-48a9-a5e8-736ef738d5bd successfully took 1 leases: shardId-000000000000
[cw-metrics-publisher] WARN com.amazonaws.services.kinesis.metrics.impl.CWPublisherRunnable - Could not publish 16 datums to CloudWatch
[main] INFO com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker - Created new shardConsumer for : ShardInfo [shardId=shardId-000000000000, concurrencyToken=165afe9a-da24-42b2-a9d5-d7cf0f5c0cbb, parentShardIds=[], checkpoint={SequenceNumber: TRIM_HORIZON,SubsequenceNumber: 0}]
[RecordProcessor-0000] INFO com.amazonaws.services.kinesis.clientlibrary.lib.worker.BlockOnParentShardTask - No need to block on parents [] of shard shardId-000000000000
[RecordProcessor-0000] INFO com.amazonaws.services.kinesis.clientlibrary.lib.worker.KinesisDataFetcher - Initializing shard shardId-000000000000 with TRIM_HORIZON
[RecordProcessor-0000] INFO com.snowplowanalytics.snowplow.enrich.stream.sources.KinesisSource - Initializing record processor for shard: shardId-000000000000
[cw-metrics-publisher] WARN com.amazonaws.services.kinesis.metrics.impl.CWPublisherRunnable - Could not publish 20 datums to CloudWatch
[cw-metrics-publisher] WARN com.amazonaws.services.kinesis.metrics.impl.CWPublisherRunnable - Could not publish 1 datums to CloudWatch
[cw-metrics-publisher] WARN com.amazonaws.services.kinesis.metrics.impl.CWPublisherRunnable - Could not publish 19 datums to CloudWatch
[cw-metrics-publisher] WARN com.amazonaws.services.kinesis.metrics.impl.CWPublisherRunnable - Could not publish 14 datums to CloudWatch
[main] INFO com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker - Current stream shard assignments: shardId-000000000000
[main] INFO com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker - Sleeping ...
[cw-metrics-publisher] WARN com.amazonaws.services.kinesis.metrics.impl.CWPublisherRunnable - Could not publish 19 datums to CloudWatch
[cw-metrics-publisher] WARN com.amazonaws.services.kinesis.metrics.impl.CWPublisherRunnable - Could not publish 14 datums to CloudWatch
[cw-metrics-publisher] WARN com.amazonaws.services.kinesis.metrics.impl.CWPublisherRunnable - Could not publish 19 datums to CloudWatch
[cw-metrics-publisher] WARN com.amazonaws.services.kinesis.metrics.impl.CWPublisherRunnable - Could not publish 14 datums to CloudWatch
[cw-metrics-publisher] WARN com.amazonaws.services.kinesis.metrics.impl.CWPublisherRunnable - Could not publish 19 datums to CloudWatch
[cw-metrics-publisher] WARN com.amazonaws.services.kinesis.metrics.impl.CWPublisherRunnable - Could not publish 14 datums to CloudWatch
[cw-metrics-publisher] WARN com.amazonaws.services.kinesis.metrics.impl.CWPublisherRunnable - Could not publish 19 datums to CloudWatch
[main] INFO com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker - Current stream shard assignments: shardId-000000000000
[main] INFO com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker - Sleeping ...

Thanks

Hi @sp_user which issue are you referring to? You have only provided log output here. To help us help you please provide details on what you actually need help with.

hi ,im running the below command for stream enrich kinesis

java -jar ‘…/bin/snowplow-stream-enrich-kinesis-0.20.0.jar’ --config config.hocon --resolver file:iglu_resolver.json --enrichments file:enrichments_config

@sp_user, you still haven’t told what help you need with. The logs do not indicate problems with Stream Enrich. Are you referring to “Could not publish XX datums to CloudWatch”?

Datums simply refer to metrics about the KCL app’s performance which are published to CloudWatch. You can read more about it here.

Hi ihor,

I’m new to snowplow,I have successfully done collector setup.
For stream enrich kinesis ,when i run jar java jar file i;m getting above logs like could not publish datums to cloud watch.I just wanted to verify whether my stream enrich kinesis setup for snowplow is successful or not.