Snowplow Stream Enrichment


#1

Hi All,

I am setting an enrichment via stream enrichment by downloading the jar file as instructed on below link

Now , to configure it we need it’s configuration file along with another configuration file which is JSON configuration for the Iglu resolver used to look up JSON schemas.(see below link)

A sample JSON file is also given at below path

https://raw.githubusercontent.com/snowplow/snowplow/master/3-enrich/config/iglu_resolver.json
.
.
.
.
.
Now i got stuck with this sample json file ,because there are many different parameters are configured inside it and i don’t know what all are they indicates for ,also what would be the values i should configure for all those parameters to make it run…?

Can anyone please help…?


#3

@miteshu,

Those configuration files serve different purposes; they are not alternatives.

  • config.hocon: configures Stream Enrich itself. It’s specific to your pipeline implementation; make changes to reflect your architecture.
  • Iglu resolver provides the reference to Iglu server - repository of various JSON schemas used to validate various events you track. If you do not use a custom event no need to make any amendments to it.

Both configuration files need to be passed over as parameters as described in Run Stream Enrich.


#4

Hello @ihor ,

Thanks a lot for your heads up.

Yes…i know these 2 configuration files serve different purposes and they are not alternatives, though i am using both the files and passing over as parameters as well.

My only question is how should we configure iglu resolver configuration , i mean what values should we set inside it.

Since, i am not able to relate the parameters inside iglu resolver config file with their exact purpose and because of that don’t know what values should i use for those parameters.

Do you have any sample iglu resolver config file , so that i can see the configuration inside that…?


#5

@miteshu, as I said If you do not use a custom event there is no need to make any amendments to the resolver configuration file. In other words, the content remains the same

{
  "schema": "iglu:com.snowplowanalytics.iglu/resolver-config/jsonschema/1-0-1",
  "data": {
    "cacheSize": 500,
    "repositories": [
      {
        "name": "Iglu Central",
        "priority": 0,
        "vendorPrefixes": [ "com.snowplowanalytics" ],
        "connection": {
          "http": {
            "uri": "http://iglucentral.com"
          }
        }
      },
      {
        "name": "Iglu Central - GCP Mirror",
        "priority": 1,
        "vendorPrefixes": [ "com.snowplowanalytics" ],
        "connection": {
          "http": {
            "uri": "http://mirror01.iglucentral.com"
          }
        }
      }
    ]
  }
}

Those are Snowplow public repositories. You are welcome to use them.

More info on Iglu repositories are here: https://github.com/snowplow/snowplow/wiki/Iglu-registry


#6

@ihor

Thanks again.

Just would like to update you that i also have done the same. As you mentioned i kept the file as it is without any changes and executing below comnand

$ java -jar snowplow-stream-enrich-0.12.0.jar --config Stream-enrich.conf --resolver file:resolver.json

After running above command i get below error

Could you please suggest me, if i am doing wrong somewhere…because error shows the content of iglu resolver file itself…?


#7

@miteshu, the error states “invalid JSON”. You might have had a hidden (invisible) character(s) introduced by your editor. Could you try linting the content of the configuration file? There are a few online linting tools out there. One of them: http://zaa.ch/jsonlint/.


#8

@ihor

Thanks.

Initially i also thought the same and because of that i already validated the JSON on
https://jsonformatter.curiousconcept.com/ and json is seems to be fine.

Below is the resolver JSON which i am using, which have been validated.

Though i already have checked it no of times, but do you see any other area’s , that we might have missed mistakenly.