Enricher Converting integer to float


#1

I have a column in our unstruct_event , I am using javascript tracker to send “hitID”: 99,

This is my iglu definition
“hitID”: {
“description”: “Identifies the unique Hit ID within the current session the User”,
“type”: “integer”
},

I am passing in the integer correctly but after stream-enrichment step the value gets transformed to “hitID”: 99.0,

Can anyone help me understand what may be causing this ?
Also is there support for bigints etc in iglu schemas ?


#2

@kaushikbhanu, JSON schema is used to validate data only, not to transform. What enrichment are you applying to that data (if any)?


#3

@ihor … I am not applying any transformation and I dont want any transformation.
I am applying ua and ip_lookup enrichments.
But this is not enrichment field, this is our custom unstruct_event the collector is receiving this field as integer, once it passes through the ETL and schema validation at the enricher it gets transformed to float (99–>99.0) I am not sure whats causing this.
In my Chema this is what I have

“hitID”: {
“description”: “Identifies the unique Hit ID within the current session the User”,
“type”: “integer”
},


#4

@kaushikbhanu, there are many different enrichments that could be applied to any field, even the field from your custom unstruct event. This is why I asked the question about enrichments you use.

I’m taking your answer that no enrichment enabled on your pipeline that could modify hitID. It is puzzling to me as well as the enrichment process does not make transformation if the enrichments you use do not apply that.

Do you see the float value in enriched data or in the database (ex. Redshift)? If it’s Redshift, how is the property defined there? Does your JSONPaths file matches the order of the columns in the table?

If all is good above, what pipeline are you running (batch or real-time), what is its version? What version of enrichment component are you utilizing?


#5

@ihor
this is our setup

tracker -> scala-collector -> stream-enrich (custom) -> kinesis-stream -> kinesis-analytics -> redshift
in the stream-enrich we are not modifying anything but we are dumping everything as json instead of tsv. I am surprised too what would cause this.
two questions come up:

  1. what is causing this ?
  2. when the iglu schema specified it as integer , how is getting passed into the good stream ?

#6

If it’s passing validation through the enrichment it’s an integer.

Given you are running a custom pipeline I would investigate what custom code you are running downstream of stream enrich to try and identify the problem.


#7

@mike @ihor Thanks for your feedback. turns out it was issue with our custom pipeline logic. We are using Gson for json serialize/deserialize and goon defaults to doubles which was causing the issue. I really appreciate you guys getting back to me and helping.
Here is some info on the gson issue if anyone else runs into this issue.