Getting incorrect userip_address in input data header of Scala Stream Collector

Hi Team,

In Scala Stream Collector o/p headers we are getting the subnet IP(usually the docker inet IP of the machine on which the collector is hosted). Due this issue our ip_enrichment is not working as expected as we are not getting the userip_address.

below is the collector config we are using:

collector {
  interface = "0.0.0.0"
  port = 8080

  p3p {
    policyRef = "/w3c/p3p.xml"
    CP = "NOI DSP COR NID PSA OUR IND COM NAV STA"
  }

  crossDomain {
    enabled = false
    domains = ["*"]
    secure = true
  }

  cookie {
    enabled = true
    expiration = "365 days"
    name = collectorCookieName
    domain = cookieDomain
  }

  cookieBounce {
    enabled = true
    name = n3pc
    fallbackNetworkUserId = "00000000-0000-4000-A000-000000000000"
    forwardedProtocolHeader = "X-Forwarded-Proto"
  }

  doNotTrackCookie {
    enabled = false
    name = dnt
    value = 1
  }

  rootResponse {
    enabled = false
    statusCode = 200
    body = ok
  }

  redirectMacro {
    enabled = false
    placeholder = "[TOKEN]"
  }

  cors {
    accessControlMaxAge = 10 seconds
}

  prometheusMetrics.enabled = false

  streams {
    good = <good>
    bad = <bad>

    useIpAddressAsPartitionKey = false

    sink {
      enabled = kinesis
      port = 4150
      region = <region>

      threadPoolSize = 10

      aws {
            accessKey = <key>
            secretKey = <secret>
      }

      backoffPolicy {
            minBackoff = 1000
            maxBackoff = 600000
      }

    }

    buffer {
      byteLimit = 4500000
      recordLimit = 500
      timeLimit = 600000
    }
  }
}

paths {
  "/com.acme/track"    = "/com.snowplowanalytics.snowplow/tp2"
  "/com.acme/redirect" = "/r/tp2"
  "/com.acme/iglu"     = "/com.snowplowanalytics.iglu/v1"
}
akka {
  loglevel = ERROR
  loggers = ["akka.event.slf4j.Slf4jLogger"]

  http.server {
    remote-address-header = on
    raw-request-uri-header = on
    parsing {
      max-uri-length = 32768
      uri-parsing-mode = relaxed
    }
  }
}

Kindly help me on this as one of our major use case is dependent on this functionality.

Let me know if any other information is required from my end.

Hi @BenB,

Hope you are doing good!

can anyone from the team can help here?

Regards
Karan

Can someone help here?

Hi @karan,

Are you saying that whenever a tracker sends events to the collector, user_ipaddress is always the one of the machine where the collector runs, but not the one sent by the tracker ?

Your configuration looks correct, the problem might come from the networking where your collector runs. How is your collector launched? Do you have load balancers ?

@BenB Thanks for the reply

Yes, we are getting the user_ipaddress as ip of collector machine.

We are launching the collector using the official docker images and we have tested it using with and without load balancer in both cases we are getting the incorrect user_ipaddress.
With load balancer we are getting the load balancer subnet IPs and without load balancer are getting the collector machine inet IP.

Also below is the event we are getting from the tracker. I am not able to see the IP address in tracker. Does it looks correct to you?

(Connection,close)
(User-Agent,okhttp/3.14.7)
(Host,127.0.0.1:8080)
(Accept-Encoding,gzip)
(Content-Length,4866)
(Content-Type,application/json; charset=UTF-8)
{"schema":"iglu:com.snowplowanalytics.snowplow\/payload_data\/jsonschema\/1-0-4","data":[{"eid":"cc-a65a-0ddc7becdfb8","tv":"andr-1.3.0","e":"ue","tna":"<ourAppName>","tz":"Asia\/Kolkata","stm":"1606568002029","p":"mob","uid":"<userId>","cx":"<ourData>","ue_px":"<ourData>","dtm":"1606567999138","lang":"English","aid":"student"},{"eid":"35-b86a-028548179f","tv":"andr-1.3.0","e":"se","tna":"<>","tz":"Asia\/Kolkata","se_ca":"<ourData>","se_ac":"<ourData>","stm":"1606568002029","p":"mob","uid":"<ourData>","cx":"<ourData>","dtm":"1606567998687","lang":"English","aid":"student"},{"eid":"a0af-a1e9-d58a13b","tv":"andr-1.3.0","e":"se","tna":"<ourAppName>","tz":"Asia\/Kolkata","se_ca":"<ourData>","se_ac":"<ourData>","stm":"1606568002029","p":"mob","uid":"<ourData>","cx":"ourData","se_va":"1.0","dtm":"1606567998602","lang":"English","aid":"student"}]}

Also below is the event we are getting from the tracker.

Where/how did you get this data ?

I’m a bit surprised to see (Host,127.0.0.1:8080), where is your tracker ? Is it sending events from the same machine where the collector runs?

@BenB,

For getting this data we have developed a simple web service and and intercepted the data we are getting from the tracker. Below is the attributes we are reading from the tracker request.

  post("/com.snowplowanalytics.snowplow/tp2"){
    //println("------> Cookies <----------")
    //request.getCookies.foreach(println)
    println("------> RemoteUser <----------")
    println(request.getRemoteUser)
    println("------> X-Forwarded-For <----------")
    println(request.getHeader("X-Forwarded-For"))
    println("------> All headers <----------")
    request.headers.foreach(println)
    println("------> body <----------")
    println(request.body)
    println("------> RemoteAddress <----------")
    println(request.getRemoteAddr)
    println("------> Proxy-Client-IP <----------")
    println(request.getHeader("Proxy-Client-IP"))
    println("------> WL-Proxy-Client-IP <----------")
    println(request.getHeader("WL-Proxy-Client-IP"))
    println("------> HTTP_CLIENT_IP <----------")
    println(request.getHeader("HTTP_CLIENT_IP"))
    println("------> HTTP_X_FORWARDED_FOR <----------")
    println(request.getHeader("HTTP_X_FORWARDED_FOR"))
  }

Tracker is running at the android side and hosted in EC2 machine. Also, tracker and collector both are running in different machines.