Cannot get scala collector link redirection to work

I am trying to follow this blog post. https://snowplowanalytics.com/blog/2016/03/07/ad-impression-and-click-tracking-with-snowplow/ It is pretty old but hopefully still relevant?

Version of the collector

snowplow_scala_stream_collector_google_pubsub
1.0.0
❯ curl -sv http://localhost:9001/r/tp2?u=http%3A%2F%2Fexample.com
*   Trying ::1...
* TCP_NODELAY set
* Connected to localhost (::1) port 9001 (#0)
> GET /r/tp2?u=http%3A%2F%2Fexample.com HTTP/1.1
> Host: localhost:9001
> User-Agent: curl/7.64.1
> Accept: */*
>
< HTTP/1.1 400 Bad Request
< Set-Cookie: sp_dp=bc7d8167-164c-4e68-9304-0ea00f27705d; Expires=Fri, 30 Jul 2021 00:00:29 GMT; Path=/; HttpOnly; SameSite=Strict
< Cache-Control: no-cache, no-store, must-revalidate
< P3P: policyref="/w3c/p3p.xml", CP="NOI DSP COR NID PSA OUR IND COM NAV STA"
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Credentials: true
< Server: akka-http/10.1.10
< Date: Thu, 30 Jul 2020 00:00:29 GMT
< Content-Length: 0
<
* Connection #0 to host localhost left intact
* Closing connection 0

Config

collector {
  interface = "0.0.0.0"
  port = 9001
  port = ${?HTTP_PORT} # set http port, or fallback to 9000

  paths {}

  p3p {
    policyRef = "/w3c/p3p.xml"
    CP = "NOI DSP COR NID PSA OUR IND COM NAV STA"
  }

  doNotTrackCookie {
    enabled = true
    name = "DP_DO_NOT_TRACK"
    value = "TRUE"
  }

  cookie {
    enabled = true
    expiration = "365 days"
    name = "sp_dp"
    httpOnly = true
    sameSite = "Strict"

    secure = false
    secure = ${?COOKIE_SECURE} # set to true for secure cookies
  }

  # cookie bounce does not work well with python tracker
  cookieBounce {
    enabled = false
    name = "n3pc"
    fallbackNetworkUserId = "00000000-0000-4000-A000-000000000000"
  }

  redirectMacro {
    enabled = false
  }

  rootResponse {
    enabled = false
    statusCode = 200
  }

  enableDefaultRedirect = true

  crossDomain {
    enabled = false
    domains = [ "*" ]
    secure = true
  }

  cors {
    accessControlMaxAge = "7d"
  }

  prometheusMetrics {
    enabled = true
  }

  streams {
    good = "dp-<REDACT>"
    bad = "dp-<REDACT>"

    useIpAddressAsPartitionKey = true

    sink {
      enabled = "google-pub-sub"
      googleProjectId = ${GOOGLE_CLOUD_PROJECT}
      backoffPolicy {
        minBackoff = 100
        maxBackoff = 1000
        totalBackoff = 10000
        multiplier = 1.3
      }
    }

    buffer {
      byteLimit = 100000
      recordLimit = 1000
      timeLimit = 360
    }
  }
}

akka {
  loglevel = DEBUG
  loggers = ["akka.event.slf4j.Slf4jLogger"]
}

I am able to track events as long as POST them to the collector, but I cannot get any of the GET endpoints to work. So I cannot get this to work, and I cannot get the webhooks to work.

Could you try and change this to true?

I’m not 100% sure but I think the 400 is only meant to be returned if the u parameter isn’t present, but you do have it present in your example.

That did not fix it. Still the same result. I thought that I was supposed to use enableDefaultRedirect for that configuration param, but I do have that set as true.

My Dockerfile that I am using to run this looks like

FROM openjdk:8-jdk

ENV SNOWPLOW_SCALA_COLLECTOR_VERSION=1.0.0

RUN mkdir -p /opt/snowplow
RUN useradd -m -u 1000 sp\
 && chown sp:sp /opt/snowplow

USER sp
WORKDIR /opt/snowplow
RUN set -x\
 && wget https://dl.bintray.com/snowplow/snowplow-generic/snowplow_scala_stream_collector_google_pubsub_${SNOWPLOW_SCALA_COLLECTOR_VERSION}.zip\
 && unzip snowplow_scala_stream_collector_google_pubsub_${SNOWPLOW_SCALA_COLLECTOR_VERSION}.zip\
 && mv *.jar snowplow_scala_stream_collector_google_pubsub.jar\
 && rm *.zip

ADD config.hacon /opt/snowplow/config/

ENTRYPOINT [\
  "java","-jar","snowplow_scala_stream_collector_google_pubsub.jar"\
  ,"--config","/opt/snowplow/config/config.hacon"\
]

I thought that the macro was just for token replacement

# When enabled, the redirect url passed via the `u` query parameter is scanned for a placeholder
  # token. All instances of that token are replaced withe the network ID. If the placeholder isn't
  # specified, the default value is `${SP_NUID}`.
  redirectMacro {
    enabled = false
    enabled = ${?COLLECTOR_REDIRECT_MACRO_ENABLED}
    # Optional custom placeholder token (defaults to the literal `${SP_NUID}`)
    placeholder = "[TOKEN]"
    placeholder = ${?COLLECTOR_REDIRECT_REDIRECT_MACRO_PLACEHOLDER}
  }

@camerondavison, you are right. This feature is described here and is used for cookie sharing with the 3rd party.

I’m not aware of any special feature to be turned on for GET to work.

So any idea what I can turn on to debug why I am getting a 400?

I was playing around and it seems like if i set

akka {
  http.server {
    raw-request-uri-header = on
  }
 }

then it stops giving the 400. does that make sense?

2 Likes