Redirect Location doesn't match &u={{uri}} argument

Hi there,

We’re facing a really weird behaviour.
It doesn’t occur all the time, but every once in a while the parameter &u= isn’t respected and the redirect goes to another URL.

I’ve checked the source code for the clojure collector, is it possible that it could have some sort of cache in there?

(defn- send-redirect
	"If our params map contains `u`, 302 redirect to that URI,
	 else return a 400 for Bad Request (malformed syntax)"
	[cookies headers params]
		(let [{url "u"} params]
			(if (nil? url)
				{:status  400
				 :headers headers
				 :cookies cookies}
				{:status  302
				 :headers (merge headers {"Location" url})
				 :cookies cookies
				 :body    ""})))

Please check two of those hits.
The identified U parameter and the Location don’t match!

The Location URL is valid and could have been used by other active event at the moment and sent to the same collector endpoint.
Also, the exact same U parameter points to two different redirect locations.

Can anyone help?

Do you think that the ELB might have something to do with the weird redirect? Or could the uri_redirect schema contain any cache?

In this particular case, an AU event got redirected to a DK campaign. It doesn’t make sense.

We used to have only 1 EC2 (m4.large) instance running, but some spikes in CPU and latency a few days ago triggered another instances an now we have 3 instances running until things stabilize.

Cloned the Beanstalk with latest platform (so a new ELB, new instance with a fresh clojure collector install, new volume), swap URLs, but the exact same problem persists.
And I think it’s getting worse, in like 20 request we used to got 1 wrong redirect, now in 10 request, 1 or 2 are pointing to the wrong URLs.

Any clue guys @josh, @alex ?

Hey @T_P - sorry to hear about this!

It’s not something we’ve heard about from any of our users of this functionality.

I agree with your hypothesis that something strange is happening with the request routing. Has anybody else seen this?

Weird - any chance you can put together a list of steps to recreate @T_P?

Hi @alex and @mike,

First of all thank you for your time.

Mike, even if I wanted to reproduce this condition I could not recreate it. It’s an oddly behaviour.

We have around ±850k daily events being tracked. Not that much of a deal, I think.

The thing now is, we’ve got two beanstalks (single m4.large clojure collector sitting behind an ELB) running (old env and new env), I’ve swapped the old beanstalk url to the new env. Yesterday the situation was happening in the new env but can’t be noticed in the old one.

Because of complaints, we had to turn off tracking for almost all future events, so the traffic volume in the collector was reduced by a lot. I’ve tested today and can’t reproduce the behaviour in either of the envs which leads me to believe that it may be related to the volume of requests and the management of any eventual cache…

I can pm you, if you want to, both beanstalk URLs so that you can see it for yourself.

So, we’ve setup a second collector and tried to split the traffic between them. Both collectors are M4.large instances.
I think a pattern can be found. Right now we’re having ±10 reqs/sec in one of the collectors and the redirects are getting scrambled.

Take a look. In 39 requests only 28 went to the correct location.

As said, we are trying to balance the number of requests between both collectors but this is unsustainable.

Try to do this

  1. Delete the second instance
  2. Make a second instance again
  3. Repeat the test (should work correctly)

A fresh instance was probably the first thing tried but did not have any improvement effect.

It looks like that it must have something to do with concurrent requests on the collector.

You need to try only with 1 instance, delete second and run test, error was gone 100%

@emm

Current scenario is:

A: Beanstalk (clojure collector) > ELB > 1 instance running (150K requests last 24 hours)
B: Beanstalk (clojure collector) > ELB > 2 instances running (385K requests last 24 hours)

The test posted above is a situation that happens in both cases when the number of requests increases. With the current volume of requests things work without constraints but we had to reduce ±50% the number of tracked events.

There’s a lot of magic involved in the Clojure Collector (Elastic Beanstalk, ELB, Tomcat etc).

If this is critical to the operation of your business (rather than just annoying), then I would move to the Scala Stream Collector.

1 Like

This does not depend on the number of requests. An error appears on some instances. You need to re-create the instance, or all of them. Make tests and if everything is fine, leave, if not - recreate the instance one more time.

Hello

Anyone see this error in the last month?

On my old instance, where I detect this, it’s no more display. Requests over 2M daily.

Any changes in software/hardware has been made? Thanks

Hi @emm,

Our workaround was significantly reduce the number of tracked events, currently around 500k daily without noticing the issue.
We would love to get back to 1M daily but can’t take any more risks with messing the user navigation.

Hi @T_P

We were very uncomfortable with this error, switching to other systems seemed too complicated.

Three weeks ago we found the Clojure programmer, switched traffic to the old instance, but could not produce the error, it does not exist, everything works well, as before, until February.

Now we are testing every day - there is no errors.

Try to reproduce the situation that you had, we can understand if there is a problem today and decide if it is there.

Perhaps you can do this at a later time, or on weekends, or using tools to simulate the load on the server.

Thank you.