JS Tracker not working with Scala stream collector

Hi,

I have setup the scala stream collector , And it is working fine. I have checked with command
$ curl http://localhost:8080/health . And it return status OK.

My JS tracker snippet looks like:

<script type="text/javascript">
;(function(p,l,o,w,i,n,g){if(!p[i]){p.GlobalSnowplowNamespace=p.GlobalSnowplowNamespace||[];
p.GlobalSnowplowNamespace.push(i);p[i]=function(){(p[i].q=p[i].q||[]).push(arguments)
};p[i].q=p[i].q||[];n=l.createElement(o);g=l.getElementsByTagName(o)[0];n.async=1;
n.src=w;g.parentNode.insertBefore(n,g)}}(window,document,"script","//d1fc8wv8zag5ca.cloudfront.net/2.6.2/sp.js","snowplow"));

window.snowplow('newTracker', 'stream-collector', 'XXXX.compute.amazonaws.com', { // Initialise a tracker
  appId: 'CDP_TEA',
  cookieDomain: null,
  cookieName: "sp",

});

When i check the browser console, that call goes to this collector . but it get fails.

I have updated the EC2 instance to accept all Traffic to validate. It is still not working.

Please can you help on what i’m doing wrong.

Hi @PuneetBabbar,

Could you ensure the port you are sending your request to is the same you configured your collector with?

Since your health check works fine via port 8080 then that’s what you have set up in the collector configuration file. But it look like you are using the default port 80 in your tracker.

You could ether change the port in the configuration file to be 80 or add the relevant port to your collector endpoint in your tracker initiator.

Regards,
Ihor

HI @ihor

Thanks for your help.

Sorry for the confusion, I was checking the health at default port
"curl http://XXXXXX.compute.amazonaws.com/health " which return ok.

And I have already configured the port in the config file as 80. Below is the line of code where I have mentioned

The collector runs as a web service specified on the following

interface and port.

interface = “0.0.0.0”
port = 80

Production mode disables additional services helpful for configuring and

initializing the collector, such as a path ‘/dump’ to view all

records stored in the current stream.

production = false

Also the inbound and outbound traffic is open for all port meanwhile to test this, but still the GET call from the browser to the collector is getting failed (ERR_PROXY_CONNECTION_FAILED) error.

Could it be related to some firewall issue, if yes ? how can I validate it ?

Thanks
Puneet

It worked with the Pubic IP but was not working with the public DNS of the EC2 instance. Not sure why this was a problem, but it resolves.

Thanks

Hey @PuneetBabbar - just to note that we strongly recommend putting your Scala Stream Collectors behind an Elastic Load Balancer in an Auto Scaling Group. This is a much more robust approach than exposing a single collector to the world.

Great suggestion, thanks @alex

what was the solution to your problem? Did you have to put your PublicIP:port and it worked fine?
So basically
window.snowplow(“newTracker”, “scalaCollectore”, “12.345.67.89:8080”, {
appId: “anyID”,
cookieDomain: “aDomain”
});

See the response from @alex above - you should be putting your Scala Stream collectors behind a load balancer (which will easily allow you to proxy traffic from port 80 to 8080 if required).

1 Like

Thanks Mike. I have put my collector behind a load balancer. But I am still confused as to what do I put in for the collector endpoint in my javascript tracker initiation? Would it be the Public DNS name for the load balancer? Apologies if these questions are too naive, I am new to AWS and snowplow :slight_smile:
window.sp(“newTracker”, “ssc”, “what goes here”, {
appId: “snowplowPOC”,
cookieDomain: “cookiedomain.com
});

Thanks for all your help.

The public DNS name can be anything you’d like it to be. Typically people create a subdomain of the main site (e.g. if my site was domain.com I might use collector.domain.com) and then point the DNS entry for this subdomain towards the AWS Elastic load balancer.

I’d then specify collector.domain.com as my Snowplow collector endpoint.

Thanks Mike.
So as an example, I have my page which is snowplowpoc-env.wvq2iagxas.us-east-2.elasticbeanstalk.com/
and my ELB DNS is sp-poc-collector-lb-93272599.us-east-2.elb.amazonaws.com
When I put the collector instance and my elasticbeanstalk instance in the ELB, the image request to the collector seems to go through(I dont think the image request is right though- because copy pasting the request URL shows me the landing page of my website which is served through elasticbeanstalk, ideally it should be just a 1x1 pixel, right?).
But when I remove my elasticbeanstalk instance from the ELB, then the image request times out.
I have updated the security groups for my EC2 instance to get inbound traffic from the ELB. But I am not sure what I am doing wrong here.
Thanks again for your help in advance.