Problem with load testing using avalanche

rahul · November 17, 2016, 1:02pm

Hi,

We decided to do a load testing exercise on our collector using avalanche. We used the following configuration for the simulation :

export SP_SIM_TIME=60
export SP_BASELINE_USERS=3500
export SP_PEAK_USERS=0

We ran the LinearPeak simulation. But as soon as we started the simulation the requests started to drop. We got the following error :

REQUEST Baseline 682 PageView Event 1479297981539 1479297981543 KO status.find.in(200,304,201,202,203,204,205,206,207,208,209), but actually found 503

After simulation ends we checked that total number of requests were around 9 million and around 5000 requests had above mentioned error.

But after running EMR and loading data into redshift we saw that only 0.13 million events came into table.

Our environment had autoscalling enabled and we were using m4.large type of EC2 instance.

What could be the reason that caused this problem?

josh · November 17, 2016, 2:43pm

Hi @rahul,

First off the idea behind the baseline and peak users is to provide a ramping mechanism so your collector does not go from nothing to 3500 events per second - this instantaneous loading does not provide a real world way of seeing how your autoscaling group and instances ramp up to deal with the traffic.

In the future I would recommend something more along the lines of:

export SP_SIM_TIME=60
export SP_BASELINE_USERS=1000
export SP_PEAK_USERS=2500

This will then provide a smooth ramp up and down to simulate your peak activity - which should then be what is triggering your autoscaling.

Onto your actual questions! The 503 response code indicates that the server is overloaded and unable to accept any more requests.

What was the state of the cluster during the load test?
- Can you provide any metrics around CPU use / Network use etc.
What metrics are you using currently to scale up?
What is the size of the backing EBS volume of the instance?

Once we have a bit more context we should hopefully be able to figure what exactly is going wrong!

Cheers,
Josh

Topic		Replies	Views
Load into Redshift fails from EC2 Storage targets	4	1465	May 15, 2017
Iglu Server responds with 502 For engineers	3	647	February 2, 2022
Storage Loader "Incomplete JSON object found" AWS batch pipeline (Legacy)	7	2575	December 18, 2016
Storage loader taking more time For engineers	14	1804	March 16, 2017
Serializable isolation violation on table Troubleshooting	9	3380	September 29, 2017

Problem with load testing using avalanche

Related Topics