Description of noisy neighbor problem #6 lacks some depth. AWS noisy neighbors p...

miles932 · on Nov 5, 2013

"You are guaranteed to avoid noisy neighbors CPU problem by using AWS dedicated instances" This is incorrect. You simply ensure that the only ec2 instances that can be a noisy neighbor are other instances of yours. Since most users investing in dedicated instances tend to put their instances to work, this sometimes has the net effect of reducing performance.

AYBABTME · on Nov 6, 2013

There was recently [a paper][1] published in ACM Transactions on Computer Systems where the argument was that if your instance is scheduled out by the hypervisor during TCP traffic, the latency in ACKing packets would reduce the network throughput of instances, because of the slow start mechanism.

Given that distributed network rely heavily on network latency and capacity, I found it very interesting to see the effect of busy CPU propagate to slower network IO.

[1]: http://friends.cs.purdue.edu/pubs/SC10.pdf

JulianWasTaken · on Nov 5, 2013

Why does this happen even when the instance isn't using all of its CPU?

vacri · on Nov 5, 2013

If you're sharing a CPU with a neighbour and you're using 3% and the neighbour is at 100%, the hypervisor needs to somehow squash that 103% into the 100% it has available - which means you lose a little bit of the scheduled time slots that otherwise would have been yours. If your neighbour drops down to 50%, there's now enough CPU time to handle both of you, so there's no 'steal'.

ideaoverload · on Nov 5, 2013

>Why does this happen even when the instance isn't using all of its CPU?

Probably hypervisor does not assign CPU every time it is requested but it still manages to assign as much as needed because in the end there is some idle time left.

JulianWasTaken · on Nov 6, 2013

We've seen for example, if we have 5 instances running the same app, one instance potentially uses 20% more CPU with tons of CPU steal time, but none are using 100%, the "normal" ones use ~40% and the one with tons of steal time spikes between 60% - 70%.

But rebuilding the instance brings them all in line.