Making Ansible Network Security 2-3x Faster

At Flatiron Health we use Ansible to configure AWS Network Security groups (see blog).  Over time I noticed more and more timeouts while asserting that the network security state was where we thought it should be.  Digging into the code I found this confusing block of code:

Screen Shot 2017-03-27 at 10.46.39 AM

The timeout happened on the highlight.  Why was checking a group getting all ec2 instances?  It doesn’t even use them unless the target description doesn’t match the existing description.  We could be more lazy in getting ec2 instances if that’s the case.

Digging deeper, the error condition listed on L326 has the intent that if the group is not being used, then maybe the description can be updated.  Presumably that update would be done via deleting the security group and recreating it since security group descriptions are immutable.  This update never happens in the module, so clearly this is a relic.  (side note: the public-ssh group assumption on L322 is another funny relic)

My recent PR/commit against this code cleaned this up a fair bit and just made it an error if the target description does not equal the existing description without checking if any ec2 instances are using the existing group.

Impact

How expensive is getting all ec2 instances?  Well it depends on how large your AWS account is.  For us the return value was in the ballpark of 1MB (tested via aws ec2 describe-instances).

Before

# ansible==2.1.3.0
time ansible-playbook sg-update.yml --check
...
real 8m23.103s
user 2m3.358s
sys 0m46.586s

After (running Ansible at commit of change)

time ansible-playbook sg-update.yml --check
...
real 3m5.069s
user 1m0.551s
sys 0m38.873s

From 503 seconds to 185 is an appreciable speed up (2.7x faster).  This speedup should apply any time the security group is already present whether in --check mode or not.

I’m looking forward to the next release of Ansible when we can realize these savings.  (I’m not sure if this will be in 2.3, which was just cut, or we’ll have to wait for 2.4)

Thanks to the reviewers/maintainers of Ansible for the review and getting this merged.