All IPsec traffic being decrypted is processed on a single CPU, despite having multiple IPsec tunnels (SAs.) How can I get the load shared across multiple CPUs?
I'm running Strongswan IPsec on Ubuntu 22.04 on AWS c5n.2xlarge with aesni_intel loaded, with 8 SAs. This instance is working like an IP router, relaying all traffic, either encapsulating or decapsulating. (IPsec is configured in "tunnel" mode, with NAT traversal using UDP port 4500.)
Using top -1 I see that for traffic being encrypted, CPU load is distributed evenly across the 8 CPUs, regardless of the number of ip xfrm state entries (child SAs) or how ECMP is configured. This is good. (My understanding is that the aesni_intel kernel module provides this behavior.)
However, for traffic being decrypted, all traffic is assigned to the same CPU, in all ECMP configurations.
The ECMP configurations I've tried:
- /proc/sys/net/ipv4/fib_multipath_hash_policy=0 (L3 ECMP)
- /proc/sys/net/ipv4/fib_multipath_hash_policy=1 (L4 ECMP)
- /proc/sys/net/ipv4/fib_multipath_hash_policy=4 (Custom ECMP), /proc/sys/net/ipv4/fib_multipath_hash_fields=255 (all fields)
I will be using other clouds and instance types, but this is what I tested, and hopefully what I learn can be applied to other cases.
My understanding is that normally, decryption for a given SA happens on a single CPU, but the SAs should be distributed across the CPUs. This isn't happening.
What can I do to get the decryption load shared across multiple CPUs?
Example ip xfrm state entries:
[root@33a14363e632 /]# ip xfrm state
src 1.1.0.10 dst 1.2.0.10
proto esp spi 0xcc688295 reqid 1 mode tunnel
replay-window 0 flag af-unspec
aead rfc4106(gcm(aes)) 0x78b083dcfd8bad769a5f63fb3e565a88b7431453 128
encap type espinudp sport 4500 dport 4500 addr 0.0.0.0
anti-replay context: seq 0x0, oseq 0x3a, bitmap 0x00000000
if_id 0x1
src 1.2.0.10 dst 1.1.0.10
proto esp spi 0xcb32714b reqid 1 mode tunnel
replay-window 32 flag af-unspec
aead rfc4106(gcm(aes)) 0x12061cd6fe8a7102c03990ddc8c8f0d5314785a1 128
encap type espinudp sport 4500 dport 4500 addr 0.0.0.0
anti-replay context: seq 0x3c, oseq 0x0, bitmap 0xffffffff
if_id 0x1
The IP addresses are the same for all SAs. My guess is that it has to do with how the NIC queues packets. I think the NIC picks the queue based on L3 or L4 hash, irrespective of the kernel ECMP configuration, ignoring the ESP SPI.