1

We have 2 VPN servers at the same hoster. The servers are virtual and they have different Linux distros. The VPN clients establish https connections to the same Amazon EC2 server. The TCP packets from EC2 always have the "Don't fragment" flag set.

Although the MTU on both the physical and the tun interfaces of both VPN servers are 1500 they usually receive larger packets from the EC2. I'm not sure how it's possible, but maybe it has something to do with Virtio.

Anyway, when the TCP traffic is forwarded to the tun interfaces the servers behave differently:

  • On "server 1" the large packets are dropped as expected and the ICMP "Fragmentation needed" is sent back to EC2.
  • On "server 2" the TCP traffic is refragmented, but it's not the IP fragmentation, but rather a completely new TCP stream as if there was an app with two sockets on the VPN server. The DF flag is retained.

So I assume there's some sysctl setting which enables this behavior on "server 2". Am I right? Where is this setting?

wireshark screenshot

I configured the forwarding on server 2 purely with firewall-cmd Here's the firewall config:

external (active)
  target: default
  icmp-block-inversion: no
  interfaces: eth0
  sources: 10.8.0.0/24 10.8.1.0/24
  services: dhcpv6-client http https irc ircs openvpn smtp ssh
  ports: 1398/tcp 1194/tcp 1401/tcp 1402/tcp 65213/tcp 500/udp 501/udp
  protocols:
  forward: yes
  masquerade: yes
  forward-ports:
        port=1500:proto=tcp:toport=1500:toaddr=10.8.1.32
        port=1501:proto=tcp:toport=1501:toaddr=10.8.1.32
  source-ports:
  icmp-blocks:
  rich rules:
        rule family="ipv4" source address="10.8.1.0/24" port port="3128" protocol="tcp" accept

ethtool output

localhost:~ # ethtool -k eth0
Features for eth0:
rx-checksumming: on [fixed]
tx-checksumming: on
        tx-checksum-ipv4: off [fixed]
        tx-checksum-ip-generic: on
        tx-checksum-ipv6: off [fixed]
        tx-checksum-fcoe-crc: off [fixed]
        tx-checksum-sctp: off [fixed]
scatter-gather: on
        tx-scatter-gather: on
        tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
        tx-tcp-segmentation: on
        tx-tcp-ecn-segmentation: on
        tx-tcp-mangleid-segmentation: off
        tx-tcp6-segmentation: on
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: off [fixed]
tx-vlan-offload: off [fixed]
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: on [fixed]
rx-vlan-filter: on [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: on [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-gre-csum-segmentation: off [fixed]
tx-ipxip4-segmentation: off [fixed]
tx-ipxip6-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-udp_tnl-csum-segmentation: off [fixed]
tx-gso-partial: off [fixed]
tx-tunnel-remcsum-segmentation: off [fixed]
tx-sctp-segmentation: off [fixed]
tx-esp-segmentation: off [fixed]
tx-udp-segmentation: off [fixed]
tx-gso-list: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off [fixed]
esp-hw-offload: off [fixed]
esp-tx-csum-hw-offload: off [fixed]
rx-udp_tunnel-port-offload: off [fixed]
tls-hw-tx-offload: off [fixed]
tls-hw-rx-offload: off [fixed]
rx-gro-hw: on
tls-hw-record: off [fixed]
rx-gro-list: off
macsec-hw-offload: off [fixed]
rx-udp-gro-forwarding: off
hsr-tag-ins-offload: off [fixed]
hsr-tag-rm-offload: off [fixed]
hsr-fwd-offload: off [fixed]
hsr-dup-offload: off [fixed]
localhost:~ #```
basin
  • 598

1 Answers1

1

Pretty certain that this is going to be related to TCP segmentation offload which allows the kernel to stuff packets much larger than the MTU onto the ring buffers of the ethernet driver, then the driver itself is getting the device to write out the IP headers and divide up the packets for you before they exit the device.

Hence, whats actually leaving the device will honour any MTU but what you are sending from the host to the device wont make any sense in a packet sniffer as (like you mentioned) the packet sizes wont appear to honour the MTU you expected to see.

If you check the receiver of these packets in a packet sniffer all the MTUs should line up though and have a DF flag set.

You can turn off TSO in ethtool - generally its not a good idea to do this, however there can be times where turning it off is better -- I've had problems in the past before with TLS connections not properly calculating hashes and being rejected on the receivers side because of it.

Matthew Ife
  • 24,261