Full disclosure, I tried asking this question here and it was closed as off topic, but I am hoping it will be on topic here.
I have gone really deep on this issue, and still have nothing to show for it, so I am hoping there are some k8s/docker/networking gurus out there who will be interested in digging deep and helping me get to the bottom of this.
Setup
- Host is Ubuntu 22.04.5 LTS
- This host is a single node K8s cluster, and also a docker host.
- For K8s, it is RKE2 v1.29.2+rke2r1. Cilium handles the networking, there is no kube-proxy in play.
- For Docker, 24.0.7 (both server and client)
- Kasm (the thing creating the docker containers in this case) 1.16.0
- Host IP (and what the K8s node is attached to) is
192.168.H.H
Problem: Docker containers (kasm workspaces) running in a bridge network can not access NodePort services on the host, to be specific lets say 192.168.H.H:32000
Sample tcpdump captured on the host of attempted connection:
veth123 P ifindex 123 ma:ca:dd:re:ss:01 ethertype IPv4 172.22.0.14.60388 > 192.168.H.H.32000: Flags[S]
br123 In ifindex 456 ma:ca:dd:re:ss:01 ethertype IPv4 172.22.0.14.60388 > 192.168.H.H.32000: Flags[S]
br123 Out ifindex 456 ma:ca:dd:re:ss:02 ethertype IPv4 192.168.H.H.32000 > 172.22.0.14.60388: Flags[R.]
veth123 Out ifindex 123 ma:ca:dd:re:ss:02 ethertype IPv4 192.168.H.H.32000 > 172.22.0.14.60388: Flags[R.]
ma:ca:dd:re:ss:01is the containers internal mac (visible in docker network inspect of the bridge network)ma:ca:dd:re:ss:02is the mac of the bridge device associated with the docker bridge network- can see that there is an active rejection,
RST,ACKcoming back from ... something...
Other info:
- Docker containers running in bridge network can access other things served on the host (for example, if I spin up a
kubectl port-forwardto the pod behind192.168.H.H:32000, I can access the host port associated with the port forward from a docker container in bridge network. Or if I just spin uppython3 -m http.serverI can access192.168.H.H:8000without issue. - Other things on the host (like curl or a browser), or even a docker container running in the host network, can access the NodePort
192.168.H.H:32000without issue. - Firewall is off.
systemctl stop ufwfollowed bysystemctl disable ufw - When running tcpdump inside the target container (aka the pod that is behind NodePort
192.168.H.H:32000) it is not receiving the failed connections, so is not the one responding with the RST,ACK (it does of course receive successful connections and these can be seen in the tcpdump output)
Attempted to debug iptables
- Of course I know the problem smells like iptables, so I did the following:
iptables -t raw -I PREROUTING 1 -p tcp --dport 32000 -j TRACE
iptables -t raw -I PREROUTING 1 -p tcp --sport 32000 -j TRACE
iptables -t raw -I OUTPUT 1 -p tcp --dport 32000 -j TRACE
iptables -t raw -I OUTPUT 1 -p tcp --sport 32000 -j TRACE
xtables-monitor --trace
- The conclusion of each packet trace was "ACCEPT" and there were not instances of reject or drop in the output (I was especially looking for reject based on the RST,ACK occurring)
Attempted to debug cilium
- Of course I know the problem also smells like cilium, so I tried digging on this side
- got a shell in the cilium pod
- neither
cilium-dbg monitornorhubble observe -fand friends were seeming to show anything connected to my failed attempts, but again, would of course show things for the successful ones (ones not coming from my docker container in bridge network).
It is not ideal to have container management solutions mixed as I do here, but I am not sure any way around it for my situation.
At this point I am totally at a loss. Will be quick to answer clarifying questions if anyone has some...