Question Description:
I have a harvester HCI Cluster (RKE2), where pods do not resolve the correct IP addresses for internet domains.
kubectl run debug --image=busybox -i --tty --rm -- sh
/ # ping serverfault.com
PING serverfault.com (<redacted IP address>): 56 data bytes
64 bytes from <redacted IP address>: seq=0 ttl=63 time=0.362 ms
64 bytes from <redacted IP address>: seq=1 ttl=63 time=0.312 ms
64 bytes from <redacted IP address>: seq=2 ttl=63 time=0.319 ms
64 bytes from <redacted IP address>: seq=3 ttl=63 time=0.449 ms
64 bytes from <redacted IP address>: seq=4 ttl=63 time=0.317 ms
64 bytes from <redacted IP address>: seq=5 ttl=63 time=0.363 ms
64 bytes from <redacted IP address>: seq=6 ttl=63 time=0.296 ms
64 bytes from <redacted IP address>: seq=7 ttl=63 time=0.361 ms
^C
--- serverfault.com ping statistics ---
8 packets transmitted, 8 packets received, 0% packet loss
round-trip min/avg/max = 0.296/0.347/0.449 ms
<redacted IP address> in this case happens to be the public IP address of the network in which the cluster resides in (and not one of serverfault.coms IP addresses).
However within the same container, nslookup does list the correct IP address:
/ # nslookup serverfault.com
Server: 10.53.0.10
Address: 10.53.0.10:53
Non-authoritative answer:
Name: serverfault.com
Address: 104.18.23.101
Name: serverfault.com
Address: 104.18.22.101
Non-authoritative answer:
This is not reproducible on the host node:
# ping serverfault.com
PING serverfault.com (104.18.23.101) 56(84) bytes of data.
64 bytes from 104.18.23.101 (104.18.23.101): icmp_seq=1 ttl=57 time=1.27 ms
64 bytes from 104.18.23.101 (104.18.23.101): icmp_seq=2 ttl=57 time=1.30 ms
64 bytes from 104.18.23.101 (104.18.23.101): icmp_seq=3 ttl=57 time=1.33 ms
64 bytes from 104.18.23.101 (104.18.23.101): icmp_seq=4 ttl=57 time=1.29 ms
64 bytes from 104.18.23.101 (104.18.23.101): icmp_seq=5 ttl=57 time=1.23 ms
64 bytes from 104.18.23.101 (104.18.23.101): icmp_seq=6 ttl=57 time=1.28 ms
^C
--- serverfault.com ping statistics ---
6 packets transmitted, 6 received, 0% packet loss, time 5006ms
rtt min/avg/max/mdev = 1.231/1.284/1.333/0.030 ms
The cluster itself is a fresh installation of Harvester HCI v1.2.0 with no additional configuration changes post-installation.
I am looking for further tips on how to troubleshoot this issue and find out why its resolving the wrong IP address.
Context:
/etc/resolve.conf on host:
### /etc/resolv.conf is a symlink to /var/run/netconfig/resolv.conf
### autogenerated by netconfig!
search harvester.<redacted domain> 1
nameserver 10.10.0.1
/etc/resolve.conf on pod container:
search default.svc.cluster.local svc.cluster.local cluster.local harvester.<redacted domain>
nameserver 10.53.0.10
options ndots:5
/etc/nsswitch.conf on host:
#
# /etc/nsswitch.conf
#
passwd: compat
group: compat
shadow: compat
Allow initgroups to default to the setting for group.
initgroups: compat
hosts: files mdns_minimal [NOTFOUND=return] dns
networks: files dns
aliases: files usrfiles
ethers: files usrfiles
gshadow: files usrfiles
netgroup: files nis
protocols: files usrfiles
publickey: files
rpc: files usrfiles
services: files usrfiles
automount: files nis
bootparams: files
netmasks: files
/etc/nsswitch.conf on pod container:
# /etc/nsswitch.conf
#
# Example configuration of GNU Name Service Switch functionality.
# If you have the `glibc-doc-reference' and `info' packages installed, try:
# `info libc "Name Service Switch"' for information about this file.
passwd: files
group: files
shadow: files
gshadow: files
hosts: files dns
networks: files
protocols: db files
services: db files
ethers: db files
rpc: db files
netgroup: nis
/etc/hosts in both cases contain no additional/suspicious entries.