2

Problem

All Production Servers were suddenly not able to access the internet anymore, while four other Servers connected to the same VLAN and same eth0 settings can.

enter image description here

Figure 1: System 1 represents the four systems which are able to access the internet, while System 2 indicates the ones which suddenly cannot since today afternoon.

Analysis

  • System 1 can access System 2 and vice versa
  • Default Gateway (10.10.10.1) can be pinged from System 1 and System 2 as well
  • System 1 can access the internet
  • System 2 cannot access the internet
  • Ifconfig's eth0 configuration identical between all Production Servers
  • Internal DNS server is identical to other systems which can access the internet
  • The IP's and names located in /etc/resolve.conf can be accessed
  • The internet can be accessed from the Switch
  • Configuration of all 8 Switchports on Cisco IOS is identical
  • Tracepath from System 2 to 8.8.8.8 (DNS Google), google IP or google.com hangs at the Default Gateway
  • The systems which cannot access the system seems to have an em1 adapter instead of eth0
  • sudo arping -I eth0 ping.tweakers.net works on all 8 systems
  • One of the systems which cannot access the internet show an output if sudo iptables-save has been executed
  • Output route -n is identical between all the systems

Tracepath

[username@hostname ~]$ tracepath google.com
 1:  10.10.10.10 (10.10.10.10)                                  0.222ms pmtu 1500
 1:  10.10.10.1 (10.10.10.1)                                    0.662ms
 1:  10.10.10.1 (10.10.10.1)                                    0.601ms
 2:  no reply

ARP

System1: ? (10.10.10.1) at AA:BB:CC:DD:EE:FF [ether] on em1

System2: ? (10.10.10.1) at AA:BB:CC:DD:EE:FF [ether] on eth0

Output iptables-save on one of the systems which cannot access the internet

# Generated by iptables-save vX on Fri Aug  1 10:00:01 2014
*filter
:INPUT ACCEPT [X:Y]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [X:Y]
COMMIT
# Completed on Fri Aug  1 10:00:01 2014

route -n

Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
10.10.10.0      0.0.0.0         255.255.255.0   U     0      0        0 eth0
X.Y.0.0         0.0.0.0         255.255.0.0     U     Z      0        0 eth0
0.0.0.0         10.10.10.1      0.0.0.0         UG    0      0        0 eth0

It is unclear why the internet cannot be accessed anymore from the four production servers. As these are running in Production, a restart of the network should be prevented. Which further tests could be done to investigate the issue?

030
  • 6,085

0 Answers0