3

I'm currently on test of VMware NSX network environment and met some trouble.

My Environment is:

  • Management Cluster with 3 Hosts and NSX components on 2 dedicated Hosts
  • Compute Cluster with 2 Hosts
  • Single 1Gbps Switch
  • vSphere version 6.0 and NSX version 6.2
  • One dedicated UTP line per all Host for Management and iSCSI(VLAN tagged)
  • One dedicated UTP line per all Host for Transit Network(for VM traffic)
  • One dedicated UTP line per Management Host for External Network

When a VM V on Host H send data to VM W on Host I over NSX network, heavy restransmission is occurred. I tested many cases below:

Cases with Problem:

  1. V send about 20MB to W in single session: retransmission at around 19MB
  2. V send about 50MB to W in single session: retransmission at 19MB only
  3. V send about 2MB to W in 30 concurrent sessions: retransmission at random position.

When this condition, I found some packet order mismatches (maybe cause of retransmission) on packet dump from H's vmnic(uplink), and delayed packets are uniq(not occur previously on dump), but on dump from vDS downlink to VM V or sfw of V, they are occurred twice(original packets and retransmitted packets). So I think, the problem is some lost packets on sender side stack especially between VM V and Host H's Physical NIC.

To divide the data path/stack into two sectors and to check independantly, I tested same cases with another destination VM X on same Host H. then I got clean dump and I found there is no retransmissions problem between VMs on same Host. (so I think, there is no error point on vDS itself and above.)

Next, I tested cases below to check the problem is related on heavy data traffic or heavy filtering and/or encapsulation or not:

  1. same test with Network I/O Control enabled: same problem
  2. same test without Network I/O Control: same problem with some diffs.
  3. same test but slowdown the throuput with N I/O C Limit: same problem
  4. same test with TSO disabled vnic of V(e1000 driver): same problem
  5. same test with vDS MTU 9000: same problem with more Question

Some different things are:

When Network I/O Control is enabled, At first, RTT is increased just before the restransmission and then after retransmission os completed, RTT values are in stable range.

But when Network I/O Control is disabled, RTT after restransmission also incleased again as same as start.

One ore strange thing is although I set MTU to 9000, the size of UTP packets which is embed VxLAN packets are under 1600. so effect of MTU 9000 is not affected.

I'm on trouble. can I get some helps? Thanks.


EDIT ---

If the VMs are on the normal, NSX disabled, vDS, all is fine.


EDIT* Is there any similar issues on OpenvSwitch?

sio4
  • 264

0 Answers0