Why is G-Wan performance so low on Opteron?

Question

I am testing G-Wan 4.3.14 on CentOS 6 with the 2.6.32 Linux kernel using an Opteron 6234 6 module / 12 core processor.

Running a simple weighttp test I get:

weighttp -k -n 1000000 -t 6 -c 1000 localhost:8080

finished in 7 sec, 250 millisec and 896 microsec, 137913 req/s, 1044186 kbyte/s
requests: 1000000 total, 1000000 started, 1000000 done, 1000000 succeeded, 0 failed, 0 errored
status codes: 1000000 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 7753000286 bytes total, 256000286 bytes http, 7497000000 bytes data

This seems abnormally low. Does anyone have any experience/advice for tuning G-Wan or other HTTP servers on Opteron?

Gil · Answer 1 · 2013-05-12T08:19:48.320

using an [AMD] Opteron 6234 6 module / 12 core processor

This score for the 6-Core AMD Opteron @ 2.4GHz[1] 137,913 req/s falls short of our 850,000 req/s on an Intel 6-Core Xeon W3680 @ 3.33GHz[2] (with a 100-byte static file).

Besides the differences of each architure performance*, the problem for G-WAN comes on AMD CPUs from the fact that we did not have access to any of those CPUs (all our machines are equipped with Intel CPUs).

Thanks to recent AMD user reports, we have found that the number of detected CPU Cores for AMD CPUs is twice the actual number. This is due to the fact that AMD has its own set of CPUID codes and return values - which differ from Intel's.

This AMD CPU Core mis-detection leads to obvious CPU cache conflicts - the problems supposed to be resolved by G-WAN.

For now, by using ./gwan -w 6 you can force any given multicore setting, bypassing the G-WAN automatic detection when needed.

In your case, your should be using 6 physical CPU Cores rather than the 12 wrongly used by G-WAN. This is what you can do right now (and you will most probably get much higher results with your benchamrks by just doing that).

We will issue an AMD workaround in the next release to make sure that no more manual tweaking is needed.

[*] References:

[1] http://www.cpubenchmark.net/cpu.php?cpu=AMD+Opteron+6234

[2] http://www.cpubenchmark.net/cpu.php?cpu=Intel+Xeon+W3680+%40+3.33GHz

guipy · Answer 2 · 2013-05-09T03:42:33.530

It's just a guess, and so i may be completely wrong...but Opteron is a NUMA architecture.

Sometimes programs are optimized for non-NUMA (very common) architectures, and then the performance is low in NUMA environments.

To test this, you can run exactly the same version of G-Wan with the same data (or almost it !) in a Phenon or i7 that are comparable with your Opteron !

Great..i'm trying to help and have -2 votes...amazing !

Why is G-Wan performance so low on Opteron?

2 Answers2