29

this is a standard apache web server on AWS Linux AMI + EBS. We are noticing high load average (+8) and iotop -a shows:

Total DISK READ: 0.00 B/s | Total DISK WRITE: 2.37 M/s

  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND             
 3730 be/4 root          0.00 B      0.00 B  0.00 % 91.98 % [kworker/u8:1]
  774 be/3 root          0.00 B   1636.00 K  0.00 % 15.77 % [jbd2/xvda1-8]
 3215 be/4 apache        0.00 B     40.39 M  0.00 %  0.88 % httpd
 3270 be/4 apache        0.00 B     38.20 M  0.00 %  0.93 % httpd
 2770 be/4 apache        0.00 B     46.86 M  0.00 %  0.71 % httpd

When apache is down, kworker and jbd2 is also down.

Server is not swapping as we have plenty of RAM available. I've seen this issue related to Database servers, but nothing only isolated to Apache.

Any idea on how to diagnose this further and prevent it?

UPDATE 1: perf report (perf record -g -a sleep 10)

Samples: 114K of event 'cpu-clock', Event count (approx.): 28728500000
-  83.58%          swapper  [kernel.kallsyms]         [k] xen_hypercall_sched_op                                          ◆
   + xen_hypercall_sched_op                                                                                               ▒
   + default_idle                                                                                                         ▒
   + arch_cpu_idle                                                                                                        ▒
   - cpu_startup_entry                                                                                                    ▒
        70.16% cpu_bringup_and_idle                                                                                       ▒
      - 29.84% rest_init                                                                                                  ▒
           start_kernel                                                                                                   ▒
           x86_64_start_reservations                                                                                      ▒
           xen_start_kernel                                                                                               ▒
+   1.73%            httpd  [kernel.kallsyms]         [k] __d_lookup_rcu                                                  ▒
+   1.08%            httpd  [kernel.kallsyms]         [k] xen_hypercall_xen_version                                       ▒
+   0.38%            httpd  [vdso]                    [.] 0x0000000000000d7c                                              ▒
+   0.36%            httpd  libphp5.so                [.] zend_hash_find                                                  ▒
+   0.33%            httpd  libphp5.so                [.] _zend_hash_add_or_update                                        ▒
+   0.25%            httpd  libc-2.17.so              [.] __memcpy_ssse3                                                  ▒
+   0.24%            httpd  libphp5.so                [.] _zval_ptr_dtor                                                  ▒
+   0.24%            httpd  [kernel.kallsyms]         [k] __audit_syscall_entry                                           ▒
+   0.22%            httpd  [kernel.kallsyms]         [k] pvclock_clocksource_read                                        ▒

2 Answers2

8

100% IO doesn't mean it's using all your IO operations. It means it's doing nothing but waiting on IO. Therefore, high %IO with low/zero disk bandwidth can be normal.

man iotop:

[...] It also displays the percentage of time the thread/process spent while swapping in and while waiting on I/O.

It may be a different issue if your kworker is waiting on IO forever, but I don't know. Maybe it's supposed to be waiting on a pipe or something. I see kworker doing the same on my server sometimes, and it doesn't seem to be a problem. (I also panicked the first time I saw it.)

sudo
  • 275
0

Had the problem of the drive being written to every 5 secs I used the command below and found that it was x2goserver keeps running every 5 sec and trigger kworker. Note that google chrome will write to the drive if its open.

sudo apt remove x2goserver

sudo pidstat -dvl 5

04:52:35 PM UID PID kB_rd/s kB_wr/s kB_ccwr/s iodelay Command

04:52:40 PM 0 2318 539.42 0.01 0.00 0 /usr/bin/perl /usr/sbin/x2gocleansessions 04:52:40 PM 0 1920632 0.00 251.20 0.00 0 kworker/u64:3-events_unbound

04:52:40 PM UID PID kB_rd/s kB_wr/s kB_ccwr/s iodelay Command 04:52:45 PM 0 2318 809.12 0.01 0.00 0 /usr/bin/perl /usr/sbin/x2gocleansessions

04:52:45 PM UID PID kB_rd/s kB_wr/s kB_ccwr/s iodelay Command 04:52:50 PM 0 2318 539.42 0.01 0.00 0 /usr/bin/perl /usr/sbin/x2gocleansessions