13

I'm trying to estimate IOPS requirements of my application running on 32-bit CentOS 6.2. I started to take some measurement on a machine with SATA disks and I'm quite confused of difference between IOPS and tps measured by sar.

According to wikipedia SATA disk should perform 75-100 IOPS. ioping utility seems to confirm this for random access test:

# ./ioping -R /dev/sda
--- /dev/sda (device 931.0 Gb) ioping statistics ---
279 requests completed in 3.0 s, 92 iops, 371.3 kb/s
min/avg/max/mdev = 2.7 ms / 10.8 ms / 130.8 ms / 7.9 ms

But tps values produced by sar are much higher (/dev/sda):

# iostat 1
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
       0.17    0.00    2.02   14.86    0.00   82.96

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             559.00         0.00    142600.00          0     142600
dm-0          18433.00         0.00    147464.00          0     147464
dm-1              0.00         0.00         0.00          0          0
dm-2              0.00         0.00         0.00          0          0

It does not really mind if this load is sequential (dd with various block sizes) or random access (ioping), value is still the same. I thought tps actually is IOPS and I would expect it go down with larger chunks transferred.

So what exactly does the tps value mean? And how does it relate to IOPS?

pystole
  • 405

3 Answers3

6

Please also be aware that TPS value represents reads and writes, you can use -x switch for extended view where reads and writes are separated (r/s = read IOPS, w/s = write IOPS):

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
vda               0.07    24.65    0.30   18.95    30.65   330.22    18.74     0.07    3.61   0.98   1.89
HTF
  • 3,278
6

Transactions are single IO-commands (fetch block/write block) that are written to the RAW-disk (in your example dm-0). The linux-kernel tries to order those commands into a better sequence or tries to compress them into more efficient commands (like: get two blocks at once instead of get one block and get another block right after this one). These are the transactions that go out to the disk-controller (tps for sda).

Good controllers migth have a logic of their own that reduce the real number of transactions further.

A transaction might be the SCSI-command "write 2 GB to crontoller 1 target 2 lun 3 starting from sector 22). As you can see this can not be brought into direct correlation with throughput-numbers.

What you are after is the sustained write-rate. You have a couple of limiting factors here:

  • client-connection: If the network is Gigabit you will never have more than 100 MB/s input
  • disk-controller: If this is a 3 Gb controller you will never have more than 300 MB/s throughput
  • disk: Look up the manufacturers value for sustained write performance
  • Filesystem: There is a little overhead since the OS needs to process data - test that in a RAM-disk...

My guess for your system is: Get a good hardware-raid-controller that is capable of doing raid 10 or 5 and get at least 6 fast (15k) disks.

For professional use use SAS instead of SATA.

Nils
  • 7,815
-1

iostat/sysstat is a very powerful tool. It is often best to consult the associated man page lest you end up waiting 6 years and 10 months for a truly correct answer. The answer from your question is taken from the man page:

The first report generated by the iostat command provides statistics concerning the time since the system was booted. Each subsequent report covers the time since the previous report.

If you run something like iostat 1 2, the second reporting block would contain IO statistics for one second while the first would contain a cumulative set of previous data. Often is helpful to run sar in cron so you can collect meaning stats in a lightweight yet consistent fashion.

In your example you are seeing cumulative previous stats, not IOPS captured during some stress test.