I have seen many Linux Performance engineers looking at the “IOWait” portion of CPU usage as something to indicate whenever the system is I/O-bound. In this blog post, I will explain why this approach is unreliable and what better indicators you can use.
Let’s start by running a little experiment – generating heavy I/O usage on the system:
1 | sysbench --threads=8 --time=0 --max-requests=0 fileio --file-num=1 --file-total-size=10G --file-io-mode=sync --file-extra-flags=direct --file-test-mode=rndrd run |
CPU Usage in Percona Monitoring and Management (PMM):
1 2 3 4 5 6 7 8 9 | root@iotest:~# vmstat 10 procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 3 6 0 7137152 26452 762972 0 0 40500 1714 2519 4693 1 6 55 35 3 2 8 0 7138100 26476 762964 0 0 344971 17 20059 37865 3 13 7 73 5 0 8 0 7139160 26500 763016 0 0 347448 37 20599 37935 4 17 5 72 3 2 7 0 7139736 26524 762968 0 0 334730 14 19190 36256 3 15 4 71 6 4 4 0 7139484 26536 762900 0 0 253995 6 15230 27934 2 11 6 77 4 0 7 0 7139484 26536 762900 0 0 350854 6 20777 38345 2 13 3 77 5 |
So far, so good, and — we see I/O intensive workload clearly corresponds to high IOWait (“wa” column in vmstat).
Let’s continue running our I/O-bound workload and add a heavy CPU-bound load:
1 | sysbench --threads=8 --time=0 cpu run |
1 2 3 4 5 6 7 8 9 10 11 12 | root@iotest:~# vmstat 10 procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 12 4 0 7121640 26832 763476 0 0 48034 1460 2895 5443 6 7 47 37 3 13 3 0 7120416 26856 763464 0 0 256464 14 12404 25937 69 15 0 0 16 8 8 0 7121020 26880 763496 0 0 325789 16 15788 33383 85 15 0 0 0 10 6 0 7121464 26904 763460 0 0 322954 33 16025 33461 83 15 0 0 1 9 7 0 7123592 26928 763524 0 0 336794 14 16772 34907 85 15 0 0 1 13 3 0 7124132 26940 763556 0 0 386384 10 17704 38679 84 16 0 0 0 9 7 0 7128252 26964 763604 0 0 356198 13 16303 35275 84 15 0 0 0 9 7 0 7128052 26988 763584 0 0 324723 14 13905 30898 80 15 0 0 5 10 6 0 7122020 27012 763584 0 0 380429 16 16770 37079 81 18 0 0 1 |
What happened? IOWait is completely gone and now this system does not look I/O-bound at all!
In reality, though, of course, nothing changed for our first workload — it continues to be I/O-bound; it just became invisible when we look at “IOWait”!
To understand what is happening, we really need to understand what “IOWait” is and how it is computed.
There is a good article that goes into more detail on the subject, but basically, “IOWait” is kind of idle CPU time. If the CPU core gets idle because there is no work to do, the time is accounted as “idle.” If, however, it got idle because a process is waiting on disk, I/O time is counted towards “IOWait.”
However, if a process is waiting on disk I/O but other processes on the system can use the CPU, the time will be counted towards their CPU usage as user/system time instead.
Because of this accounting, other interesting behaviors are possible. Now instead of running eight I/O-bound threads, let’s just run one I/O-bound process on four core VM:
1 | sysbench --threads=1 --time=0 --max-requests=0 fileio --file-num=1 --file-total-size=10G --file-io-mode=sync --file-extra-flags=direct --file-test-mode=rndrd run |
1 2 3 4 5 6 7 8 9 10 11 12 | procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 3 1 0 7130308 27704 763592 0 0 62000 12 4503 8577 3 5 69 20 3 2 1 0 7127144 27728 763592 0 0 67098 14 4810 9253 2 5 70 20 2 2 1 0 7128448 27752 763592 0 0 72760 15 5179 9946 2 5 72 20 1 4 0 0 7133068 27776 763588 0 0 69566 29 4953 9562 2 5 72 21 1 2 1 0 7131328 27800 763576 0 0 67501 15 4793 9276 2 5 72 20 1 2 0 0 7128136 27824 763592 0 0 59461 15 4316 8272 2 5 71 20 3 3 1 0 7129712 27848 763592 0 0 64139 13 4628 8854 2 5 70 20 3 2 0 0 7128984 27872 763592 0 0 71027 18 5068 9718 2 6 71 20 1 1 0 0 7128232 27884 763592 0 0 69779 12 4967 9549 2 5 71 20 1 5 0 0 7128504 27908 763592 0 0 66419 18 4767 9139 2 5 71 20 1 |
Even though this process is completely I/O-bound, we can see IOWait (wa) is not particularly high, less than 25%. On larger systems with 32, 64, or more cores, such completely IO-bottlenecked processes will be all but invisible, generating single-digit IOWait percentages.
As such, high IOWait shows many processes in the system waiting on disk I/O, but even with low IOWait, the disk I/O may be bottlenecked for some processes on the system.
If IOWait is unreliable, what can you use instead to give you better visibility?
First, look at application-specific observability. The application, if it is well instrumented, tends to know best whenever it is bound by the disk and what particular tasks are I/O-bound.
If you only have access to Linux metrics, look at the “b” column in vmstat, which corresponds to processes blocked on disk I/O. This will show such processes, even of concurrent CPU-intensive loads, will mask IOWait:
Finally, you can look at per-process statistics to see which processes are waiting for disk I/O. For Percona Monitoring and Management, you can install a plugin as described in the blog post Understanding Processes Running on Linux Host with Percona Monitoring and Management.
With this extension, we can clearly see which processes are runnable (running or blocked on CPU availability) and which are waiting on disk I/O!
Percona Monitoring and Management is a best-of-breed open source database monitoring solution. It helps you reduce complexity, optimize performance, and improve the security of your business-critical database environments, no matter where they are located or deployed.
Download Percona Monitoring and Management Today