linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alan Jenkins <alan.christopher.jenkins@gmail.com>
To: Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	linux-kernel@vger.kernel.org
Cc: Doug Smythies <dsmythies@telus.net>, linux-pm@vger.kernel.org
Subject: Re: iowait v.s. idle accounting is "inconsistent" - iowait is too low
Date: Fri, 5 Jul 2019 12:25:46 +0100	[thread overview]
Message-ID: <e82b9d7c-81e5-dd80-b9c0-f5f065344e2f@gmail.com> (raw)
In-Reply-To: <000001d531a8$8931b2a0$9b9517e0$@net>

Hi, scheduler experts!

My cpu "iowait" time appears to be reported incorrectly.  Do you know 
why this could happen?

Doug helped me - it was he who noticed different behaviour on v4.15 vs 
v4.16 vs v4.17+.  So I have some confirmation of this.  I don't think 
Doug mentioned what hardware he ran the kernels on.  lscpu says my 
hardware is "Intel(R) Core(TM) i5-5300U CPU @ 2.30GHz".

I tested using "dd" :-

dd if=bigfile bs=1M iflag=direct of=/dev/null

  (1) E.g. I get the expected result if I pin "dd" to the "right" cpu.  
Which cpu varies; it has often been cpu2.  At the moment I have booted 
5.2-rc5-ish.  Here I run "dd" with "taskset -c 0", and "iowait" worked 
as expected:

top - 11:01:47 up 15:10,  2 users,  load average: 1.07, 0.86, 0.86
Tasks: 288 total,   1 running, 287 sleeping,   0 stopped,   0 zombie
%Cpu0  :  0.3 us,  3.3 sy,  0.0 ni,  0.0 id, 94.3 wa,  0.7 hi,  1.3 si,  0.0 st
%Cpu1  :  0.3 us,  0.7 sy,  0.0 ni, 98.7 id,  0.0 wa,  0.3 hi,  0.0 si,  0.0 st
%Cpu2  :  1.0 us,  0.7 sy,  0.0 ni, 98.0 id,  0.0 wa,  0.3 hi,  0.0 si,  0.0 st
%Cpu3  :  0.7 us,  0.3 sy,  0.0 ni, 99.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   7854.0 total,    432.2 free,   4616.4 used,   2805.4 buff/cache
MiB Swap:   2048.0 total,   1978.2 free,     69.8 used.   2498.0 avail Mem

   PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
31849 alan-sy+  20   0  216052   2836   1800 D   3.0   0.0   0:00.58 dd
24220 alan-sy+  20   0 3339828 232160 126720 S   0.7   2.9   1:53.14 gnome-shell
...

(I have also used "atop" and "vmstat 3". "atop" shows both total and 
per-cpu iowait, idle, etc. "vmstat 3" just shows a total, but all the 
old values stay on-screen).

  (2) But compare running "dd" with "taskset -c 1":

%Cpu0  :  0.3 us,  0.3 sy,  0.0 ni, 97.3 id,  0.0 wa,  0.7 hi,  1.3 si,  0.0 st
%Cpu1  :  0.3 us,  3.0 sy,  0.0 ni, 83.7 id, 12.6 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu2  :  1.0 us,  0.3 sy,  0.0 ni, 98.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  :  1.3 us,  0.7 sy,  0.0 ni, 98.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st

  (3) If I don't use "taskset", "dd" generally doesn't sit on the 
"right" cpu, and so I get don't see the right "iowait".  Here's "top -d 
30".  Over this longer interval, "dd" appears to spend a quarter of its 
time on the "right" cpu:

%Cpu0  :  0.5 us,  1.2 sy,  0.0 ni, 74.4 id, 22.2 wa,  0.5 hi,  1.3 si,  0.0 st
%Cpu1  :  0.7 us,  1.3 sy,  0.0 ni, 92.3 id,  5.7 wa,  0.1 hi,  0.0 si,  0.0 st
%Cpu2  :  0.5 us,  0.9 sy,  0.0 ni, 95.1 id,  3.3 wa,  0.1 hi,  0.0 si,  0.0 st
%Cpu3  :  0.7 us,  0.9 sy,  0.0 ni, 94.6 id,  3.6 wa,  0.1 hi,  0.0 si,  0.0 st

This point (3) does not apply to 4.15.  On 4.15, it seems "dd" naturally 
sits on the "right" cpu, so I get the "right" iowait". But if I pin "dd" 
to a different cpu, I get the "wrong" iowait again.

I bisected 4.15-4.16.  The first "bad" commit was 806486c377e3 
"sched/fair: Do not migrate if the prev_cpu is idle"

  (4) I can get the "right" iowait regardless of which cpu, if I boot 
with "nohz=off", or if I suppress nohz by dynamically disabling all 
cpuidle states except for state0 (POLL).

This point (4) does not apply to 4.16. On 4.15 and 4.16, suppressing 
nohz does not help.  (So far, I did not test 4.15).

I bisected 4.16-4.17.    The first "new" commit was 554c8aa8ecad "sched: 
idle: Select idle state before stopping the tick"

  (5) I seem to get the "right" iowait regardless of which cpu, if I run 
inside a virtual machine.  I tested stock Fedora v5.1.? inside a KVM 
(virt-manager) virtual machine, which also had 4 cpus.  Whereas 
un-virtualized Fedora v5.1.? on my laptop, behaves as per points 1-4.

I read the documented limitations for "iowait" time. As far as I 
understand them, they don't explain such inconsistent values.

> - iowait: In a word, iowait stands for waiting for I/O to complete. But there
> are several problems:
>
> 1. Cpu will not wait for I/O to complete, iowait is the time that a task is
>    waiting for I/O to complete. When cpu goes into idle state for
>    outstanding task io, another task will be scheduled on this CPU.
>
> 2. In a multi-core CPU, the task waiting for I/O to complete is not running
>    on any CPU, so the iowait of each CPU is difficult to calculate.
>
> 3. The value of iowait field in /proc/stat will decrease in certain
>    conditions


Thanks for all the kernels
Alan

  parent reply	other threads:[~2019-07-05 11:25 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-01 15:33 NO_HZ_IDLE causes consistently low cpu "iowait" time (and higher cpu "idle" time) Alan Jenkins
2019-07-03 14:06 ` Doug Smythies
2019-07-03 16:09   ` Alan Jenkins
2019-07-05 11:25   ` Alan Jenkins [this message]
2019-07-05 11:38     ` iowait v.s. idle accounting is "inconsistent" - iowait is too low Peter Zijlstra
2019-07-05 13:37       ` Alan Jenkins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e82b9d7c-81e5-dd80-b9c0-f5f065344e2f@gmail.com \
    --to=alan.christopher.jenkins@gmail.com \
    --cc=dsmythies@telus.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).