From: Alan Jenkins <alan.christopher.jenkins@gmail.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>,
linux-kernel@vger.kernel.org, Doug Smythies <dsmythies@telus.net>,
linux-pm@vger.kernel.org
Subject: Re: iowait v.s. idle accounting is "inconsistent" - iowait is too low
Date: Fri, 5 Jul 2019 14:37:44 +0100 [thread overview]
Message-ID: <26e7faef-7223-3ef8-d09c-e382223ce4fa@gmail.com> (raw)
In-Reply-To: <20190705113806.GP3402@hirez.programming.kicks-ass.net>
On 05/07/2019 12:38, Peter Zijlstra wrote:
> On Fri, Jul 05, 2019 at 12:25:46PM +0100, Alan Jenkins wrote:
>> Hi, scheduler experts!
>>
>> My cpu "iowait" time appears to be reported incorrectly. Do you know why
>> this could happen?
> Because iowait is a magic random number that has no sane meaning.
> Personally I'd prefer to just delete the whole thing, except ABI :/
>
> Also see the comment near nr_iowait():
>
> /*
> * IO-wait accounting, and how its mostly bollocks (on SMP).
> *
> * The idea behind IO-wait account is to account the idle time that we could
> * have spend running if it were not for IO. That is, if we were to improve the
> * storage performance, we'd have a proportional reduction in IO-wait time.
> *
> * This all works nicely on UP, where, when a task blocks on IO, we account
> * idle time as IO-wait, because if the storage were faster, it could've been
> * running and we'd not be idle.
> *
> * This has been extended to SMP, by doing the same for each CPU. This however
> * is broken.
> *
> * Imagine for instance the case where two tasks block on one CPU, only the one
> * CPU will have IO-wait accounted, while the other has regular idle. Even
> * though, if the storage were faster, both could've ran at the same time,
> * utilising both CPUs.
> *
> * This means, that when looking globally, the current IO-wait accounting on
> * SMP is a lower bound, by reason of under accounting.
> *
> * Worse, since the numbers are provided per CPU, they are sometimes
> * interpreted per CPU, and that is nonsensical. A blocked task isn't strictly
> * associated with any one particular CPU, it can wake to another CPU than it
> * blocked on. This means the per CPU IO-wait number is meaningless.
> *
> * Task CPU affinities can make all that even more 'interesting'.
> */
Thanks. I take those as being different problems, but you mean there is
not much demand (or point) to "fix" my issue.
> (2) Compare running "dd" with "taskset -c 1":
>
> %Cpu1 : 0.3 us, 3.0 sy, 0.0 ni, 83.7 id, 12.6 wa, 0.0 hi, 0.3 si, 0.0 st
^ non-zero idle time for Cpu1, despite the pinned IO hog.
The block layer recently decided they could break "disk busy%" reporting
for slow devices (mechanical HDD), in order to reduce overheads for fast
devices. This means the summary view in "atop" now lacks any reliable
indicator.
I suppose I need to look in "iotop".
The new /proc/pressure/io seems to have caveats related to the iowait
issues... it seems even more complex to interpret for this case, and it
does not seem to work how I think it does.[1]
Regards
Alan
[1]
https://unix.stackexchange.com/questions/527342/why-does-the-new-linux-pressure-stall-information-for-io-not-show-as-100/
prev parent reply other threads:[~2019-07-05 13:37 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-01 15:33 NO_HZ_IDLE causes consistently low cpu "iowait" time (and higher cpu "idle" time) Alan Jenkins
2019-07-03 14:06 ` Doug Smythies
2019-07-03 16:09 ` Alan Jenkins
2019-07-05 11:25 ` iowait v.s. idle accounting is "inconsistent" - iowait is too low Alan Jenkins
2019-07-05 11:38 ` Peter Zijlstra
2019-07-05 13:37 ` Alan Jenkins [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=26e7faef-7223-3ef8-d09c-e382223ce4fa@gmail.com \
--to=alan.christopher.jenkins@gmail.com \
--cc=dsmythies@telus.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).