public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Alan Jenkins <alan.christopher.jenkins@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>,
	linux-kernel@vger.kernel.org, Doug Smythies <dsmythies@telus.net>,
	linux-pm@vger.kernel.org
Subject: Re: iowait v.s. idle accounting is "inconsistent" - iowait is too low
Date: Fri, 5 Jul 2019 13:38:06 +0200	[thread overview]
Message-ID: <20190705113806.GP3402@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <e82b9d7c-81e5-dd80-b9c0-f5f065344e2f@gmail.com>

On Fri, Jul 05, 2019 at 12:25:46PM +0100, Alan Jenkins wrote:
> Hi, scheduler experts!
> 
> My cpu "iowait" time appears to be reported incorrectly.  Do you know why
> this could happen?

Because iowait is a magic random number that has no sane meaning.
Personally I'd prefer to just delete the whole thing, except ABI :/

Also see the comment near nr_iowait():

/*
 * IO-wait accounting, and how its mostly bollocks (on SMP).
 *
 * The idea behind IO-wait account is to account the idle time that we could
 * have spend running if it were not for IO. That is, if we were to improve the
 * storage performance, we'd have a proportional reduction in IO-wait time.
 *
 * This all works nicely on UP, where, when a task blocks on IO, we account
 * idle time as IO-wait, because if the storage were faster, it could've been
 * running and we'd not be idle.
 *
 * This has been extended to SMP, by doing the same for each CPU. This however
 * is broken.
 *
 * Imagine for instance the case where two tasks block on one CPU, only the one
 * CPU will have IO-wait accounted, while the other has regular idle. Even
 * though, if the storage were faster, both could've ran at the same time,
 * utilising both CPUs.
 *
 * This means, that when looking globally, the current IO-wait accounting on
 * SMP is a lower bound, by reason of under accounting.
 *
 * Worse, since the numbers are provided per CPU, they are sometimes
 * interpreted per CPU, and that is nonsensical. A blocked task isn't strictly
 * associated with any one particular CPU, it can wake to another CPU than it
 * blocked on. This means the per CPU IO-wait number is meaningless.
 *
 * Task CPU affinities can make all that even more 'interesting'.
 */



  reply	other threads:[~2019-07-05 11:38 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-01 15:33 NO_HZ_IDLE causes consistently low cpu "iowait" time (and higher cpu "idle" time) Alan Jenkins
2019-07-03 14:06 ` Doug Smythies
2019-07-03 16:09   ` Alan Jenkins
2019-07-05 11:25   ` iowait v.s. idle accounting is "inconsistent" - iowait is too low Alan Jenkins
2019-07-05 11:38     ` Peter Zijlstra [this message]
2019-07-05 13:37       ` Alan Jenkins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190705113806.GP3402@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=alan.christopher.jenkins@gmail.com \
    --cc=dsmythies@telus.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox