public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Robert Elliott <relliott@beardog.cce.hp.com>
To: axboe@kernel.dk, elliott@hp.com, hch@lst.de,
	linux-kernel@vger.kernel.org
Subject: [PATCH 1/2] block: default to rq_affinity=2 for blk-mq
Date: Tue, 09 Sep 2014 19:18:01 -0500	[thread overview]
Message-ID: <20140910001801.9294.79720.stgit@beardog.cce.hp.com> (raw)
In-Reply-To: <20140910001417.9294.40414.stgit@beardog.cce.hp.com>

From: Robert Elliott <elliott@hp.com>

One change introduced by blk-mq is that it does all
the completion work in hard irq context rather than
soft irq context.

On a 6 core system, if all interrupts are routed to
one CPU, then you can easily run into this:
* 5 CPUs submitting IOs
* 1 CPU spending 100% of its time in hard irq context
processing IO completions, not able to submit anything
itself

Example with CPU5 receiving all interrupts:
   CPU usage:   CPU0   CPU1   CPU2   CPU3   CPU4   CPU5
        %usr:   0.00   3.03   1.01   2.02   2.00   0.00
        %sys:  14.58  75.76  14.14   4.04  78.00   0.00
        %irq:   0.00   0.00   0.00   1.01   0.00 100.00
       %soft:   0.00   0.00   0.00   0.00   0.00   0.00
%iowait idle:  85.42  21.21  84.85  92.93  20.00   0.00
       %idle:   0.00   0.00   0.00   0.00   0.00   0.00

When the submitting CPUs are forced to process their own
completion interrupts, this steals time from new
submissions and self-throttles them.

Without that, there is no direct feedback to the
submitters to slow down.  The only feedback is:
* reaching max queue depth
* lots of timeouts, resulting in aborts, resets, soft
  lockups and self-detected stalls on CPU5, bogus
  clocksource tsc unstable reports, network
  drop-offs, etc.

The SCSI LLD can set affinity_hint for each of its
interrupts to request that a program like irqbalance
route the interrupts back to the submitting CPU.
The latest version of irqbalance ignores those hints,
though, instead offering an option to run a policy
script that could honor them. Otherwise, it balances
them based on its own algorithms. So, we cannot rely
on this.

Hardware might perform interrupt coalescing to help,
but it cannot help 1 CPU keep up with the work
generated by many other CPUs.

rq_affinity=2 helps by pushing most of the block layer
and SCSI midlayer completion work back to the submitting
CPU (via an IPI).

Change the default rq_affinity=2 under blk-mq
so there's at least some feedback to slow down the
submitters.

Signed-off-by: Robert Elliott <elliott@hp.com>
---
 include/linux/blkdev.h |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 518b465..9f41a02 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -522,7 +522,8 @@ struct request_queue {
 				 (1 << QUEUE_FLAG_ADD_RANDOM))
 
 #define QUEUE_FLAG_MQ_DEFAULT	((1 << QUEUE_FLAG_IO_STAT) |		\
-				 (1 << QUEUE_FLAG_SAME_COMP))
+				 (1 << QUEUE_FLAG_SAME_COMP)	|	\
+				 (1 << QUEUE_FLAG_SAME_FORCE))
 
 static inline void queue_lockdep_assert_held(struct request_queue *q)
 {


  reply	other threads:[~2014-09-10  0:19 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-10  0:17 [PATCH 0/2] block: rq_affinity default and reserved tag limits Robert Elliott
2014-09-10  0:18 ` Robert Elliott [this message]
2014-09-10 18:14   ` [PATCH 1/2] block: default to rq_affinity=2 for blk-mq Jens Axboe
2014-09-10 19:35     ` Elliott, Robert (Server Storage)
2014-09-10 19:51       ` Jens Axboe
2014-09-10  0:18 ` [PATCH 2/2] block: return error if too many reserved tags are requested Robert Elliott
2014-09-10 18:17   ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140910001801.9294.79720.stgit@beardog.cce.hp.com \
    --to=relliott@beardog.cce.hp.com \
    --cc=axboe@kernel.dk \
    --cc=elliott@hp.com \
    --cc=hch@lst.de \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox