[PATCH 0/6] Six blk-mq patches

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/6] Six blk-mq patches
@ 2014-12-09 15:57 Bart Van Assche
  2014-12-09 15:57 ` [PATCH 1/6] blk-mq: Fix a use-after-free Bart Van Assche
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Bart Van Assche @ 2014-12-09 15:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Christoph Hellwig, Robert Elliott, Ming Lei, Alexander Gordeev,
	linux-kernel

Hello,

While testing the SRP initiator multichannel patches that are present in 
Linus' tree and that will be included in kernel 3.19 I came up with 
several blk-mq patches. Some of these patches fix bugs I encountered 
while running SRP tests; other patches are the result of source reading. 
As usual feedback is welcome.

The patches included in this series are:

0001-blk-mq-Fix-a-use-after-free.patch
0002-blk-mq-Avoid-that-__bt_get_word-wraps-multiple-times.patch
0003-blk-mq-Fix-a-race-between-bt_clear_tag-and-bt_get.patch
0004-blk-mq-Avoid-that-I-O-hangs-in-bt_get.patch
0005-blk-mq-Use-all-available-hardware-queues.patch
0006-blk-mq-Micro-optimize-bt_get.patch

Bart.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/6] blk-mq: Fix a use-after-free
  2014-12-09 15:57 [PATCH 0/6] Six blk-mq patches Bart Van Assche
@ 2014-12-09 15:57 ` Bart Van Assche
  2014-12-09 15:58 ` [PATCH 2/6] blk-mq: Avoid that __bt_get_word() wraps multiple times Bart Van Assche
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Bart Van Assche @ 2014-12-09 15:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Christoph Hellwig, Robert Elliott, Ming Lei, Alexander Gordeev,
	linux-kernel

blk-mq users are allowed to free the memory request_queue.tag_set
points at after blk_cleanup_queue() has finished but before
blk_release_queue() has started. This can happen e.g. in the SCSI
core. The SCSI core namely embeds the tag_set structure in a SCSI
host structure. The SCSI host structure is freed by
scsi_host_dev_release(). This function is called after
blk_cleanup_queue() finished but can be called before
blk_release_queue().

This means that it is not safe to access request_queue.tag_set from
inside blk_release_queue(). Hence remove the blk_sync_queue() call
from blk_release_queue(). This call is not necessary - outstanding
requests must have finished before blk_release_queue() is
called. Additionally, move the blk_mq_free_queue() call from
blk_release_queue() to blk_cleanup_queue() to avoid that struct
request_queue.tag_set gets accessed after it has been freed.

This patch avoids that the following kernel oops can be triggered
when deleting a SCSI host for which scsi-mq was enabled:

Call Trace:
 [<ffffffff8109a7c4>] lock_acquire+0xc4/0x270
 [<ffffffff814ce111>] mutex_lock_nested+0x61/0x380
 [<ffffffff812575f0>] blk_mq_free_queue+0x30/0x180
 [<ffffffff8124d654>] blk_release_queue+0x84/0xd0
 [<ffffffff8126c29b>] kobject_cleanup+0x7b/0x1a0
 [<ffffffff8126c140>] kobject_put+0x30/0x70
 [<ffffffff81245895>] blk_put_queue+0x15/0x20
 [<ffffffff8125c409>] disk_release+0x99/0xd0
 [<ffffffff8133d056>] device_release+0x36/0xb0
 [<ffffffff8126c29b>] kobject_cleanup+0x7b/0x1a0
 [<ffffffff8126c140>] kobject_put+0x30/0x70
 [<ffffffff8125a78a>] put_disk+0x1a/0x20
 [<ffffffff811d4cb5>] __blkdev_put+0x135/0x1b0
 [<ffffffff811d56a0>] blkdev_put+0x50/0x160
 [<ffffffff81199eb4>] kill_block_super+0x44/0x70
 [<ffffffff8119a2a4>] deactivate_locked_super+0x44/0x60
 [<ffffffff8119a87e>] deactivate_super+0x4e/0x70
 [<ffffffff811b9833>] cleanup_mnt+0x43/0x90
 [<ffffffff811b98d2>] __cleanup_mnt+0x12/0x20
 [<ffffffff8107252c>] task_work_run+0xac/0xe0
 [<ffffffff81002c01>] do_notify_resume+0x61/0xa0
 [<ffffffff814d2c58>] int_signal+0x12/0x17

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Robert Elliott <elliott@hp.com>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Alexander Gordeev <agordeev@redhat.com>
Cc: <stable@vger.kernel.org> # v3.13+
---
 block/blk-core.c  |  3 +++
 block/blk-sysfs.c | 12 ++++--------
 2 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 2e7424b..a681827 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -525,6 +525,9 @@ void blk_cleanup_queue(struct request_queue *q)
 	del_timer_sync(&q->backing_dev_info.laptop_mode_wb_timer);
 	blk_sync_queue(q);
 
+	if (q->mq_ops)
+		blk_mq_free_queue(q);
+
 	spin_lock_irq(lock);
 	if (q->queue_lock != &q->__queue_lock)
 		q->queue_lock = &q->__queue_lock;
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 1fac434..935ea2a 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -492,17 +492,15 @@ static void blk_free_queue_rcu(struct rcu_head *rcu_head)
  *     Currently, its primary task it to free all the &struct request
  *     structures that were allocated to the queue and the queue itself.
  *
- * Caveat:
- *     Hopefully the low level driver will have finished any
- *     outstanding requests first...
+ * Note:
+ *     The low level driver must have finished any outstanding requests first
+ *     via blk_cleanup_queue().
  **/
 static void blk_release_queue(struct kobject *kobj)
 {
 	struct request_queue *q =
 		container_of(kobj, struct request_queue, kobj);
 
-	blk_sync_queue(q);
-
 	blkcg_exit_queue(q);
 
 	if (q->elevator) {
@@ -517,9 +515,7 @@ static void blk_release_queue(struct kobject *kobj)
 	if (q->queue_tags)
 		__blk_queue_free_tags(q);
 
-	if (q->mq_ops)
-		blk_mq_free_queue(q);
-	else
+	if (!q->mq_ops)
 		blk_free_flush_queue(q->fq);
 
 	blk_trace_shutdown(q);
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/6] blk-mq: Avoid that __bt_get_word() wraps multiple times
  2014-12-09 15:57 [PATCH 0/6] Six blk-mq patches Bart Van Assche
  2014-12-09 15:57 ` [PATCH 1/6] blk-mq: Fix a use-after-free Bart Van Assche
@ 2014-12-09 15:58 ` Bart Van Assche
  2014-12-09 15:58 ` [PATCH 3/6] blk-mq: Fix a race between bt_clear_tag() and bt_get() Bart Van Assche
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Bart Van Assche @ 2014-12-09 15:58 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Christoph Hellwig, Robert Elliott, Ming Lei, Alexander Gordeev,
	linux-kernel

If __bt_get_word() is called with last_tag != 0, if the first
find_next_zero_bit() fails, if after wrap-around the
test_and_set_bit() call fails and find_next_zero_bit() succeeds,
if the next test_and_set_bit() call fails and subsequently
find_next_zero_bit() does not find a zero bit, then another
wrap-around will occur. Avoid this by introducing an additional
local variable.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Robert Elliott <elliott@hp.com>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Alexander Gordeev <agordeev@redhat.com>
Cc: <stable@vger.kernel.org> # v3.13+
---
 block/blk-mq-tag.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 728b9a4..5a3db28 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -137,6 +137,7 @@ static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx,
 static int __bt_get_word(struct blk_align_bitmap *bm, unsigned int last_tag)
 {
 	int tag, org_last_tag, end;
+	bool wrap = last_tag != 0;
 
 	org_last_tag = last_tag;
 	end = bm->depth;
@@ -148,8 +149,9 @@ restart:
 			 * We started with an offset, start from 0 to
 			 * exhaust the map.
 			 */
-			if (org_last_tag && last_tag) {
-				end = last_tag;
+			if (wrap) {
+				wrap = false;
+				end = org_last_tag;
 				last_tag = 0;
 				goto restart;
 			}
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/6] blk-mq: Fix a race between bt_clear_tag() and bt_get()
  2014-12-09 15:57 [PATCH 0/6] Six blk-mq patches Bart Van Assche
  2014-12-09 15:57 ` [PATCH 1/6] blk-mq: Fix a use-after-free Bart Van Assche
  2014-12-09 15:58 ` [PATCH 2/6] blk-mq: Avoid that __bt_get_word() wraps multiple times Bart Van Assche
@ 2014-12-09 15:58 ` Bart Van Assche
  2014-12-09 15:58 ` [PATCH 4/6] blk-mq: Avoid that I/O hangs in bt_get() Bart Van Assche
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Bart Van Assche @ 2014-12-09 15:58 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Christoph Hellwig, Robert Elliott, Ming Lei, Alexander Gordeev,
	linux-kernel

What we need is the following two guarantees:
* Any thread that observes the effect of the test_and_set_bit() by
  __bt_get_word() also observes the preceding addition of 'current'
  to the appropriate wait list. This is guaranteed by the semantics
  of the spin_unlock() operation performed by prepare_and_wait().
  Hence the conversion of test_and_set_bit_lock() into
  test_and_set_bit().
* The wait lists are examined by bt_clear() after the tag bit has
  been cleared. clear_bit_unlock() guarantees that any thread that
  observes that the bit has been cleared also observes the store
  operations preceding clear_bit_unlock(). However,
  clear_bit_unlock() does not prevent that the wait lists are examined
  before that the tag bit is cleared. Hence the addition of a memory
  barrier between clear_bit() and the wait list examination.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Robert Elliott <elliott@hp.com>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Alexander Gordeev <agordeev@redhat.com>
Cc: <stable@vger.kernel.org> # v3.13+
---
 block/blk-mq-tag.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 5a3db28..70d9f9e 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -158,7 +158,7 @@ restart:
 			return -1;
 		}
 		last_tag = tag + 1;
-	} while (test_and_set_bit_lock(tag, &bm->word));
+	} while (test_and_set_bit(tag, &bm->word));

 	return tag;
 }
@@ -342,11 +342,10 @@ static void bt_clear_tag(struct blk_mq_bitmap_tags *bt, unsigned int tag)
 	struct bt_wait_state *bs;
 	int wait_cnt;

-	/*
-	 * The unlock memory barrier need to order access to req in free
-	 * path and clearing tag bit
-	 */
-	clear_bit_unlock(TAG_TO_BIT(bt, tag), &bt->map[index].word);
+	clear_bit(TAG_TO_BIT(bt, tag), &bt->map[index].word);
+
+	/* Ensure that the wait list checks occur after clear_bit(). */
+	smp_mb();

 	bs = bt_wake_ptr(bt);
 	if (!bs)
-- 
2.1.2

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/6] blk-mq: Avoid that I/O hangs in bt_get()
  2014-12-09 15:57 [PATCH 0/6] Six blk-mq patches Bart Van Assche
                   ` (2 preceding siblings ...)
  2014-12-09 15:58 ` [PATCH 3/6] blk-mq: Fix a race between bt_clear_tag() and bt_get() Bart Van Assche
@ 2014-12-09 15:58 ` Bart Van Assche
  2014-12-09 16:10   ` Jens Axboe
  2014-12-09 15:59 ` [PATCH 5/6] blk-mq: Use all available hardware queues Bart Van Assche
  2014-12-09 15:59 ` [PATCH 6/6] blk-mq: Micro-optimize bt_get() Bart Van Assche
  5 siblings, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2014-12-09 15:58 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Christoph Hellwig, Robert Elliott, Ming Lei, Alexander Gordeev,
	linux-kernel

Avoid that if there are fewer hardware queues than CPU threads that
bt_get() can hang. The symptoms of the hang were as follows:
* All tags allocated for a particular hardware queue.
* (nr_tags) pending commands for that hardware queue.
* No pending commands for the software queues associated with that
  hardware queue.

The call stack that corresponds to the hang is as follows:

io_schedule+0x9c/0x130
bt_get+0xef/0x180
blk_mq_get_tag+0x9f/0xd0
__blk_mq_alloc_request+0x16/0x1f0
blk_mq_map_request+0x123/0x130
blk_mq_make_request+0x69/0x280
generic_make_request+0xc0/0x110
submit_bio+0x64/0x130
do_blockdev_direct_IO+0x1dc8/0x2da0
__blockdev_direct_IO+0x47/0x50
blkdev_direct_IO+0x49/0x50
generic_file_read_iter+0x546/0x610
blkdev_read_iter+0x32/0x40
aio_run_iocb+0x1f8/0x400
do_io_submit+0x121/0x490
SyS_io_submit+0xb/0x10
system_call_fastpath+0x12/0x17

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Robert Elliott <elliott@hp.com>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Alexander Gordeev <agordeev@redhat.com>
Cc: <stable@vger.kernel.org> # v3.13+
---
 block/blk-mq-tag.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 70d9f9e..795996e 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -256,6 +256,12 @@ static int bt_get(struct blk_mq_alloc_data *data,
 		if (tag != -1)
 			break;
 
+		blk_mq_run_hw_queue(hctx, false);
+
+		tag = __bt_get(hctx, bt, last_tag);
+		if (tag != -1)
+			break;
+
 		blk_mq_put_ctx(data->ctx);
 
 		io_schedule();
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 5/6] blk-mq: Use all available hardware queues
  2014-12-09 15:57 [PATCH 0/6] Six blk-mq patches Bart Van Assche
                   ` (3 preceding siblings ...)
  2014-12-09 15:58 ` [PATCH 4/6] blk-mq: Avoid that I/O hangs in bt_get() Bart Van Assche
@ 2014-12-09 15:59 ` Bart Van Assche
  2014-12-09 16:10   ` Jens Axboe
  2014-12-09 15:59 ` [PATCH 6/6] blk-mq: Micro-optimize bt_get() Bart Van Assche
  5 siblings, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2014-12-09 15:59 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Christoph Hellwig, Robert Elliott, Ming Lei, Alexander Gordeev,
	linux-kernel

Suppose that a system has two CPU sockets, three cores per socket,
that it does not support hyperthreading and that four hardware
queues are provided by a block driver. With the current algorithm
this will lead to the following assignment of CPU cores to hardware
queues:

  HWQ 0: 0 1
  HWQ 1: 2 3
  HWQ 2: 4 5
  HWQ 3: (none)

This patch changes the queue assignment into:

  HWQ 0: 0 1
  HWQ 1: 2
  HWQ 2: 3 4
  HWQ 3: 5

In other words, this patch has the following three effects:
- All four hardware queues are used instead of only three.
- CPU cores are spread more evenly over hardware queues. For the
  above example the range of the number of CPU cores associated
  with a single HWQ is reduced from [0..2] to [1..2].
- If the number of HWQ's is a multiple of the number of CPU sockets
  it is now guaranteed that all CPU cores associated with a single
  HWQ reside on the same CPU socket.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Alexander Gordeev <agordeev@redhat.com>
---
 block/blk-mq-cpumap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c
index 1065d7c..8e56455 100644
--- a/block/blk-mq-cpumap.c
+++ b/block/blk-mq-cpumap.c
@@ -17,7 +17,7 @@
 static int cpu_to_queue_index(unsigned int nr_cpus, unsigned int nr_queues,
 			      const int cpu)
 {
-	return cpu / ((nr_cpus + nr_queues - 1) / nr_queues);
+	return cpu * nr_queues / nr_cpus;
 }
 
 static int get_first_sibling(unsigned int cpu)
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 6/6] blk-mq: Micro-optimize bt_get()
  2014-12-09 15:57 [PATCH 0/6] Six blk-mq patches Bart Van Assche
                   ` (4 preceding siblings ...)
  2014-12-09 15:59 ` [PATCH 5/6] blk-mq: Use all available hardware queues Bart Van Assche
@ 2014-12-09 15:59 ` Bart Van Assche
  2014-12-09 16:06   ` Jens Axboe
  5 siblings, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2014-12-09 15:59 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Christoph Hellwig, Robert Elliott, Ming Lei, Alexander Gordeev,
	linux-kernel

Remove a superfluous finish_wait() call. Convert the two bt_wait_ptr()
calls into a single call.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Robert Elliott <elliott@hp.com>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Alexander Gordeev <agordeev@redhat.com>
---
 block/blk-mq-tag.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 795996e..2373724 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -248,8 +248,8 @@ static int bt_get(struct blk_mq_alloc_data *data,
 	if (!(data->gfp & __GFP_WAIT))
 		return -1;
 
-	bs = bt_wait_ptr(bt, hctx);
 	do {
+		bs = bt_wait_ptr(bt, hctx);
 		prepare_to_wait(&bs->wait, &wait, TASK_UNINTERRUPTIBLE);
 
 		tag = __bt_get(hctx, bt, last_tag);
@@ -276,8 +276,6 @@ static int bt_get(struct blk_mq_alloc_data *data,
 			hctx = data->hctx;
 			bt = &hctx->tags->bitmap_tags;
 		}
-		finish_wait(&bs->wait, &wait);
-		bs = bt_wait_ptr(bt, hctx);
 	} while (1);
 
 	finish_wait(&bs->wait, &wait);
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 6/6] blk-mq: Micro-optimize bt_get()
  2014-12-09 15:59 ` [PATCH 6/6] blk-mq: Micro-optimize bt_get() Bart Van Assche
@ 2014-12-09 16:06   ` Jens Axboe
  0 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2014-12-09 16:06 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Christoph Hellwig, Robert Elliott, Ming Lei, Alexander Gordeev,
	linux-kernel

On 12/09/2014 08:59 AM, Bart Van Assche wrote:
> Remove a superfluous finish_wait() call. Convert the two bt_wait_ptr()
> calls into a single call.

I had to double check this, but looks OK to me. And it's better than a 
micro-optimization, should be a nice win for the tag contended case.


-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 4/6] blk-mq: Avoid that I/O hangs in bt_get()
  2014-12-09 15:58 ` [PATCH 4/6] blk-mq: Avoid that I/O hangs in bt_get() Bart Van Assche
@ 2014-12-09 16:10   ` Jens Axboe
  0 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2014-12-09 16:10 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Christoph Hellwig, Robert Elliott, Ming Lei, Alexander Gordeev,
	linux-kernel

On 12/09/2014 08:58 AM, Bart Van Assche wrote:
> Avoid that if there are fewer hardware queues than CPU threads that
> bt_get() can hang. The symptoms of the hang were as follows:
> * All tags allocated for a particular hardware queue.
> * (nr_tags) pending commands for that hardware queue.
> * No pending commands for the software queues associated with that
>    hardware queue.

I already queued up the previous one, and added the extra tag get myself.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 5/6] blk-mq: Use all available hardware queues
  2014-12-09 15:59 ` [PATCH 5/6] blk-mq: Use all available hardware queues Bart Van Assche
@ 2014-12-09 16:10   ` Jens Axboe
  0 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2014-12-09 16:10 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Christoph Hellwig, Robert Elliott, Ming Lei, Alexander Gordeev,
	linux-kernel

On 12/09/2014 08:59 AM, Bart Van Assche wrote:
> Suppose that a system has two CPU sockets, three cores per socket,
> that it does not support hyperthreading and that four hardware
> queues are provided by a block driver. With the current algorithm
> this will lead to the following assignment of CPU cores to hardware
> queues:
>
>    HWQ 0: 0 1
>    HWQ 1: 2 3
>    HWQ 2: 4 5
>    HWQ 3: (none)
>
> This patch changes the queue assignment into:
>
>    HWQ 0: 0 1
>    HWQ 1: 2
>    HWQ 2: 3 4
>    HWQ 3: 5
>
> In other words, this patch has the following three effects:
> - All four hardware queues are used instead of only three.
> - CPU cores are spread more evenly over hardware queues. For the
>    above example the range of the number of CPU cores associated
>    with a single HWQ is reduced from [0..2] to [1..2].
> - If the number of HWQ's is a multiple of the number of CPU sockets
>    it is now guaranteed that all CPU cores associated with a single
>    HWQ reside on the same CPU socket.

I have thought about this since your last posting, and I think it should 
be a win for most cases to do this, even if we end up with asymmetric 
queue <-> cpu mappings.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-12-09 16:10 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-09 15:57 [PATCH 0/6] Six blk-mq patches Bart Van Assche
2014-12-09 15:57 ` [PATCH 1/6] blk-mq: Fix a use-after-free Bart Van Assche
2014-12-09 15:58 ` [PATCH 2/6] blk-mq: Avoid that __bt_get_word() wraps multiple times Bart Van Assche
2014-12-09 15:58 ` [PATCH 3/6] blk-mq: Fix a race between bt_clear_tag() and bt_get() Bart Van Assche
2014-12-09 15:58 ` [PATCH 4/6] blk-mq: Avoid that I/O hangs in bt_get() Bart Van Assche
2014-12-09 16:10   ` Jens Axboe
2014-12-09 15:59 ` [PATCH 5/6] blk-mq: Use all available hardware queues Bart Van Assche
2014-12-09 16:10   ` Jens Axboe
2014-12-09 15:59 ` [PATCH 6/6] blk-mq: Micro-optimize bt_get() Bart Van Assche
2014-12-09 16:06   ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).