All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@redhat.com>
To: Keith Busch <keith.busch@intel.com>,
	Bart Van Assche <bvanassche@acm.org>,
	Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@infradead.org>,
	Jun'ichi Nomura <j-nomura@ce.jp.nec.com>,
	device-mapper development <dm-devel@redhat.com>
Subject: Re: blk-mq request allocation stalls [was: Re: [PATCH v3 0/8] dm: add request-based blk-mq support]
Date: Fri, 9 Jan 2015 14:49:56 -0500	[thread overview]
Message-ID: <20150109194955.GA32641@redhat.com> (raw)
In-Reply-To: <alpine.LNX.2.00.1501071755110.4026@localhost.lm.intel.com>

[-- Attachment #1: Type: text/plain, Size: 9291 bytes --]

On Wed, Jan 07 2015 at  3:40pm -0500,
Keith Busch <keith.busch@intel.com> wrote:

> On Wed, 7 Jan 2015, Bart Van Assche wrote:
> >On 01/06/15 17:15, Jens Axboe wrote:
> >>blk-mq request allocation is pretty much as optimized/fast as it can be.
> >>The slowdown must be due to one of two reasons:
> >>
> >>- A bug related to running out of requests, perhaps a missing queue run
> >>or something like that.
> >>- A smaller number of available requests, due to the requested queue depth.
> >>
> >>Looking at Barts results, it looks like it's usually fast, but sometimes
> >>very slow. That would seem to indicate it's option #1 above that is the
> >>issue. Bart, since this seems to wait for quite a bit, would it be
> >>possible to cat the 'tags' file for that queue when it is stuck like that?
> >
> >Hello Jens,
> >
> >Thanks for the assistance. Is this the output you were looking for
> 
> I'm a little confused by the later comments given the below data. It says
> multipath_clone_and_map() is stuck at bt_get, but that doesn't block
> unless there are no tags available. The tags should be coming from one
> of dm-1's path queues, and I'm assuming these queues are provided by sdc
> and sdd. All their tags are free, so that looks like a missing wake_up
> when the queue idles.

Like I said in an earlier email, I cannot reproduce Bart's hangs running
mkfs.xfs against a multipath device that is built ontop of a virtio
device in a KVM guest.

But I can hit __bt_get() failures on the virtio-blk device that I'm
using for the root device on this guest.  Bart I'd be interested to see
what you get when running the attached debug patch (likely will just
echo the same type of info you've already provided).

There does appear to be something weird going on with bt_get().  With
the debug patch I'm seeing the following when I simply run "make install" 
of the kernel (it'll run dracut to build the initramfs, etc):

You'll note that in all instances where __bt_get() returns -1 nr_free isn't 0.

[  441.332886] bt_get: __bt_get() returned -1
[  441.336246] queue_num=0, nr_tags=128, reserved_tags=0, bits_per_word=5
[  441.338368] nr_free=128, nr_reserved=0
[  441.340076] active_queues=0
[  441.341636] CPU: 2 PID: 190 Comm: jbd2/vda3-8 Tainted: G        W      3.18.0+ #18
[  441.343897] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  441.345904]  0000000000000000 0000000012f1ab3d ffff8801198cb938 ffffffff8168b8fb
[  441.348283]  0000000000000000 00000000ffffffff ffff8801198cb9b8 ffffffff81307f53
[  441.350645]  0000000000000000 ffffffff81304bd8 0000000000000000 ffff8800358e43b0
[  441.352996] Call Trace:
[  441.354544]  [<ffffffff8168b8fb>] dump_stack+0x4c/0x65
[  441.356472]  [<ffffffff81307f53>] bt_get+0xc3/0x210
[  441.358346]  [<ffffffff81304bd8>] ? blk_mq_queue_enter+0x38/0x220
[  441.360409]  [<ffffffff810bfc20>] ? prepare_to_wait_event+0x110/0x110
[  441.362524]  [<ffffffff813083af>] blk_mq_get_tag+0xbf/0xf0
[  441.364480]  [<ffffffff81303a4b>] __blk_mq_alloc_request+0x1b/0x210
[  441.366536]  [<ffffffff813054f0>] blk_mq_map_request+0xd0/0x240
[  441.368542]  [<ffffffff81306b2a>] blk_sq_make_request+0x9a/0x3b0
[  441.370564]  [<ffffffff812f6f44>] ? generic_make_request_checks+0x2a4/0x410
[  441.372719]  [<ffffffff8118bb95>] ? mempool_alloc_slab+0x15/0x20
[  441.374725]  [<ffffffff812f7190>] generic_make_request+0xe0/0x130
[  441.376746]  [<ffffffff812f7257>] submit_bio+0x77/0x150
[  441.378642]  [<ffffffff812f1e96>] ? bio_alloc_bioset+0x1d6/0x330
[  441.380648]  [<ffffffff8123d6e9>] _submit_bh+0x119/0x170
[  441.382552]  [<ffffffff8123d750>] submit_bh+0x10/0x20
[  441.384460]  [<ffffffffa015ab34>] jbd2_journal_commit_transaction+0x774/0x1b90 [jbd2]
[  441.386763]  [<ffffffff816948f6>] ? _raw_spin_unlock_irqrestore+0x36/0x70
[  441.388903]  [<ffffffff810eccd5>] ? del_timer_sync+0x5/0xf0
[  441.390864]  [<ffffffffa015fa24>] kjournald2+0xd4/0x2a0 [jbd2]
[  441.392860]  [<ffffffff810bfc20>] ? prepare_to_wait_event+0x110/0x110
[  441.394952]  [<ffffffffa015f950>] ? commit_timeout+0x10/0x10 [jbd2]
[  441.397054]  [<ffffffff8109a3f7>] kthread+0x107/0x120
[  441.398948]  [<ffffffff8109a2f0>] ? kthread_create_on_node+0x240/0x240
[  441.401053]  [<ffffffff8169533c>] ret_from_fork+0x7c/0xb0
[  441.403003]  [<ffffffff8109a2f0>] ? kthread_create_on_node+0x240/0x240
[  446.719847]
[  446.719847] bt_get: __bt_get() returned -1
[  446.724288] queue_num=0, nr_tags=128, reserved_tags=0, bits_per_word=5
[  446.727624] nr_free=128, nr_reserved=0
[  446.730070] active_queues=0
[  446.732347] CPU: 2 PID: 190 Comm: jbd2/vda3-8 Tainted: G        W      3.18.0+ #18
[  446.735856] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  446.738912]  0000000000000000 0000000012f1ab3d ffff8801198cb938 ffffffff8168b8fb
[  446.742712]  0000000000000000 00000000ffffffff ffff8801198cb9b8 ffffffff81307f53
[  446.746464]  0000000000000000 ffffffff81304bd8 0000000000000000 ffff8800358e43b0
[  446.750217] Call Trace:
[  446.752287]  [<ffffffff8168b8fb>] dump_stack+0x4c/0x65
[  446.755089]  [<ffffffff81307f53>] bt_get+0xc3/0x210
[  446.757844]  [<ffffffff81304bd8>] ? blk_mq_queue_enter+0x38/0x220
[  446.760964]  [<ffffffff810bfc20>] ? prepare_to_wait_event+0x110/0x110
[  446.764105]  [<ffffffff813083af>] blk_mq_get_tag+0xbf/0xf0
[  446.766956]  [<ffffffff81303a4b>] __blk_mq_alloc_request+0x1b/0x210
[  446.770087]  [<ffffffff813054f0>] blk_mq_map_request+0xd0/0x240
[  446.773151]  [<ffffffff81306b2a>] blk_sq_make_request+0x9a/0x3b0
[  446.776208]  [<ffffffff812f6f44>] ? generic_make_request_checks+0x2a4/0x410
[  446.779576]  [<ffffffff8118bb95>] ? mempool_alloc_slab+0x15/0x20
[  446.782655]  [<ffffffff812f7190>] generic_make_request+0xe0/0x130
[  446.785727]  [<ffffffff812f7257>] submit_bio+0x77/0x150
[  446.788477]  [<ffffffff812f1e96>] ? bio_alloc_bioset+0x1d6/0x330
[  446.791533]  [<ffffffff8123d6e9>] _submit_bh+0x119/0x170
[  446.794317]  [<ffffffff8123d750>] submit_bh+0x10/0x20
[  446.797051]  [<ffffffffa015ab34>] jbd2_journal_commit_transaction+0x774/0x1b90 [jbd2]
[  446.800564]  [<ffffffff816948f6>] ? _raw_spin_unlock_irqrestore+0x36/0x70
[  446.803908]  [<ffffffff810eccd5>] ? del_timer_sync+0x5/0xf0
[  446.806922]  [<ffffffffa015fa24>] kjournald2+0xd4/0x2a0 [jbd2]
[  446.810019]  [<ffffffff810bfc20>] ? prepare_to_wait_event+0x110/0x110
[  446.813177]  [<ffffffffa015f950>] ? commit_timeout+0x10/0x10 [jbd2]
[  446.816362]  [<ffffffff8109a3f7>] kthread+0x107/0x120
[  446.819165]  [<ffffffff8109a2f0>] ? kthread_create_on_node+0x240/0x240
[  446.822359]  [<ffffffff8169533c>] ret_from_fork+0x7c/0xb0
[  446.825195]  [<ffffffff8109a2f0>] ? kthread_create_on_node+0x240/0x240
[  471.843384]
[  471.843384] bt_get: __bt_get() returned -1
[  471.847400] queue_num=0, nr_tags=128, reserved_tags=0, bits_per_word=5
[  471.849502] nr_free=127, nr_reserved=0
[  471.851226] active_queues=0
[  471.852765] CPU: 3 PID: 6 Comm: kworker/u8:0 Tainted: G        W      3.18.0+ #18
[  471.855007] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  471.857018] Workqueue: writeback bdi_writeback_workfn (flush-253:0)
[  471.859133]  0000000000000000 00000000bbb0d8f3 ffff88011ab8b6c8 ffffffff8168b8fb
[  471.861481]  0000000000000000 00000000ffffffff ffff88011ab8b748 ffffffff81307f53
[  471.863826]  0000000000000000 ffffffff81304bd8 0000000000000000 ffff88011ab80000
[  471.866181] Call Trace:
[  471.867703]  [<ffffffff8168b8fb>] dump_stack+0x4c/0x65
[  471.869628]  [<ffffffff81307f53>] bt_get+0xc3/0x210
[  471.871494]  [<ffffffff81304bd8>] ? blk_mq_queue_enter+0x38/0x220
[  471.873539]  [<ffffffff810bfc20>] ? prepare_to_wait_event+0x110/0x110
[  471.875640]  [<ffffffff813083af>] blk_mq_get_tag+0xbf/0xf0
[  471.877598]  [<ffffffff81303a4b>] __blk_mq_alloc_request+0x1b/0x210
[  471.879653]  [<ffffffff813054f0>] blk_mq_map_request+0xd0/0x240
[  471.881645]  [<ffffffff81306b2a>] blk_sq_make_request+0x9a/0x3b0
[  471.883653]  [<ffffffff812f6f44>] ? generic_make_request_checks+0x2a4/0x410
[  471.885798]  [<ffffffff812f7190>] generic_make_request+0xe0/0x130
[  471.887807]  [<ffffffff812f7257>] submit_bio+0x77/0x150
[  471.889699]  [<ffffffffa017ebb5>] ? ext4_writepages+0xa75/0xde0 [ext4]
[  471.891792]  [<ffffffffa01824c9>] ext4_io_submit+0x29/0x50 [ext4]
[  471.893819]  [<ffffffffa017eaab>] ext4_writepages+0x96b/0xde0 [ext4]
[  471.895879]  [<ffffffff81195251>] do_writepages+0x21/0x50
[  471.897795]  [<ffffffff81230b75>] __writeback_single_inode+0x55/0x2c0
[  471.899866]  [<ffffffff81231b36>] writeback_sb_inodes+0x2a6/0x4a0
[  471.901887]  [<ffffffff81207ed4>] ? grab_super_passive+0x44/0x90
[  471.903898]  [<ffffffff81231dcf>] __writeback_inodes_wb+0x9f/0xd0
[  471.905919]  [<ffffffff81232733>] wb_writeback+0x323/0x350
[  471.907845]  [<ffffffff81234b3d>] bdi_writeback_workfn+0x32d/0x4f0
[  471.909878]  [<ffffffff81094471>] process_one_work+0x1d1/0x520
[  471.911858]  [<ffffffff810943fa>] ? process_one_work+0x15a/0x520
[  471.913867]  [<ffffffff810948db>] worker_thread+0x11b/0x4a0
[  471.915809]  [<ffffffff810947c0>] ? process_one_work+0x520/0x520
[  471.917820]  [<ffffffff8109a3f7>] kthread+0x107/0x120
[  471.919677]  [<ffffffff8109a2f0>] ? kthread_create_on_node+0x240/0x240
[  471.921740]  [<ffffffff8169533c>] ret_from_fork+0x7c/0xb0
[  471.923630]  [<ffffffff8109a2f0>] ? kthread_create_on_node+0x240/0x240

[-- Attachment #2: blk-mq-debug-bt_get.patch --]
[-- Type: text/plain, Size: 1659 bytes --]

 block/blk-mq-tag.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 32e8dbb..8d4b4f0 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -232,6 +232,27 @@ static struct bt_wait_state *bt_wait_ptr(struct blk_mq_bitmap_tags *bt,
 	return bs;
 }
 
+static unsigned int bt_unused_tags(struct blk_mq_bitmap_tags *bt);
+
+static void print_hctx_tags_usage(struct blk_mq_hw_ctx *hctx)
+{
+	unsigned int free, res;
+	struct blk_mq_tags *tags = hctx->tags;
+
+	if (!tags)
+		return;
+
+	printk("queue_num=%d, nr_tags=%u, reserved_tags=%u, bits_per_word=%u\n",
+	       hctx->queue_num, tags->nr_tags, tags->nr_reserved_tags,
+	       tags->bitmap_tags.bits_per_word);
+
+	free = bt_unused_tags(&tags->bitmap_tags);
+	res = bt_unused_tags(&tags->breserved_tags);
+
+	printk("nr_free=%u, nr_reserved=%u\n", free, res);
+	printk("active_queues=%u\n", atomic_read(&tags->active_queues));
+}
+
 static int bt_get(struct blk_mq_alloc_data *data,
 		struct blk_mq_bitmap_tags *bt,
 		struct blk_mq_hw_ctx *hctx,
@@ -245,6 +266,10 @@ static int bt_get(struct blk_mq_alloc_data *data,
 	if (tag != -1)
 		return tag;
 
+	printk("\n%s: __bt_get() returned -1\n", __func__);
+	print_hctx_tags_usage(hctx);
+	dump_stack();
+
 	if (!(data->gfp & __GFP_WAIT))
 		return -1;
 
@@ -256,6 +281,9 @@ static int bt_get(struct blk_mq_alloc_data *data,
 		if (tag != -1)
 			break;
 
+		printk("%s: __bt_get() _still_ returned -1\n", __func__);
+		print_hctx_tags_usage(hctx);
+
 		/*
 		 * We're out of tags on this hardware queue, kick any
 		 * pending IO submits before going to sleep waiting for

[-- Attachment #3: Type: text/plain, Size: 0 bytes --]



  reply	other threads:[~2015-01-09 19:49 UTC|newest]

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-17  3:59 [PATCH v3 0/8] dm: add request-based blk-mq support Mike Snitzer
2014-12-17  3:59 ` [PATCH v3 1/8] block: require blk_rq_prep_clone() be given an initialized clone request Mike Snitzer
2014-12-17  3:59 ` [PATCH v3 2/8] block: initialize bio member of blk-mq request to NULL Mike Snitzer
2014-12-17  3:59 ` [PATCH v3 3/8] block: add blk-mq support to blk_insert_cloned_request() Mike Snitzer
2014-12-17  4:00 ` [PATCH v3 4/8] block: mark blk-mq devices as stackable Mike Snitzer
2014-12-17  4:00 ` [PATCH v3 5/8] dm: remove exports for request-based interfaces without external callers Mike Snitzer
2014-12-17  4:00 ` [PATCH v3 6/8] dm: split request structure out from dm_rq_target_io structure Mike Snitzer
2014-12-17  4:00 ` [PATCH v3 7/8] dm: submit stacked requests in irq enabled context Mike Snitzer
2014-12-17  4:00 ` [PATCH v3 8/8] dm: allocate requests from target when stacking on blk-mq devices Mike Snitzer
2014-12-17 22:35   ` Mike Snitzer
2014-12-17 21:42 ` [PATCH v3 0/8] dm: add request-based blk-mq support Keith Busch
2014-12-17 21:43   ` Jens Axboe
2014-12-17 23:06     ` Mike Snitzer
2014-12-18  1:41       ` Keith Busch
2014-12-18  4:58         ` Mike Snitzer
2014-12-19 14:32       ` Bart Van Assche
2014-12-19 15:38         ` Mike Snitzer
2014-12-19 17:14           ` Mike Snitzer
2014-12-22 15:28             ` Bart Van Assche
2014-12-22 18:49               ` Mike Snitzer
2014-12-23 16:24                 ` Bart Van Assche
2014-12-23 17:13                   ` Mike Snitzer
2014-12-23 21:42                     ` Mike Snitzer
2014-12-24 13:02                       ` Bart Van Assche
2014-12-24 18:21                         ` Mike Snitzer
2014-12-24 18:55                           ` Mike Snitzer
2014-12-24 19:26                             ` Mike Snitzer
2015-01-02 17:53                               ` Bart Van Assche
2015-01-05 21:35                                 ` Mike Snitzer
2015-01-06  8:59                                   ` Christoph Hellwig
2015-01-06  9:31                                   ` Bart Van Assche
2015-01-06 16:05                                     ` blk-mq request allocation stalls [was: Re: [PATCH v3 0/8] dm: add request-based blk-mq support] Mike Snitzer
2015-01-06 16:15                                       ` Jens Axboe
2015-01-07 10:33                                         ` Bart Van Assche
2015-01-07 15:32                                           ` Jens Axboe
2015-01-07 16:15                                             ` Mike Snitzer
2015-01-07 16:18                                               ` Jens Axboe
2015-01-07 16:22                                               ` Mike Snitzer
2015-01-07 16:24                                                 ` Jens Axboe
2015-01-07 17:18                                                   ` Mike Snitzer
2015-01-07 17:35                                                     ` Jens Axboe
2015-01-07 20:09                                                       ` Mike Snitzer
2015-01-07 20:40                                           ` Keith Busch
2015-01-09 19:49                                             ` Mike Snitzer [this message]
2015-01-09 21:07                                               ` Jens Axboe
2015-01-09 21:11                                                 ` Jens Axboe
2015-01-09 21:40                                                   ` Mike Snitzer
2015-01-09 21:56                                                     ` Jens Axboe
2015-01-09 22:25                                                       ` Mike Snitzer
2015-01-10  0:27                                                         ` Jens Axboe
2015-01-10  1:48                                                           ` Mike Snitzer
2015-01-10  1:59                                                             ` Jens Axboe
2015-01-10  3:10                                                               ` Mike Snitzer
2015-01-12 14:46                                                                 ` blk-mq request allocation stalls Bart Van Assche
2015-01-12 15:42                                                                   ` Jens Axboe
2015-01-12 16:12                                                                     ` Bart Van Assche
2015-01-12 16:34                                                                       ` Jens Axboe
2015-01-12 16:58                                                                         ` Mike Snitzer
2015-01-12 16:59                                                                           ` Jens Axboe
2015-01-12 17:04                                                                         ` Bart Van Assche
2015-01-12 17:09                                                                           ` Jens Axboe
2015-01-12 17:53                                                                             ` Keith Busch
2015-01-12 18:12                                                                               ` Jens Axboe
2015-01-12 18:22                                                                                 ` Keith Busch
2015-01-12 18:35                                                                                   ` Keith Busch
2015-01-12 19:11                                                                                     ` Mike Snitzer
2015-01-12 20:21                                                                                       ` Mike Snitzer
2015-01-13 12:29                                                                                         ` Bart Van Assche
2015-01-13 14:17                                                                                           ` Mike Snitzer
2015-01-13 14:28                                                                                             ` dm + blk-mq soft lockup complaint Bart Van Assche
2015-01-13 16:20                                                                                               ` Mike Snitzer
2015-01-14  9:16                                                                                                 ` Bart Van Assche
2015-01-14  9:16                                                                                                   ` Bart Van Assche
2015-01-14 18:59                                                                                                   ` Mike Snitzer
2015-01-15  8:11                                                                                                     ` Bart Van Assche
2015-01-15 15:43                                                                                                       ` Mike Snitzer
2015-01-15 15:55                                                                                                         ` Bart Van Assche
2015-01-13 14:59                                                                                     ` blk-mq request allocation stalls Jens Axboe
2015-01-13 15:11                                                                                       ` Keith Busch
2015-01-13 15:27                                                                                         ` Keith Busch
2015-01-13 15:41                                                                                         ` Mike Snitzer
2015-01-13 15:14                                                                                       ` Mike Snitzer
2015-01-27 18:42                                                                                       ` blk-mq DM changes for 3.20 [was: Re: blk-mq request allocation stalls] Mike Snitzer
2015-01-28 16:42                                                                                         ` Jens Axboe
2015-01-28 17:44                                                                                           ` Mike Snitzer
2015-01-28 17:49                                                                                             ` Jens Axboe
2015-01-28 18:10                                                                                               ` Mike Snitzer
2015-01-29 22:43                                                                                               ` blk-mq DM changes for 3.20 [was: Re: blk-mq request allocation stalls]X Keith Busch
2015-01-29 23:09                                                                                                 ` Mike Snitzer
2015-01-29 23:44                                                                                                   ` Keith Busch
2015-01-30  0:32                                                                                                     ` Mike Snitzer
2015-01-12 19:05                                                                                   ` blk-mq request allocation stalls Jens Axboe
2015-01-12 19:07                                                                                 ` Mike Snitzer
2015-01-12 18:19                                                                           ` Mike Snitzer
2014-12-17 22:51   ` [PATCH v3 0/8] dm: add request-based blk-mq support Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150109194955.GA32641@redhat.com \
    --to=snitzer@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=dm-devel@redhat.com \
    --cc=hch@infradead.org \
    --cc=j-nomura@ce.jp.nec.com \
    --cc=keith.busch@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.