[BUG] bcache_writebac blocked, IO on bcache device hung

linux-bcache.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [BUG] bcache_writebac blocked, IO on bcache device hung
@ 2016-04-10 18:21 Sebastian Roesner
  2016-04-11  1:55 ` Eric Wheeler
  0 siblings, 1 reply; 2+ messages in thread
From: Sebastian Roesner @ 2016-04-10 18:21 UTC (permalink / raw)
  To: linux-bcache

Hello,

I had an issue with bcache and kernel 4.5.0. I'm not sure that it was 
purely bcache related, but IO on the bcache device didn't work anymore 
whereas other volumes still worked fine.

After bcache blocked, it showed up the same message for dmcrypt_write. 
On top of the bcache device I run LVM and encrypt its LVs.


└─sda2                                              8:34    part
   └─md1                                             9:1     raid1
     └─bcache0                                     252:0     disk
       ├─storage-XXXXXXXXXXXXX_crypt               253:0     lvm
       │ └─XXXXXXXXXXXX                            253:137   crypt
       ├─storage-XXXXXXXXXXXX_crypt                253:1     lvm
       │ └─XXXXXXXXXXXX                            253:122   crypt
[..]

Bcache was patched with the patches from

https://bitbucket.org/ewheelerinc/linux v4.5-rc6-bcache-fixes
https://bitbucket.org/ewheelerinc/linux v4.5-rc7-bcache-fixes

Trace:

INFO: task bcache_writebac:10061 blocked for more than 120 seconds.
       Not tainted 4.5.0-kvmhost #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
bcache_writebac D ffff88081fc94440     0 10061      2 0x00000000
  ffff8807f83d2400 ffff8807fc2ae180 000000020000a07c ffffffff810036a4
  ffff8800cb444000 ffffffff00000052 ffff8807f83d2400 ffff8807f7df0bc0
  0000000000000000 ffff8807f7df0000 ffffffff81410370 ffff8807f7df0ad8
Call Trace:
  [<ffffffff810036a4>] ? __switch_to+0x1c8/0x36e
  [<ffffffff81410370>] ? schedule+0x7a/0x87
  [<ffffffff81411c6c>] ? rwsem_down_write_failed+0x241/0x2b0
  [<ffffffff81218763>] ? call_rwsem_down_write_failed+0x13/0x20
  [<ffffffff81411539>] ? down_write+0x24/0x33
  [<ffffffffa044ccac>] ? bch_writeback_thread+0x48/0x6bc [bcache]
  [<ffffffffa044cc64>] ? write_dirty_finish+0x1d4/0x1d4 [bcache]
  [<ffffffff8105efa2>] ? kthread+0x99/0xa1
  [<ffffffff8105ef09>] ? kthread_parkme+0x16/0x16
  [<ffffffff814129df>] ? ret_from_fork+0x3f/0x70
  [<ffffffff8105ef09>] ? kthread_parkme+0x16/0x16
INFO: task dmcrypt_write:11119 blocked for more than 120 seconds.
       Not tainted 4.5.0-kvmhost #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
dmcrypt_write   D ffff88081fc54440     0 11119      2 0x00000000
  ffff8807ebe0ce00 ffff8807fc2aee80 ffff8807f929eac0 0000000002011200
  ffff8800ce50c000 ffff8807ebe0ce00 0000000000000000 ffff8807bcd33020
  0000000000000001 0000000000000001 ffffffff81410370 ffff8807f7df0ad8
Call Trace:
  [<ffffffff81410370>] ? schedule+0x7a/0x87
  [<ffffffff81411a15>] ? rwsem_down_read_failed+0xc6/0xdc
  [<ffffffff810e71c8>] ? mempool_alloc+0x61/0x12d
  [<ffffffff81218734>] ? call_rwsem_down_read_failed+0x14/0x30
  [<ffffffff81411513>] ? down_read+0x17/0x19
  [<ffffffffa04441d9>] ? cached_dev_make_request+0x411/0x738 [bcache]
  [<ffffffff811eb860>] ? generic_make_request+0xb5/0x155
  [<ffffffffa0314917>] ? dmcrypt_write+0x131/0x160 [dm_crypt]
  [<ffffffff8106674c>] ? try_to_wake_up+0x1b5/0x1b5
  [<ffffffffa03147e6>] ? crypt_iv_benbi_gen+0x37/0x37 [dm_crypt]
  [<ffffffff8105efa2>] ? kthread+0x99/0xa1
  [<ffffffff8105ef09>] ? kthread_parkme+0x16/0x16
  [<ffffffff814129df>] ? ret_from_fork+0x3f/0x70
  [<ffffffff8105ef09>] ? kthread_parkme+0x16/0x16
INFO: task dmcrypt_write:11609 blocked for more than 120 seconds.
       Not tainted 4.5.0-kvmhost #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
dmcrypt_write   D ffff88081fc54440     0 11609      2 0x00000000
  ffff8807f04e4e80 ffff8807fc2aee80 ffff8807f929eac0 0000000002011200
  ffff8807eb2b4000 ffff8807f04e4e80 0000000000000000 ffff8807bab51020
  0000000000000001 0000000000000001 ffffffff81410370 ffff8807f7df0ad8
Call Trace:
  [<ffffffff81410370>] ? schedule+0x7a/0x87
  [<ffffffff81411a15>] ? rwsem_down_read_failed+0xc6/0xdc
  [<ffffffff810e71c8>] ? mempool_alloc+0x61/0x12d
  [<ffffffff81218734>] ? call_rwsem_down_read_failed+0x14/0x30
  [<ffffffff81411513>] ? down_read+0x17/0x19
  [<ffffffffa04441d9>] ? cached_dev_make_request+0x411/0x738 [bcache]
  [<ffffffff811eb860>] ? generic_make_request+0xb5/0x155
  [<ffffffffa0314917>] ? dmcrypt_write+0x131/0x160 [dm_crypt]
  [<ffffffff8106674c>] ? try_to_wake_up+0x1b5/0x1b5
  [<ffffffffa03147e6>] ? crypt_iv_benbi_gen+0x37/0x37 [dm_crypt]
  [<ffffffff8105efa2>] ? kthread+0x99/0xa1
  [<ffffffff8105ef09>] ? kthread_parkme+0x16/0x16
  [<ffffffff814129df>] ? ret_from_fork+0x3f/0x70
  [<ffffffff8105ef09>] ? kthread_parkme+0x16/0x16

Sebastian

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [BUG] bcache_writebac blocked, IO on bcache device hung
  2016-04-10 18:21 [BUG] bcache_writebac blocked, IO on bcache device hung Sebastian Roesner
@ 2016-04-11  1:55 ` Eric Wheeler
  0 siblings, 0 replies; 2+ messages in thread
From: Eric Wheeler @ 2016-04-11  1:55 UTC (permalink / raw)
  To: Sebastian Roesner; +Cc: linux-bcache

[-- Attachment #1: Type: TEXT/PLAIN, Size: 7445 bytes --]

On Sun, 10 Apr 2016, Sebastian Roesner wrote:

> Hello,
> 
> I had an issue with bcache and kernel 4.5.0. I'm not sure that it was purely
> bcache related, but IO on the bcache device didn't work anymore whereas other
> volumes still worked fine.
> 
> After bcache blocked, it showed up the same message for dmcrypt_write. On top
> of the bcache device I run LVM and encrypt its LVs.
> 
> 
> └─sda2                                              8:34    part
>   └─md1                                             9:1     raid1
>     └─bcache0                                     252:0     disk
>       ├─storage-XXXXXXXXXXXXX_crypt               253:0     lvm
>       │ └─XXXXXXXXXXXX                            253:137   crypt
>       ├─storage-XXXXXXXXXXXX_crypt                253:1     lvm
>       │ └─XXXXXXXXXXXX                            253:122   crypt
> [..]
> 
> Bcache was patched with the patches from
> 
> https://bitbucket.org/ewheelerinc/linux v4.5-rc6-bcache-fixes
> https://bitbucket.org/ewheelerinc/linux v4.5-rc7-bcache-fixes
> 
> Trace:
> 
> INFO: task bcache_writebac:10061 blocked for more than 120 seconds.
>       Not tainted 4.5.0-kvmhost #1
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> bcache_writebac D ffff88081fc94440     0 10061      2 0x00000000
>  ffff8807f83d2400 ffff8807fc2ae180 000000020000a07c ffffffff810036a4
>  ffff8800cb444000 ffffffff00000052 ffff8807f83d2400 ffff8807f7df0bc0
>  0000000000000000 ffff8807f7df0000 ffffffff81410370 ffff8807f7df0ad8
> Call Trace:
>  [<ffffffff810036a4>] ? __switch_to+0x1c8/0x36e
>  [<ffffffff81410370>] ? schedule+0x7a/0x87
>  [<ffffffff81411c6c>] ? rwsem_down_write_failed+0x241/0x2b0
>  [<ffffffff81218763>] ? call_rwsem_down_write_failed+0x13/0x20
>  [<ffffffff81411539>] ? down_write+0x24/0x33
>  [<ffffffffa044ccac>] ? bch_writeback_thread+0x48/0x6bc [bcache]
>  [<ffffffffa044cc64>] ? write_dirty_finish+0x1d4/0x1d4 [bcache]
>  [<ffffffff8105efa2>] ? kthread+0x99/0xa1
>  [<ffffffff8105ef09>] ? kthread_parkme+0x16/0x16
>  [<ffffffff814129df>] ? ret_from_fork+0x3f/0x70
>  [<ffffffff8105ef09>] ? kthread_parkme+0x16/0x16

Please recompile with the lockdep debugging options noted here:
  
http://stackoverflow.com/questions/20892822/how-to-use-lockdep-feature-in-linux-kernel-for-deadlock-detection


Also, try this patch from Ming Lei and let us know if that solves it:

Fixes: 54efd50(block: make generic_make_request handle arbitrarily sized bios)
Reported-by: Sebastian Roesner <sroesner-kernelorg@roesner-online.de>
Reported-by: Eric Wheeler <bcache@lists.ewheeler.net>
Cc: stable@vger.kernel.org (4.3+)
Cc: Shaohua Li <shli@fb.com>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
V1:
	- Kent pointed out that using max io size can't cover
	the case of non-full bvecs/pages

The issue can be reproduced by the following approach:
        - create one raid1 over two virtio-blk
        - build bcache device over the above raid1 and another cache device
	and bucket size is set 2Mbytes
        - set cache mode as writeback
        - run random write over ext4 on the bcache device
        - then the crash can be triggered

 block/blk-merge.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 2613531..7b96471 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -94,8 +94,10 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
 	bool do_split = true;
 	struct bio *new = NULL;
 	const unsigned max_sectors = get_max_io_size(q, bio);
+	unsigned bvecs = 0;
 
 	bio_for_each_segment(bv, bio, iter) {
+		bvecs++;
 		/*
 		 * If the queue doesn't support SG gaps and adding this
 		 * offset would create a gap, disallow it.
@@ -103,6 +105,23 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
 		if (bvprvp && bvec_gap_to_prev(q, bvprvp, bv.bv_offset))
 			goto split;
 
+		/*
+		 * With arbitrary bio size, the incoming bio may be very
+		 * big. We have to split the bio into small bios so that
+		 * each holds at most BIO_MAX_PAGES bvecs because
+		 * bio_clone() can fail to allocate big bvecs.
+		 *
+		 * It should have been better to apply the limit per
+		 * request queue in which bio_clone() is involved,
+		 * instead of globally. The biggest blocker is
+		 * bio_clone() in bio bounce.
+		 *
+		 * TODO: deal with bio bounce's bio_clone() gracefully
+		 * and convert the global limit into per-queue limit.
+		 */
+		if (bvecs >= BIO_MAX_PAGES)
+			goto split;
+
 		if (sectors + (bv.bv_len >> 9) > max_sectors) {
 			/*
 			 * Consider this a new segment if we're splitting in
-- 
1.9.1




--
Eric Wheeler


> INFO: task dmcrypt_write:11119 blocked for more than 120 seconds.
>       Not tainted 4.5.0-kvmhost #1
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> dmcrypt_write   D ffff88081fc54440     0 11119      2 0x00000000
>  ffff8807ebe0ce00 ffff8807fc2aee80 ffff8807f929eac0 0000000002011200
>  ffff8800ce50c000 ffff8807ebe0ce00 0000000000000000 ffff8807bcd33020
>  0000000000000001 0000000000000001 ffffffff81410370 ffff8807f7df0ad8
> Call Trace:
>  [<ffffffff81410370>] ? schedule+0x7a/0x87
>  [<ffffffff81411a15>] ? rwsem_down_read_failed+0xc6/0xdc
>  [<ffffffff810e71c8>] ? mempool_alloc+0x61/0x12d
>  [<ffffffff81218734>] ? call_rwsem_down_read_failed+0x14/0x30
>  [<ffffffff81411513>] ? down_read+0x17/0x19
>  [<ffffffffa04441d9>] ? cached_dev_make_request+0x411/0x738 [bcache]
>  [<ffffffff811eb860>] ? generic_make_request+0xb5/0x155
>  [<ffffffffa0314917>] ? dmcrypt_write+0x131/0x160 [dm_crypt]
>  [<ffffffff8106674c>] ? try_to_wake_up+0x1b5/0x1b5
>  [<ffffffffa03147e6>] ? crypt_iv_benbi_gen+0x37/0x37 [dm_crypt]
>  [<ffffffff8105efa2>] ? kthread+0x99/0xa1
>  [<ffffffff8105ef09>] ? kthread_parkme+0x16/0x16
>  [<ffffffff814129df>] ? ret_from_fork+0x3f/0x70
>  [<ffffffff8105ef09>] ? kthread_parkme+0x16/0x16
> INFO: task dmcrypt_write:11609 blocked for more than 120 seconds.
>       Not tainted 4.5.0-kvmhost #1
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> dmcrypt_write   D ffff88081fc54440     0 11609      2 0x00000000
>  ffff8807f04e4e80 ffff8807fc2aee80 ffff8807f929eac0 0000000002011200
>  ffff8807eb2b4000 ffff8807f04e4e80 0000000000000000 ffff8807bab51020
>  0000000000000001 0000000000000001 ffffffff81410370 ffff8807f7df0ad8
> Call Trace:
>  [<ffffffff81410370>] ? schedule+0x7a/0x87
>  [<ffffffff81411a15>] ? rwsem_down_read_failed+0xc6/0xdc
>  [<ffffffff810e71c8>] ? mempool_alloc+0x61/0x12d
>  [<ffffffff81218734>] ? call_rwsem_down_read_failed+0x14/0x30
>  [<ffffffff81411513>] ? down_read+0x17/0x19
>  [<ffffffffa04441d9>] ? cached_dev_make_request+0x411/0x738 [bcache]
>  [<ffffffff811eb860>] ? generic_make_request+0xb5/0x155
>  [<ffffffffa0314917>] ? dmcrypt_write+0x131/0x160 [dm_crypt]
>  [<ffffffff8106674c>] ? try_to_wake_up+0x1b5/0x1b5
>  [<ffffffffa03147e6>] ? crypt_iv_benbi_gen+0x37/0x37 [dm_crypt]
>  [<ffffffff8105efa2>] ? kthread+0x99/0xa1
>  [<ffffffff8105ef09>] ? kthread_parkme+0x16/0x16
>  [<ffffffff814129df>] ? ret_from_fork+0x3f/0x70
>  [<ffffffff8105ef09>] ? kthread_parkme+0x16/0x16
> 
> Sebastian
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2016-04-11  1:55 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-10 18:21 [BUG] bcache_writebac blocked, IO on bcache device hung Sebastian Roesner
2016-04-11  1:55 ` Eric Wheeler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).