public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* kernel BUG at drivers/block/virtio_blk.c:172!
@ 2014-11-07 13:04 Jeff Layton
  2014-11-10  9:59 ` Rusty Russell
  0 siblings, 1 reply; 9+ messages in thread
From: Jeff Layton @ 2014-11-07 13:04 UTC (permalink / raw)
  To: Rusty, "Russell <rusty", Michael S. Tsirkin,
	Dave Chinner
  Cc: xfs, virtualization

In the latest Fedora rawhide kernel in the repos, I'm seeing the
following oops when mounting xfs. rc2-ish kernels seem to be fine:

[   64.669633] ------------[ cut here ]------------
[   64.670008] kernel BUG at drivers/block/virtio_blk.c:172!
[   64.670008] invalid opcode: 0000 [#1] SMP 
[   64.670008] Modules linked in: xfs libcrc32c snd_hda_codec_generic snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm ppdev snd_timer snd virtio_net virtio_balloon soundcore serio_raw parport_pc virtio_console pvpanic parport i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc qxl virtio_blk drm_kms_helper ttm drm ata_generic virtio_pci virtio_ring virtio pata_acpi
[   64.670008] CPU: 1 PID: 705 Comm: mount Not tainted 3.18.0-0.rc3.git2.1.fc22.x86_64 #1
[   64.670008] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[   64.670008] task: ffff8800d94a4ec0 ti: ffff8800d9f38000 task.ti: ffff8800d9f38000
[   64.670008] RIP: 0010:[<ffffffffa00287c0>]  [<ffffffffa00287c0>] virtio_queue_rq+0x290/0x2a0 [virtio_blk]
[   64.670008] RSP: 0018:ffff8800d9f3b778  EFLAGS: 00010202
[   64.670008] RAX: 0000000000000082 RBX: ffff8800d8375700 RCX: dead000000200200
[   64.670008] RDX: 0000000000000001 RSI: ffff8800d8375700 RDI: ffff8800d82c4c00
[   64.670008] RBP: ffff8800d9f3b7b8 R08: ffff8800d8375700 R09: 0000000000000001
[   64.670008] R10: 0000000000000001 R11: 0000000000000004 R12: ffff8800d9f3b7e0
[   64.670008] R13: ffff8800d82c4c00 R14: ffff880118629200 R15: 0000000000000000
[   64.670008] FS:  00007f5c64dfd840(0000) GS:ffff88011b000000(0000) knlGS:0000000000000000
[   64.670008] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   64.670008] CR2: 00007fffe6458fb8 CR3: 00000000d06d3000 CR4: 00000000000006e0
[   64.670008] Stack:
[   64.670008]  ffff880000000001 ffff8800d8375870 0000000000000001 ffff8800d82c4c00
[   64.670008]  ffff8800d9f3b7e0 0000000000000000 ffff8800d8375700 ffff8800d82c4c48
[   64.670008]  ffff8800d9f3b828 ffffffff813ec258 ffff8800d82c8000 0000000000000001
[   64.670008] Call Trace:
[   64.670008]  [<ffffffff813ec258>] __blk_mq_run_hw_queue+0x1c8/0x330
[   64.670008]  [<ffffffff813ecd80>] blk_mq_run_hw_queue+0x70/0x90
[   64.670008]  [<ffffffff813ee0cd>] blk_sq_make_request+0x24d/0x5c0
[   64.670008]  [<ffffffff813dec68>] generic_make_request+0xf8/0x150
[   64.670008]  [<ffffffff813ded38>] submit_bio+0x78/0x190
[   64.670008]  [<ffffffffa02fc27e>] _xfs_buf_ioapply+0x2be/0x5f0 [xfs]
[   64.670008]  [<ffffffffa0333628>] ? xlog_bread_noalign+0xa8/0xe0 [xfs]
[   64.670008]  [<ffffffffa02ffe21>] xfs_buf_submit_wait+0x91/0x840 [xfs]
[   64.670008]  [<ffffffffa0333628>] xlog_bread_noalign+0xa8/0xe0 [xfs]
[   64.670008]  [<ffffffffa0333ea7>] xlog_bread+0x27/0x60 [xfs]
[   64.670008]  [<ffffffffa03357f3>] xlog_find_verify_cycle+0xf3/0x1b0 [xfs]
[   64.670008]  [<ffffffffa0335de5>] xlog_find_head+0x2f5/0x3e0 [xfs]
[   64.670008]  [<ffffffffa0335f0c>] xlog_find_tail+0x3c/0x410 [xfs]
[   64.670008]  [<ffffffffa033b12d>] xlog_recover+0x2d/0x120 [xfs]
[   64.670008]  [<ffffffffa033cfdb>] ? xfs_trans_ail_init+0xcb/0x100 [xfs]
[   64.670008]  [<ffffffffa0329c3d>] xfs_log_mount+0xdd/0x2c0 [xfs]
[   64.670008]  [<ffffffffa031f744>] xfs_mountfs+0x514/0x9c0 [xfs]
[   64.670008]  [<ffffffffa0320c8d>] ? xfs_mru_cache_create+0x18d/0x1f0 [xfs]
[   64.670008]  [<ffffffffa0322ed0>] xfs_fs_fill_super+0x330/0x3b0 [xfs]
[   64.670008]  [<ffffffff8126d4ac>] mount_bdev+0x1bc/0x1f0
[   64.670008]  [<ffffffffa0322ba0>] ? xfs_parseargs+0xbe0/0xbe0 [xfs]
[   64.670008]  [<ffffffffa0320fd5>] xfs_fs_mount+0x15/0x20 [xfs]
[   64.670008]  [<ffffffff8126de58>] mount_fs+0x38/0x1c0
[   64.670008]  [<ffffffff81202c15>] ? __alloc_percpu+0x15/0x20
[   64.670008]  [<ffffffff812908f8>] vfs_kern_mount+0x68/0x160
[   64.670008]  [<ffffffff81293d6c>] do_mount+0x22c/0xc20
[   64.670008]  [<ffffffff8120d92e>] ? might_fault+0x5e/0xc0
[   64.670008]  [<ffffffff811fcf1b>] ? memdup_user+0x4b/0x90
[   64.670008]  [<ffffffff81294a8e>] SyS_mount+0x9e/0x100
[   64.670008]  [<ffffffff8185e169>] system_call_fastpath+0x12/0x17
[   64.670008] Code: 00 00 c7 86 78 01 00 00 02 00 00 00 48 c7 86 80 01 00 00 00 00 00 00 89 86 7c 01 00 00 e9 02 fe ff ff 66 0f 1f 84 00 00 00 00 00 <0f> 0b 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 
[   64.670008] RIP  [<ffffffffa00287c0>] virtio_queue_rq+0x290/0x2a0 [virtio_blk]
[   64.670008]  RSP <ffff8800d9f3b778>
[   64.715347] ---[ end trace c0ff4a0f2fb21f7f ]---

It's reliably reproducible and I don't see this oops when I convert the
same block device to ext4 and mount it. In this setup, the KVM guest
has a virtio block device that has a LVM2 PV on it with an LV on it
that contains the filesystem.

Let me know if you need any other info to chase this down.

Thanks!
-- 
Jeff Layton <jlayton@poochiereds.net>

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: kernel BUG at drivers/block/virtio_blk.c:172!
  2014-11-07 13:04 kernel BUG at drivers/block/virtio_blk.c:172! Jeff Layton
@ 2014-11-10  9:59 ` Rusty Russell
  2014-11-10 23:31   ` Jens Axboe
  0 siblings, 1 reply; 9+ messages in thread
From: Rusty Russell @ 2014-11-10  9:59 UTC (permalink / raw)
  To: Jeff Layton, Michael S. Tsirkin, Dave Chinner, Jens Axboe
  Cc: xfs, virtualization

Jeff Layton <jlayton@poochiereds.net> writes:

> In the latest Fedora rawhide kernel in the repos, I'm seeing the
> following oops when mounting xfs. rc2-ish kernels seem to be fine:
>
> [   64.669633] ------------[ cut here ]------------
> [   64.670008] kernel BUG at drivers/block/virtio_blk.c:172!

Hmm, that's:

	BUG_ON(req->nr_phys_segments + 2 > vblk->sg_elems);

But during our probe routine we said:

	/* We can handle whatever the host told us to handle. */
	blk_queue_max_segments(q, vblk->sg_elems-2);

Jens?

Thanks,
Rusty.

> [   64.670008] invalid opcode: 0000 [#1] SMP 
> [   64.670008] Modules linked in: xfs libcrc32c snd_hda_codec_generic snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm ppdev snd_timer snd virtio_net virtio_balloon soundcore serio_raw parport_pc virtio_console pvpanic parport i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc qxl virtio_blk drm_kms_helper ttm drm ata_generic virtio_pci virtio_ring virtio pata_acpi
> [   64.670008] CPU: 1 PID: 705 Comm: mount Not tainted 3.18.0-0.rc3.git2.1.fc22.x86_64 #1
> [   64.670008] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [   64.670008] task: ffff8800d94a4ec0 ti: ffff8800d9f38000 task.ti: ffff8800d9f38000
> [   64.670008] RIP: 0010:[<ffffffffa00287c0>]  [<ffffffffa00287c0>] virtio_queue_rq+0x290/0x2a0 [virtio_blk]
> [   64.670008] RSP: 0018:ffff8800d9f3b778  EFLAGS: 00010202
> [   64.670008] RAX: 0000000000000082 RBX: ffff8800d8375700 RCX: dead000000200200
> [   64.670008] RDX: 0000000000000001 RSI: ffff8800d8375700 RDI: ffff8800d82c4c00
> [   64.670008] RBP: ffff8800d9f3b7b8 R08: ffff8800d8375700 R09: 0000000000000001
> [   64.670008] R10: 0000000000000001 R11: 0000000000000004 R12: ffff8800d9f3b7e0
> [   64.670008] R13: ffff8800d82c4c00 R14: ffff880118629200 R15: 0000000000000000
> [   64.670008] FS:  00007f5c64dfd840(0000) GS:ffff88011b000000(0000) knlGS:0000000000000000
> [   64.670008] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [   64.670008] CR2: 00007fffe6458fb8 CR3: 00000000d06d3000 CR4: 00000000000006e0
> [   64.670008] Stack:
> [   64.670008]  ffff880000000001 ffff8800d8375870 0000000000000001 ffff8800d82c4c00
> [   64.670008]  ffff8800d9f3b7e0 0000000000000000 ffff8800d8375700 ffff8800d82c4c48
> [   64.670008]  ffff8800d9f3b828 ffffffff813ec258 ffff8800d82c8000 0000000000000001
> [   64.670008] Call Trace:
> [   64.670008]  [<ffffffff813ec258>] __blk_mq_run_hw_queue+0x1c8/0x330
> [   64.670008]  [<ffffffff813ecd80>] blk_mq_run_hw_queue+0x70/0x90
> [   64.670008]  [<ffffffff813ee0cd>] blk_sq_make_request+0x24d/0x5c0
> [   64.670008]  [<ffffffff813dec68>] generic_make_request+0xf8/0x150
> [   64.670008]  [<ffffffff813ded38>] submit_bio+0x78/0x190
> [   64.670008]  [<ffffffffa02fc27e>] _xfs_buf_ioapply+0x2be/0x5f0 [xfs]
> [   64.670008]  [<ffffffffa0333628>] ? xlog_bread_noalign+0xa8/0xe0 [xfs]
> [   64.670008]  [<ffffffffa02ffe21>] xfs_buf_submit_wait+0x91/0x840 [xfs]
> [   64.670008]  [<ffffffffa0333628>] xlog_bread_noalign+0xa8/0xe0 [xfs]
> [   64.670008]  [<ffffffffa0333ea7>] xlog_bread+0x27/0x60 [xfs]
> [   64.670008]  [<ffffffffa03357f3>] xlog_find_verify_cycle+0xf3/0x1b0 [xfs]
> [   64.670008]  [<ffffffffa0335de5>] xlog_find_head+0x2f5/0x3e0 [xfs]
> [   64.670008]  [<ffffffffa0335f0c>] xlog_find_tail+0x3c/0x410 [xfs]
> [   64.670008]  [<ffffffffa033b12d>] xlog_recover+0x2d/0x120 [xfs]
> [   64.670008]  [<ffffffffa033cfdb>] ? xfs_trans_ail_init+0xcb/0x100 [xfs]
> [   64.670008]  [<ffffffffa0329c3d>] xfs_log_mount+0xdd/0x2c0 [xfs]
> [   64.670008]  [<ffffffffa031f744>] xfs_mountfs+0x514/0x9c0 [xfs]
> [   64.670008]  [<ffffffffa0320c8d>] ? xfs_mru_cache_create+0x18d/0x1f0 [xfs]
> [   64.670008]  [<ffffffffa0322ed0>] xfs_fs_fill_super+0x330/0x3b0 [xfs]
> [   64.670008]  [<ffffffff8126d4ac>] mount_bdev+0x1bc/0x1f0
> [   64.670008]  [<ffffffffa0322ba0>] ? xfs_parseargs+0xbe0/0xbe0 [xfs]
> [   64.670008]  [<ffffffffa0320fd5>] xfs_fs_mount+0x15/0x20 [xfs]
> [   64.670008]  [<ffffffff8126de58>] mount_fs+0x38/0x1c0
> [   64.670008]  [<ffffffff81202c15>] ? __alloc_percpu+0x15/0x20
> [   64.670008]  [<ffffffff812908f8>] vfs_kern_mount+0x68/0x160
> [   64.670008]  [<ffffffff81293d6c>] do_mount+0x22c/0xc20
> [   64.670008]  [<ffffffff8120d92e>] ? might_fault+0x5e/0xc0
> [   64.670008]  [<ffffffff811fcf1b>] ? memdup_user+0x4b/0x90
> [   64.670008]  [<ffffffff81294a8e>] SyS_mount+0x9e/0x100
> [   64.670008]  [<ffffffff8185e169>] system_call_fastpath+0x12/0x17
> [   64.670008] Code: 00 00 c7 86 78 01 00 00 02 00 00 00 48 c7 86 80 01 00 00 00 00 00 00 89 86 7c 01 00 00 e9 02 fe ff ff 66 0f 1f 84 00 00 00 00 00 <0f> 0b 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 
> [   64.670008] RIP  [<ffffffffa00287c0>] virtio_queue_rq+0x290/0x2a0 [virtio_blk]
> [   64.670008]  RSP <ffff8800d9f3b778>
> [   64.715347] ---[ end trace c0ff4a0f2fb21f7f ]---
>
> It's reliably reproducible and I don't see this oops when I convert the
> same block device to ext4 and mount it. In this setup, the KVM guest
> has a virtio block device that has a LVM2 PV on it with an LV on it
> that contains the filesystem.
>
> Let me know if you need any other info to chase this down.
>
> Thanks!
> -- 
> Jeff Layton <jlayton@poochiereds.net>

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: kernel BUG at drivers/block/virtio_blk.c:172!
  2014-11-10  9:59 ` Rusty Russell
@ 2014-11-10 23:31   ` Jens Axboe
  2014-11-11  0:56     ` Ming Lei
  0 siblings, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2014-11-10 23:31 UTC (permalink / raw)
  To: Rusty Russell, Jeff Layton, Michael S. Tsirkin, Dave Chinner
  Cc: Ming Lei, xfs, virtualization

On 2014-11-10 02:59, Rusty Russell wrote:
> Jeff Layton <jlayton@poochiereds.net> writes:
>
>> In the latest Fedora rawhide kernel in the repos, I'm seeing the
>> following oops when mounting xfs. rc2-ish kernels seem to be fine:
>>
>> [   64.669633] ------------[ cut here ]------------
>> [   64.670008] kernel BUG at drivers/block/virtio_blk.c:172!
>
> Hmm, that's:
>
> 	BUG_ON(req->nr_phys_segments + 2 > vblk->sg_elems);
>
> But during our probe routine we said:
>
> 	/* We can handle whatever the host told us to handle. */
> 	blk_queue_max_segments(q, vblk->sg_elems-2);
>
> Jens?

Known, I'm afraid, Ming is looking into it.

-- 
Jens Axboe

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: kernel BUG at drivers/block/virtio_blk.c:172!
  2014-11-10 23:31   ` Jens Axboe
@ 2014-11-11  0:56     ` Ming Lei
  2014-11-11 15:42       ` Dongsu Park
  0 siblings, 1 reply; 9+ messages in thread
From: Ming Lei @ 2014-11-11  0:56 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Michael S. Tsirkin, Rusty Russell, xfs, Jeff Layton,
	Linux Virtualization

On Tue, Nov 11, 2014 at 7:31 AM, Jens Axboe <axboe@kernel.dk> wrote:
> On 2014-11-10 02:59, Rusty Russell wrote:
>>
>> Jeff Layton <jlayton@poochiereds.net> writes:
>>
>>> In the latest Fedora rawhide kernel in the repos, I'm seeing the
>>> following oops when mounting xfs. rc2-ish kernels seem to be fine:
>>>
>>> [   64.669633] ------------[ cut here ]------------
>>> [   64.670008] kernel BUG at drivers/block/virtio_blk.c:172!
>>
>>
>> Hmm, that's:
>>
>>         BUG_ON(req->nr_phys_segments + 2 > vblk->sg_elems);
>>
>> But during our probe routine we said:
>>
>>         /* We can handle whatever the host told us to handle. */
>>         blk_queue_max_segments(q, vblk->sg_elems-2);
>>
>> Jens?
>
>
> Known, I'm afraid, Ming is looking into it.

There is one obvious bug which should have been fixed by below
patch(0001-block-blk-merge-fix-blk_recount_segments.patch"):

http://marc.info/?l=linux-virtualization&m=141562191719405&q=p3

And there might be another one, I appreciate someone can post
log which is printed by patch("blk-seg.patch") in below link if the
bug still can be triggered even with above fix:

http://marc.info/?l=linux-virtualization&m=141473040618467&q=p3


Thanks,
-- 
Ming Lei

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: kernel BUG at drivers/block/virtio_blk.c:172!
  2014-11-11  0:56     ` Ming Lei
@ 2014-11-11 15:42       ` Dongsu Park
  2014-11-11 16:42         ` Ming Lei
  0 siblings, 1 reply; 9+ messages in thread
From: Dongsu Park @ 2014-11-11 15:42 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, Michael S. Tsirkin, Rusty Russell, xfs, Lukas Czerner,
	Jeff Layton, Linux Virtualization, Christoph Hellwig

Hi Ming,

On 11.11.2014 08:56, Ming Lei wrote:
> On Tue, Nov 11, 2014 at 7:31 AM, Jens Axboe <axboe@kernel.dk> wrote:
> > Known, I'm afraid, Ming is looking into it.

Actually I had also tried to reproduce this bug, without success.
But today I happened to know how to trigger the bug, by coincidence,
during testing other things.

Try to run xfstests/generic/034. You'll see the crash immediately.
Tested on a QEMU VM with kernel 3.18-rc4, virtio-blk, dm-flakey and xfs.

> There is one obvious bug which should have been fixed by below
> patch(0001-block-blk-merge-fix-blk_recount_segments.patch"):
> 
> http://marc.info/?l=linux-virtualization&m=141562191719405&q=p3

This patch didn't bring anything to me, as Lukas said.

> And there might be another one, I appreciate someone can post
> log which is printed by patch("blk-seg.patch") in below link if the
> bug still can be triggered even with above fix:
> 
> http://marc.info/?l=linux-virtualization&m=141473040618467&q=p3

"blk_recount_segments: 1-1-1 vcnt-128 segs-128"

As long as I understood so far, the reason is that bi_phys_segments gets
sometimes bigger than queue_max_sectors() after blk_recount_segments().
That happens no matter whether segments are recalculated or not.

I'm not completely sure about what to do, but how about the attached patch?
It seems to work, according to several xfstests runs.

Cheers,
Dongsu

----

>From 1db98323931eb9ab430116c4d909d22222c16e22 Mon Sep 17 00:00:00 2001
From: Dongsu Park <dongsu.park@profitbricks.com>
Date: Tue, 11 Nov 2014 13:10:59 +0100
Subject: [RFC PATCH] blk-merge: make bi_phys_segments consider also
 queue_max_segments()

When recounting the number of physical segments, the number of max
segments of request_queue must be also taken into account.
Otherwise bio->bi_phys_segments could get bigger than
queue_max_segments(). Then this results in virtio_queue_rq() seeing
req->nr_phys_segments that is greater than expected. Although the
initial queue_max_segments was set to (vblk->sg_elems - 2), a request
comes in with a larger value of nr_phys_segments, which triggers the
BUG_ON() condition.

This commit should fix a kernel crash in virtio_blk, which occurs
especially frequently when it runs with blk-mq, device mapper, and xfs.
The simplest way to reproduce this bug is to run xfstests/generic/034.
Note, test 034 requires dm-flakey to be turned on in the kernel config.

See the kernel trace below:
------------[ cut here ]------------
kernel BUG at drivers/block/virtio_blk.c:172!
invalid opcode: 0000 [#1] SMP
CPU: 1 PID: 3343 Comm: mount Not tainted 3.18.0-rc4+ #55
RIP: 0010:[<ffffffff81561027>]
 [<ffffffff81561027>] virtio_queue_rq+0x277/0x280
Call Trace:
 [<ffffffff8142e908>] __blk_mq_run_hw_queue+0x1a8/0x300
 [<ffffffff8142f00d>] blk_mq_run_hw_queue+0x6d/0x90
 [<ffffffff8143003e>] blk_sq_make_request+0x23e/0x360
 [<ffffffff81422e20>] generic_make_request+0xc0/0x110
 [<ffffffff81422ed9>] submit_bio+0x69/0x130
 [<ffffffff812f013d>] _xfs_buf_ioapply+0x2bd/0x410
 [<ffffffff81315f38>] ? xlog_bread_noalign+0xa8/0xe0
 [<ffffffff812f1bd1>] xfs_buf_submit_wait+0x61/0x1d0
 [<ffffffff81315f38>] xlog_bread_noalign+0xa8/0xe0
 [<ffffffff81316917>] xlog_bread+0x27/0x60
 [<ffffffff8131ad11>] xlog_find_verify_cycle+0xe1/0x190
 [<ffffffff8131b291>] xlog_find_head+0x2d1/0x3c0
 [<ffffffff8131b3ad>] xlog_find_tail+0x2d/0x3f0
 [<ffffffff8131b78e>] xlog_recover+0x1e/0xf0
 [<ffffffff8130fbac>] xfs_log_mount+0x24c/0x2c0
 [<ffffffff813075db>] xfs_mountfs+0x44b/0x7a0
 [<ffffffff8130a98a>] xfs_fs_fill_super+0x2ba/0x330
 [<ffffffff811cea64>] mount_bdev+0x194/0x1d0
 [<ffffffff8130a6d0>] ? xfs_parseargs+0xbe0/0xbe0
 [<ffffffff813089a5>] xfs_fs_mount+0x15/0x20
 [<ffffffff811cf389>] mount_fs+0x39/0x1b0
 [<ffffffff8117bf75>] ? __alloc_percpu+0x15/0x20
 [<ffffffff811e9887>] vfs_kern_mount+0x67/0x110
 [<ffffffff811ec584>] do_mount+0x204/0xad0
 [<ffffffff811ed18b>] SyS_mount+0x8b/0xe0
 [<ffffffff81788e12>] system_call_fastpath+0x12/0x17
RIP [<ffffffff81561027>] virtio_queue_rq+0x277/0x280
---[ end trace ae3ec6426f011b5d ]---

Signed-off-by: Dongsu Park <dongsu.park@profitbricks.com>
Tested-by: Dongsu Park <dongsu.park@profitbricks.com>
Cc: Ming Lei <tom.leiming@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jeff Layton <jlayton@poochiereds.net>
Cc: Dave Chinner <david@fromorbit.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Lukas Czerner <lczerner@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: virtualization@lists.linux-foundation.org
---
 block/blk-merge.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index b3ac40a..d808601 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -103,13 +103,16 @@ void blk_recount_segments(struct request_queue *q, struct bio *bio)
 
 	if (no_sg_merge && !bio_flagged(bio, BIO_CLONED) &&
 			merge_not_need)
-		bio->bi_phys_segments = bio->bi_vcnt;
+		bio->bi_phys_segments = min_t(unsigned int, bio->bi_vcnt,
+				queue_max_segments(q));
 	else {
 		struct bio *nxt = bio->bi_next;
 
 		bio->bi_next = NULL;
-		bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio,
-				no_sg_merge && merge_not_need);
+		bio->bi_phys_segments = min_t(unsigned int,
+				__blk_recalc_rq_segments(q, bio, no_sg_merge
+					&& merge_not_need),
+				queue_max_segments(q));
 		bio->bi_next = nxt;
 	}
 
-- 
1.9.3

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: kernel BUG at drivers/block/virtio_blk.c:172!
  2014-11-11 15:42       ` Dongsu Park
@ 2014-11-11 16:42         ` Ming Lei
  2014-11-12 18:18           ` Jens Axboe
  0 siblings, 1 reply; 9+ messages in thread
From: Ming Lei @ 2014-11-11 16:42 UTC (permalink / raw)
  To: Dongsu Park
  Cc: Jens Axboe, Michael S. Tsirkin, Rusty Russell, xfs, Lukas Czerner,
	Jeff Layton, Linux Virtualization, Christoph Hellwig

[-- Attachment #1: Type: text/plain, Size: 6330 bytes --]

On Tue, Nov 11, 2014 at 11:42 PM, Dongsu Park
<dongsu.park@profitbricks.com> wrote:
> Hi Ming,
>
> On 11.11.2014 08:56, Ming Lei wrote:
>> On Tue, Nov 11, 2014 at 7:31 AM, Jens Axboe <axboe@kernel.dk> wrote:
>> > Known, I'm afraid, Ming is looking into it.
>
> Actually I had also tried to reproduce this bug, without success.
> But today I happened to know how to trigger the bug, by coincidence,
> during testing other things.
>
> Try to run xfstests/generic/034. You'll see the crash immediately.
> Tested on a QEMU VM with kernel 3.18-rc4, virtio-blk, dm-flakey and xfs.
>
>> There is one obvious bug which should have been fixed by below
>> patch(0001-block-blk-merge-fix-blk_recount_segments.patch"):
>>
>> http://marc.info/?l=linux-virtualization&m=141562191719405&q=p3
>
> This patch didn't bring anything to me, as Lukas said.
>
>> And there might be another one, I appreciate someone can post
>> log which is printed by patch("blk-seg.patch") in below link if the
>> bug still can be triggered even with above fix:
>>
>> http://marc.info/?l=linux-virtualization&m=141473040618467&q=p3
>
> "blk_recount_segments: 1-1-1 vcnt-128 segs-128"
>
> As long as I understood so far, the reason is that bi_phys_segments gets
> sometimes bigger than queue_max_sectors() after blk_recount_segments().
> That happens no matter whether segments are recalculated or not.

I know the problem now, since bi_vcnt can't be used for cloned bio,
and the patch I sent last time is wrong too.

> I'm not completely sure about what to do, but how about the attached patch?
> It seems to work, according to several xfstests runs.
>
> Cheers,
> Dongsu
>
> ----
>
> From 1db98323931eb9ab430116c4d909d22222c16e22 Mon Sep 17 00:00:00 2001
> From: Dongsu Park <dongsu.park@profitbricks.com>
> Date: Tue, 11 Nov 2014 13:10:59 +0100
> Subject: [RFC PATCH] blk-merge: make bi_phys_segments consider also
>  queue_max_segments()
>
> When recounting the number of physical segments, the number of max
> segments of request_queue must be also taken into account.
> Otherwise bio->bi_phys_segments could get bigger than
> queue_max_segments(). Then this results in virtio_queue_rq() seeing
> req->nr_phys_segments that is greater than expected. Although the
> initial queue_max_segments was set to (vblk->sg_elems - 2), a request
> comes in with a larger value of nr_phys_segments, which triggers the
> BUG_ON() condition.
>
> This commit should fix a kernel crash in virtio_blk, which occurs
> especially frequently when it runs with blk-mq, device mapper, and xfs.
> The simplest way to reproduce this bug is to run xfstests/generic/034.
> Note, test 034 requires dm-flakey to be turned on in the kernel config.
>
> See the kernel trace below:
> ------------[ cut here ]------------
> kernel BUG at drivers/block/virtio_blk.c:172!
> invalid opcode: 0000 [#1] SMP
> CPU: 1 PID: 3343 Comm: mount Not tainted 3.18.0-rc4+ #55
> RIP: 0010:[<ffffffff81561027>]
>  [<ffffffff81561027>] virtio_queue_rq+0x277/0x280
> Call Trace:
>  [<ffffffff8142e908>] __blk_mq_run_hw_queue+0x1a8/0x300
>  [<ffffffff8142f00d>] blk_mq_run_hw_queue+0x6d/0x90
>  [<ffffffff8143003e>] blk_sq_make_request+0x23e/0x360
>  [<ffffffff81422e20>] generic_make_request+0xc0/0x110
>  [<ffffffff81422ed9>] submit_bio+0x69/0x130
>  [<ffffffff812f013d>] _xfs_buf_ioapply+0x2bd/0x410
>  [<ffffffff81315f38>] ? xlog_bread_noalign+0xa8/0xe0
>  [<ffffffff812f1bd1>] xfs_buf_submit_wait+0x61/0x1d0
>  [<ffffffff81315f38>] xlog_bread_noalign+0xa8/0xe0
>  [<ffffffff81316917>] xlog_bread+0x27/0x60
>  [<ffffffff8131ad11>] xlog_find_verify_cycle+0xe1/0x190
>  [<ffffffff8131b291>] xlog_find_head+0x2d1/0x3c0
>  [<ffffffff8131b3ad>] xlog_find_tail+0x2d/0x3f0
>  [<ffffffff8131b78e>] xlog_recover+0x1e/0xf0
>  [<ffffffff8130fbac>] xfs_log_mount+0x24c/0x2c0
>  [<ffffffff813075db>] xfs_mountfs+0x44b/0x7a0
>  [<ffffffff8130a98a>] xfs_fs_fill_super+0x2ba/0x330
>  [<ffffffff811cea64>] mount_bdev+0x194/0x1d0
>  [<ffffffff8130a6d0>] ? xfs_parseargs+0xbe0/0xbe0
>  [<ffffffff813089a5>] xfs_fs_mount+0x15/0x20
>  [<ffffffff811cf389>] mount_fs+0x39/0x1b0
>  [<ffffffff8117bf75>] ? __alloc_percpu+0x15/0x20
>  [<ffffffff811e9887>] vfs_kern_mount+0x67/0x110
>  [<ffffffff811ec584>] do_mount+0x204/0xad0
>  [<ffffffff811ed18b>] SyS_mount+0x8b/0xe0
>  [<ffffffff81788e12>] system_call_fastpath+0x12/0x17
> RIP [<ffffffff81561027>] virtio_queue_rq+0x277/0x280
> ---[ end trace ae3ec6426f011b5d ]---
>
> Signed-off-by: Dongsu Park <dongsu.park@profitbricks.com>
> Tested-by: Dongsu Park <dongsu.park@profitbricks.com>
> Cc: Ming Lei <tom.leiming@gmail.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Rusty Russell <rusty@rustcorp.com.au>
> Cc: Jeff Layton <jlayton@poochiereds.net>
> Cc: Dave Chinner <david@fromorbit.com>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Lukas Czerner <lczerner@redhat.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: virtualization@lists.linux-foundation.org
> ---
>  block/blk-merge.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index b3ac40a..d808601 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -103,13 +103,16 @@ void blk_recount_segments(struct request_queue *q, struct bio *bio)
>
>         if (no_sg_merge && !bio_flagged(bio, BIO_CLONED) &&
>                         merge_not_need)
> -               bio->bi_phys_segments = bio->bi_vcnt;
> +               bio->bi_phys_segments = min_t(unsigned int, bio->bi_vcnt,
> +                               queue_max_segments(q));
>         else {
>                 struct bio *nxt = bio->bi_next;
>
>                 bio->bi_next = NULL;
> -               bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio,
> -                               no_sg_merge && merge_not_need);
> +               bio->bi_phys_segments = min_t(unsigned int,
> +                               __blk_recalc_rq_segments(q, bio, no_sg_merge
> +                                       && merge_not_need),
> +                               queue_max_segments(q));
>                 bio->bi_next = nxt;
>         }

The above change may cause some data not written to/read from
device, and we have to merge if segments number will become
bigger than the limit.

The attached patch should fix the problem, and hope it is the last one, :-)


Thanks,
-- 
Ming Lei

[-- Attachment #2: 0001-block-blk-merge-fix-blk_recount_segments.patch --]
[-- Type: text/x-patch, Size: 1633 bytes --]

From 9cedeb8cfd420ecfcd3e2b2e0bcb699d35ae2a03 Mon Sep 17 00:00:00 2001
From: Ming Lei <tom.leiming@gmail.com>
Date: Wed, 12 Nov 2014 00:15:41 +0800
Subject: [PATCH] block: blk-merge: fix blk_recount_segments()

For cloned bio, bio->bi_vcnt can't be used at all, and we
have resort to bio_segments() to figure out how many
segment there are in the bio.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 block/blk-merge.c |   19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index b3ac40a..89b97b5 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -97,19 +97,22 @@ void blk_recalc_rq_segments(struct request *rq)
 
 void blk_recount_segments(struct request_queue *q, struct bio *bio)
 {
-	bool no_sg_merge = !!test_bit(QUEUE_FLAG_NO_SG_MERGE,
-			&q->queue_flags);
-	bool merge_not_need = bio->bi_vcnt < queue_max_segments(q);
+	unsigned short seg_cnt;
+
+	/* estimate segment number by bi_vcnt for non-cloned bio */
+	if (bio_flagged(bio, BIO_CLONED))
+		seg_cnt = bio_segments(bio);
+	else
+		seg_cnt = bio->bi_vcnt;
 
-	if (no_sg_merge && !bio_flagged(bio, BIO_CLONED) &&
-			merge_not_need)
-		bio->bi_phys_segments = bio->bi_vcnt;
+	if (test_bit(QUEUE_FLAG_NO_SG_MERGE, &q->queue_flags) &&
+			(seg_cnt < queue_max_segments(q)))
+		bio->bi_phys_segments = seg_cnt;
 	else {
 		struct bio *nxt = bio->bi_next;
 
 		bio->bi_next = NULL;
-		bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio,
-				no_sg_merge && merge_not_need);
+		bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, false);
 		bio->bi_next = nxt;
 	}
 
-- 
1.7.9.5


[-- Attachment #3: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: kernel BUG at drivers/block/virtio_blk.c:172!
  2014-11-11 16:42         ` Ming Lei
@ 2014-11-12 18:18           ` Jens Axboe
  2014-11-12 21:02             ` Jeff Layton
  2014-11-13 11:04             ` Dongsu Park
  0 siblings, 2 replies; 9+ messages in thread
From: Jens Axboe @ 2014-11-12 18:18 UTC (permalink / raw)
  To: Ming Lei, Dongsu Park
  Cc: Michael S. Tsirkin, Rusty Russell, xfs, Lukas Czerner,
	Jeff Layton, Linux Virtualization, Christoph Hellwig

On 11/11/2014 09:42 AM, Ming Lei wrote:
> The attached patch should fix the problem, and hope it is the last one, :-)

Dongsu and Jeff, any of you test this variant? I think this is the last
one, at least I hope so as well...

-- 
Jens Axboe

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: kernel BUG at drivers/block/virtio_blk.c:172!
  2014-11-12 18:18           ` Jens Axboe
@ 2014-11-12 21:02             ` Jeff Layton
  2014-11-13 11:04             ` Dongsu Park
  1 sibling, 0 replies; 9+ messages in thread
From: Jeff Layton @ 2014-11-12 21:02 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Michael S. Tsirkin, Ming Lei, Rusty Russell, xfs, Lukas Czerner,
	Linux Virtualization, Christoph Hellwig, Dongsu Park

On Wed, 12 Nov 2014 11:18:32 -0700
Jens Axboe <axboe@kernel.dk> wrote:

> On 11/11/2014 09:42 AM, Ming Lei wrote:
> > The attached patch should fix the problem, and hope it is the last one, :-)
> 
> Dongsu and Jeff, any of you test this variant? I think this is the last
> one, at least I hope so as well...
> 

Yes, thanks! That patch seems to fix the problem for me. You can add:

Tested-by: Jeff Layton <jlayton@poochiereds.net>

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: kernel BUG at drivers/block/virtio_blk.c:172!
  2014-11-12 18:18           ` Jens Axboe
  2014-11-12 21:02             ` Jeff Layton
@ 2014-11-13 11:04             ` Dongsu Park
  1 sibling, 0 replies; 9+ messages in thread
From: Dongsu Park @ 2014-11-13 11:04 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Michael S. Tsirkin, Ming Lei, Rusty Russell, xfs, Lukas Czerner,
	Jeff Layton, Linux Virtualization, Christoph Hellwig

On 12.11.2014 11:18, Jens Axboe wrote:
> On 11/11/2014 09:42 AM, Ming Lei wrote:
> > The attached patch should fix the problem, and hope it is the last one, :-)
> 
> Dongsu and Jeff, any of you test this variant? I think this is the last
> one, at least I hope so as well...

Yes, I've just tested it again with the Ming's patch.
It passed a full cycle of xfstests: No crash, no particular regression.
The code in blk_recount_segments() seems to make sense too.

Thanks! ;-)
Dongsu

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-11-13 11:04 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-07 13:04 kernel BUG at drivers/block/virtio_blk.c:172! Jeff Layton
2014-11-10  9:59 ` Rusty Russell
2014-11-10 23:31   ` Jens Axboe
2014-11-11  0:56     ` Ming Lei
2014-11-11 15:42       ` Dongsu Park
2014-11-11 16:42         ` Ming Lei
2014-11-12 18:18           ` Jens Axboe
2014-11-12 21:02             ` Jeff Layton
2014-11-13 11:04             ` Dongsu Park

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox