From: Anand Jain <anand.jain@oracle.com>
To: dsterba@suse.cz, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 13/13] btrfs: optimize check for stale device
Date: Wed, 23 Mar 2016 00:43:30 +0800 [thread overview]
Message-ID: <56F17632.8080901@oracle.com> (raw)
In-Reply-To: <20160322122119.GJ8095@twin.jikos.cz>
On 03/22/2016 08:21 PM, David Sterba wrote:
> On Fri, Feb 19, 2016 at 03:10:16PM +0800, Anand Jain wrote:
>>> I see crashes with btrfs/011 on a non-debugging config
>>>
>>> [ 641.714363] BUG: unable to handle kernel NULL pointer dereference at 0000000000000068
>>> [ 641.716057] IP: [<ffffffffa0152eb6>] scrub_setup_ctx.isra.19+0x1f6/0x260 [btrfs]
>>> [ 641.717036] PGD 720c1067 PUD 720c2067 PMD 0
>>> [ 641.717749] Oops: 0000 [#1] PREEMPT SMP
>> ::
>>> [ 641.723163] CPU: 0 PID: 27766 Comm: btrfs Not tainted 4.5.0-rc3-next-20160212-1.g38290f0-vanilla #1
>>> [ 641.724420] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014
>>> [ 641.725723] task: ffff8800742481c0 ti: ffff880071d10000 task.ti: ffff880071d10000
>>> [ 641.726954] RIP: 0010:[<ffffffffa0152eb6>] [<ffffffffa0152eb6>] scrub_setup_ctx.isra.19+0x1f6/0x260 [btrfs]
>>> [ 641.728404] RSP: 0018:ffff880071d13ce8 EFLAGS: 00010202
>>> [ 641.729413] RAX: ffff88007231e800 RBX: ffff88007231e800 RCX: 0000000000000000
>>> [ 641.730610] RDX: ffffffffa0195638 RSI: ffffffffa017c5a8 RDI: ffff88007231ea80
>>> [ 641.731832] RBP: ffff880071d13d18 R08: 0000000000000000 R09: ffff88007204ea00
>>> [ 641.733085] R10: 0000000000000008 R11: 0000000000000000 R12: 0000000000000000
>>> [ 641.734307] R13: 0000000000000001 R14: ffff88007231e9f8 R15: 000000000000003f
>>> [ 641.735544] FS: 00007f03ed36d8c0(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
>>> [ 641.736883] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [ 641.738022] CR2: 0000000000000068 CR3: 00000000720c0000 CR4: 00000000000006f0
>>> [ 641.739325] Stack:
>>> [ 641.740156] ffff8800724d4000 ffff8800724d4000 0000000000000000 ffff8800722ef000
>>> [ 641.741735] 0000000000000000 ffff8800724d4fc8 ffff880071d13d98 ffffffffa01566fd
>>> [ 641.743163] ffff88007b127000 0000001900000000 ffff8800724d4ce8 0000000000000000
>>> [ 641.744599] Call Trace:
>>> [ 641.745553] [<ffffffffa01566fd>] btrfs_scrub_dev+0x13d/0x510 [btrfs]
>>> [ 641.746894] [<ffffffffa0169ca9>] btrfs_dev_replace_start+0x279/0x3f0 [btrfs]
>>> [ 641.748282] [<ffffffffa0132839>] btrfs_ioctl+0x1869/0x2070 [btrfs]
>>> [ 641.749587] [<ffffffff8106d553>] ? pte_alloc_one+0x33/0x40
>>> [ 641.750850] [<ffffffff81222516>] do_vfs_ioctl+0x96/0x590
>>> [ 641.752128] [<ffffffff810682d1>] ? __do_page_fault+0x181/0x450
>>> [ 641.753432] [<ffffffff81222a89>] SyS_ioctl+0x79/0x90
>>> [ 641.754663] [<ffffffff816d4336>] entry_SYSCALL_64_fastpath+0x1e/0xa8
>>> [ 641.756037] Code: 00 48 c7 c2 38 56 19 a0 48 c7 c6 a8 c5 17 a0 e8 21 39 f7 e0 45 85 ed 48 c7 83 68 02 00 00 00 00 00 00 48 89 d8 0f 84 03 ff ff ff <49> 83 7c 24 68 00 74 40 c7 83 78 02 00 00 20 00 00 00 4c 89 a3
>>> [ 641.760392] RIP [<ffffffffa0152eb6>] scrub_setup_ctx.isra.19+0x1f6/0x260 [btrfs]
>>> [ 641.761970] RSP <ffff880071d13ce8>
>>> [ 641.763190] CR2: 0000000000000068
>>> [ 641.767218] ---[ end trace f46d4e6a90bda310 ]---
>>>
>>> the dereference happens at offset 0x68 which matches bdev in
>>> btrfs_device, so this patch is my best guess at the moment. I'm not able
>>> to reproduce it directly so I need to wait for a rebuild and repeat.
>>
>>
>> Looks like dev was fine when find_device was called, but
>> later it was null when ->bdev was accessed.
>>
>> I couldn't reproduce here. There are 10 workouts within btrfs/011
>> any idea workout caused this? As of now I am guessing..
>>
>> workout "-m dup -d single" 1 cancel quick
>>
>> digging more.
>
> I was not able reproduce the crash since. All ok on a physical machine,
> in a virtual machine in kvm the test runs for a long time and then
> freezes (serial console, ssh). The kvm process eats 100% cpu, not
> possible to debug it directly. The branch stays in my for-next and is
> on the way to 4.7, we'll see if we can reproduce it.
Agreed. Thanks Dave.
next prev parent reply other threads:[~2016-03-22 16:43 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-13 2:01 [PATCH resend 00/13] misc patches plus Introduce device delete by devid Anand Jain
2016-02-13 2:01 ` [PATCH v2 01/13] btrfs: pass the error code to the btrfs_std_error and log ret Anand Jain
2016-02-13 2:01 ` [PATCH 02/13] btrfs: create a helper function to read the disk super Anand Jain
2016-02-13 2:01 ` [PATCH v2 03/13] btrfs: maintain consistency in logging to help debugging Anand Jain
2016-02-13 2:01 ` [PATCH v2 04/13] btrfs: device path change must be logged Anand Jain
2016-02-13 2:01 ` [PATCH 05/13] Btrfs: fix fs logging for multi device Anand Jain
2016-02-13 2:01 ` [PATCH v2 06/13] btrfs: create helper function __check_raid_min_devices() Anand Jain
2016-02-15 14:51 ` David Sterba
2016-02-13 2:01 ` [PATCH 07/13] btrfs: clean up and optimize __check_raid_min_device() Anand Jain
2016-02-13 2:01 ` [PATCH v2 08/13] btrfs: create helper btrfs_find_device_by_user_input() Anand Jain
2016-02-13 2:01 ` [PATCH 09/13] btrfs: make use of btrfs_find_device_by_user_input() Anand Jain
2016-02-15 16:47 ` David Sterba
2016-02-15 16:53 ` David Sterba
2016-02-13 2:01 ` [PATCH v2 10/13] btrfs: enhance btrfs_find_device_by_user_input() to check device path Anand Jain
2016-02-13 2:01 ` [PATCH v2 11/13] btrfs: make use of btrfs_scratch_superblocks() in btrfs_rm_device() Anand Jain
2016-02-13 2:01 ` [PATCH v4 12/13] btrfs: introduce device delete by devid Anand Jain
2016-02-17 10:49 ` David Sterba
2016-02-18 6:59 ` Anand Jain
2016-02-18 9:53 ` David Sterba
2016-02-13 2:01 ` [PATCH 13/13] btrfs: optimize check for stale device Anand Jain
2016-02-18 15:13 ` David Sterba
2016-02-19 7:10 ` Anand Jain
2016-02-19 9:15 ` Anand Jain
2016-03-22 12:21 ` David Sterba
2016-03-22 16:43 ` Anand Jain [this message]
2016-03-09 9:54 ` Anand Jain
2016-03-09 16:33 ` David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56F17632.8080901@oracle.com \
--to=anand.jain@oracle.com \
--cc=dsterba@suse.cz \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).