linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Josef Bacik <jbacik@fb.com>
To: <fdmanana@gmail.com>
Cc: Filipe Manana <fdmanana@suse.com>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH 2/6] Btrfs: fix crash caused by block group removal
Date: Wed, 26 Nov 2014 11:24:39 -0500	[thread overview]
Message-ID: <5475FEC7.2050606@fb.com> (raw)
In-Reply-To: <CAL3q7H4-gUg8W2zFR15Uk38b6S4K9P66mdf7JUMBni8WSjT75Q@mail.gmail.com>

On 11/26/2014 11:09 AM, Filipe David Manana wrote:
> On Wed, Nov 26, 2014 at 3:57 PM, Josef Bacik <jbacik@fb.com> wrote:
>> On 11/26/2014 10:28 AM, Filipe Manana wrote:
>>>
>>> If we remove a block group (because it became empty), we might have left
>>> a caching_ctl structure in fs_info->caching_block_groups that points to
>>> the block group and is accessed at transaction commit time. This results
>>> in accessing an invalid or incorrect block group. This issue became
>>> visible
>>> after Josef's patch "Btrfs: remove empty block groups automatically".
>>>
>>> So if the block group is removed make sure we don't leave a dangling
>>> caching_ctl in caching_block_groups.
>>>
>>> Sample crash trace:
>>>
>>> [58380.439449] BUG: unable to handle kernel paging request at
>>> ffff8801446eaeb8
>>> [58380.439707] IP: [<ffffffffa03f6d05>]
>>> block_group_cache_done.isra.21+0xc/0x1c [btrfs]
>>> [58380.440879] PGD 1acb067 PUD 23f5ff067 PMD 23f5db067 PTE
>>> 80000001446ea060
>>> [58380.441220] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
>>> [58380.441486] Modules linked in: btrfs crc32c_generic xor raid6_pq nfsd
>>> auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc loop psmouse
>>> processor i2c_piix4 parport_pc parport pcspkr serio_raw evdev i2ccore
>>> thermal_sys microcode button ext4 crc16 jbd2 mbcache sr_mod cdrom
>>> ata_generic sg sd_mod crc_t10dif crct10dif_generic crct10dif_common
>>> virtio_scsi floppy ata_piix e1000 libata virtio_pci scsi_mod virtio_ring
>>> virtio [last unloaded: btrfs]
>>> [58380.443238] CPU: 3 PID: 25728 Comm: btrfs-transacti Tainted: G        W
>>> 3.17.0-rc5-btrfs-next-1+ #1
>>> [58380.443238] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>>> rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
>>> [58380.443238] task: ffff88013ac82090 ti: ffff88013896c000 task.ti:
>>> ffff88013896c000
>>> [58380.443238] RIP: 0010:[<ffffffffa03f6d05>]  [<ffffffffa03f6d05>]
>>> block_group_cache_done.isra.21+0xc/0x1c [btrfs]
>>> [58380.443238] RSP: 0018:ffff88013896fdd8  EFLAGS: 00010283
>>> [58380.443238] RAX: ffff880222cae850 RBX: ffff880119ba74c0 RCX:
>>> 0000000000000000
>>> [58380.443238] RDX: 0000000000000000 RSI: ffff880185e16800 RDI:
>>> ffff8801446eaeb8
>>> [58380.443238] RBP: ffff88013896fdd8 R08: ffff8801a9ca9fa8 R09:
>>> ffff88013896fc60
>>> [58380.443238] R10: ffff88013896fd28 R11: 0000000000000000 R12:
>>> ffff880222cae000
>>> [58380.443238] R13: ffff880222cae850 R14: ffff880222cae6b0 R15:
>>> ffff8801446eae00
>>> [58380.443238] FS:  0000000000000000(0000) GS:ffff88023ed80000(0000)
>>> knlGS:0000000000000000
>>> [58380.443238] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>>> [58380.443238] CR2: ffff8801446eaeb8 CR3: 0000000001811000 CR4:
>>> 00000000000006e0
>>> [58380.443238] Stack:
>>> [58380.443238]  ffff88013896fe18 ffffffffa03fe2d5 ffff880222cae850
>>> ffff880185e16800
>>> [58380.443238]  ffff88000dc41c20 0000000000000000 ffff8801a9ca9f00
>>> 0000000000000000
>>> [58380.443238]  ffff88013896fe80 ffffffffa040fbcf ffff88018b0dcdb0
>>> ffff88013ac82090
>>> [58380.443238] Call Trace:
>>> [58380.443238]  [<ffffffffa03fe2d5>] btrfs_prepare_extent_commit+0x5a/0xd7
>>> [btrfs]
>>> [58380.443238]  [<ffffffffa040fbcf>] btrfs_commit_transaction+0x45c/0x882
>>> [btrfs]
>>> [58380.443238]  [<ffffffffa040c058>] transaction_kthread+0xf2/0x1a4
>>> [btrfs]
>>> [58380.443238]  [<ffffffffa040bf66>] ?
>>> btrfs_cleanup_transaction+0x3d8/0x3d8 [btrfs]
>>> [58380.443238]  [<ffffffff8105966b>] kthread+0xb7/0xbf
>>> [58380.443238]  [<ffffffff810595b4>] ? __kthread_parkme+0x67/0x67
>>> [58380.443238]  [<ffffffff813ebeac>] ret_from_fork+0x7c/0xb0
>>> [58380.443238]  [<ffffffff810595b4>] ? __kthread_parkme+0x67/0x67
>>>
>>> Signed-off-by: Filipe Manana <fdmanana@suse.com>
>>> ---
>>>    fs/btrfs/ctree.h       |  1 +
>>>    fs/btrfs/extent-tree.c | 27 +++++++++++++++++++++++++++
>>>    2 files changed, 28 insertions(+)
>>>
>>> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
>>> index d3ccd09..7f40a65 100644
>>> --- a/fs/btrfs/ctree.h
>>> +++ b/fs/btrfs/ctree.h
>>> @@ -1277,6 +1277,7 @@ struct btrfs_block_group_cache {
>>>          unsigned int ro:1;
>>>          unsigned int dirty:1;
>>>          unsigned int iref:1;
>>> +       unsigned int has_caching_ctl:1;
>>>
>>
>> Don't do this, just unconditionally call get_caching_control in
>> btrfs_remove_block_group, then if we get something we can do stuff,
>> otherwise we can just continue.  Thanks,
>
> That's what I initially thought too. However, get_caching_control only
> returns us the caching_ctl if block_group->cached ==
> BTRFS_CACHE_STARTED, so it's not enough to use it exclusively. The
> has_caching_ctl flag is just to avoid holding the semaphore and search
> through the list (since block_group->caching_ctl can be NULL but a
> caching_ctl point to the block group can be in the list).
>

Oh God that's not good, we need to change get_caching_control to return 
if there is a caching control at all, since the other users want to wait 
for the fast caching to finish too.  So change that and then use it 
unconditionally.  I bet this has been causing us the random early ENOSPC 
problems.  Thanks,

Josef


  reply	other threads:[~2014-11-26 16:24 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-26 15:28 [PATCH 0/6] Btrfs: fixes for block group remove/allocation and trim/discard Filipe Manana
2014-11-26 15:28 ` [PATCH 1/6] Btrfs: fix invalid block group rbtree access after bg is removed Filipe Manana
2014-11-26 15:58   ` Josef Bacik
2014-11-26 15:28 ` [PATCH 2/6] Btrfs: fix crash caused by block group removal Filipe Manana
2014-11-26 15:57   ` Josef Bacik
2014-11-26 16:09     ` Filipe David Manana
2014-11-26 16:24       ` Josef Bacik [this message]
2014-11-26 16:34         ` Filipe David Manana
2014-11-26 16:41           ` Josef Bacik
2014-11-26 15:28 ` [PATCH 3/6] Btrfs: fix freeing used extents after removing empty block group Filipe Manana
2014-11-26 16:02   ` Josef Bacik
2014-11-26 15:28 ` [PATCH 4/6] Btrfs: fix race between fs trimming and block group remove/allocation Filipe Manana
2014-11-26 16:15   ` Josef Bacik
2014-11-26 16:25     ` Filipe David Manana
2014-11-26 16:30       ` Josef Bacik
2014-11-26 17:19   ` [PATCH v2 " Filipe Manana
2014-11-27  1:16   ` [PATCH v3 " Filipe Manana
2014-11-27 21:14   ` [PATCH v4 " Filipe Manana
2014-11-26 15:28 ` [PATCH 5/6] Btrfs: fix race between writing free space cache and trimming Filipe Manana
2014-12-01 17:04   ` [PATCH v2 " Filipe Manana
2014-11-26 15:28 ` [PATCH 6/6] Btrfs: make btrfs_abort_transaction consider existence of new block groups Filipe Manana
2014-11-26 16:07   ` Josef Bacik
2014-11-26 16:15     ` Filipe David Manana
2014-11-26 16:19       ` Josef Bacik
2014-11-26 16:29         ` Filipe David Manana
2014-11-26 16:32           ` Josef Bacik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5475FEC7.2050606@fb.com \
    --to=jbacik@fb.com \
    --cc=fdmanana@gmail.com \
    --cc=fdmanana@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).