From: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
To: Josef Bacik <josef@toxicpanda.com>
Cc: linux-btrfs@vger.kernel.org, kernel-team@fb.com,
Qu Wenruo <wqu@suse.com>
Subject: Re: [PATCH v5 13/52] btrfs: handle errors from select_reloc_root()
Date: Tue, 8 Dec 2020 09:44:46 -0500 [thread overview]
Message-ID: <20201208144446.GH31381@hungrycats.org> (raw)
In-Reply-To: <5daa486a9ce06876bdef92a50f1a11d4eee9da67.1607349282.git.josef@toxicpanda.com>
On Mon, Dec 07, 2020 at 08:57:05AM -0500, Josef Bacik wrote:
> Currently select_reloc_root() doesn't return an error, but followup
> patches will make it possible for it to return an error. We do have
> proper error recovery in do_relocation however, so handle the
> possibility of select_reloc_root() having an error properly instead of
> BUG_ON(!root). I've also adjusted select_reloc_root() to return
> ERR_PTR(-ENOENT) if we don't find a root, instead of NULL, to make the
> error case easier to deal with. I've replaced the BUG_ON(!root) with an
> ASSERT(ret != -ENOENT), as this indicates we messed up the backref
> walking code, but could indicate corruption so we do not want to have a
> BUG_ON() here.
>
> Reviewed-by: Qu Wenruo <wqu@suse.com>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
> fs/btrfs/relocation.c | 15 +++++++++++++--
> 1 file changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index 4333ee329290..66515ccc04fe 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -2027,7 +2027,7 @@ struct btrfs_root *select_reloc_root(struct btrfs_trans_handle *trans,
> break;
> }
> if (!root)
> - return NULL;
> + return ERR_PTR(-ENOENT);
>
> next = node;
> /* setup backref node path for btrfs_reloc_cow_block */
> @@ -2198,7 +2198,18 @@ static int do_relocation(struct btrfs_trans_handle *trans,
>
> upper = edge->node[UPPER];
> root = select_reloc_root(trans, rc, upper, edges);
> - BUG_ON(!root);
> + if (IS_ERR(root)) {
> + ret = PTR_ERR(root);
> +
> + /*
> + * This can happen if there's fs corruption, but if we
> + * have ASSERT()'s on then we're developers and we
> + * likely made a logic mistake in the backref code, so
> + * check for this error condition.
> + */
> + ASSERT(ret != -ENOENT);
Hit this once last night. Test VM kept going after a reboot.
[17579.428311][T10916] BTRFS info (device dm-0): balance: start -mlimit=1 -slimit=1
[17579.773705][T10916] BTRFS info (device dm-0): relocating block group 3262427168768 flags metadata|raid1
[17581.153861][T10916] assertion failed: ret != -ENOENT, in fs/btrfs/relocation.c:2369
[17581.157815][T10916] ------------[ cut here ]------------
[17581.161214][T10916] kernel BUG at fs/btrfs/ctree.h:3368!
[17581.162817][T10916] invalid opcode: 0000 [#1] SMP KASAN PTI
[17581.167786][T10916] CPU: 3 PID: 10916 Comm: btrfs Tainted: G W 5.10.0-69dfffde3fb6-for-next+ #22
[17581.174638][T10916] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[17581.177226][T10916] RIP: 0010:assertfail+0x18/0x1a
[17581.178622][T10916] Code: a0 46 aa fe 48 8b 3d a9 bd fd 02 e8 94 46 aa fe 5d c3 55 89 d1 48 89 f2 48 89 fe 48 c7 c7 00 20 24 96 48 89 e5 e8 74 36 fe ff <0f> 0b 48 89 df e8 00 2b b4 fe 48 8b 85 98 fe ff ff 44 89 ea 48 c7
[17581.187531][T10916] RSP: 0018:ffffc90001757400 EFLAGS: 00010286
[17581.189711][T10916] RAX: 000000000000003f RBX: ffff888012dabf00 RCX: ffffffff94265ba2
[17581.192629][T10916] RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff8881f5fff28c
[17581.195411][T10916] RBP: ffffc90001757400 R08: ffffed103ec015b5 R09: ffffed103ec015b5
[17581.198061][T10916] R10: ffff8881f600ada7 R11: ffffed103ec015b4 R12: ffff888012dab668
[17581.201387][T10916] R13: 0000000000000004 R14: 00000000fffffffe R15: ffff888005f18f00
[17581.203965][T10916] FS: 00007f55e295b8c0(0000) GS:ffff8881f5e00000(0000) knlGS:0000000000000000
[17581.207716][T10916] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[17581.209584][T10916] CR2: 0000560ee9649da4 CR3: 00000000720d4005 CR4: 0000000000170ee0
[17581.214157][T10916] Call Trace:
[17581.215835][T10916] do_relocation.cold.55+0x9c/0xb9
[17581.218719][T10916] ? select_reloc_root+0x6e0/0x6e0
[17581.221899][T10916] ? walk_up_backref+0x91/0xd0
[17581.224508][T10916] ? __asan_loadN+0xf/0x20
[17581.234986][T10916] ? memcpy+0x4d/0x60
[17581.242655][T10916] ? read_extent_buffer+0xd1/0x100
[17581.250560][T10916] relocate_tree_blocks+0xa70/0xb90
[17581.253642][T10916] ? do_relocation+0xc10/0xc10
[17581.256300][T10916] ? kmem_cache_alloc_trace+0x6a3/0xcb0
[17581.259164][T10916] ? free_extent_buffer.part.52+0xd7/0x140
[17581.265002][T10916] ? rb_insert_color+0x342/0x360
[17581.267469][T10916] relocate_block_group+0x2eb/0x780
[17581.271941][T10916] ? merge_reloc_roots+0x4a0/0x4a0
[17581.273800][T10916] btrfs_relocate_block_group+0x26e/0x4c0
[17581.276285][T10916] btrfs_relocate_chunk+0x52/0x120
[17581.280022][T10916] btrfs_balance+0xe2e/0x1900
[17581.284864][T10916] ? __kasan_check_read+0x11/0x20
[17581.288259][T10916] ? lock_acquire+0xd0/0x550
[17581.291309][T10916] ? btrfs_relocate_chunk+0x120/0x120
[17581.295542][T10916] ? kasan_unpoison_shadow+0x40/0x50
[17581.298465][T10916] ? kmem_cache_alloc_trace+0x6a3/0xcb0
[17581.301092][T10916] ? _copy_from_user+0x83/0xc0
[17581.303427][T10916] btrfs_ioctl_balance+0x3a7/0x460
[17581.305850][T10916] btrfs_ioctl+0x24c8/0x4360
[17581.307699][T10916] ? __kasan_check_read+0x11/0x20
[17581.310098][T10916] ? lock_release+0xc8/0x640
[17581.312712][T10916] ? lru_cache_add+0x178/0x250
[17581.314991][T10916] ? btrfs_ioctl_get_supported_features+0x30/0x30
[17581.318654][T10916] ? lock_downgrade+0x3f0/0x3f0
[17581.320972][T10916] ? handle_mm_fault+0x159e/0x2150
[17581.323839][T10916] ? __kasan_check_read+0x11/0x20
[17581.326814][T10916] ? lock_release+0xc8/0x640
[17581.329251][T10916] ? do_user_addr_fault+0x299/0x5a0
[17581.332161][T10916] ? do_raw_spin_unlock+0xa8/0x140
[17581.334825][T10916] ? lock_downgrade+0x3f0/0x3f0
[17581.337282][T10916] ? _raw_spin_unlock+0x22/0x30
[17581.339972][T10916] ? handle_mm_fault+0xad6/0x2150
[17581.342585][T10916] ? do_vfs_ioctl+0xfc/0x9d0
[17581.345189][T10916] ? ioctl_file_clone+0xe0/0xe0
[17581.348274][T10916] ? __kasan_check_write+0x14/0x20
[17581.350976][T10916] ? up_read+0x176/0x4f0
[17581.353281][T10916] ? down_write_nested+0x2d0/0x2d0
[17581.356263][T10916] ? vmacache_find+0xc9/0x120
[17581.358822][T10916] ? __kasan_check_read+0x11/0x20
[17581.361699][T10916] ? __fget_light+0xae/0x110
[17581.364329][T10916] __x64_sys_ioctl+0xc3/0x100
[17581.367294][T10916] do_syscall_64+0x37/0x80
[17581.370340][T10916] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[17581.373984][T10916] RIP: 0033:0x7f55e2a4e427
[17581.376697][T10916] Code: 00 00 90 48 8b 05 69 aa 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 39 aa 0c 00 f7 d8 64 89 01 48
[17581.388513][T10916] RSP: 002b:00007ffdb30a1438 EFLAGS: 00000202 ORIG_RAX: 0000000000000010
[17581.393564][T10916] RAX: ffffffffffffffda RBX: 00007ffdb30a14d8 RCX: 00007f55e2a4e427
[17581.398315][T10916] RDX: 00007ffdb30a14d8 RSI: 00000000c4009420 RDI: 0000000000000003
[17581.402711][T10916] RBP: 0000000000000003 R08: 0000000000000003 R09: 0000000000000078
[17581.407348][T10916] R10: fffffffffffff59d R11: 0000000000000202 R12: 0000000000000001
[17581.412074][T10916] R13: 0000000000000000 R14: 00007ffdb30a3a34 R15: 0000000000000001
[17581.417008][T10916] Modules linked in:
[17581.420470][T10916] ---[ end trace 2aee413c08ff01b0 ]---
[17581.424847][T10916] RIP: 0010:assertfail+0x18/0x1a
[17581.428219][T10916] Code: a0 46 aa fe 48 8b 3d a9 bd fd 02 e8 94 46 aa fe 5d c3 55 89 d1 48 89 f2 48 89 fe 48 c7 c7 00 20 24 96 48 89 e5 e8 74 36 fe ff <0f> 0b 48 89 df e8 00 2b b4 fe 48 8b 85 98 fe ff ff 44 89 ea 48 c7
[17581.441219][T10916] RSP: 0018:ffffc90001757400 EFLAGS: 00010286
[17581.445067][T10916] RAX: 000000000000003f RBX: ffff888012dabf00 RCX: ffffffff94265ba2
[17581.448863][T10916] RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff8881f5fff28c
[17581.454668][T10916] RBP: ffffc90001757400 R08: ffffed103ec015b5 R09: ffffed103ec015b5
[17581.459261][T10916] R10: ffff8881f600ada7 R11: ffffed103ec015b4 R12: ffff888012dab668
[17581.464851][T10916] R13: 0000000000000004 R14: 00000000fffffffe R15: ffff888005f18f00
[17581.468591][T10916] FS: 00007f55e295b8c0(0000) GS:ffff8881f5e00000(0000) knlGS:0000000000000000
[17581.473633][T10916] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[17581.477009][T10916] CR2: 0000560ee9649da4 CR3: 00000000720d4005 CR4: 0000000000170ee0
> + goto next;
> + }
>
> if (upper->eb && !upper->locked) {
> if (!lowest) {
> --
> 2.26.2
>
next prev parent reply other threads:[~2020-12-08 14:45 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-07 13:56 [PATCH v5 00/52] Cleanup error handling in relocation Josef Bacik
2020-12-07 13:56 ` [PATCH v5 01/52] btrfs: allow error injection for btrfs_search_slot and btrfs_cow_block Josef Bacik
2020-12-07 13:56 ` [PATCH v5 02/52] btrfs: modify the new_root highest_objectid under a ref count Josef Bacik
2020-12-08 13:51 ` Nikolay Borisov
2020-12-07 13:56 ` [PATCH v5 03/52] btrfs: fix lockdep splat in btrfs_recover_relocation Josef Bacik
2020-12-07 13:56 ` [PATCH v5 04/52] btrfs: keep track of the root owner for relocation reads Josef Bacik
2020-12-07 13:56 ` [PATCH v5 05/52] btrfs: noinline btrfs_should_cancel_balance Josef Bacik
2020-12-07 13:56 ` [PATCH v5 06/52] btrfs: do not cleanup upper nodes in btrfs_backref_cleanup_node Josef Bacik
2020-12-07 13:56 ` [PATCH v5 07/52] btrfs: pass down the tree block level through ref-verify Josef Bacik
2020-12-07 13:57 ` [PATCH v5 08/52] btrfs: make sure owner is set in ref-verify Josef Bacik
2020-12-07 13:57 ` [PATCH v5 09/52] btrfs: don't clear ret in btrfs_start_dirty_block_groups Josef Bacik
2020-12-07 13:57 ` [PATCH v5 10/52] btrfs: convert some BUG_ON()'s to ASSERT()'s in do_relocation Josef Bacik
2020-12-07 13:57 ` [PATCH v5 11/52] btrfs: convert BUG_ON()'s in relocate_tree_block Josef Bacik
2020-12-07 13:57 ` [PATCH v5 12/52] btrfs: return an error from btrfs_record_root_in_trans Josef Bacik
2020-12-07 13:57 ` [PATCH v5 13/52] btrfs: handle errors from select_reloc_root() Josef Bacik
2020-12-08 14:44 ` Zygo Blaxell [this message]
2020-12-07 13:57 ` [PATCH v5 14/52] btrfs: convert BUG_ON()'s in select_reloc_root() to proper errors Josef Bacik
2020-12-07 13:57 ` [PATCH v5 15/52] btrfs: check record_root_in_trans related failures in select_reloc_root Josef Bacik
2020-12-07 13:57 ` [PATCH v5 16/52] btrfs: do proper error handling in record_reloc_root_in_trans Josef Bacik
2020-12-07 13:57 ` [PATCH v5 17/52] btrfs: handle btrfs_record_root_in_trans failure in btrfs_rename_exchange Josef Bacik
2020-12-07 13:57 ` [PATCH v5 18/52] btrfs: handle btrfs_record_root_in_trans failure in btrfs_rename Josef Bacik
2020-12-07 13:57 ` [PATCH v5 19/52] btrfs: handle btrfs_record_root_in_trans failure in btrfs_delete_subvolume Josef Bacik
2020-12-07 13:57 ` [PATCH v5 20/52] btrfs: handle btrfs_record_root_in_trans failure in btrfs_recover_log_trees Josef Bacik
2020-12-07 13:57 ` [PATCH v5 21/52] btrfs: handle btrfs_record_root_in_trans failure in create_subvol Josef Bacik
2020-12-07 13:57 ` [PATCH v5 22/52] btrfs: btrfs: handle btrfs_record_root_in_trans failure in relocate_tree_block Josef Bacik
2020-12-07 13:57 ` [PATCH v5 23/52] btrfs: handle btrfs_record_root_in_trans failure in start_transaction Josef Bacik
2020-12-07 13:57 ` [PATCH v5 24/52] btrfs: handle record_root_in_trans failure in qgroup_account_snapshot Josef Bacik
2020-12-07 13:57 ` [PATCH v5 25/52] btrfs: handle record_root_in_trans failure in btrfs_record_root_in_trans Josef Bacik
2020-12-07 13:57 ` [PATCH v5 26/52] btrfs: handle record_root_in_trans failure in create_pending_snapshot Josef Bacik
2020-12-07 13:57 ` [PATCH v5 27/52] btrfs: do not panic in __add_reloc_root Josef Bacik
2020-12-08 13:53 ` Nikolay Borisov
2020-12-07 13:57 ` [PATCH v5 28/52] btrfs: have proper error handling in btrfs_init_reloc_root Josef Bacik
2020-12-07 13:57 ` [PATCH v5 29/52] btrfs: do proper error handling in create_reloc_root Josef Bacik
2020-12-07 13:57 ` [PATCH v5 30/52] btrfs: validate ->reloc_root after recording root in trans Josef Bacik
2020-12-07 13:57 ` [PATCH v5 31/52] btrfs: handle btrfs_update_reloc_root failure in commit_fs_roots Josef Bacik
2020-12-07 13:57 ` [PATCH v5 32/52] btrfs: change insert_dirty_subvol to return errors Josef Bacik
2020-12-07 13:57 ` [PATCH v5 33/52] btrfs: handle btrfs_update_reloc_root failure in insert_dirty_subvol Josef Bacik
2020-12-07 13:57 ` [PATCH v5 34/52] btrfs: handle btrfs_update_reloc_root failure in prepare_to_merge Josef Bacik
2020-12-07 13:57 ` [PATCH v5 35/52] btrfs: do proper error handling in btrfs_update_reloc_root Josef Bacik
2020-12-07 13:57 ` [PATCH v5 36/52] btrfs: convert logic BUG_ON()'s in replace_path to ASSERT()'s Josef Bacik
2020-12-07 13:57 ` [PATCH v5 37/52] btrfs: handle btrfs_cow_block errors in replace_path Josef Bacik
2020-12-07 13:57 ` [PATCH v5 38/52] btrfs: handle btrfs_search_slot failure " Josef Bacik
2020-12-07 13:57 ` [PATCH v5 39/52] btrfs: handle errors in reference count manipulation " Josef Bacik
2020-12-07 13:57 ` [PATCH v5 40/52] btrfs: handle extent reference errors in do_relocation Josef Bacik
2020-12-07 13:57 ` [PATCH v5 41/52] btrfs: check for BTRFS_BLOCK_FLAG_FULL_BACKREF being set improperly Josef Bacik
2020-12-07 13:57 ` [PATCH v5 42/52] btrfs: remove the extent item sanity checks in relocate_block_group Josef Bacik
2020-12-07 13:57 ` [PATCH v5 43/52] btrfs: do proper error handling in create_reloc_inode Josef Bacik
2020-12-07 13:57 ` [PATCH v5 44/52] btrfs: handle __add_reloc_root failures in btrfs_recover_relocation Josef Bacik
2020-12-07 13:57 ` [PATCH v5 45/52] btrfs: cleanup error handling in prepare_to_merge Josef Bacik
2020-12-07 13:57 ` [PATCH v5 46/52] btrfs: handle extent corruption with select_one_root properly Josef Bacik
2020-12-07 13:57 ` [PATCH v5 47/52] btrfs: do proper error handling in merge_reloc_roots Josef Bacik
2020-12-07 13:57 ` [PATCH v5 48/52] btrfs: check return value of btrfs_commit_transaction in relocation Josef Bacik
2020-12-07 13:57 ` [PATCH v5 49/52] btrfs: do not WARN_ON() if we can't find the reloc root Josef Bacik
2020-12-07 13:57 ` [PATCH v5 50/52] btrfs: print the actual offset in btrfs_root_name Josef Bacik
2020-12-07 13:57 ` [PATCH v5 51/52] btrfs: fix reloc root leak with 0 ref reloc roots on recovery Josef Bacik
2020-12-07 13:57 ` [PATCH v5 52/52] btrfs: splice remaining dirty_bg's onto the transaction dirty bg list Josef Bacik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201208144446.GH31381@hungrycats.org \
--to=ce3g8jdj@umail.furryterror.org \
--cc=josef@toxicpanda.com \
--cc=kernel-team@fb.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox