All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] btrfs: some fixes around quota disable, free space tree disable and relocation
@ 2023-06-19 16:21 fdmanana
  2023-06-19 16:21 ` [PATCH 1/4] btrfs: fix race when deleting quota root from the dirty cow roots list fdmanana
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: fdmanana @ 2023-06-19 16:21 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

These fix some bugs related to quota disable and free space tree disable,
as well as a race between quota disable and relocation that makes
relocation fail with -ENOENT. More details on the change logs.

Filipe Manana (4):
  btrfs: fix race when deleting quota root from the dirty cow roots list
  btrfs: fix race when deleting free space root from the dirty cow roots list
  btrfs: add comment to struct btrfs_fs_info::dirty_cowonly_roots
  btrfs: fix race between quota disable and relocation

 fs/btrfs/free-space-tree.c |  3 +++
 fs/btrfs/fs.h              |  1 +
 fs/btrfs/qgroup.c          | 20 +++++++++++++++++---
 3 files changed, 21 insertions(+), 3 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/4] btrfs: fix race when deleting quota root from the dirty cow roots list
  2023-06-19 16:21 [PATCH 0/4] btrfs: some fixes around quota disable, free space tree disable and relocation fdmanana
@ 2023-06-19 16:21 ` fdmanana
  2023-06-19 16:21 ` [PATCH 2/4] btrfs: fix race when deleting free space " fdmanana
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: fdmanana @ 2023-06-19 16:21 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

When disabling quotas we are deleting the quota root from the list
fs_info->dirty_cowonly_roots without taking the lock that protects it,
which is struct btrfs_fs_info::trans_lock. This unsynchronized list
manipulation may cause chaos if there's another concurrent manipulation
of this list, such as when adding a root to it with
ctree.c:add_root_to_dirty_list().

This can result in all sorts of weird failures caused by a race, such as
the following crash:

  [337571.278245] general protection fault, probably for non-canonical address 0xdead000000000108: 0000 [#1] PREEMPT SMP PTI
  [337571.278933] CPU: 1 PID: 115447 Comm: btrfs Tainted: G        W          6.4.0-rc6-btrfs-next-134+ #1
  [337571.279153] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
  [337571.279572] RIP: 0010:commit_cowonly_roots+0x11f/0x250 [btrfs]
  [337571.279928] Code: 85 38 06 00 (...)
  [337571.280363] RSP: 0018:ffff9f63446efba0 EFLAGS: 00010206
  [337571.280582] RAX: ffff942d98ec2638 RBX: ffff9430b82b4c30 RCX: 0000000449e1c000
  [337571.280798] RDX: dead000000000100 RSI: ffff9430021e4900 RDI: 0000000000036070
  [337571.281015] RBP: ffff942d98ec2000 R08: ffff942d98ec2000 R09: 000000000000015b
  [337571.281254] R10: 0000000000000009 R11: 0000000000000001 R12: ffff942fe8fbf600
  [337571.281476] R13: ffff942dabe23040 R14: ffff942dabe20800 R15: ffff942d92cf3b48
  [337571.281723] FS:  00007f478adb7340(0000) GS:ffff94349fa40000(0000) knlGS:0000000000000000
  [337571.281950] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [337571.282184] CR2: 00007f478ab9a3d5 CR3: 000000001e02c001 CR4: 0000000000370ee0
  [337571.282416] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [337571.282647] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  [337571.282874] Call Trace:
  [337571.283101]  <TASK>
  [337571.283327]  ? __die_body+0x1b/0x60
  [337571.283570]  ? die_addr+0x39/0x60
  [337571.283796]  ? exc_general_protection+0x22e/0x430
  [337571.284022]  ? asm_exc_general_protection+0x22/0x30
  [337571.284251]  ? commit_cowonly_roots+0x11f/0x250 [btrfs]
  [337571.284531]  btrfs_commit_transaction+0x42e/0xf90 [btrfs]
  [337571.284803]  ? _raw_spin_unlock+0x15/0x30
  [337571.285031]  ? release_extent_buffer+0x103/0x130 [btrfs]
  [337571.285305]  reset_balance_state+0x152/0x1b0 [btrfs]
  [337571.285578]  btrfs_balance+0xa50/0x11e0 [btrfs]
  [337571.285864]  ? __kmem_cache_alloc_node+0x14a/0x410
  [337571.286086]  btrfs_ioctl+0x249a/0x3320 [btrfs]
  [337571.286358]  ? mod_objcg_state+0xd2/0x360
  [337571.286577]  ? refill_obj_stock+0xb0/0x160
  [337571.286798]  ? seq_release+0x25/0x30
  [337571.287016]  ? __rseq_handle_notify_resume+0x3ba/0x4b0
  [337571.287235]  ? percpu_counter_add_batch+0x2e/0xa0
  [337571.287455]  ? __x64_sys_ioctl+0x88/0xc0
  [337571.287675]  __x64_sys_ioctl+0x88/0xc0
  [337571.287901]  do_syscall_64+0x38/0x90
  [337571.288126]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
  [337571.288352] RIP: 0033:0x7f478aaffe9b

So fix this by locking struct btrfs_fs_info::trans_lock before deleting
the quota root from that list.

Fixes: bed92eae26cc ("Btrfs: qgroup implementation and prototypes")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/qgroup.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index f41da7ac360d..f8735b31da16 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -1301,7 +1301,9 @@ int btrfs_quota_disable(struct btrfs_fs_info *fs_info)
 		goto out;
 	}
 
+	spin_lock(&fs_info->trans_lock);
 	list_del(&quota_root->dirty_list);
+	spin_unlock(&fs_info->trans_lock);
 
 	btrfs_tree_lock(quota_root->node);
 	btrfs_clear_buffer_dirty(trans, quota_root->node);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/4] btrfs: fix race when deleting free space root from the dirty cow roots list
  2023-06-19 16:21 [PATCH 0/4] btrfs: some fixes around quota disable, free space tree disable and relocation fdmanana
  2023-06-19 16:21 ` [PATCH 1/4] btrfs: fix race when deleting quota root from the dirty cow roots list fdmanana
@ 2023-06-19 16:21 ` fdmanana
  2023-06-19 16:21 ` [PATCH 3/4] btrfs: add comment to struct btrfs_fs_info::dirty_cowonly_roots fdmanana
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: fdmanana @ 2023-06-19 16:21 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

When deleting the free space tree we are deleting the free space root
from the list fs_info->dirty_cowonly_roots without taking the lock that
protects it, which is struct btrfs_fs_info::trans_lock.
This unsynchronized list manipulation may cause chaos if there's another
concurrent manipulation of this list, such as when adding a root to it
with ctree.c:add_root_to_dirty_list().

This can result in all sorts of weird failures caused by a race, such as
the following crash:

      [337571.278245] general protection fault, probably for non-canonical address 0xdead000000000108: 0000 [#1] PREEMPT SMP PTI
      [337571.278933] CPU: 1 PID: 115447 Comm: btrfs Tainted: G        W          6.4.0-rc6-btrfs-next-134+ #1
      [337571.279153] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
      [337571.279572] RIP: 0010:commit_cowonly_roots+0x11f/0x250 [btrfs]
      [337571.279928] Code: 85 38 06 00 (...)
      [337571.280363] RSP: 0018:ffff9f63446efba0 EFLAGS: 00010206
      [337571.280582] RAX: ffff942d98ec2638 RBX: ffff9430b82b4c30 RCX: 0000000449e1c000
      [337571.280798] RDX: dead000000000100 RSI: ffff9430021e4900 RDI: 0000000000036070
      [337571.281015] RBP: ffff942d98ec2000 R08: ffff942d98ec2000 R09: 000000000000015b
      [337571.281254] R10: 0000000000000009 R11: 0000000000000001 R12: ffff942fe8fbf600
      [337571.281476] R13: ffff942dabe23040 R14: ffff942dabe20800 R15: ffff942d92cf3b48
      [337571.281723] FS:  00007f478adb7340(0000) GS:ffff94349fa40000(0000) knlGS:0000000000000000
      [337571.281950] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [337571.282184] CR2: 00007f478ab9a3d5 CR3: 000000001e02c001 CR4: 0000000000370ee0
      [337571.282416] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [337571.282647] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [337571.282874] Call Trace:
      [337571.283101]  <TASK>
      [337571.283327]  ? __die_body+0x1b/0x60
      [337571.283570]  ? die_addr+0x39/0x60
      [337571.283796]  ? exc_general_protection+0x22e/0x430
      [337571.284022]  ? asm_exc_general_protection+0x22/0x30
      [337571.284251]  ? commit_cowonly_roots+0x11f/0x250 [btrfs]
      [337571.284531]  btrfs_commit_transaction+0x42e/0xf90 [btrfs]
      [337571.284803]  ? _raw_spin_unlock+0x15/0x30
      [337571.285031]  ? release_extent_buffer+0x103/0x130 [btrfs]
      [337571.285305]  reset_balance_state+0x152/0x1b0 [btrfs]
      [337571.285578]  btrfs_balance+0xa50/0x11e0 [btrfs]
      [337571.285864]  ? __kmem_cache_alloc_node+0x14a/0x410
      [337571.286086]  btrfs_ioctl+0x249a/0x3320 [btrfs]
      [337571.286358]  ? mod_objcg_state+0xd2/0x360
      [337571.286577]  ? refill_obj_stock+0xb0/0x160
      [337571.286798]  ? seq_release+0x25/0x30
      [337571.287016]  ? __rseq_handle_notify_resume+0x3ba/0x4b0
      [337571.287235]  ? percpu_counter_add_batch+0x2e/0xa0
      [337571.287455]  ? __x64_sys_ioctl+0x88/0xc0
      [337571.287675]  __x64_sys_ioctl+0x88/0xc0
      [337571.287901]  do_syscall_64+0x38/0x90
      [337571.288126]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
      [337571.288352] RIP: 0033:0x7f478aaffe9b

So fix this by locking struct btrfs_fs_info::trans_lock before deleting
the free space root from that list.

Fixes: a5ed91828518 ("Btrfs: implement the free space B-tree")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/free-space-tree.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/btrfs/free-space-tree.c b/fs/btrfs/free-space-tree.c
index b21da1446f2a..045ddce32eca 100644
--- a/fs/btrfs/free-space-tree.c
+++ b/fs/btrfs/free-space-tree.c
@@ -1280,7 +1280,10 @@ int btrfs_delete_free_space_tree(struct btrfs_fs_info *fs_info)
 		goto abort;
 
 	btrfs_global_root_delete(free_space_root);
+
+	spin_lock(&fs_info->trans_lock);
 	list_del(&free_space_root->dirty_list);
+	spin_unlock(&fs_info->trans_lock);
 
 	btrfs_tree_lock(free_space_root->node);
 	btrfs_clear_buffer_dirty(trans, free_space_root->node);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/4] btrfs: add comment to struct btrfs_fs_info::dirty_cowonly_roots
  2023-06-19 16:21 [PATCH 0/4] btrfs: some fixes around quota disable, free space tree disable and relocation fdmanana
  2023-06-19 16:21 ` [PATCH 1/4] btrfs: fix race when deleting quota root from the dirty cow roots list fdmanana
  2023-06-19 16:21 ` [PATCH 2/4] btrfs: fix race when deleting free space " fdmanana
@ 2023-06-19 16:21 ` fdmanana
  2023-06-19 16:21 ` [PATCH 4/4] btrfs: fix race between quota disable and relocation fdmanana
  2023-06-19 18:35 ` [PATCH 0/4] btrfs: some fixes around quota disable, free space tree " David Sterba
  4 siblings, 0 replies; 6+ messages in thread
From: fdmanana @ 2023-06-19 16:21 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

Add a comment to struct btrfs_fs_info::dirty_cowonly_roots to mention
that struct btrfs_fs_info::trans_lock is the lock that protects that
list.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/fs.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h
index 396e2a463480..203d2a267828 100644
--- a/fs/btrfs/fs.h
+++ b/fs/btrfs/fs.h
@@ -576,6 +576,7 @@ struct btrfs_fs_info {
 	s32 dirty_metadata_batch;
 	s32 delalloc_batch;
 
+	/* Protected by 'trans_lock'. */
 	struct list_head dirty_cowonly_roots;
 
 	struct btrfs_fs_devices *fs_devices;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 4/4] btrfs: fix race between quota disable and relocation
  2023-06-19 16:21 [PATCH 0/4] btrfs: some fixes around quota disable, free space tree disable and relocation fdmanana
                   ` (2 preceding siblings ...)
  2023-06-19 16:21 ` [PATCH 3/4] btrfs: add comment to struct btrfs_fs_info::dirty_cowonly_roots fdmanana
@ 2023-06-19 16:21 ` fdmanana
  2023-06-19 18:35 ` [PATCH 0/4] btrfs: some fixes around quota disable, free space tree " David Sterba
  4 siblings, 0 replies; 6+ messages in thread
From: fdmanana @ 2023-06-19 16:21 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

If we disable quotas while we have a relocation of a metadata block group
that has extents belonging to the quota root, we can cause the relocation
to fail with -ENOENT. This is because relocation builds backref nodes for
extents of the quota root and later needs to walk the backrefs and access
the quota root - however if in between a task disables quotas, it results
in deleting the quota root from the root tree (with btrfs_del_root(),
called from btrfs_quota_disable().

This can be sporadically triggered by test case btrfs/255 from fstests:

  $ ./check btrfs/255
  FSTYP         -- btrfs
  PLATFORM      -- Linux/x86_64 debian0 6.4.0-rc6-btrfs-next-134+ #1 SMP PREEMPT_DYNAMIC Thu Jun 15 11:59:28 WEST 2023
  MKFS_OPTIONS  -- /dev/sdc
  MOUNT_OPTIONS -- /dev/sdc /home/fdmanana/btrfs-tests/scratch_1

  btrfs/255 6s ... _check_dmesg: something found in dmesg (see /home/fdmanana/git/hub/xfstests/results//btrfs/255.dmesg)
  - output mismatch (see /home/fdmanana/git/hub/xfstests/results//btrfs/255.out.bad)
      --- tests/btrfs/255.out	2023-03-02 21:47:53.876609426 +0000
      +++ /home/fdmanana/git/hub/xfstests/results//btrfs/255.out.bad	2023-06-16 10:20:39.267563212 +0100
      @@ -1,2 +1,4 @@
       QA output created by 255
      +ERROR: error during balancing '/home/fdmanana/btrfs-tests/scratch_1': No such file or directory
      +There may be more info in syslog - try dmesg | tail
       Silence is golden
      ...
      (Run 'diff -u /home/fdmanana/git/hub/xfstests/tests/btrfs/255.out /home/fdmanana/git/hub/xfstests/results//btrfs/255.out.bad'  to see the entire diff)
  Ran: btrfs/255
  Failures: btrfs/255
  Failed 1 of 1 tests

To fix this make the quota disable operation take the cleaner mutex, as
relocation of a block group also takes this mutex. This is also what we
do when deleting a subvolume/snapshot, we take the cleaner mutex in the
cleaner kthread (at cleaner_kthread()) and then we call btrfs_del_root()
at btrfs_drop_snapshot() while under the protection of the cleaner mutex.

Fixes: bed92eae26cc ("Btrfs: qgroup implementation and prototypes")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/qgroup.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index f8735b31da16..da1f84a0eb29 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -1232,12 +1232,23 @@ int btrfs_quota_disable(struct btrfs_fs_info *fs_info)
 	int ret = 0;
 
 	/*
-	 * We need to have subvol_sem write locked, to prevent races between
-	 * concurrent tasks trying to disable quotas, because we will unlock
-	 * and relock qgroup_ioctl_lock across BTRFS_FS_QUOTA_ENABLED changes.
+	 * We need to have subvol_sem write locked to prevent races with
+	 * snapshot creation.
 	 */
 	lockdep_assert_held_write(&fs_info->subvol_sem);
 
+	/*
+	 * Lock the cleaner mutex to prevent races with concurrent relocation,
+	 * because relocation may be building backrefs for blocks of the quota
+	 * root while we are deleting the root. This is like dropping fs roots
+	 * of deleted snapshots/subvolumes, we need the same protection.
+	 *
+	 * This also prevents races between concurrent tasks trying to disable
+	 * quotas, because we will unlock and relock qgroup_ioctl_lock across
+	 * BTRFS_FS_QUOTA_ENABLED changes.
+	 */
+	mutex_lock(&fs_info->cleaner_mutex);
+
 	mutex_lock(&fs_info->qgroup_ioctl_lock);
 	if (!fs_info->quota_root)
 		goto out;
@@ -1319,6 +1330,7 @@ int btrfs_quota_disable(struct btrfs_fs_info *fs_info)
 		btrfs_end_transaction(trans);
 	else if (trans)
 		ret = btrfs_end_transaction(trans);
+	mutex_unlock(&fs_info->cleaner_mutex);
 
 	return ret;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/4] btrfs: some fixes around quota disable, free space tree disable and relocation
  2023-06-19 16:21 [PATCH 0/4] btrfs: some fixes around quota disable, free space tree disable and relocation fdmanana
                   ` (3 preceding siblings ...)
  2023-06-19 16:21 ` [PATCH 4/4] btrfs: fix race between quota disable and relocation fdmanana
@ 2023-06-19 18:35 ` David Sterba
  4 siblings, 0 replies; 6+ messages in thread
From: David Sterba @ 2023-06-19 18:35 UTC (permalink / raw)
  To: fdmanana; +Cc: linux-btrfs

On Mon, Jun 19, 2023 at 05:21:46PM +0100, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> These fix some bugs related to quota disable and free space tree disable,
> as well as a race between quota disable and relocation that makes
> relocation fail with -ENOENT. More details on the change logs.
> 
> Filipe Manana (4):
>   btrfs: fix race when deleting quota root from the dirty cow roots list
>   btrfs: fix race when deleting free space root from the dirty cow roots list
>   btrfs: add comment to struct btrfs_fs_info::dirty_cowonly_roots
>   btrfs: fix race between quota disable and relocation

Added to misc-next, thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-06-19 18:42 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-19 16:21 [PATCH 0/4] btrfs: some fixes around quota disable, free space tree disable and relocation fdmanana
2023-06-19 16:21 ` [PATCH 1/4] btrfs: fix race when deleting quota root from the dirty cow roots list fdmanana
2023-06-19 16:21 ` [PATCH 2/4] btrfs: fix race when deleting free space " fdmanana
2023-06-19 16:21 ` [PATCH 3/4] btrfs: add comment to struct btrfs_fs_info::dirty_cowonly_roots fdmanana
2023-06-19 16:21 ` [PATCH 4/4] btrfs: fix race between quota disable and relocation fdmanana
2023-06-19 18:35 ` [PATCH 0/4] btrfs: some fixes around quota disable, free space tree " David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.