stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 6.12.y] btrfs: fix block group refcount race in btrfs_create_pending_block_groups()
@ 2025-07-18  2:59 alvalan9
  2025-07-18 13:29 ` Sasha Levin
  0 siblings, 1 reply; 2+ messages in thread
From: alvalan9 @ 2025-07-18  2:59 UTC (permalink / raw)
  To: gregkh, stable
  Cc: Boris Burkov, Qu Wenruo, Filipe Manana, David Sterba, Alva Lan

From: Boris Burkov <boris@bur.io>

[ Upstream commit 2d8e5168d48a91e7a802d3003e72afb4304bebfa ]

Block group creation is done in two phases, which results in a slightly
unintuitive property: a block group can be allocated/deallocated from
after btrfs_make_block_group() adds it to the space_info with
btrfs_add_bg_to_space_info(), but before creation is completely completed
in btrfs_create_pending_block_groups(). As a result, it is possible for a
block group to go unused and have 'btrfs_mark_bg_unused' called on it
concurrently with 'btrfs_create_pending_block_groups'. This causes a
number of issues, which were fixed with the block group flag
'BLOCK_GROUP_FLAG_NEW'.

However, this fix is not quite complete. Since it does not use the
unused_bg_lock, it is possible for the following race to occur:

btrfs_create_pending_block_groups            btrfs_mark_bg_unused
                                           if list_empty // false
        list_del_init
        clear_bit
                                           else if (test_bit) // true
                                                list_move_tail

And we get into the exact same broken ref count and invalid new_bgs
state for transaction cleanup that BLOCK_GROUP_FLAG_NEW was designed to
prevent.

The broken refcount aspect will result in a warning like:

  [1272.943527] refcount_t: underflow; use-after-free.
  [1272.943967] WARNING: CPU: 1 PID: 61 at lib/refcount.c:28 refcount_warn_saturate+0xba/0x110
  [1272.944731] Modules linked in: btrfs virtio_net xor zstd_compress raid6_pq null_blk [last unloaded: btrfs]
  [1272.945550] CPU: 1 UID: 0 PID: 61 Comm: kworker/u32:1 Kdump: loaded Tainted: G        W          6.14.0-rc5+ #108
  [1272.946368] Tainted: [W]=WARN
  [1272.946585] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
  [1272.947273] Workqueue: btrfs_discard btrfs_discard_workfn [btrfs]
  [1272.947788] RIP: 0010:refcount_warn_saturate+0xba/0x110
  [1272.949532] RSP: 0018:ffffbf1200247df0 EFLAGS: 00010282
  [1272.949901] RAX: 0000000000000000 RBX: ffffa14b00e3f800 RCX: 0000000000000000
  [1272.950437] RDX: 0000000000000000 RSI: ffffbf1200247c78 RDI: 00000000ffffdfff
  [1272.950986] RBP: ffffa14b00dc2860 R08: 00000000ffffdfff R09: ffffffff90526268
  [1272.951512] R10: ffffffff904762c0 R11: 0000000063666572 R12: ffffa14b00dc28c0
  [1272.952024] R13: 0000000000000000 R14: ffffa14b00dc2868 R15: 000001285dcd12c0
  [1272.952850] FS:  0000000000000000(0000) GS:ffffa14d33c40000(0000) knlGS:0000000000000000
  [1272.953458] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [1272.953931] CR2: 00007f838cbda000 CR3: 000000010104e000 CR4: 00000000000006f0
  [1272.954474] Call Trace:
  [1272.954655]  <TASK>
  [1272.954812]  ? refcount_warn_saturate+0xba/0x110
  [1272.955173]  ? __warn.cold+0x93/0xd7
  [1272.955487]  ? refcount_warn_saturate+0xba/0x110
  [1272.955816]  ? report_bug+0xe7/0x120
  [1272.956103]  ? handle_bug+0x53/0x90
  [1272.956424]  ? exc_invalid_op+0x13/0x60
  [1272.956700]  ? asm_exc_invalid_op+0x16/0x20
  [1272.957011]  ? refcount_warn_saturate+0xba/0x110
  [1272.957399]  btrfs_discard_cancel_work.cold+0x26/0x2b [btrfs]
  [1272.957853]  btrfs_put_block_group.cold+0x5d/0x8e [btrfs]
  [1272.958289]  btrfs_discard_workfn+0x194/0x380 [btrfs]
  [1272.958729]  process_one_work+0x130/0x290
  [1272.959026]  worker_thread+0x2ea/0x420
  [1272.959335]  ? __pfx_worker_thread+0x10/0x10
  [1272.959644]  kthread+0xd7/0x1c0
  [1272.959872]  ? __pfx_kthread+0x10/0x10
  [1272.960172]  ret_from_fork+0x30/0x50
  [1272.960474]  ? __pfx_kthread+0x10/0x10
  [1272.960745]  ret_from_fork_asm+0x1a/0x30
  [1272.961035]  </TASK>
  [1272.961238] ---[ end trace 0000000000000000 ]---

Though we have seen them in the async discard workfn as well. It is
most likely to happen after a relocation finishes which cancels discard,
tears down the block group, etc.

Fix this fully by taking the lock around the list_del_init + clear_bit
so that the two are done atomically.

Fixes: 0657b20c5a76 ("btrfs: fix use-after-free of new block group that became unused")
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Alva Lan <alvalan9@foxmail.com>
---
 fs/btrfs/block-group.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index aa8656c8b7e7..dd35e29d8082 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -2780,8 +2780,11 @@ void btrfs_create_pending_block_groups(struct btrfs_trans_handle *trans)
 		/* Already aborted the transaction if it failed. */
 next:
 		btrfs_dec_delayed_refs_rsv_bg_inserts(fs_info);
+
+		spin_lock(&fs_info->unused_bgs_lock);
 		list_del_init(&block_group->bg_list);
 		clear_bit(BLOCK_GROUP_FLAG_NEW, &block_group->runtime_flags);
+		spin_unlock(&fs_info->unused_bgs_lock);
 
 		/*
 		 * If the block group is still unused, add it to the list of
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH 6.12.y] btrfs: fix block group refcount race in btrfs_create_pending_block_groups()
  2025-07-18  2:59 [PATCH 6.12.y] btrfs: fix block group refcount race in btrfs_create_pending_block_groups() alvalan9
@ 2025-07-18 13:29 ` Sasha Levin
  0 siblings, 0 replies; 2+ messages in thread
From: Sasha Levin @ 2025-07-18 13:29 UTC (permalink / raw)
  To: stable; +Cc: Sasha Levin

[ Sasha's backport helper bot ]

Hi,

✅ All tests passed successfully. No issues detected.
No action required from the submitter.

The upstream commit SHA1 provided is correct: 2d8e5168d48a91e7a802d3003e72afb4304bebfa

WARNING: Author mismatch between patch and upstream commit:
Backport author: <alvalan9@foxmail.com>
Commit author: Boris Burkov <boris@bur.io>

Status in newer kernel trees:
6.15.y | Present (exact SHA1)

Note: The patch differs from the upstream commit:
---
1:  2d8e5168d48a ! 1:  7e8e9017eb32 btrfs: fix block group refcount race in btrfs_create_pending_block_groups()
    @@ Metadata
      ## Commit message ##
         btrfs: fix block group refcount race in btrfs_create_pending_block_groups()
     
    +    [ Upstream commit 2d8e5168d48a91e7a802d3003e72afb4304bebfa ]
    +
         Block group creation is done in two phases, which results in a slightly
         unintuitive property: a block group can be allocated/deallocated from
         after btrfs_make_block_group() adds it to the space_info with
    @@ Commit message
         Reviewed-by: Filipe Manana <fdmanana@suse.com>
         Signed-off-by: Boris Burkov <boris@bur.io>
         Signed-off-by: David Sterba <dsterba@suse.com>
    +    Signed-off-by: Alva Lan <alvalan9@foxmail.com>
     
      ## fs/btrfs/block-group.c ##
     @@ fs/btrfs/block-group.c: void btrfs_create_pending_block_groups(struct btrfs_trans_handle *trans)

---

Results of testing on various branches:

| Branch                    | Patch Apply | Build Test |
|---------------------------|-------------|------------|
| origin/linux-6.12.y       | Success     | Success    |

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2025-07-18 13:29 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-18  2:59 [PATCH 6.12.y] btrfs: fix block group refcount race in btrfs_create_pending_block_groups() alvalan9
2025-07-18 13:29 ` Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).