public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/5] btrfs: block_group refcounting fixes
@ 2025-03-10 20:07 Boris Burkov
  2025-03-10 20:07 ` [PATCH v2 1/5] btrfs: fix bg refcount race in btrfs_create_pending_block_groups Boris Burkov
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Boris Burkov @ 2025-03-10 20:07 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We have observed a number of WARNINGs in the Meta fleet which are the
result of a block_group refcount underflowing. The refcount error
can happen at any point in the block group's lifetime, so it is hard to
conclude that we have reproduced/fixed all the bugs, I believe I have
found a few here that will hopefully improve things.

The main thrust of this patch series is that we need to take the
fs_info->unused_bgs_lock spin lock when modifying the bg_list of a
block_group. There are a number of code paths where we atomically check
that list_head for emptiness and then add/del get/put appropriately.
If any other thread messes with it in between without locking, then that
logic gets messed up. This is most obviously evident with
btrfs_mark_bg_unused.

I could imagine universally protecting bg_list's empty/not-empty nature
with a lock with smaller scope, but this is already the locking strategy
being used to synchronize reclaim/unused lists, so it seems reasonable
to just re-use it.

In addition, I attempted to simplify the refcounting logic in the
discard workfn, as the last time I fixed a bug in there, I made it far
too subtle. Hopefully this more explicit variant is easier to analyze in
the future.
--
Changelog
v2:
- fix mistaken placement of a btrfs_block_group put in the 2nd
  (locking) patch, when it ought to be in the 4th (ref-counting) patch.
- improve several commit messages with more details and using full
  function names instead of shorthand.
- add comments about over-paranoid locking.
- rename second patch to reflect that it is hardening rather than
  fixing any bugs.
- fix bad comment and variable names in btrfs_link_bg_list.


Boris Burkov (5):
  btrfs: fix bg refcount race in btrfs_create_pending_block_groups
  btrfs: harden bg->bg_list against list_del races
  btrfs: make discard_workfn block_group ref explicit
  btrfs: explicitly ref count block_group on new_bgs list
  btrfs: codify pattern for adding block_group to bg_list

 fs/btrfs/block-group.c | 57 +++++++++++++++++++++++++-----------------
 fs/btrfs/discard.c     | 34 ++++++++++++-------------
 fs/btrfs/extent-tree.c |  8 ++++++
 fs/btrfs/transaction.c | 13 ++++++++++
 4 files changed, 71 insertions(+), 41 deletions(-)

-- 
2.48.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-03-11 19:03 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-10 20:07 [PATCH v2 0/5] btrfs: block_group refcounting fixes Boris Burkov
2025-03-10 20:07 ` [PATCH v2 1/5] btrfs: fix bg refcount race in btrfs_create_pending_block_groups Boris Burkov
2025-03-11 19:02   ` David Sterba
2025-03-10 20:07 ` [PATCH v2 2/5] btrfs: harden bg->bg_list against list_del races Boris Burkov
2025-03-11 11:35   ` Filipe Manana
2025-03-10 20:07 ` [PATCH v2 3/5] btrfs: make discard_workfn block_group ref explicit Boris Burkov
2025-03-10 20:07 ` [PATCH v2 4/5] btrfs: explicitly ref count block_group on new_bgs list Boris Burkov
2025-03-10 20:07 ` [PATCH v2 5/5] btrfs: codify pattern for adding block_group to bg_list Boris Burkov
2025-03-11 19:03   ` David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox