From: Boris Burkov <boris@bur.io>
To: linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: [PATCH v2 0/5] btrfs: block_group refcounting fixes
Date: Mon, 10 Mar 2025 13:07:00 -0700 [thread overview]
Message-ID: <cover.1741636986.git.boris@bur.io> (raw)
We have observed a number of WARNINGs in the Meta fleet which are the
result of a block_group refcount underflowing. The refcount error
can happen at any point in the block group's lifetime, so it is hard to
conclude that we have reproduced/fixed all the bugs, I believe I have
found a few here that will hopefully improve things.
The main thrust of this patch series is that we need to take the
fs_info->unused_bgs_lock spin lock when modifying the bg_list of a
block_group. There are a number of code paths where we atomically check
that list_head for emptiness and then add/del get/put appropriately.
If any other thread messes with it in between without locking, then that
logic gets messed up. This is most obviously evident with
btrfs_mark_bg_unused.
I could imagine universally protecting bg_list's empty/not-empty nature
with a lock with smaller scope, but this is already the locking strategy
being used to synchronize reclaim/unused lists, so it seems reasonable
to just re-use it.
In addition, I attempted to simplify the refcounting logic in the
discard workfn, as the last time I fixed a bug in there, I made it far
too subtle. Hopefully this more explicit variant is easier to analyze in
the future.
--
Changelog
v2:
- fix mistaken placement of a btrfs_block_group put in the 2nd
(locking) patch, when it ought to be in the 4th (ref-counting) patch.
- improve several commit messages with more details and using full
function names instead of shorthand.
- add comments about over-paranoid locking.
- rename second patch to reflect that it is hardening rather than
fixing any bugs.
- fix bad comment and variable names in btrfs_link_bg_list.
Boris Burkov (5):
btrfs: fix bg refcount race in btrfs_create_pending_block_groups
btrfs: harden bg->bg_list against list_del races
btrfs: make discard_workfn block_group ref explicit
btrfs: explicitly ref count block_group on new_bgs list
btrfs: codify pattern for adding block_group to bg_list
fs/btrfs/block-group.c | 57 +++++++++++++++++++++++++-----------------
fs/btrfs/discard.c | 34 ++++++++++++-------------
fs/btrfs/extent-tree.c | 8 ++++++
fs/btrfs/transaction.c | 13 ++++++++++
4 files changed, 71 insertions(+), 41 deletions(-)
--
2.48.1
next reply other threads:[~2025-03-10 20:06 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-10 20:07 Boris Burkov [this message]
2025-03-10 20:07 ` [PATCH v2 1/5] btrfs: fix bg refcount race in btrfs_create_pending_block_groups Boris Burkov
2025-03-11 19:02 ` David Sterba
2025-03-10 20:07 ` [PATCH v2 2/5] btrfs: harden bg->bg_list against list_del races Boris Burkov
2025-03-11 11:35 ` Filipe Manana
2025-03-10 20:07 ` [PATCH v2 3/5] btrfs: make discard_workfn block_group ref explicit Boris Burkov
2025-03-10 20:07 ` [PATCH v2 4/5] btrfs: explicitly ref count block_group on new_bgs list Boris Burkov
2025-03-10 20:07 ` [PATCH v2 5/5] btrfs: codify pattern for adding block_group to bg_list Boris Burkov
2025-03-11 19:03 ` David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1741636986.git.boris@bur.io \
--to=boris@bur.io \
--cc=kernel-team@fb.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox