From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Boris Burkov <boris@bur.io>, Filipe Manana <fdmanana@suse.com>,
David Sterba <dsterba@suse.com>, Sasha Levin <sashal@kernel.org>,
clm@fb.com, josef@toxicpanda.com, linux-btrfs@vger.kernel.org
Subject: [PATCH AUTOSEL 6.14 093/642] btrfs: make btrfs_discard_workfn() block_group ref explicit
Date: Mon, 5 May 2025 18:05:09 -0400 [thread overview]
Message-ID: <20250505221419.2672473-93-sashal@kernel.org> (raw)
In-Reply-To: <20250505221419.2672473-1-sashal@kernel.org>
From: Boris Burkov <boris@bur.io>
[ Upstream commit 895c6721d310c036dcfebb5ab845822229fa35eb ]
Currently, the async discard machinery owns a ref to the block_group
when the block_group is queued on a discard list. However, to handle
races with discard cancellation and the discard workfn, we have a
specific logic to detect that the block_group is *currently* running in
the workfn, to protect the workfn's usage amidst cancellation.
As far as I can tell, this doesn't have any overt bugs (though
finish_discard_pass() and remove_from_discard_list() racing can have a
surprising outcome for the caller of remove_from_discard_list() in that
it is again added at the end).
But it is needlessly complicated to rely on locking and the nullity of
discard_ctl->block_group. Simplify this significantly by just taking a
refcount while we are in the workfn and unconditionally drop it in both
the remove and workfn paths, regardless of if they race.
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/btrfs/discard.c | 34 ++++++++++++++++------------------
1 file changed, 16 insertions(+), 18 deletions(-)
diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c
index e815d165cccc2..d6eef4bd9e9d4 100644
--- a/fs/btrfs/discard.c
+++ b/fs/btrfs/discard.c
@@ -167,13 +167,7 @@ static bool remove_from_discard_list(struct btrfs_discard_ctl *discard_ctl,
block_group->discard_eligible_time = 0;
queued = !list_empty(&block_group->discard_list);
list_del_init(&block_group->discard_list);
- /*
- * If the block group is currently running in the discard workfn, we
- * don't want to deref it, since it's still being used by the workfn.
- * The workfn will notice this case and deref the block group when it is
- * finished.
- */
- if (queued && !running)
+ if (queued)
btrfs_put_block_group(block_group);
spin_unlock(&discard_ctl->lock);
@@ -260,9 +254,10 @@ static struct btrfs_block_group *peek_discard_list(
block_group->discard_cursor = block_group->start;
block_group->discard_state = BTRFS_DISCARD_EXTENTS;
}
- discard_ctl->block_group = block_group;
}
if (block_group) {
+ btrfs_get_block_group(block_group);
+ discard_ctl->block_group = block_group;
*discard_state = block_group->discard_state;
*discard_index = block_group->discard_index;
}
@@ -493,9 +488,20 @@ static void btrfs_discard_workfn(struct work_struct *work)
block_group = peek_discard_list(discard_ctl, &discard_state,
&discard_index, now);
- if (!block_group || !btrfs_run_discard_work(discard_ctl))
+ if (!block_group)
return;
+ if (!btrfs_run_discard_work(discard_ctl)) {
+ spin_lock(&discard_ctl->lock);
+ btrfs_put_block_group(block_group);
+ discard_ctl->block_group = NULL;
+ spin_unlock(&discard_ctl->lock);
+ return;
+ }
if (now < block_group->discard_eligible_time) {
+ spin_lock(&discard_ctl->lock);
+ btrfs_put_block_group(block_group);
+ discard_ctl->block_group = NULL;
+ spin_unlock(&discard_ctl->lock);
btrfs_discard_schedule_work(discard_ctl, false);
return;
}
@@ -547,15 +553,7 @@ static void btrfs_discard_workfn(struct work_struct *work)
spin_lock(&discard_ctl->lock);
discard_ctl->prev_discard = trimmed;
discard_ctl->prev_discard_time = now;
- /*
- * If the block group was removed from the discard list while it was
- * running in this workfn, then we didn't deref it, since this function
- * still owned that reference. But we set the discard_ctl->block_group
- * back to NULL, so we can use that condition to know that now we need
- * to deref the block_group.
- */
- if (discard_ctl->block_group == NULL)
- btrfs_put_block_group(block_group);
+ btrfs_put_block_group(block_group);
discard_ctl->block_group = NULL;
__btrfs_discard_schedule_work(discard_ctl, now, false);
spin_unlock(&discard_ctl->lock);
--
2.39.5
next parent reply other threads:[~2025-05-05 22:18 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20250505221419.2672473-1-sashal@kernel.org>
2025-05-05 22:05 ` Sasha Levin [this message]
2025-05-05 22:05 ` [PATCH AUTOSEL 6.14 094/642] btrfs: avoid linker error in btrfs_find_create_tree_block() Sasha Levin
2025-05-05 22:05 ` [PATCH AUTOSEL 6.14 095/642] btrfs: run btrfs_error_commit_super() early Sasha Levin
2025-05-05 22:05 ` [PATCH AUTOSEL 6.14 096/642] btrfs: fix non-empty delayed iputs list on unmount due to async workers Sasha Levin
2025-05-05 22:05 ` [PATCH AUTOSEL 6.14 097/642] btrfs: properly limit inline data extent according to block size Sasha Levin
2025-05-05 22:05 ` [PATCH AUTOSEL 6.14 098/642] btrfs: allow buffered write to avoid full page read if it's block aligned Sasha Levin
2025-05-05 22:05 ` [PATCH AUTOSEL 6.14 099/642] btrfs: prevent inline data extents read from touching blocks beyond its range Sasha Levin
2025-05-06 13:19 ` David Sterba
2025-05-20 14:15 ` Sasha Levin
2025-05-05 22:05 ` [PATCH AUTOSEL 6.14 100/642] btrfs: get zone unusable bytes while holding lock at btrfs_reclaim_bgs_work() Sasha Levin
2025-05-05 22:05 ` [PATCH AUTOSEL 6.14 101/642] btrfs: send: return -ENAMETOOLONG when attempting a path that is too long Sasha Levin
2025-05-05 22:05 ` [PATCH AUTOSEL 6.14 102/642] btrfs: zoned: exit btrfs_can_activate_zone if BTRFS_FS_NEED_ZONE_FINISH is set Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250505221419.2672473-93-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=boris@bur.io \
--cc=clm@fb.com \
--cc=dsterba@suse.com \
--cc=fdmanana@suse.com \
--cc=josef@toxicpanda.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox