From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Filipe Manana <fdmanana@suse.com>,
David Sterba <dsterba@suse.com>, Sasha Levin <sashal@kernel.org>,
clm@fb.com, josef@toxicpanda.com, linux-btrfs@vger.kernel.org
Subject: [PATCH AUTOSEL 6.5 05/18] btrfs: fix race when refilling delayed refs block reserve
Date: Sat, 7 Oct 2023 20:48:39 -0400 [thread overview]
Message-ID: <20231008004853.3767621-5-sashal@kernel.org> (raw)
In-Reply-To: <20231008004853.3767621-1-sashal@kernel.org>
From: Filipe Manana <fdmanana@suse.com>
[ Upstream commit 2ed45c0f1879079b30248568c515cf60fc668d8a ]
If we have two (or more) tasks attempting to refill the delayed refs block
reserve we can end up with the delayed block reserve being over reserved,
that is, with a reserved space greater than its size. If this happens, we
are holding to more reserved space than necessary for a while.
The race happens like this:
1) The delayed refs block reserve has a size of 8M and a reserved space of
6M for example;
2) Task A calls btrfs_delayed_refs_rsv_refill();
3) Task B also calls btrfs_delayed_refs_rsv_refill();
4) Task A sees there's a 2M difference between the size and the reserved
space of the delayed refs rsv, so it will reserve 2M of space by
calling btrfs_reserve_metadata_bytes();
5) Task B also sees that 2M difference, and like task A, it reserves
another 2M of metadata space;
6) Both task A and task B increase the reserved space of block reserve
by 2M, by calling btrfs_block_rsv_add_bytes(), so the block reserve
ends up with a size of 8M and a reserved space of 10M;
7) The extra, over reserved space will eventually be freed by some task
calling btrfs_delayed_refs_rsv_release() -> btrfs_block_rsv_release()
-> block_rsv_release_bytes(), as there we will detect the over reserve
and release that space.
So fix this by checking if we still need to add space to the delayed refs
block reserve after reserving the metadata space, and if we don't, just
release that space immediately.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/btrfs/delayed-ref.c | 37 ++++++++++++++++++++++++++++++++++---
1 file changed, 34 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c
index 6a13cf00218bc..1043f66cc130d 100644
--- a/fs/btrfs/delayed-ref.c
+++ b/fs/btrfs/delayed-ref.c
@@ -163,6 +163,8 @@ int btrfs_delayed_refs_rsv_refill(struct btrfs_fs_info *fs_info,
struct btrfs_block_rsv *block_rsv = &fs_info->delayed_refs_rsv;
u64 limit = btrfs_calc_delayed_ref_bytes(fs_info, 1);
u64 num_bytes = 0;
+ u64 refilled_bytes;
+ u64 to_free;
int ret = -ENOSPC;
spin_lock(&block_rsv->lock);
@@ -178,9 +180,38 @@ int btrfs_delayed_refs_rsv_refill(struct btrfs_fs_info *fs_info,
ret = btrfs_reserve_metadata_bytes(fs_info, block_rsv, num_bytes, flush);
if (ret)
return ret;
- btrfs_block_rsv_add_bytes(block_rsv, num_bytes, false);
- trace_btrfs_space_reservation(fs_info, "delayed_refs_rsv",
- 0, num_bytes, 1);
+
+ /*
+ * We may have raced with someone else, so check again if we the block
+ * reserve is still not full and release any excess space.
+ */
+ spin_lock(&block_rsv->lock);
+ if (block_rsv->reserved < block_rsv->size) {
+ u64 needed = block_rsv->size - block_rsv->reserved;
+
+ if (num_bytes >= needed) {
+ block_rsv->reserved += needed;
+ block_rsv->full = true;
+ to_free = num_bytes - needed;
+ refilled_bytes = needed;
+ } else {
+ block_rsv->reserved += num_bytes;
+ to_free = 0;
+ refilled_bytes = num_bytes;
+ }
+ } else {
+ to_free = num_bytes;
+ refilled_bytes = 0;
+ }
+ spin_unlock(&block_rsv->lock);
+
+ if (to_free > 0)
+ btrfs_space_info_free_bytes_may_use(fs_info, block_rsv->space_info,
+ to_free);
+
+ if (refilled_bytes > 0)
+ trace_btrfs_space_reservation(fs_info, "delayed_refs_rsv", 0,
+ refilled_bytes, 1);
return 0;
}
--
2.40.1
next parent reply other threads:[~2023-10-08 0:49 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20231008004853.3767621-1-sashal@kernel.org>
2023-10-08 0:48 ` Sasha Levin [this message]
2023-10-08 0:48 ` [PATCH AUTOSEL 6.5 06/18] btrfs: prevent transaction block reserve underflow when starting transaction Sasha Levin
2023-10-08 0:48 ` [PATCH AUTOSEL 6.5 07/18] btrfs: return -EUCLEAN for delayed tree ref with a ref count not equals to 1 Sasha Levin
2023-10-08 0:48 ` [PATCH AUTOSEL 6.5 08/18] btrfs: initialize start_slot in btrfs_log_prealloc_extents Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231008004853.3767621-5-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=clm@fb.com \
--cc=dsterba@suse.com \
--cc=fdmanana@suse.com \
--cc=josef@toxicpanda.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).