* [PATCH AUTOSEL 6.5 05/18] btrfs: fix race when refilling delayed refs block reserve
[not found] <20231008004853.3767621-1-sashal@kernel.org>
@ 2023-10-08 0:48 ` Sasha Levin
2023-10-08 0:48 ` [PATCH AUTOSEL 6.5 06/18] btrfs: prevent transaction block reserve underflow when starting transaction Sasha Levin
` (2 subsequent siblings)
3 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2023-10-08 0:48 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Filipe Manana, David Sterba, Sasha Levin, clm, josef, linux-btrfs
From: Filipe Manana <fdmanana@suse.com>
[ Upstream commit 2ed45c0f1879079b30248568c515cf60fc668d8a ]
If we have two (or more) tasks attempting to refill the delayed refs block
reserve we can end up with the delayed block reserve being over reserved,
that is, with a reserved space greater than its size. If this happens, we
are holding to more reserved space than necessary for a while.
The race happens like this:
1) The delayed refs block reserve has a size of 8M and a reserved space of
6M for example;
2) Task A calls btrfs_delayed_refs_rsv_refill();
3) Task B also calls btrfs_delayed_refs_rsv_refill();
4) Task A sees there's a 2M difference between the size and the reserved
space of the delayed refs rsv, so it will reserve 2M of space by
calling btrfs_reserve_metadata_bytes();
5) Task B also sees that 2M difference, and like task A, it reserves
another 2M of metadata space;
6) Both task A and task B increase the reserved space of block reserve
by 2M, by calling btrfs_block_rsv_add_bytes(), so the block reserve
ends up with a size of 8M and a reserved space of 10M;
7) The extra, over reserved space will eventually be freed by some task
calling btrfs_delayed_refs_rsv_release() -> btrfs_block_rsv_release()
-> block_rsv_release_bytes(), as there we will detect the over reserve
and release that space.
So fix this by checking if we still need to add space to the delayed refs
block reserve after reserving the metadata space, and if we don't, just
release that space immediately.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/btrfs/delayed-ref.c | 37 ++++++++++++++++++++++++++++++++++---
1 file changed, 34 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c
index 6a13cf00218bc..1043f66cc130d 100644
--- a/fs/btrfs/delayed-ref.c
+++ b/fs/btrfs/delayed-ref.c
@@ -163,6 +163,8 @@ int btrfs_delayed_refs_rsv_refill(struct btrfs_fs_info *fs_info,
struct btrfs_block_rsv *block_rsv = &fs_info->delayed_refs_rsv;
u64 limit = btrfs_calc_delayed_ref_bytes(fs_info, 1);
u64 num_bytes = 0;
+ u64 refilled_bytes;
+ u64 to_free;
int ret = -ENOSPC;
spin_lock(&block_rsv->lock);
@@ -178,9 +180,38 @@ int btrfs_delayed_refs_rsv_refill(struct btrfs_fs_info *fs_info,
ret = btrfs_reserve_metadata_bytes(fs_info, block_rsv, num_bytes, flush);
if (ret)
return ret;
- btrfs_block_rsv_add_bytes(block_rsv, num_bytes, false);
- trace_btrfs_space_reservation(fs_info, "delayed_refs_rsv",
- 0, num_bytes, 1);
+
+ /*
+ * We may have raced with someone else, so check again if we the block
+ * reserve is still not full and release any excess space.
+ */
+ spin_lock(&block_rsv->lock);
+ if (block_rsv->reserved < block_rsv->size) {
+ u64 needed = block_rsv->size - block_rsv->reserved;
+
+ if (num_bytes >= needed) {
+ block_rsv->reserved += needed;
+ block_rsv->full = true;
+ to_free = num_bytes - needed;
+ refilled_bytes = needed;
+ } else {
+ block_rsv->reserved += num_bytes;
+ to_free = 0;
+ refilled_bytes = num_bytes;
+ }
+ } else {
+ to_free = num_bytes;
+ refilled_bytes = 0;
+ }
+ spin_unlock(&block_rsv->lock);
+
+ if (to_free > 0)
+ btrfs_space_info_free_bytes_may_use(fs_info, block_rsv->space_info,
+ to_free);
+
+ if (refilled_bytes > 0)
+ trace_btrfs_space_reservation(fs_info, "delayed_refs_rsv", 0,
+ refilled_bytes, 1);
return 0;
}
--
2.40.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH AUTOSEL 6.5 06/18] btrfs: prevent transaction block reserve underflow when starting transaction
[not found] <20231008004853.3767621-1-sashal@kernel.org>
2023-10-08 0:48 ` [PATCH AUTOSEL 6.5 05/18] btrfs: fix race when refilling delayed refs block reserve Sasha Levin
@ 2023-10-08 0:48 ` Sasha Levin
2023-10-08 0:48 ` [PATCH AUTOSEL 6.5 07/18] btrfs: return -EUCLEAN for delayed tree ref with a ref count not equals to 1 Sasha Levin
2023-10-08 0:48 ` [PATCH AUTOSEL 6.5 08/18] btrfs: initialize start_slot in btrfs_log_prealloc_extents Sasha Levin
3 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2023-10-08 0:48 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Filipe Manana, Josef Bacik, David Sterba, Sasha Levin, clm,
linux-btrfs
From: Filipe Manana <fdmanana@suse.com>
[ Upstream commit a7ddeeb079505961355cf0106154da0110f1fdff ]
When starting a transaction, with a non-zero number of items, we reserve
metadata space for that number of items and for delayed refs by doing a
call to btrfs_block_rsv_add(), with the transaction block reserve passed
as the block reserve argument. This reserves metadata space and adds it
to the transaction block reserve. Later we migrate the space we reserved
for delayed references from the transaction block reserve into the delayed
refs block reserve, by calling btrfs_migrate_to_delayed_refs_rsv().
btrfs_migrate_to_delayed_refs_rsv() decrements the number of bytes to
migrate from the source block reserve, and this however may result in an
underflow in case the space added to the transaction block reserve ended
up being used by another task that has not reserved enough space for its
own use - examples are tasks doing reflinks or hole punching because they
end up calling btrfs_replace_file_extents() -> btrfs_drop_extents() and
may need to modify/COW a variable number of leaves/paths, so they keep
trying to use space from the transaction block reserve when they need to
COW an extent buffer, and may end up trying to use more space then they
have reserved (1 unit/path only for removing file extent items).
This can be avoided by simply reserving space first without adding it to
the transaction block reserve, then add the space for delayed refs to the
delayed refs block reserve and finally add the remaining reserved space
to the transaction block reserve. This also makes the code a bit shorter
and simpler. So just do that.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/btrfs/delayed-ref.c | 9 +--------
fs/btrfs/delayed-ref.h | 1 -
fs/btrfs/transaction.c | 6 +++---
3 files changed, 4 insertions(+), 12 deletions(-)
diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c
index 1043f66cc130d..9fe4ccca50a06 100644
--- a/fs/btrfs/delayed-ref.c
+++ b/fs/btrfs/delayed-ref.c
@@ -103,24 +103,17 @@ void btrfs_update_delayed_refs_rsv(struct btrfs_trans_handle *trans)
* Transfer bytes to our delayed refs rsv.
*
* @fs_info: the filesystem
- * @src: source block rsv to transfer from
* @num_bytes: number of bytes to transfer
*
- * This transfers up to the num_bytes amount from the src rsv to the
+ * This transfers up to the num_bytes amount, previously reserved, to the
* delayed_refs_rsv. Any extra bytes are returned to the space info.
*/
void btrfs_migrate_to_delayed_refs_rsv(struct btrfs_fs_info *fs_info,
- struct btrfs_block_rsv *src,
u64 num_bytes)
{
struct btrfs_block_rsv *delayed_refs_rsv = &fs_info->delayed_refs_rsv;
u64 to_free = 0;
- spin_lock(&src->lock);
- src->reserved -= num_bytes;
- src->size -= num_bytes;
- spin_unlock(&src->lock);
-
spin_lock(&delayed_refs_rsv->lock);
if (delayed_refs_rsv->size > delayed_refs_rsv->reserved) {
u64 delta = delayed_refs_rsv->size -
diff --git a/fs/btrfs/delayed-ref.h b/fs/btrfs/delayed-ref.h
index b8e14b0ba5f16..fd9bf2b709c0e 100644
--- a/fs/btrfs/delayed-ref.h
+++ b/fs/btrfs/delayed-ref.h
@@ -407,7 +407,6 @@ void btrfs_update_delayed_refs_rsv(struct btrfs_trans_handle *trans);
int btrfs_delayed_refs_rsv_refill(struct btrfs_fs_info *fs_info,
enum btrfs_reserve_flush_enum flush);
void btrfs_migrate_to_delayed_refs_rsv(struct btrfs_fs_info *fs_info,
- struct btrfs_block_rsv *src,
u64 num_bytes);
bool btrfs_check_space_for_delayed_refs(struct btrfs_fs_info *fs_info);
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 5bbd288b9cb54..0554ca2a8e3ba 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -625,14 +625,14 @@ start_transaction(struct btrfs_root *root, unsigned int num_items,
reloc_reserved = true;
}
- ret = btrfs_block_rsv_add(fs_info, rsv, num_bytes, flush);
+ ret = btrfs_reserve_metadata_bytes(fs_info, rsv, num_bytes, flush);
if (ret)
goto reserve_fail;
if (delayed_refs_bytes) {
- btrfs_migrate_to_delayed_refs_rsv(fs_info, rsv,
- delayed_refs_bytes);
+ btrfs_migrate_to_delayed_refs_rsv(fs_info, delayed_refs_bytes);
num_bytes -= delayed_refs_bytes;
}
+ btrfs_block_rsv_add_bytes(rsv, num_bytes, true);
if (rsv->space_info->force_alloc)
do_chunk_alloc = true;
--
2.40.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH AUTOSEL 6.5 07/18] btrfs: return -EUCLEAN for delayed tree ref with a ref count not equals to 1
[not found] <20231008004853.3767621-1-sashal@kernel.org>
2023-10-08 0:48 ` [PATCH AUTOSEL 6.5 05/18] btrfs: fix race when refilling delayed refs block reserve Sasha Levin
2023-10-08 0:48 ` [PATCH AUTOSEL 6.5 06/18] btrfs: prevent transaction block reserve underflow when starting transaction Sasha Levin
@ 2023-10-08 0:48 ` Sasha Levin
2023-10-08 0:48 ` [PATCH AUTOSEL 6.5 08/18] btrfs: initialize start_slot in btrfs_log_prealloc_extents Sasha Levin
3 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2023-10-08 0:48 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Filipe Manana, Josef Bacik, David Sterba, Sasha Levin, clm,
linux-btrfs
From: Filipe Manana <fdmanana@suse.com>
[ Upstream commit 1bf76df3fee56d6637718e267f7c34ed70d0c7dc ]
When running a delayed tree reference, if we find a ref count different
from 1, we return -EIO. This isn't an IO error, as it indicates either a
bug in the delayed refs code or a memory corruption, so change the error
code from -EIO to -EUCLEAN. Also tag the branch as 'unlikely' as this is
not expected to ever happen, and change the error message to print the
tree block's bytenr without the parenthesis (and there was a missing space
between the 'block' word and the opening parenthesis), for consistency as
that's the style we used everywhere else.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/btrfs/extent-tree.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 0917c5f39e3d0..2cf8d646085c2 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -1715,12 +1715,12 @@ static int run_delayed_tree_ref(struct btrfs_trans_handle *trans,
parent = ref->parent;
ref_root = ref->root;
- if (node->ref_mod != 1) {
+ if (unlikely(node->ref_mod != 1)) {
btrfs_err(trans->fs_info,
- "btree block(%llu) has %d references rather than 1: action %d ref_root %llu parent %llu",
+ "btree block %llu has %d references rather than 1: action %d ref_root %llu parent %llu",
node->bytenr, node->ref_mod, node->action, ref_root,
parent);
- return -EIO;
+ return -EUCLEAN;
}
if (node->action == BTRFS_ADD_DELAYED_REF && insert_reserved) {
BUG_ON(!extent_op || !extent_op->update_flags);
--
2.40.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH AUTOSEL 6.5 08/18] btrfs: initialize start_slot in btrfs_log_prealloc_extents
[not found] <20231008004853.3767621-1-sashal@kernel.org>
` (2 preceding siblings ...)
2023-10-08 0:48 ` [PATCH AUTOSEL 6.5 07/18] btrfs: return -EUCLEAN for delayed tree ref with a ref count not equals to 1 Sasha Levin
@ 2023-10-08 0:48 ` Sasha Levin
3 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2023-10-08 0:48 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Josef Bacik, Jens Axboe, David Sterba, Sasha Levin, clm,
linux-btrfs
From: Josef Bacik <josef@toxicpanda.com>
[ Upstream commit b4c639f699349880b7918b861e1bd360442ec450 ]
Jens reported a compiler warning when using
CONFIG_CC_OPTIMIZE_FOR_SIZE=y that looks like this
fs/btrfs/tree-log.c: In function ‘btrfs_log_prealloc_extents’:
fs/btrfs/tree-log.c:4828:23: warning: ‘start_slot’ may be used
uninitialized [-Wmaybe-uninitialized]
4828 | ret = copy_items(trans, inode, dst_path, path,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
4829 | start_slot, ins_nr, 1, 0);
| ~~~~~~~~~~~~~~~~~~~~~~~~~
fs/btrfs/tree-log.c:4725:13: note: ‘start_slot’ was declared here
4725 | int start_slot;
| ^~~~~~~~~~
The compiler is incorrect, as we only use this code when ins_len > 0,
and when ins_len > 0 we have start_slot properly initialized. However
we generally find the -Wmaybe-uninitialized warnings valuable, so
initialize start_slot to get rid of the warning.
Reported-by: Jens Axboe <axboe@kernel.dk>
Tested-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/btrfs/tree-log.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 365a1cc0a3c35..a00e7a0bc713d 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -4722,7 +4722,7 @@ static int btrfs_log_prealloc_extents(struct btrfs_trans_handle *trans,
struct extent_buffer *leaf;
int slot;
int ins_nr = 0;
- int start_slot;
+ int start_slot = 0;
int ret;
if (!(inode->flags & BTRFS_INODE_PREALLOC))
--
2.40.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-10-08 0:49 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20231008004853.3767621-1-sashal@kernel.org>
2023-10-08 0:48 ` [PATCH AUTOSEL 6.5 05/18] btrfs: fix race when refilling delayed refs block reserve Sasha Levin
2023-10-08 0:48 ` [PATCH AUTOSEL 6.5 06/18] btrfs: prevent transaction block reserve underflow when starting transaction Sasha Levin
2023-10-08 0:48 ` [PATCH AUTOSEL 6.5 07/18] btrfs: return -EUCLEAN for delayed tree ref with a ref count not equals to 1 Sasha Levin
2023-10-08 0:48 ` [PATCH AUTOSEL 6.5 08/18] btrfs: initialize start_slot in btrfs_log_prealloc_extents Sasha Levin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).