* [PATCH AUTOSEL 6.3 17/37] btrfs: scrub: try harder to mark RAID56 block groups read-only
[not found] <20230531134020.3383253-1-sashal@kernel.org>
@ 2023-05-31 13:39 ` Sasha Levin
2023-05-31 13:40 ` [PATCH AUTOSEL 6.3 18/37] btrfs: handle memory allocation failure in btrfs_csum_one_bio Sasha Levin
1 sibling, 0 replies; 2+ messages in thread
From: Sasha Levin @ 2023-05-31 13:39 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Qu Wenruo, David Sterba, Sasha Levin, clm, josef, linux-btrfs
From: Qu Wenruo <wqu@suse.com>
[ Upstream commit 7561551e7ba870b9659083b95feb520fb2dacce3 ]
Currently we allow a block group not to be marked read-only for scrub.
But for RAID56 block groups if we require the block group to be
read-only, then we're allowed to use cached content from scrub stripe to
reduce unnecessary RAID56 reads.
So this patch would:
- Make btrfs_inc_block_group_ro() try harder
During my tests, for cases like btrfs/061 and btrfs/064, we can hit
ENOSPC from btrfs_inc_block_group_ro() calls during scrub.
The reason is if we only have one single data chunk, and trying to
scrub it, we won't have any space left for any newer data writes.
But this check should be done by the caller, especially for scrub
cases we only temporarily mark the chunk read-only.
And newer data writes would always try to allocate a new data chunk
when needed.
- Return error for scrub if we failed to mark a RAID56 chunk read-only
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/btrfs/block-group.c | 14 ++++++++++++--
fs/btrfs/scrub.c | 9 ++++++++-
2 files changed, 20 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 5fc670c27f864..58ce5d44ce4d5 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -2832,10 +2832,20 @@ int btrfs_inc_block_group_ro(struct btrfs_block_group *cache,
}
ret = inc_block_group_ro(cache, 0);
- if (!do_chunk_alloc || ret == -ETXTBSY)
- goto unlock_out;
if (!ret)
goto out;
+ if (ret == -ETXTBSY)
+ goto unlock_out;
+
+ /*
+ * Skip chunk alloction if the bg is SYSTEM, this is to avoid system
+ * chunk allocation storm to exhaust the system chunk array. Otherwise
+ * we still want to try our best to mark the block group read-only.
+ */
+ if (!do_chunk_alloc && ret == -ENOSPC &&
+ (cache->flags & BTRFS_BLOCK_GROUP_SYSTEM))
+ goto unlock_out;
+
alloc_flags = btrfs_get_alloc_profile(fs_info, cache->space_info->flags);
ret = btrfs_chunk_alloc(trans, alloc_flags, CHUNK_ALLOC_FORCE);
if (ret < 0)
diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 69c93ae333f63..3720fd1f593d2 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -4034,13 +4034,20 @@ int scrub_enumerate_chunks(struct scrub_ctx *sctx,
if (ret == 0) {
ro_set = 1;
- } else if (ret == -ENOSPC && !sctx->is_dev_replace) {
+ } else if (ret == -ENOSPC && !sctx->is_dev_replace &&
+ !(cache->flags & BTRFS_BLOCK_GROUP_RAID56_MASK)) {
/*
* btrfs_inc_block_group_ro return -ENOSPC when it
* failed in creating new chunk for metadata.
* It is not a problem for scrub, because
* metadata are always cowed, and our scrub paused
* commit_transactions.
+ *
+ * For RAID56 chunks, we have to mark them read-only
+ * for scrub, as later we would use our own cache
+ * out of RAID56 realm.
+ * Thus we want the RAID56 bg to be marked RO to
+ * prevent RMW from screwing up out cache.
*/
ro_set = 0;
} else if (ret == -ETXTBSY) {
--
2.39.2
^ permalink raw reply related [flat|nested] 2+ messages in thread* [PATCH AUTOSEL 6.3 18/37] btrfs: handle memory allocation failure in btrfs_csum_one_bio
[not found] <20230531134020.3383253-1-sashal@kernel.org>
2023-05-31 13:39 ` [PATCH AUTOSEL 6.3 17/37] btrfs: scrub: try harder to mark RAID56 block groups read-only Sasha Levin
@ 2023-05-31 13:40 ` Sasha Levin
1 sibling, 0 replies; 2+ messages in thread
From: Sasha Levin @ 2023-05-31 13:40 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Johannes Thumshirn, syzbot+d8941552e21eac774778,
Christoph Hellwig, Anand Jain, David Sterba, Sasha Levin, clm,
josef, linux-btrfs
From: Johannes Thumshirn <johannes.thumshirn@wdc.com>
[ Upstream commit 806570c0bb7b4847828c22c4934fcf2dc8fc572f ]
Since f8a53bb58ec7 ("btrfs: handle checksum generation in the storage
layer") the failures of btrfs_csum_one_bio() are handled via
bio_end_io().
This means, we can return BLK_STS_RESOURCE from btrfs_csum_one_bio() in
case the allocation of the ordered sums fails.
This also fixes a syzkaller report, where injecting a failure into the
kvzalloc() call results in a BUG_ON().
Reported-by: syzbot+d8941552e21eac774778@syzkaller.appspotmail.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/btrfs/file-item.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c
index a4584c629ba35..9e45b416a9c85 100644
--- a/fs/btrfs/file-item.c
+++ b/fs/btrfs/file-item.c
@@ -847,7 +847,9 @@ blk_status_t btrfs_csum_one_bio(struct btrfs_bio *bbio)
sums = kvzalloc(btrfs_ordered_sum_size(fs_info,
bytes_left), GFP_KERNEL);
memalloc_nofs_restore(nofs_flag);
- BUG_ON(!sums); /* -ENOMEM */
+ if (!sums)
+ return BLK_STS_RESOURCE;
+
sums->len = bytes_left;
ordered = btrfs_lookup_ordered_extent(inode,
offset);
--
2.39.2
^ permalink raw reply related [flat|nested] 2+ messages in thread