public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Qu Wenruo <wqu@suse.com>, David Sterba <dsterba@suse.com>,
	Sasha Levin <sashal@kernel.org>,
	clm@fb.com, josef@toxicpanda.com, linux-btrfs@vger.kernel.org
Subject: [PATCH AUTOSEL 6.3 17/37] btrfs: scrub: try harder to mark RAID56 block groups read-only
Date: Wed, 31 May 2023 09:39:59 -0400	[thread overview]
Message-ID: <20230531134020.3383253-17-sashal@kernel.org> (raw)
In-Reply-To: <20230531134020.3383253-1-sashal@kernel.org>

From: Qu Wenruo <wqu@suse.com>

[ Upstream commit 7561551e7ba870b9659083b95feb520fb2dacce3 ]

Currently we allow a block group not to be marked read-only for scrub.

But for RAID56 block groups if we require the block group to be
read-only, then we're allowed to use cached content from scrub stripe to
reduce unnecessary RAID56 reads.

So this patch would:

- Make btrfs_inc_block_group_ro() try harder
  During my tests, for cases like btrfs/061 and btrfs/064, we can hit
  ENOSPC from btrfs_inc_block_group_ro() calls during scrub.

  The reason is if we only have one single data chunk, and trying to
  scrub it, we won't have any space left for any newer data writes.

  But this check should be done by the caller, especially for scrub
  cases we only temporarily mark the chunk read-only.
  And newer data writes would always try to allocate a new data chunk
  when needed.

- Return error for scrub if we failed to mark a RAID56 chunk read-only

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/btrfs/block-group.c | 14 ++++++++++++--
 fs/btrfs/scrub.c       |  9 ++++++++-
 2 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 5fc670c27f864..58ce5d44ce4d5 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -2832,10 +2832,20 @@ int btrfs_inc_block_group_ro(struct btrfs_block_group *cache,
 	}
 
 	ret = inc_block_group_ro(cache, 0);
-	if (!do_chunk_alloc || ret == -ETXTBSY)
-		goto unlock_out;
 	if (!ret)
 		goto out;
+	if (ret == -ETXTBSY)
+		goto unlock_out;
+
+	/*
+	 * Skip chunk alloction if the bg is SYSTEM, this is to avoid system
+	 * chunk allocation storm to exhaust the system chunk array.  Otherwise
+	 * we still want to try our best to mark the block group read-only.
+	 */
+	if (!do_chunk_alloc && ret == -ENOSPC &&
+	    (cache->flags & BTRFS_BLOCK_GROUP_SYSTEM))
+		goto unlock_out;
+
 	alloc_flags = btrfs_get_alloc_profile(fs_info, cache->space_info->flags);
 	ret = btrfs_chunk_alloc(trans, alloc_flags, CHUNK_ALLOC_FORCE);
 	if (ret < 0)
diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 69c93ae333f63..3720fd1f593d2 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -4034,13 +4034,20 @@ int scrub_enumerate_chunks(struct scrub_ctx *sctx,
 
 		if (ret == 0) {
 			ro_set = 1;
-		} else if (ret == -ENOSPC && !sctx->is_dev_replace) {
+		} else if (ret == -ENOSPC && !sctx->is_dev_replace &&
+			   !(cache->flags & BTRFS_BLOCK_GROUP_RAID56_MASK)) {
 			/*
 			 * btrfs_inc_block_group_ro return -ENOSPC when it
 			 * failed in creating new chunk for metadata.
 			 * It is not a problem for scrub, because
 			 * metadata are always cowed, and our scrub paused
 			 * commit_transactions.
+			 *
+			 * For RAID56 chunks, we have to mark them read-only
+			 * for scrub, as later we would use our own cache
+			 * out of RAID56 realm.
+			 * Thus we want the RAID56 bg to be marked RO to
+			 * prevent RMW from screwing up out cache.
 			 */
 			ro_set = 0;
 		} else if (ret == -ETXTBSY) {
-- 
2.39.2


  parent reply	other threads:[~2023-05-31 13:41 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-31 13:39 [PATCH AUTOSEL 6.3 01/37] power: supply: ab8500: Fix external_power_changed race Sasha Levin
2023-05-31 13:39 ` [PATCH AUTOSEL 6.3 02/37] power: supply: sc27xx: " Sasha Levin
2023-05-31 13:39 ` [PATCH AUTOSEL 6.3 03/37] power: supply: bq27xxx: Use mod_delayed_work() instead of cancel() + schedule() Sasha Levin
2023-05-31 13:39 ` [PATCH AUTOSEL 6.3 04/37] ARM: dts: vexpress: add missing cache properties Sasha Levin
2023-05-31 13:39 ` [PATCH AUTOSEL 6.3 05/37] arm64: dts: arm: " Sasha Levin
2023-05-31 13:39 ` [PATCH AUTOSEL 6.3 06/37] tools: gpio: fix debounce_period_us output of lsgpio Sasha Levin
2023-05-31 13:39 ` [PATCH AUTOSEL 6.3 07/37] selftests: gpio: gpio-sim: Fix BUG: test FAILED due to recent change Sasha Levin
2023-05-31 13:39 ` [PATCH AUTOSEL 6.3 08/37] power: supply: Ratelimit no data debug output Sasha Levin
2023-05-31 13:39 ` [PATCH AUTOSEL 6.3 09/37] PCI/DPC: Quirk PIO log size for Intel Ice Lake Root Ports Sasha Levin
2023-05-31 13:39 ` [PATCH AUTOSEL 6.3 10/37] platform/x86: asus-wmi: Ignore WMI events with codes 0x7B, 0xC0 Sasha Levin
2023-05-31 13:39 ` [PATCH AUTOSEL 6.3 11/37] regulator: Fix error checking for debugfs_create_dir Sasha Levin
2023-05-31 13:39 ` [PATCH AUTOSEL 6.3 12/37] irqchip/gic-v3: Disable pseudo NMIs on Mediatek devices w/ firmware issues Sasha Levin
2023-05-31 13:58   ` Doug Anderson
2023-05-31 13:39 ` [PATCH AUTOSEL 6.3 13/37] irqchip/meson-gpio: Mark OF related data as maybe unused Sasha Levin
2023-05-31 13:39 ` [PATCH AUTOSEL 6.3 14/37] power: supply: Fix logic checking if system is running from battery Sasha Levin
2023-05-31 13:39 ` [PATCH AUTOSEL 6.3 15/37] ASoC: lpass: Fix for KASAN use_after_free out of bounds Sasha Levin
2023-05-31 13:39 ` [PATCH AUTOSEL 6.3 16/37] drm: panel-orientation-quirks: Change Air's quirk to support Air Plus Sasha Levin
2023-05-31 13:39 ` Sasha Levin [this message]
2023-05-31 13:40 ` [PATCH AUTOSEL 6.3 18/37] btrfs: handle memory allocation failure in btrfs_csum_one_bio Sasha Levin
2023-05-31 13:40 ` [PATCH AUTOSEL 6.3 19/37] ASoC: soc-pcm: test if a BE can be prepared Sasha Levin
2023-05-31 13:40 ` [PATCH AUTOSEL 6.3 20/37] tls: rx: strp: force mixed decrypted records into copy mode Sasha Levin
2023-05-31 13:40 ` [PATCH AUTOSEL 6.3 21/37] sfc: fix devlink info error handling Sasha Levin
2023-05-31 13:40 ` [PATCH AUTOSEL 6.3 22/37] ASoC: Intel: avs: Account for UID of ACPI device Sasha Levin
2023-05-31 13:40 ` [PATCH AUTOSEL 6.3 23/37] ASoC: Intel: avs: Fix avs_path_module::instance_id size Sasha Levin
2023-05-31 13:40 ` [PATCH AUTOSEL 6.3 24/37] ASoC: Intel: avs: Add missing checks on FE startup Sasha Levin
2023-05-31 13:40 ` [PATCH AUTOSEL 6.3 25/37] parisc: Improve cache flushing for PCXL in arch_sync_dma_for_cpu() Sasha Levin
2023-05-31 13:40 ` [PATCH AUTOSEL 6.3 26/37] parisc: Flush gatt writes and adjust gatt mask in parisc_agp_mask_memory() Sasha Levin
2023-05-31 13:40 ` [PATCH AUTOSEL 6.3 27/37] erofs: use HIPRI by default if per-cpu kthreads are enabled Sasha Levin
2023-05-31 13:40 ` [PATCH AUTOSEL 6.3 28/37] MIPS: unhide PATA_PLATFORM Sasha Levin
2023-05-31 13:40 ` [PATCH AUTOSEL 6.3 29/37] MIPS: Restore Au1300 support Sasha Levin
2023-05-31 13:40 ` [PATCH AUTOSEL 6.3 30/37] MIPS: Alchemy: fix dbdma2 Sasha Levin
2023-05-31 13:40 ` [PATCH AUTOSEL 6.3 31/37] mips: Move initrd_start check after initrd address sanitisation Sasha Levin
2023-05-31 13:40 ` [PATCH AUTOSEL 6.3 32/37] ASoC: cs35l41: Fix default regmap values for some registers Sasha Levin
2023-05-31 13:40 ` [PATCH AUTOSEL 6.3 33/37] ASoC: dwc: move DMA init to snd_soc_dai_driver probe() Sasha Levin
2023-05-31 13:40 ` [PATCH AUTOSEL 6.3 34/37] xen/blkfront: Only check REQ_FUA for writes Sasha Levin
2023-05-31 13:40 ` [PATCH AUTOSEL 6.3 35/37] drm:amd:amdgpu: Fix missing buffer object unlock in failure path Sasha Levin
2023-05-31 13:40 ` [PATCH AUTOSEL 6.3 36/37] io_uring: unlock sqd->lock before sq thread release CPU Sasha Levin
2023-05-31 13:40 ` [PATCH AUTOSEL 6.3 37/37] NVMe: Add MAXIO 1602 to bogus nid list Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230531134020.3383253-17-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=clm@fb.com \
    --cc=dsterba@suse.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox