From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92D7FC7EE2E for ; Wed, 31 May 2023 13:45:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236453AbjEaNpN (ORCPT ); Wed, 31 May 2023 09:45:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54780 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236667AbjEaNoT (ORCPT ); Wed, 31 May 2023 09:44:19 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7888310FA; Wed, 31 May 2023 06:42:45 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 8F10363B52; Wed, 31 May 2023 13:42:29 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3DC01C433A0; Wed, 31 May 2023 13:42:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1685540549; bh=qAKfpel+uMjutzpbQSJ03/v59uj44iW1nUTcXKMD3eI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=OnXfrC9I3vyEWKbvwUFnJj0dAQeyOyfYsh8MSuyvlgawP1WjzAP+OVQmEwNt3HMSN sZqWMqR5geeU/e2MLL4/UNsPthaTg4CLx7ckkENVi+O2Kez7hxLZdnSl1+d9pERwaa JdT04pNAe6Lparak+yO9reCBJSWHn9zjv1jg5Nsgu8X+uQtOnk53XEHMpEoaUiMH88 VZc86TaWNqU6lJTT7/h5GgZrl1nTNpz/mB0I00XvVERbA6WRqxt+SsqyspJIZa1IK6 cDveHVBkadm0pyQCLocZn7l5FsJbBFcrAFtCI0hWNOkgRzS+n4Ynf44eRXJ+HI4uOf KjSNWLCF+RTBw== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Qu Wenruo , David Sterba , Sasha Levin , clm@fb.com, josef@toxicpanda.com, linux-btrfs@vger.kernel.org Subject: [PATCH AUTOSEL 6.1 16/33] btrfs: scrub: try harder to mark RAID56 block groups read-only Date: Wed, 31 May 2023 09:41:42 -0400 Message-Id: <20230531134159.3383703-16-sashal@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230531134159.3383703-1-sashal@kernel.org> References: <20230531134159.3383703-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Qu Wenruo [ Upstream commit 7561551e7ba870b9659083b95feb520fb2dacce3 ] Currently we allow a block group not to be marked read-only for scrub. But for RAID56 block groups if we require the block group to be read-only, then we're allowed to use cached content from scrub stripe to reduce unnecessary RAID56 reads. So this patch would: - Make btrfs_inc_block_group_ro() try harder During my tests, for cases like btrfs/061 and btrfs/064, we can hit ENOSPC from btrfs_inc_block_group_ro() calls during scrub. The reason is if we only have one single data chunk, and trying to scrub it, we won't have any space left for any newer data writes. But this check should be done by the caller, especially for scrub cases we only temporarily mark the chunk read-only. And newer data writes would always try to allocate a new data chunk when needed. - Return error for scrub if we failed to mark a RAID56 chunk read-only Signed-off-by: Qu Wenruo Signed-off-by: David Sterba Signed-off-by: Sasha Levin --- fs/btrfs/block-group.c | 14 ++++++++++++-- fs/btrfs/scrub.c | 9 ++++++++- 2 files changed, 20 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c index f33ddd5922b8c..74a5c94898b0f 100644 --- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -2621,10 +2621,20 @@ int btrfs_inc_block_group_ro(struct btrfs_block_group *cache, } ret = inc_block_group_ro(cache, 0); - if (!do_chunk_alloc || ret == -ETXTBSY) - goto unlock_out; if (!ret) goto out; + if (ret == -ETXTBSY) + goto unlock_out; + + /* + * Skip chunk alloction if the bg is SYSTEM, this is to avoid system + * chunk allocation storm to exhaust the system chunk array. Otherwise + * we still want to try our best to mark the block group read-only. + */ + if (!do_chunk_alloc && ret == -ENOSPC && + (cache->flags & BTRFS_BLOCK_GROUP_SYSTEM)) + goto unlock_out; + alloc_flags = btrfs_get_alloc_profile(fs_info, cache->space_info->flags); ret = btrfs_chunk_alloc(trans, alloc_flags, CHUNK_ALLOC_FORCE); if (ret < 0) diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index c5d8dc112fd58..1672d4846baaf 100644 --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -4017,13 +4017,20 @@ int scrub_enumerate_chunks(struct scrub_ctx *sctx, if (ret == 0) { ro_set = 1; - } else if (ret == -ENOSPC && !sctx->is_dev_replace) { + } else if (ret == -ENOSPC && !sctx->is_dev_replace && + !(cache->flags & BTRFS_BLOCK_GROUP_RAID56_MASK)) { /* * btrfs_inc_block_group_ro return -ENOSPC when it * failed in creating new chunk for metadata. * It is not a problem for scrub, because * metadata are always cowed, and our scrub paused * commit_transactions. + * + * For RAID56 chunks, we have to mark them read-only + * for scrub, as later we would use our own cache + * out of RAID56 realm. + * Thus we want the RAID56 bg to be marked RO to + * prevent RMW from screwing up out cache. */ ro_set = 0; } else if (ret == -ETXTBSY) { -- 2.39.2