public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Alex Zhuravlev <azhuravlev@whamcloud.com>,
	Andreas Dilger <adilger@whamcloud.com>,
	Artem Blagodarenko <artem.blagodarenko@gmail.com>,
	Sasha Levin <sashal@kernel.org>,
	linux-ext4@vger.kernel.org
Subject: [PATCH AUTOSEL 5.8 05/63] ext4: skip non-loaded groups at cr=0/1 when scanning for good groups
Date: Mon, 24 Aug 2020 12:34:05 -0400	[thread overview]
Message-ID: <20200824163504.605538-5-sashal@kernel.org> (raw)
In-Reply-To: <20200824163504.605538-1-sashal@kernel.org>

From: Alex Zhuravlev <azhuravlev@whamcloud.com>

[ Upstream commit c1d2c7d47e15482bb23cda83a5021e60f624a09c ]

cr=0 is supposed to be an optimization to save CPU cycles, but if
buddy data (in memory) is not initialized then all this makes no sense
as we have to do sync IO taking a lot of cycles.  Also, at cr=0
mballoc doesn't choose any available chunk.  cr=1 also skips groups
using heuristic based on avg. fragment size.  It's more useful to skip
such groups and switch to cr=2 where groups will be scanned for
available chunks.  However, we always read the first block group in a
flex_bg so metadata blocks will get read into the first flex_bg if
possible.

Using sparse image and dm-slow virtual device of 120TB was
simulated, then the image was formatted and filled using debugfs to
mark ~85% of available space as busy.  mount process w/o the patch
couldn't complete in half an hour (according to vmstat it would take
~10-11 hours).  With the patch applied mount took ~20 seconds.

Lustre-bug-id: https://jira.whamcloud.com/browse/LU-12988
Signed-off-by: Alex Zhuravlev <azhuravlev@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <artem.blagodarenko@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/ext4/mballoc.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index c0a331e2feb02..9ed108b5bd7fd 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -2177,6 +2177,7 @@ static int ext4_mb_good_group_nolock(struct ext4_allocation_context *ac,
 {
 	struct ext4_group_info *grp = ext4_get_group_info(ac->ac_sb, group);
 	struct super_block *sb = ac->ac_sb;
+	struct ext4_sb_info *sbi = EXT4_SB(sb);
 	bool should_lock = ac->ac_flags & EXT4_MB_STRICT_CHECK;
 	ext4_grpblk_t free;
 	int ret = 0;
@@ -2195,7 +2196,25 @@ static int ext4_mb_good_group_nolock(struct ext4_allocation_context *ac,
 
 	/* We only do this if the grp has never been initialized */
 	if (unlikely(EXT4_MB_GRP_NEED_INIT(grp))) {
-		ret = ext4_mb_init_group(ac->ac_sb, group, GFP_NOFS);
+		struct ext4_group_desc *gdp =
+			ext4_get_group_desc(sb, group, NULL);
+		int ret;
+
+		/* cr=0/1 is a very optimistic search to find large
+		 * good chunks almost for free.  If buddy data is not
+		 * ready, then this optimization makes no sense.  But
+		 * we never skip the first block group in a flex_bg,
+		 * since this gets used for metadata block allocation,
+		 * and we want to make sure we locate metadata blocks
+		 * in the first block group in the flex_bg if possible.
+		 */
+		if (cr < 2 &&
+		    (!sbi->s_log_groups_per_flex ||
+		     ((group & ((1 << sbi->s_log_groups_per_flex) - 1)) != 0)) &&
+		    !(ext4_has_group_desc_csum(sb) &&
+		      (gdp->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT))))
+			return 0;
+		ret = ext4_mb_init_group(sb, group, GFP_NOFS);
 		if (ret)
 			return ret;
 	}
-- 
2.25.1


  parent reply	other threads:[~2020-08-24 17:11 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-24 16:34 [PATCH AUTOSEL 5.8 01/63] spi: stm32: clear only asserted irq flags on interrupt Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 02/63] jbd2: make sure jh have b_transaction set in refile/unfile_buffer Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 03/63] ext4: don't BUG on inconsistent journal feature Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 04/63] ext4: handle read only external journal device Sasha Levin
2020-08-24 16:34 ` Sasha Levin [this message]
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 06/63] drm/virtio: fix memory leak in virtio_gpu_cleanup_object() Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 07/63] ext4: abort the filesystem if failed to async write metadata buffer Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 08/63] jbd2: abort journal if free a async write error " Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 09/63] ext4: handle option set by mount flags correctly Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 10/63] ext4: handle error of ext4_setup_system_zone() on remount Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 11/63] ext4: correctly restore system zone info when remount fails Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 12/63] fs: prevent BUG_ON in submit_bh_wbc() Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 13/63] spi: stm32h7: fix race condition at end of transfer Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 14/63] spi: stm32: fix fifo threshold level in case of short transfer Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 15/63] spi: stm32: fix stm32_spi_prepare_mbr in case of odd clk_rate Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 16/63] spi: stm32: always perform registers configuration prior to transfer Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 17/63] drm/amd/powerplay: correct Vega20 cached smu feature state Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 18/63] drm/amd/powerplay: correct UVD/VCE PG state on custom pptable uploading Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 19/63] drm/amd/display: Fix LFC multiplier changing erratically Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 20/63] drm/amd/display: Switch to immediate mode for updating infopackets Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 21/63] selftests/bpf: Fix segmentation fault in test_progs Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 22/63] libbpf: Handle GCC built-in types for Arm NEON Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 23/63] netfilter: avoid ipv6 -> nf_defrag_ipv6 module dependency Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 24/63] libbpf: Prevent overriding errno when logging errors Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 25/63] tools/bpftool: Fix compilation warnings in 32-bit mode Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 26/63] selftest/bpf: " Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 27/63] selftests/bpf: Fix btf_dump test cases on 32-bit arches Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 28/63] selftests/bpf: Correct various core_reloc 64-bit assumptions Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 29/63] can: j1939: transport: j1939_xtp_rx_dat_one(): compare own packets to detect corruptions Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 30/63] dma-pool: fix coherent pool allocations for IOMMU mappings Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 31/63] dma-pool: Only allocate from CMA when in same memory zone Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 32/63] drivers/net/wan/hdlc_x25: Added needed_headroom and a skb->len check Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 33/63] ALSA: hda/realtek: Add model alc298-samsung-headphone Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 34/63] s390/cio: add cond_resched() in the slow_eval_known_fn() loop Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 35/63] ASoC: wm8994: Avoid attempts to read unreadable registers Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 36/63] ALSA: usb-audio: ignore broken processing/extension unit Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 37/63] selftests: disable rp_filter for icmp_redirect.sh Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 38/63] scsi: fcoe: Fix I/O path allocation Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 39/63] scsi: ufs: Fix possible infinite loop in ufshcd_hold Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 40/63] scsi: ufs: Improve interrupt handling for shared interrupts Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 41/63] scsi: ufs: Clean up completed request without interrupt notification Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 42/63] scsi: scsi_debug: Fix scp is NULL errors Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 43/63] scsi: qla2xxx: Flush all sessions on zone disable Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 44/63] scsi: qla2xxx: Flush I/O " Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 45/63] scsi: qla2xxx: Indicate correct supported speeds for Mezz card Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 46/63] scsi: qla2xxx: Fix login timeout Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 47/63] scsi: qla2xxx: Check if FW supports MQ before enabling Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 48/63] scsi: qla2xxx: Fix null pointer access during disconnect from subsystem Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 49/63] Revert "scsi: qla2xxx: Fix crash on qla2x00_mailbox_command" Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 50/63] macvlan: validate setting of multiple remote source MAC addresses Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 51/63] net: gianfar: Add of_node_put() before goto statement Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 52/63] drm/amdgpu: disable gfxoff for navy_flounder Sasha Levin
2020-08-24 18:23   ` Alex Deucher
2020-08-30 22:41     ` Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 53/63] drm/amdgpu: fix NULL pointer access issue when unloading driver Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 54/63] drm/amdkfd: fix the wrong sdma instance query for renoir Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 55/63] bpf: Fix a rcu_sched stall issue with bpf task/task_file iterator Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 56/63] bpf: Avoid visit same object multiple times Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 57/63] ext4: limit the length of per-inode prealloc list Sasha Levin
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 58/63] Revert "drm/amdgpu: disable gfxoff for navy_flounder" Sasha Levin
2020-08-24 18:24   ` Alex Deucher
2020-08-24 16:34 ` [PATCH AUTOSEL 5.8 59/63] powerpc/perf: Fix soft lockups due to missed interrupt accounting Sasha Levin
2020-08-24 16:35 ` [PATCH AUTOSEL 5.8 60/63] libbpf: Fix map index used in error message Sasha Levin
2020-08-24 16:35 ` [PATCH AUTOSEL 5.8 61/63] bpf: selftests: global_funcs: Check err_str before strstr Sasha Levin
2020-08-24 16:35 ` [PATCH AUTOSEL 5.8 62/63] arm64: Move handling of erratum 1418040 into C code Sasha Levin
2020-08-24 16:35 ` [PATCH AUTOSEL 5.8 63/63] arm64: Allow booting of late CPUs affected by erratum 1418040 Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200824163504.605538-5-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=adilger@whamcloud.com \
    --cc=artem.blagodarenko@gmail.com \
    --cc=azhuravlev@whamcloud.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox