public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Ojaswin Mujoo <ojaswin@linux.ibm.com>, Jan Kara <jack@suse.cz>,
	Ritesh Harjani <ritesh.list@gmail.com>,
	Theodore Ts'o <tytso@mit.edu>, Sasha Levin <sashal@kernel.org>,
	adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org
Subject: [PATCH AUTOSEL 6.1 20/49] ext4: Fix best extent lstart adjustment logic in ext4_mb_new_inode_pa()
Date: Thu,  4 May 2023 15:45:57 -0400	[thread overview]
Message-ID: <20230504194626.3807438-20-sashal@kernel.org> (raw)
In-Reply-To: <20230504194626.3807438-1-sashal@kernel.org>

From: Ojaswin Mujoo <ojaswin@linux.ibm.com>

[ Upstream commit 93cdf49f6eca5e23f6546b8f28457b2e6a6961d9 ]

When the length of best extent found is less than the length of goal extent
we need to make sure that the best extent atleast covers the start of the
original request. This is done by adjusting the ac_b_ex.fe_logical (logical
start) of the extent.

While doing so, the current logic sometimes results in the best extent's
logical range overflowing the goal extent. Since this best extent is later
added to the inode preallocation list, we have a possibility of introducing
overlapping preallocations. This is discussed in detail here [1].

As per Jan's suggestion, to fix this, replace the existing logic with the
below logic for adjusting best extent as it keeps fragmentation in check
while ensuring logical range of best extent doesn't overflow out of goal
extent:

1. Check if best extent can be kept at end of goal range and still cover
   original start.
2. Else, check if best extent can be kept at start of goal range and still
   cover original start.
3. Else, keep the best extent at start of original request.

Also, add a few extra BUG_ONs that might help catch errors faster.

[1] https://lore.kernel.org/r/Y+OGkVvzPN0RMv0O@li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.ibm.com

Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/f96aca6d415b36d1f90db86c1a8cd7e2e9d7ab0e.1679731817.git.ojaswin@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/ext4/mballoc.c | 49 ++++++++++++++++++++++++++++++-----------------
 1 file changed, 31 insertions(+), 18 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 6c2d52b6e42c7..67a7aa3383655 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -4296,6 +4296,7 @@ static void ext4_mb_use_inode_pa(struct ext4_allocation_context *ac,
 	BUG_ON(start < pa->pa_pstart);
 	BUG_ON(end > pa->pa_pstart + EXT4_C2B(sbi, pa->pa_len));
 	BUG_ON(pa->pa_free < len);
+	BUG_ON(ac->ac_b_ex.fe_len <= 0);
 	pa->pa_free -= len;
 
 	mb_debug(ac->ac_sb, "use %llu/%d from inode pa %p\n", start, len, pa);
@@ -4620,10 +4621,8 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac)
 	pa = ac->ac_pa;
 
 	if (ac->ac_b_ex.fe_len < ac->ac_g_ex.fe_len) {
-		int winl;
-		int wins;
-		int win;
-		int offs;
+		int new_bex_start;
+		int new_bex_end;
 
 		/* we can't allocate as much as normalizer wants.
 		 * so, found space must get proper lstart
@@ -4631,26 +4630,40 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac)
 		BUG_ON(ac->ac_g_ex.fe_logical > ac->ac_o_ex.fe_logical);
 		BUG_ON(ac->ac_g_ex.fe_len < ac->ac_o_ex.fe_len);
 
-		/* we're limited by original request in that
-		 * logical block must be covered any way
-		 * winl is window we can move our chunk within */
-		winl = ac->ac_o_ex.fe_logical - ac->ac_g_ex.fe_logical;
+		/*
+		 * Use the below logic for adjusting best extent as it keeps
+		 * fragmentation in check while ensuring logical range of best
+		 * extent doesn't overflow out of goal extent:
+		 *
+		 * 1. Check if best ex can be kept at end of goal and still
+		 *    cover original start
+		 * 2. Else, check if best ex can be kept at start of goal and
+		 *    still cover original start
+		 * 3. Else, keep the best ex at start of original request.
+		 */
+		new_bex_end = ac->ac_g_ex.fe_logical +
+			EXT4_C2B(sbi, ac->ac_g_ex.fe_len);
+		new_bex_start = new_bex_end - EXT4_C2B(sbi, ac->ac_b_ex.fe_len);
+		if (ac->ac_o_ex.fe_logical >= new_bex_start)
+			goto adjust_bex;
 
-		/* also, we should cover whole original request */
-		wins = EXT4_C2B(sbi, ac->ac_b_ex.fe_len - ac->ac_o_ex.fe_len);
+		new_bex_start = ac->ac_g_ex.fe_logical;
+		new_bex_end =
+			new_bex_start + EXT4_C2B(sbi, ac->ac_b_ex.fe_len);
+		if (ac->ac_o_ex.fe_logical < new_bex_end)
+			goto adjust_bex;
 
-		/* the smallest one defines real window */
-		win = min(winl, wins);
+		new_bex_start = ac->ac_o_ex.fe_logical;
+		new_bex_end =
+			new_bex_start + EXT4_C2B(sbi, ac->ac_b_ex.fe_len);
 
-		offs = ac->ac_o_ex.fe_logical %
-			EXT4_C2B(sbi, ac->ac_b_ex.fe_len);
-		if (offs && offs < win)
-			win = offs;
+adjust_bex:
+		ac->ac_b_ex.fe_logical = new_bex_start;
 
-		ac->ac_b_ex.fe_logical = ac->ac_o_ex.fe_logical -
-			EXT4_NUM_B2C(sbi, win);
 		BUG_ON(ac->ac_o_ex.fe_logical < ac->ac_b_ex.fe_logical);
 		BUG_ON(ac->ac_o_ex.fe_len > ac->ac_b_ex.fe_len);
+		BUG_ON(new_bex_end > (ac->ac_g_ex.fe_logical +
+				      EXT4_C2B(sbi, ac->ac_g_ex.fe_len)));
 	}
 
 	/* preallocation can change ac_b_ex, thus we store actually
-- 
2.39.2


  parent reply	other threads:[~2023-05-04 19:53 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-04 19:45 [PATCH AUTOSEL 6.1 01/49] wifi: ath: Silence memcpy run-time false positive warning Sasha Levin
2023-05-04 19:45 ` [PATCH AUTOSEL 6.1 02/49] bpf: Annotate data races in bpf_local_storage Sasha Levin
2023-05-04 19:45 ` [PATCH AUTOSEL 6.1 03/49] wifi: brcmfmac: pcie: Provide a buffer of random bytes to the device Sasha Levin
2023-05-04 19:45 ` [PATCH AUTOSEL 6.1 04/49] wifi: brcmfmac: cfg80211: Pass the PMK in binary instead of hex Sasha Levin
2023-05-04 19:45 ` [PATCH AUTOSEL 6.1 05/49] bpf, mips: Implement DADDI workarounds for JIT Sasha Levin
2023-05-04 19:45 ` [PATCH AUTOSEL 6.1 06/49] ext2: Check block size validity during mount Sasha Levin
2023-05-04 19:45 ` [PATCH AUTOSEL 6.1 07/49] scsi: lpfc: Prevent lpfc_debugfs_lockstat_write() buffer overflow Sasha Levin
2023-05-04 19:45 ` [PATCH AUTOSEL 6.1 08/49] scsi: lpfc: Correct used_rpi count when devloss tmo fires with no recovery Sasha Levin
2023-05-04 19:45 ` [PATCH AUTOSEL 6.1 09/49] wifi: brcmfmac: slab-out-of-bounds read in brcmf_get_assoc_ies() Sasha Levin
2023-05-04 19:45 ` [PATCH AUTOSEL 6.1 10/49] bnxt: avoid overflow in bnxt_get_nvram_directory() Sasha Levin
2023-05-04 19:45 ` [PATCH AUTOSEL 6.1 11/49] net: pasemi: Fix return type of pasemi_mac_start_tx() Sasha Levin
2023-05-04 19:45 ` [PATCH AUTOSEL 6.1 12/49] net: Catch invalid index in XPS mapping Sasha Levin
2023-05-04 19:45 ` [PATCH AUTOSEL 6.1 13/49] netdev: Enforce index cap in netdev_get_tx_queue Sasha Levin
2023-05-04 19:45 ` [PATCH AUTOSEL 6.1 14/49] scsi: target: iscsit: Free cmds before session free Sasha Levin
2023-05-04 19:45 ` [PATCH AUTOSEL 6.1 15/49] lib: cpu_rmap: Avoid use after free on rmap->obj array entries Sasha Levin
2023-05-04 19:45 ` [PATCH AUTOSEL 6.1 16/49] scsi: message: mptlan: Fix use after free bug in mptlan_remove() due to race condition Sasha Levin
2023-05-04 19:45 ` [PATCH AUTOSEL 6.1 17/49] gfs2: Fix inode height consistency check Sasha Levin
2023-05-04 19:45 ` [PATCH AUTOSEL 6.1 18/49] scsi: ufs: ufs-pci: Add support for Intel Lunar Lake Sasha Levin
2023-05-04 19:45 ` [PATCH AUTOSEL 6.1 19/49] ext4: set goal start correctly in ext4_mb_normalize_request Sasha Levin
2023-05-04 19:45 ` Sasha Levin [this message]
2023-05-04 19:45 ` [PATCH AUTOSEL 6.1 21/49] crypto: jitter - permanent and intermittent health errors Sasha Levin
2023-05-04 19:45 ` [PATCH AUTOSEL 6.1 22/49] f2fs: Fix system crash due to lack of free space in LFS Sasha Levin
2023-05-04 19:46 ` [PATCH AUTOSEL 6.1 23/49] f2fs: fix to drop all dirty pages during umount() if cp_error is set Sasha Levin
2023-05-04 19:46 ` [PATCH AUTOSEL 6.1 24/49] f2fs: fix to check readonly condition correctly Sasha Levin
2023-05-04 19:46 ` [PATCH AUTOSEL 6.1 25/49] samples/bpf: Fix fout leak in hbm's run_bpf_prog Sasha Levin
2023-05-04 19:46 ` [PATCH AUTOSEL 6.1 26/49] bpf: Add preempt_count_{sub,add} into btf id deny list Sasha Levin
2023-05-04 19:46 ` [PATCH AUTOSEL 6.1 27/49] md: fix soft lockup in status_resync Sasha Levin
2023-05-04 19:46 ` [PATCH AUTOSEL 6.1 28/49] wifi: iwlwifi: pcie: fix possible NULL pointer dereference Sasha Levin
2023-05-04 19:46 ` [PATCH AUTOSEL 6.1 29/49] wifi: iwlwifi: add a new PCI device ID for BZ device Sasha Levin
2023-05-04 19:46 ` [PATCH AUTOSEL 6.1 30/49] wifi: iwlwifi: pcie: Fix integer overflow in iwl_write_to_user_buf Sasha Levin
2023-05-04 19:46 ` [PATCH AUTOSEL 6.1 31/49] wifi: iwlwifi: mvm: fix ptk_pn memory leak Sasha Levin
2023-05-04 19:46 ` [PATCH AUTOSEL 6.1 32/49] block, bfq: Fix division by zero error on zero wsum Sasha Levin
2023-05-04 19:46 ` [PATCH AUTOSEL 6.1 33/49] wifi: ath11k: Ignore frags from uninitialized peer in dp Sasha Levin
2023-05-04 19:46 ` [PATCH AUTOSEL 6.1 34/49] wifi: iwlwifi: fix iwl_mvm_max_amsdu_size() for MLO Sasha Levin
2023-05-04 19:46 ` [PATCH AUTOSEL 6.1 35/49] null_blk: Always check queue mode setting from configfs Sasha Levin
2023-05-04 19:46 ` [PATCH AUTOSEL 6.1 36/49] wifi: iwlwifi: dvm: Fix memcpy: detected field-spanning write backtrace Sasha Levin
2023-05-04 19:46 ` [PATCH AUTOSEL 6.1 37/49] wifi: ath11k: Fix SKB corruption in REO destination ring Sasha Levin
2023-05-04 19:46 ` [PATCH AUTOSEL 6.1 38/49] nbd: fix incomplete validation of ioctl arg Sasha Levin
2023-05-04 19:46 ` [PATCH AUTOSEL 6.1 39/49] ipvs: Update width of source for ip_vs_sync_conn_options Sasha Levin
2023-05-04 19:46 ` [PATCH AUTOSEL 6.1 40/49] Bluetooth: btusb: Add new PID/VID 04ca:3801 for MT7663 Sasha Levin
2023-05-04 19:46 ` [PATCH AUTOSEL 6.1 41/49] Bluetooth: Add new quirk for broken local ext features page 2 Sasha Levin
2023-05-04 19:46 ` [PATCH AUTOSEL 6.1 42/49] Bluetooth: btrtl: add support for the RTL8723CS Sasha Levin
2023-05-04 19:46 ` [PATCH AUTOSEL 6.1 43/49] Bluetooth: Improve support for Actions Semi ATS2851 based devices Sasha Levin
2023-05-04 19:46 ` [PATCH AUTOSEL 6.1 44/49] Bluetooth: btrtl: check for NULL in btrtl_set_quirks() Sasha Levin
2023-05-04 19:46 ` [PATCH AUTOSEL 6.1 45/49] Bluetooth: btintel: Add LE States quirk support Sasha Levin
2023-05-04 19:46 ` [PATCH AUTOSEL 6.1 46/49] Bluetooth: hci_bcm: Fall back to getting bdaddr from EFI if not set Sasha Levin
2023-05-04 19:46 ` [PATCH AUTOSEL 6.1 47/49] Bluetooth: Add new quirk for broken set random RPA timeout for ATS2851 Sasha Levin
2023-05-04 19:46 ` [PATCH AUTOSEL 6.1 48/49] Bluetooth: L2CAP: fix "bad unlock balance" in l2cap_disconnect_rsp Sasha Levin
2023-05-04 19:46 ` [PATCH AUTOSEL 6.1 49/49] Bluetooth: btrtl: Add the support for RTL8851B Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230504194626.3807438-20-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=adilger.kernel@dilger.ca \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ojaswin@linux.ibm.com \
    --cc=ritesh.list@gmail.com \
    --cc=stable@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox