linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Zhang Yi <yi.zhang@huaweicloud.com>
To: linux-ext4@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz,
	ritesh.list@gmail.com, yi.zhang@huawei.com,
	yi.zhang@huaweicloud.com, chengzhihao1@huawei.com,
	yukuai3@huawei.com
Subject: [PATCH v3 06/12] ext4: update delalloc data reserve spcae in ext4_es_insert_extent()
Date: Tue, 13 Aug 2024 20:34:46 +0800	[thread overview]
Message-ID: <20240813123452.2824659-7-yi.zhang@huaweicloud.com> (raw)
In-Reply-To: <20240813123452.2824659-1-yi.zhang@huaweicloud.com>

From: Zhang Yi <yi.zhang@huawei.com>

Now that we update data reserved space for delalloc after allocating
new blocks in ext4_{ind|ext}_map_blocks(), and if bigalloc feature is
enabled, we also need to query the extents_status tree to calculate the
exact reserved clusters. This is complicated now and it appears that
it's better to do this job in ext4_es_insert_extent(), because
__es_remove_extent() have already count delalloc blocks when removing
delalloc extents and __revise_pending() return new adding pending count,
we could update the reserved blocks easily in ext4_es_insert_extent().

We direct reduce the reserved cluster count when replacing a delalloc
extent. However, thers are two special cases need to concern about the
quota claiming when doing direct block allocation (e.g. from fallocate).

A),
fallocate a range that covers a delalloc extent but start with
non-delayed allocated blocks, e.g. a hole.

  hhhhhhh+ddddddd+ddddddd
  ^^^^^^^^^^^^^^^^^^^^^^^  fallocate this range

Current ext4_map_blocks() can't always trim the extent since it may
release i_data_sem before calling ext4_map_create_blocks() and raced by
another delayed allocation. Hence the EXT4_GET_BLOCKS_DELALLOC_RESERVE
may not set even when we are replacing a delalloc extent, without this
flag set, the quota has already been claimed by ext4_mb_new_blocks(), so
we should release the quota reservations instead of claim them again.

B),
bigalloc feature is enabled, fallocate a range that contains non-delayed
allocated blocks.

  |<         one cluster       >|
  hhhhhhh+hhhhhhh+hhhhhhh+ddddddd
  ^^^^^^^  fallocate this range

This case is similar to above case, the EXT4_GET_BLOCKS_DELALLOC_RESERVE
flag is also not set.

Hence we should release the quota reservations if we replace a delalloc
extent but without EXT4_GET_BLOCKS_DELALLOC_RESERVE set.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
 fs/ext4/extents.c        | 37 -------------------------------------
 fs/ext4/extents_status.c | 25 ++++++++++++++++++++++++-
 fs/ext4/indirect.c       |  7 -------
 3 files changed, 24 insertions(+), 45 deletions(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 671dacd7c873..db8f9d79477c 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -4357,43 +4357,6 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
 		goto out;
 	}
 
-	/*
-	 * Reduce the reserved cluster count to reflect successful deferred
-	 * allocation of delayed allocated clusters or direct allocation of
-	 * clusters discovered to be delayed allocated.  Once allocated, a
-	 * cluster is not included in the reserved count.
-	 */
-	if (test_opt(inode->i_sb, DELALLOC) && allocated_clusters) {
-		if (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE) {
-			/*
-			 * When allocating delayed allocated clusters, simply
-			 * reduce the reserved cluster count and claim quota
-			 */
-			ext4_da_update_reserve_space(inode, allocated_clusters,
-							1);
-		} else {
-			ext4_lblk_t lblk, len;
-			unsigned int n;
-
-			/*
-			 * When allocating non-delayed allocated clusters
-			 * (from fallocate, filemap, DIO, or clusters
-			 * allocated when delalloc has been disabled by
-			 * ext4_nonda_switch), reduce the reserved cluster
-			 * count by the number of allocated clusters that
-			 * have previously been delayed allocated.  Quota
-			 * has been claimed by ext4_mb_new_blocks() above,
-			 * so release the quota reservations made for any
-			 * previously delayed allocated clusters.
-			 */
-			lblk = EXT4_LBLK_CMASK(sbi, map->m_lblk);
-			len = allocated_clusters << sbi->s_cluster_bits;
-			n = ext4_es_delayed_clu(inode, lblk, len);
-			if (n > 0)
-				ext4_da_update_reserve_space(inode, (int) n, 0);
-		}
-	}
-
 	/*
 	 * Cache the extent and update transaction to commit on fdatasync only
 	 * when it is _not_ an unwritten extent.
diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
index 0580bc4bc762..41adf0d69959 100644
--- a/fs/ext4/extents_status.c
+++ b/fs/ext4/extents_status.c
@@ -853,6 +853,7 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk,
 	struct extent_status newes;
 	ext4_lblk_t end = lblk + len - 1;
 	int err1 = 0, err2 = 0, err3 = 0;
+	int resv_used = 0, pending = 0;
 	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
 	struct extent_status *es1 = NULL;
 	struct extent_status *es2 = NULL;
@@ -891,7 +892,7 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk,
 		pr = __alloc_pending(true);
 	write_lock(&EXT4_I(inode)->i_es_lock);
 
-	err1 = __es_remove_extent(inode, lblk, end, NULL, es1);
+	err1 = __es_remove_extent(inode, lblk, end, &resv_used, es1);
 	if (err1 != 0)
 		goto error;
 	/* Free preallocated extent if it didn't get used. */
@@ -921,9 +922,31 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk,
 			__free_pending(pr);
 			pr = NULL;
 		}
+		pending = err3;
 	}
 error:
 	write_unlock(&EXT4_I(inode)->i_es_lock);
+	/*
+	 * Reduce the reserved cluster count to reflect successful deferred
+	 * allocation of delayed allocated clusters or direct allocation of
+	 * clusters discovered to be delayed allocated.  Once allocated, a
+	 * cluster is not included in the reserved count.
+	 *
+	 * When direct allocating (from fallocate, filemap, DIO, or clusters
+	 * allocated when delalloc has been disabled by ext4_nonda_switch())
+	 * an extent either 1) contains delayed blocks but start with
+	 * non-delayed allocated blocks (e.g. hole) or 2) contains non-delayed
+	 * allocated blocks which belong to delayed allocated clusters when
+	 * bigalloc feature is enabled, quota has already been claimed by
+	 * ext4_mb_new_blocks(), so release the quota reservations made for
+	 * any previously delayed allocated clusters instead of claim them
+	 * again.
+	 */
+	resv_used += pending;
+	if (resv_used)
+		ext4_da_update_reserve_space(inode, resv_used,
+				flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE);
+
 	if (err1 || err2 || err3 < 0)
 		goto retry;
 
diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c
index d8ca7f64f952..7404f0935c90 100644
--- a/fs/ext4/indirect.c
+++ b/fs/ext4/indirect.c
@@ -652,13 +652,6 @@ int ext4_ind_map_blocks(handle_t *handle, struct inode *inode,
 	ext4_update_inode_fsync_trans(handle, inode, 1);
 	count = ar.len;
 
-	/*
-	 * Update reserved blocks/metadata blocks after successful block
-	 * allocation which had been deferred till now.
-	 */
-	if (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE)
-		ext4_da_update_reserve_space(inode, count, 1);
-
 got_it:
 	map->m_flags |= EXT4_MAP_MAPPED;
 	map->m_pblk = le32_to_cpu(chain[depth-1].key);
-- 
2.39.2


  parent reply	other threads:[~2024-08-13 12:39 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-13 12:34 [PATCH v3 00/12] ext4: simplify the counting and management of delalloc reserved blocks Zhang Yi
2024-08-13 12:34 ` [PATCH v3 01/12] ext4: factor out ext4_map_create_blocks() to allocate new blocks Zhang Yi
2024-08-13 12:34 ` [PATCH v3 02/12] ext4: optimize the EXT4_GET_BLOCKS_DELALLOC_RESERVE flag set Zhang Yi
2024-08-13 12:34 ` [PATCH v3 03/12] ext4: don't set EXTENT_STATUS_DELAYED on allocated blocks Zhang Yi
2024-08-13 12:34 ` [PATCH v3 04/12] ext4: let __revise_pending() return newly inserted pendings Zhang Yi
2024-09-04 10:15   ` Jan Kara
2024-09-04 10:23     ` Jan Kara
2024-08-13 12:34 ` [PATCH v3 05/12] ext4: passing block allocation information to ext4_es_insert_extent() Zhang Yi
2024-09-04 10:21   ` Jan Kara
2024-09-04 11:43     ` Zhang Yi
2024-08-13 12:34 ` Zhang Yi [this message]
2024-09-04 10:22   ` [PATCH v3 06/12] ext4: update delalloc data reserve spcae in ext4_es_insert_extent() Jan Kara
2024-08-13 12:34 ` [PATCH v3 07/12] ext4: drop ext4_es_delayed_clu() Zhang Yi
2024-09-04 10:25   ` Jan Kara
2024-08-13 12:34 ` [PATCH v3 08/12] ext4: use ext4_map_query_blocks() in ext4_map_blocks() Zhang Yi
2024-08-13 12:34 ` [PATCH v3 09/12] ext4: drop unused ext4_es_store_status() Zhang Yi
2024-09-04 10:25   ` Jan Kara
2024-08-13 12:34 ` [PATCH v3 10/12] ext4: make extent status types exclusive Zhang Yi
2024-09-04 10:28   ` Jan Kara
2024-08-13 12:34 ` [PATCH v3 11/12] ext4: drop ext4_es_is_delonly() Zhang Yi
2024-09-04 10:28   ` Jan Kara
2024-08-13 12:34 ` [PATCH v3 12/12] ext4: drop all delonly descriptions Zhang Yi
2024-09-04 10:31   ` Jan Kara
2024-09-05 14:53 ` [PATCH v3 00/12] ext4: simplify the counting and management of delalloc reserved blocks Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240813123452.2824659-7-yi.zhang@huaweicloud.com \
    --to=yi.zhang@huaweicloud.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=chengzhihao1@huawei.com \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ritesh.list@gmail.com \
    --cc=tytso@mit.edu \
    --cc=yi.zhang@huawei.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).