linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/6] ext4: make ext4_map_blocks() recognize delayed only extent
@ 2023-11-21  9:34 Zhang Yi
  2023-11-21  9:34 ` [RFC PATCH 1/6] ext4: introduce ext4_es_skip_hole_extent() to skip hole extents Zhang Yi
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: Zhang Yi @ 2023-11-21  9:34 UTC (permalink / raw)
  To: linux-ext4
  Cc: tytso, adilger.kernel, jack, ritesh.list, yi.zhang, yi.zhang,
	chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

Hello, guys.

I'm working on switching ext4 buffer IO from buffer_head to iomap
and enable large folio on regular file recently, this patch set is one
of a preparation of this work. It first correct the hole length returned
by ext4_map_blocks() when user query mapping type and blocks range, and
then make this function and ext4_set_iomap() are able to distinguish
delayed allocated only mapping from hole, finally cleanup the
ext4_iomap_begin_report() by the way. This preparation patch set changes
the ext4 map -> iomap converting logic in ext4_set_iomap(), so that the
later buffer IO conversion can use it. This patch set is already passed
'kvm-xfstests -g auto' tests.

Thanks,
Yi.

Zhang Yi (6):
  ext4: introduce ext4_es_skip_hole_extent() to skip hole extents
  ext4: make ext4_es_lookup_extent() return the next extent if not found
  ext4: correct the hole length returned by ext4_map_blocks()
  ext4: add a hole extent entry in cache after punch
  ext4: make ext4_map_blocks() distinguish delayed only mapping
  ext4: make ext4_set_iomap() recognize IOMAP_DELALLOC mapping type

 fs/ext4/ext4.h              |  7 ++++-
 fs/ext4/extents.c           |  5 ++--
 fs/ext4/extents_status.c    | 53 ++++++++++++++++++++++++--------
 fs/ext4/extents_status.h    |  2 ++
 fs/ext4/inode.c             | 60 ++++++++++++++++++-------------------
 include/trace/events/ext4.h | 28 +++++++++++++++++
 6 files changed, 107 insertions(+), 48 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC PATCH 1/6] ext4: introduce ext4_es_skip_hole_extent() to skip hole extents
  2023-11-21  9:34 [RFC PATCH 0/6] ext4: make ext4_map_blocks() recognize delayed only extent Zhang Yi
@ 2023-11-21  9:34 ` Zhang Yi
  2023-11-21  9:34 ` [RFC PATCH 2/6] ext4: make ext4_es_lookup_extent() return the next extent if not found Zhang Yi
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Zhang Yi @ 2023-11-21  9:34 UTC (permalink / raw)
  To: linux-ext4
  Cc: tytso, adilger.kernel, jack, ritesh.list, yi.zhang, yi.zhang,
	chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

Introduce a new helper ext4_es_skip_hole_extent() to skip all hole
extents in a search range, return the valid lblk of next not hole extent
entry. It's useful to estimate and limit the length of a potential hole
returned when querying mapping status in ext4_map_blocks().

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
 fs/ext4/extents_status.c    | 32 ++++++++++++++++++++++++++++++++
 fs/ext4/extents_status.h    |  2 ++
 include/trace/events/ext4.h | 28 ++++++++++++++++++++++++++++
 3 files changed, 62 insertions(+)

diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
index 6f7de14c0fa8..1b1b1a8848a8 100644
--- a/fs/ext4/extents_status.c
+++ b/fs/ext4/extents_status.c
@@ -944,6 +944,38 @@ void ext4_es_cache_extent(struct inode *inode, ext4_lblk_t lblk,
 	write_unlock(&EXT4_I(inode)->i_es_lock);
 }
 
+/*
+ * ext4_es_skip_hole_extent() skip hole extents and loops up the next
+ * delayed/unwritten/mapped extent in extent status tree from lblk to
+ * end.
+ */
+ext4_lblk_t ext4_es_skip_hole_extent(struct inode *inode, ext4_lblk_t lblk,
+				     ext4_lblk_t len)
+{
+	struct extent_status *es = NULL;
+	ext4_lblk_t next_lblk;
+	struct rb_node *node;
+
+	read_lock(&EXT4_I(inode)->i_es_lock);
+	es = __es_tree_search(&EXT4_I(inode)->i_es_tree.root, lblk);
+
+	while (es && es->es_lblk < lblk + len) {
+		if (!ext4_es_is_hole(es))
+			break;
+		node = rb_next(&es->rb_node);
+		es = rb_entry(node, struct extent_status, rb_node);
+	}
+	if (!es || es->es_lblk >= lblk + len)
+		next_lblk = lblk + len;
+	else
+		next_lblk = es->es_lblk;
+
+	trace_ext4_es_skip_hole_extent(inode, lblk, len, next_lblk);
+	read_unlock(&EXT4_I(inode)->i_es_lock);
+
+	return next_lblk;
+}
+
 /*
  * ext4_es_lookup_extent() looks up an extent in extent status tree.
  *
diff --git a/fs/ext4/extents_status.h b/fs/ext4/extents_status.h
index d9847a4a25db..4f69322dd626 100644
--- a/fs/ext4/extents_status.h
+++ b/fs/ext4/extents_status.h
@@ -139,6 +139,8 @@ extern void ext4_es_find_extent_range(struct inode *inode,
 				      int (*match_fn)(struct extent_status *es),
 				      ext4_lblk_t lblk, ext4_lblk_t end,
 				      struct extent_status *es);
+ext4_lblk_t ext4_es_skip_hole_extent(struct inode *inode, ext4_lblk_t lblk,
+				     ext4_lblk_t len);
 extern int ext4_es_lookup_extent(struct inode *inode, ext4_lblk_t lblk,
 				 ext4_lblk_t *next_lblk,
 				 struct extent_status *es);
diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
index 65029dfb92fb..84421cecec0b 100644
--- a/include/trace/events/ext4.h
+++ b/include/trace/events/ext4.h
@@ -2291,6 +2291,34 @@ TRACE_EVENT(ext4_es_find_extent_range_exit,
 		  __entry->pblk, show_extent_status(__entry->status))
 );
 
+TRACE_EVENT(ext4_es_skip_hole_extent,
+	TP_PROTO(struct inode *inode, ext4_lblk_t lblk,
+		 ext4_lblk_t len, ext4_lblk_t next_lblk),
+
+	TP_ARGS(inode, lblk, len, next_lblk),
+
+	TP_STRUCT__entry(
+		__field(	dev_t,		dev		)
+		__field(	ino_t,		ino		)
+		__field(	ext4_lblk_t,	lblk		)
+		__field(	ext4_lblk_t,	len		)
+		__field(	ext4_lblk_t,	next		)
+	),
+
+	TP_fast_assign(
+		__entry->dev	= inode->i_sb->s_dev;
+		__entry->ino	= inode->i_ino;
+		__entry->lblk	= lblk;
+		__entry->len	= len;
+		__entry->next	= next_lblk;
+	),
+
+	TP_printk("dev %d,%d ino %lu [%u/%u) next_lblk %u",
+		  MAJOR(__entry->dev), MINOR(__entry->dev),
+		  (unsigned long) __entry->ino, __entry->lblk,
+		  __entry->len, __entry->next)
+);
+
 TRACE_EVENT(ext4_es_lookup_extent_enter,
 	TP_PROTO(struct inode *inode, ext4_lblk_t lblk),
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 2/6] ext4: make ext4_es_lookup_extent() return the next extent if not found
  2023-11-21  9:34 [RFC PATCH 0/6] ext4: make ext4_map_blocks() recognize delayed only extent Zhang Yi
  2023-11-21  9:34 ` [RFC PATCH 1/6] ext4: introduce ext4_es_skip_hole_extent() to skip hole extents Zhang Yi
@ 2023-11-21  9:34 ` Zhang Yi
  2023-11-21  9:34 ` [RFC PATCH 3/6] ext4: correct the hole length returned by ext4_map_blocks() Zhang Yi
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Zhang Yi @ 2023-11-21  9:34 UTC (permalink / raw)
  To: linux-ext4
  Cc: tytso, adilger.kernel, jack, ritesh.list, yi.zhang, yi.zhang,
	chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

Make ext4_es_lookup_extent() return the next extent entry if we can't
find the extent that lblk belongs to, it's useful to estimate and limit
the length of a potential hole in ext4_map_blocks().

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
 fs/ext4/extents_status.c | 21 ++++++++-------------
 1 file changed, 8 insertions(+), 13 deletions(-)

diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
index 1b1b1a8848a8..19a0cc904cd8 100644
--- a/fs/ext4/extents_status.c
+++ b/fs/ext4/extents_status.c
@@ -1012,19 +1012,9 @@ int ext4_es_lookup_extent(struct inode *inode, ext4_lblk_t lblk,
 		goto out;
 	}
 
-	node = tree->root.rb_node;
-	while (node) {
-		es1 = rb_entry(node, struct extent_status, rb_node);
-		if (lblk < es1->es_lblk)
-			node = node->rb_left;
-		else if (lblk > ext4_es_end(es1))
-			node = node->rb_right;
-		else {
-			found = 1;
-			break;
-		}
-	}
-
+	es1 = __es_tree_search(&tree->root, lblk);
+	if (es1 && in_range(lblk, es1->es_lblk, es1->es_len))
+		found = 1;
 out:
 	stats = &EXT4_SB(inode->i_sb)->s_es_stats;
 	if (found) {
@@ -1045,6 +1035,11 @@ int ext4_es_lookup_extent(struct inode *inode, ext4_lblk_t lblk,
 				*next_lblk = 0;
 		}
 	} else {
+		if (es1) {
+			es->es_lblk = es1->es_lblk;
+			es->es_len = es1->es_len;
+			es->es_pblk = es1->es_pblk;
+		}
 		percpu_counter_inc(&stats->es_stats_cache_misses);
 	}
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 3/6] ext4: correct the hole length returned by ext4_map_blocks()
  2023-11-21  9:34 [RFC PATCH 0/6] ext4: make ext4_map_blocks() recognize delayed only extent Zhang Yi
  2023-11-21  9:34 ` [RFC PATCH 1/6] ext4: introduce ext4_es_skip_hole_extent() to skip hole extents Zhang Yi
  2023-11-21  9:34 ` [RFC PATCH 2/6] ext4: make ext4_es_lookup_extent() return the next extent if not found Zhang Yi
@ 2023-11-21  9:34 ` Zhang Yi
  2023-12-13 18:21   ` Jan Kara
  2023-11-21  9:34 ` [RFC PATCH 4/6] ext4: add a hole extent entry in cache after punch Zhang Yi
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 11+ messages in thread
From: Zhang Yi @ 2023-11-21  9:34 UTC (permalink / raw)
  To: linux-ext4
  Cc: tytso, adilger.kernel, jack, ritesh.list, yi.zhang, yi.zhang,
	chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

In ext4_map_blocks(), if we can't find a range of mapping in the
extents cache, we are calling ext4_ext_map_blocks() to search the real
path. But if the querying range was tail overlaped by a delayed extent,
we can't find it on the real extent path, so the returned hole length
could be larger than it really is.

      |          querying map          |
      v                                v
      |----------{-------------}{------|----------------}-----...
      ^          ^             ^^                       ^
      | uncached | hole extent ||     delayed extent    |

We have to adjust the mapping length to the next not hole extent's
lblk before searching the extent path.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
 fs/ext4/inode.c | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 4ce35f1c8b0a..94e7b8500878 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -479,6 +479,7 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
 		    struct ext4_map_blocks *map, int flags)
 {
 	struct extent_status es;
+	ext4_lblk_t next;
 	int retval;
 	int ret = 0;
 #ifdef ES_AGGRESSIVE_TEST
@@ -502,8 +503,10 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
 		return -EFSCORRUPTED;
 
 	/* Lookup extent status tree firstly */
-	if (!(EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) &&
-	    ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es)) {
+	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
+		goto uncached;
+
+	if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es)) {
 		if (ext4_es_is_written(&es) || ext4_es_is_unwritten(&es)) {
 			map->m_pblk = ext4_es_pblock(&es) +
 					map->m_lblk - es.es_lblk;
@@ -532,6 +535,23 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
 #endif
 		goto found;
 	}
+	/*
+	 * Not found, maybe a hole, need to adjust the map length before
+	 * seraching the real extent path. It can prevent incorrect hole
+	 * length returned if the following entries have delayed only
+	 * ones.
+	 */
+	if (!(flags & EXT4_GET_BLOCKS_CREATE) && es.es_lblk > map->m_lblk) {
+		next = es.es_lblk;
+		if (ext4_es_is_hole(&es))
+			next = ext4_es_skip_hole_extent(inode, map->m_lblk,
+							map->m_len);
+		retval = next - map->m_lblk;
+		if (map->m_len > retval)
+			map->m_len = retval;
+	}
+
+uncached:
 	/*
 	 * In the query cache no-wait mode, nothing we can do more if we
 	 * cannot find extent in the cache.
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 4/6] ext4: add a hole extent entry in cache after punch
  2023-11-21  9:34 [RFC PATCH 0/6] ext4: make ext4_map_blocks() recognize delayed only extent Zhang Yi
                   ` (2 preceding siblings ...)
  2023-11-21  9:34 ` [RFC PATCH 3/6] ext4: correct the hole length returned by ext4_map_blocks() Zhang Yi
@ 2023-11-21  9:34 ` Zhang Yi
  2023-11-21  9:34 ` [RFC PATCH 5/6] ext4: make ext4_map_blocks() distinguish delayed only mapping Zhang Yi
  2023-11-21  9:34 ` [RFC PATCH 6/6] ext4: make ext4_set_iomap() recognize IOMAP_DELALLOC mapping type Zhang Yi
  5 siblings, 0 replies; 11+ messages in thread
From: Zhang Yi @ 2023-11-21  9:34 UTC (permalink / raw)
  To: linux-ext4
  Cc: tytso, adilger.kernel, jack, ritesh.list, yi.zhang, yi.zhang,
	chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

In order to cache hole extents in the extent status tree and keep the
hole continuity as much as possible, add a hole entry to the cache after
punching a hole. It can reduce the 'hole' in some continuous hole extent
entries.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
 fs/ext4/ext4.h    | 3 +++
 fs/ext4/extents.c | 5 ++---
 fs/ext4/inode.c   | 2 ++
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 9418359b1d9d..c2ca28c6ec38 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -3681,6 +3681,9 @@ extern int ext4_convert_unwritten_io_end_vec(handle_t *handle,
 					     ext4_io_end_t *io_end);
 extern int ext4_map_blocks(handle_t *handle, struct inode *inode,
 			   struct ext4_map_blocks *map, int flags);
+extern void ext4_ext_put_gap_in_cache(struct inode *inode,
+				      ext4_lblk_t hole_start,
+				      ext4_lblk_t hole_len);
 extern int ext4_ext_calc_credits_for_single_extent(struct inode *inode,
 						   int num,
 						   struct ext4_ext_path *path);
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 202c76996b62..52bad225e3c8 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -2275,9 +2275,8 @@ static ext4_lblk_t ext4_ext_determine_hole(struct inode *inode,
  * calculate boundaries of the gap that the requested block fits into
  * and cache this gap
  */
-static void
-ext4_ext_put_gap_in_cache(struct inode *inode, ext4_lblk_t hole_start,
-			  ext4_lblk_t hole_len)
+void ext4_ext_put_gap_in_cache(struct inode *inode, ext4_lblk_t hole_start,
+			       ext4_lblk_t hole_len)
 {
 	struct extent_status es;
 
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 94e7b8500878..3908ce7f6fb8 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4034,6 +4034,8 @@ int ext4_punch_hole(struct file *file, loff_t offset, loff_t length)
 			ret = ext4_ind_remove_space(handle, inode, first_block,
 						    stop_block);
 
+		ext4_ext_put_gap_in_cache(inode, first_block,
+					  stop_block - first_block);
 		up_write(&EXT4_I(inode)->i_data_sem);
 	}
 	ext4_fc_track_range(handle, inode, first_block, stop_block);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 5/6] ext4: make ext4_map_blocks() distinguish delayed only mapping
  2023-11-21  9:34 [RFC PATCH 0/6] ext4: make ext4_map_blocks() recognize delayed only extent Zhang Yi
                   ` (3 preceding siblings ...)
  2023-11-21  9:34 ` [RFC PATCH 4/6] ext4: add a hole extent entry in cache after punch Zhang Yi
@ 2023-11-21  9:34 ` Zhang Yi
  2023-11-21  9:34 ` [RFC PATCH 6/6] ext4: make ext4_set_iomap() recognize IOMAP_DELALLOC mapping type Zhang Yi
  5 siblings, 0 replies; 11+ messages in thread
From: Zhang Yi @ 2023-11-21  9:34 UTC (permalink / raw)
  To: linux-ext4
  Cc: tytso, adilger.kernel, jack, ritesh.list, yi.zhang, yi.zhang,
	chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

Add a new map flag EXT4_MAP_DELAYED to indicate the mapping range is a
delayed allocated only (not unwritten) one, and making
ext4_map_blocks() can distinguish it, no longer mixing it with holes.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
 fs/ext4/ext4.h  | 4 +++-
 fs/ext4/inode.c | 2 ++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index c2ca28c6ec38..b5026090ad6f 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -252,8 +252,10 @@ struct ext4_allocation_request {
 #define EXT4_MAP_MAPPED		BIT(BH_Mapped)
 #define EXT4_MAP_UNWRITTEN	BIT(BH_Unwritten)
 #define EXT4_MAP_BOUNDARY	BIT(BH_Boundary)
+#define EXT4_MAP_DELAYED	BIT(BH_Delay)
 #define EXT4_MAP_FLAGS		(EXT4_MAP_NEW | EXT4_MAP_MAPPED |\
-				 EXT4_MAP_UNWRITTEN | EXT4_MAP_BOUNDARY)
+				 EXT4_MAP_UNWRITTEN | EXT4_MAP_BOUNDARY |\
+				 EXT4_MAP_DELAYED)
 
 struct ext4_map_blocks {
 	ext4_fsblk_t m_pblk;
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 3908ce7f6fb8..74b41566d31a 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -518,6 +518,8 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
 			map->m_len = retval;
 		} else if (ext4_es_is_delayed(&es) || ext4_es_is_hole(&es)) {
 			map->m_pblk = 0;
+			map->m_flags |= ext4_es_is_delayed(&es) ?
+					EXT4_MAP_DELAYED : 0;
 			retval = es.es_len - (map->m_lblk - es.es_lblk);
 			if (retval > map->m_len)
 				retval = map->m_len;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 6/6] ext4: make ext4_set_iomap() recognize IOMAP_DELALLOC mapping type
  2023-11-21  9:34 [RFC PATCH 0/6] ext4: make ext4_map_blocks() recognize delayed only extent Zhang Yi
                   ` (4 preceding siblings ...)
  2023-11-21  9:34 ` [RFC PATCH 5/6] ext4: make ext4_map_blocks() distinguish delayed only mapping Zhang Yi
@ 2023-11-21  9:34 ` Zhang Yi
  5 siblings, 0 replies; 11+ messages in thread
From: Zhang Yi @ 2023-11-21  9:34 UTC (permalink / raw)
  To: linux-ext4
  Cc: tytso, adilger.kernel, jack, ritesh.list, yi.zhang, yi.zhang,
	chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

Since ext4_map_blocks() can recognize a delayed allocated only extent,
make ext4_set_iomap() can also recognize it, and remove the useless
separate check in ext4_iomap_begin_report().

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
 fs/ext4/inode.c | 32 +++-----------------------------
 1 file changed, 3 insertions(+), 29 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 74b41566d31a..17fe2bd83617 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3279,6 +3279,9 @@ static void ext4_set_iomap(struct inode *inode, struct iomap *iomap,
 		iomap->addr = (u64) map->m_pblk << blkbits;
 		if (flags & IOMAP_DAX)
 			iomap->addr += EXT4_SB(inode->i_sb)->s_dax_part_off;
+	} else if (map->m_flags & EXT4_MAP_DELAYED) {
+		iomap->type = IOMAP_DELALLOC;
+		iomap->addr = IOMAP_NULL_ADDR;
 	} else {
 		iomap->type = IOMAP_HOLE;
 		iomap->addr = IOMAP_NULL_ADDR;
@@ -3441,35 +3444,11 @@ const struct iomap_ops ext4_iomap_overwrite_ops = {
 	.iomap_end		= ext4_iomap_end,
 };
 
-static bool ext4_iomap_is_delalloc(struct inode *inode,
-				   struct ext4_map_blocks *map)
-{
-	struct extent_status es;
-	ext4_lblk_t offset = 0, end = map->m_lblk + map->m_len - 1;
-
-	ext4_es_find_extent_range(inode, &ext4_es_is_delayed,
-				  map->m_lblk, end, &es);
-
-	if (!es.es_len || es.es_lblk > end)
-		return false;
-
-	if (es.es_lblk > map->m_lblk) {
-		map->m_len = es.es_lblk - map->m_lblk;
-		return false;
-	}
-
-	offset = map->m_lblk - es.es_lblk;
-	map->m_len = es.es_len - offset;
-
-	return true;
-}
-
 static int ext4_iomap_begin_report(struct inode *inode, loff_t offset,
 				   loff_t length, unsigned int flags,
 				   struct iomap *iomap, struct iomap *srcmap)
 {
 	int ret;
-	bool delalloc = false;
 	struct ext4_map_blocks map;
 	u8 blkbits = inode->i_blkbits;
 
@@ -3510,13 +3489,8 @@ static int ext4_iomap_begin_report(struct inode *inode, loff_t offset,
 	ret = ext4_map_blocks(NULL, inode, &map, 0);
 	if (ret < 0)
 		return ret;
-	if (ret == 0)
-		delalloc = ext4_iomap_is_delalloc(inode, &map);
-
 set_iomap:
 	ext4_set_iomap(inode, iomap, &map, offset, length, flags);
-	if (delalloc && iomap->type == IOMAP_HOLE)
-		iomap->type = IOMAP_DELALLOC;
 
 	return 0;
 }
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 3/6] ext4: correct the hole length returned by ext4_map_blocks()
  2023-11-21  9:34 ` [RFC PATCH 3/6] ext4: correct the hole length returned by ext4_map_blocks() Zhang Yi
@ 2023-12-13 18:21   ` Jan Kara
  2023-12-14  9:18     ` Zhang Yi
  0 siblings, 1 reply; 11+ messages in thread
From: Jan Kara @ 2023-12-13 18:21 UTC (permalink / raw)
  To: Zhang Yi
  Cc: linux-ext4, tytso, adilger.kernel, jack, ritesh.list, yi.zhang,
	chengzhihao1, yukuai3

On Tue 21-11-23 17:34:26, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@huawei.com>
> 
> In ext4_map_blocks(), if we can't find a range of mapping in the
> extents cache, we are calling ext4_ext_map_blocks() to search the real
> path. But if the querying range was tail overlaped by a delayed extent,
> we can't find it on the real extent path, so the returned hole length
> could be larger than it really is.
> 
>       |          querying map          |
>       v                                v
>       |----------{-------------}{------|----------------}-----...
>       ^          ^             ^^                       ^
>       | uncached | hole extent ||     delayed extent    |
> 
> We have to adjust the mapping length to the next not hole extent's
> lblk before searching the extent path.
> 
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>

So I agree the ext4_ext_determine_hole() does return a hole that does not
reflect possible delalloc extent (it doesn't even need to be straddling the
end of looked up range, does it?). But ext4_ext_put_gap_in_cache() does
actually properly trim the hole length in the status tree so I think the
problem rather is that the trimming should happen in
ext4_ext_determine_hole() instead of ext4_ext_put_gap_in_cache() and that
will also make ext4_map_blocks() return proper hole length? And then
there's no need for this special handling? Or am I missing something?

								Honza

> ---
>  fs/ext4/inode.c | 24 ++++++++++++++++++++++--
>  1 file changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 4ce35f1c8b0a..94e7b8500878 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -479,6 +479,7 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
>  		    struct ext4_map_blocks *map, int flags)
>  {
>  	struct extent_status es;
> +	ext4_lblk_t next;
>  	int retval;
>  	int ret = 0;
>  #ifdef ES_AGGRESSIVE_TEST
> @@ -502,8 +503,10 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
>  		return -EFSCORRUPTED;
>  
>  	/* Lookup extent status tree firstly */
> -	if (!(EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) &&
> -	    ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es)) {
> +	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
> +		goto uncached;
> +
> +	if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es)) {
>  		if (ext4_es_is_written(&es) || ext4_es_is_unwritten(&es)) {
>  			map->m_pblk = ext4_es_pblock(&es) +
>  					map->m_lblk - es.es_lblk;
> @@ -532,6 +535,23 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
>  #endif
>  		goto found;
>  	}
> +	/*
> +	 * Not found, maybe a hole, need to adjust the map length before
> +	 * seraching the real extent path. It can prevent incorrect hole
> +	 * length returned if the following entries have delayed only
> +	 * ones.
> +	 */
> +	if (!(flags & EXT4_GET_BLOCKS_CREATE) && es.es_lblk > map->m_lblk) {
> +		next = es.es_lblk;
> +		if (ext4_es_is_hole(&es))
> +			next = ext4_es_skip_hole_extent(inode, map->m_lblk,
> +							map->m_len);
> +		retval = next - map->m_lblk;
> +		if (map->m_len > retval)
> +			map->m_len = retval;
> +	}
> +
> +uncached:
>  	/*
>  	 * In the query cache no-wait mode, nothing we can do more if we
>  	 * cannot find extent in the cache.
> -- 
> 2.39.2
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 3/6] ext4: correct the hole length returned by ext4_map_blocks()
  2023-12-13 18:21   ` Jan Kara
@ 2023-12-14  9:18     ` Zhang Yi
  2023-12-14 14:31       ` Jan Kara
  0 siblings, 1 reply; 11+ messages in thread
From: Zhang Yi @ 2023-12-14  9:18 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-ext4, tytso, adilger.kernel, ritesh.list, yi.zhang,
	chengzhihao1, yukuai3

On 2023/12/14 2:21, Jan Kara wrote:
> On Tue 21-11-23 17:34:26, Zhang Yi wrote:
>> From: Zhang Yi <yi.zhang@huawei.com>
>>
>> In ext4_map_blocks(), if we can't find a range of mapping in the
>> extents cache, we are calling ext4_ext_map_blocks() to search the real
>> path. But if the querying range was tail overlaped by a delayed extent,
>> we can't find it on the real extent path, so the returned hole length
>> could be larger than it really is.
>>
>>       |          querying map          |
>>       v                                v
>>       |----------{-------------}{------|----------------}-----...
>>       ^          ^             ^^                       ^
>>       | uncached | hole extent ||     delayed extent    |
>>
>> We have to adjust the mapping length to the next not hole extent's
>> lblk before searching the extent path.
>>
>> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
> 
> So I agree the ext4_ext_determine_hole() does return a hole that does not
> reflect possible delalloc extent (it doesn't even need to be straddling the
> end of looked up range, does it?). But ext4_ext_put_gap_in_cache() does

Yeah.

> actually properly trim the hole length in the status tree so I think the
> problem rather is that the trimming should happen in
> ext4_ext_determine_hole() instead of ext4_ext_put_gap_in_cache() and that
> will also make ext4_map_blocks() return proper hole length? And then
> there's no need for this special handling? Or am I missing something?
> 

Thanks for your suggestions. Yeah, we can trim the hole length in
ext4_ext_determine_hole(), but I'm a little uneasy about the race condition.
ext4_da_map_blocks() only hold inode lock and i_data_sem read lock while
inserting delay extents, and not all query path of ext4_map_blocks() hold
inode lock. I guess the hole/delayed range could be raced by another new
delay allocation and changed after we first check in ext4_map_blocks(),
the querying range could be overlapped and became all or partial delayed,
so we also need to recheck the map type here if the start querying block
has became delayed, right?

Thanks,
Yi.

> 
>> ---
>>  fs/ext4/inode.c | 24 ++++++++++++++++++++++--
>>  1 file changed, 22 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>> index 4ce35f1c8b0a..94e7b8500878 100644
>> --- a/fs/ext4/inode.c
>> +++ b/fs/ext4/inode.c
>> @@ -479,6 +479,7 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
>>  		    struct ext4_map_blocks *map, int flags)
>>  {
>>  	struct extent_status es;
>> +	ext4_lblk_t next;
>>  	int retval;
>>  	int ret = 0;
>>  #ifdef ES_AGGRESSIVE_TEST
>> @@ -502,8 +503,10 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
>>  		return -EFSCORRUPTED;
>>  
>>  	/* Lookup extent status tree firstly */
>> -	if (!(EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) &&
>> -	    ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es)) {
>> +	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
>> +		goto uncached;
>> +
>> +	if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es)) {
>>  		if (ext4_es_is_written(&es) || ext4_es_is_unwritten(&es)) {
>>  			map->m_pblk = ext4_es_pblock(&es) +
>>  					map->m_lblk - es.es_lblk;
>> @@ -532,6 +535,23 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
>>  #endif
>>  		goto found;
>>  	}
>> +	/*
>> +	 * Not found, maybe a hole, need to adjust the map length before
>> +	 * seraching the real extent path. It can prevent incorrect hole
>> +	 * length returned if the following entries have delayed only
>> +	 * ones.
>> +	 */
>> +	if (!(flags & EXT4_GET_BLOCKS_CREATE) && es.es_lblk > map->m_lblk) {
>> +		next = es.es_lblk;
>> +		if (ext4_es_is_hole(&es))
>> +			next = ext4_es_skip_hole_extent(inode, map->m_lblk,
>> +							map->m_len);
>> +		retval = next - map->m_lblk;
>> +		if (map->m_len > retval)
>> +			map->m_len = retval;
>> +	}
>> +
>> +uncached:
>>  	/*
>>  	 * In the query cache no-wait mode, nothing we can do more if we
>>  	 * cannot find extent in the cache.
>> -- 
>> 2.39.2
>>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 3/6] ext4: correct the hole length returned by ext4_map_blocks()
  2023-12-14  9:18     ` Zhang Yi
@ 2023-12-14 14:31       ` Jan Kara
  2023-12-15  4:36         ` Zhang Yi
  0 siblings, 1 reply; 11+ messages in thread
From: Jan Kara @ 2023-12-14 14:31 UTC (permalink / raw)
  To: Zhang Yi
  Cc: Jan Kara, linux-ext4, tytso, adilger.kernel, ritesh.list,
	yi.zhang, chengzhihao1, yukuai3

On Thu 14-12-23 17:18:45, Zhang Yi wrote:
> On 2023/12/14 2:21, Jan Kara wrote:
> > On Tue 21-11-23 17:34:26, Zhang Yi wrote:
> >> From: Zhang Yi <yi.zhang@huawei.com>
> >>
> >> In ext4_map_blocks(), if we can't find a range of mapping in the
> >> extents cache, we are calling ext4_ext_map_blocks() to search the real
> >> path. But if the querying range was tail overlaped by a delayed extent,
> >> we can't find it on the real extent path, so the returned hole length
> >> could be larger than it really is.
> >>
> >>       |          querying map          |
> >>       v                                v
> >>       |----------{-------------}{------|----------------}-----...
> >>       ^          ^             ^^                       ^
> >>       | uncached | hole extent ||     delayed extent    |
> >>
> >> We have to adjust the mapping length to the next not hole extent's
> >> lblk before searching the extent path.
> >>
> >> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
> > 
> > So I agree the ext4_ext_determine_hole() does return a hole that does not
> > reflect possible delalloc extent (it doesn't even need to be straddling the
> > end of looked up range, does it?). But ext4_ext_put_gap_in_cache() does
> 
> Yeah.
> 
> > actually properly trim the hole length in the status tree so I think the
> > problem rather is that the trimming should happen in
> > ext4_ext_determine_hole() instead of ext4_ext_put_gap_in_cache() and that
> > will also make ext4_map_blocks() return proper hole length? And then
> > there's no need for this special handling? Or am I missing something?
> > 
> 
> Thanks for your suggestions. Yeah, we can trim the hole length in
> ext4_ext_determine_hole(), but I'm a little uneasy about the race condition.
> ext4_da_map_blocks() only hold inode lock and i_data_sem read lock while
> inserting delay extents, and not all query path of ext4_map_blocks() hold
> inode lock.

That is a good point! I think something like following could happen already
now:

Suppose we have a file 8192 bytes large containing just a hole.

Task1					Task2
pread(f, buf, 4096, 0)			pwrite(f, buf, 4096, 4096)
  filemap_read()
    filemap_get_pages()
      filemap_create_folio()
        filemap_read_folio()
          ext4_mpage_readpages()
            ext4_map_blocks()
	      down_read(&EXT4_I(inode)->i_data_sem);
              ext4_ext_map_blocks()
		- finds hole 0..8192
	        ext4_ext_put_gap_in_cache()
		  ext4_es_find_extent_range()
		    - finds no delalloc extent
					  ext4_da_write_begin()
					    ext4_da_get_block_prep()
					      ext4_da_map_blocks()
					        down_read(&EXT4_I(inode)->i_data_sem);
					        ext4_ext_map_blocks()
						  - nothing found
						ext4_insert_delayed_block()
						  - inserts delalloc extent
						    to 4096-8192
		  ext4_es_insert_extent()
		    - inserts 0..8192 a hole overwriting delalloc extent

> I guess the hole/delayed range could be raced by another new
> delay allocation and changed after we first check in ext4_map_blocks(),
> the querying range could be overlapped and became all or partial delayed,
> so we also need to recheck the map type here if the start querying block
> has became delayed, right?

I don't think think you can fix this just by rechecking. I think we need to
hold i_data_sem in exclusive mode when inserting delalloc extents. Because
that operation is in fact changing state of allocation tree (although not
on disk yet). And that will fix this race because holding i_data_sem shared
is then enough so that delalloc state cannot change.

Please do this as a separate patch because this will need to be backported
to stable tree. Thanks!

								Honza

> >> ---
> >>  fs/ext4/inode.c | 24 ++++++++++++++++++++++--
> >>  1 file changed, 22 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> >> index 4ce35f1c8b0a..94e7b8500878 100644
> >> --- a/fs/ext4/inode.c
> >> +++ b/fs/ext4/inode.c
> >> @@ -479,6 +479,7 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
> >>  		    struct ext4_map_blocks *map, int flags)
> >>  {
> >>  	struct extent_status es;
> >> +	ext4_lblk_t next;
> >>  	int retval;
> >>  	int ret = 0;
> >>  #ifdef ES_AGGRESSIVE_TEST
> >> @@ -502,8 +503,10 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
> >>  		return -EFSCORRUPTED;
> >>  
> >>  	/* Lookup extent status tree firstly */
> >> -	if (!(EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) &&
> >> -	    ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es)) {
> >> +	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
> >> +		goto uncached;
> >> +
> >> +	if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es)) {
> >>  		if (ext4_es_is_written(&es) || ext4_es_is_unwritten(&es)) {
> >>  			map->m_pblk = ext4_es_pblock(&es) +
> >>  					map->m_lblk - es.es_lblk;
> >> @@ -532,6 +535,23 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
> >>  #endif
> >>  		goto found;
> >>  	}
> >> +	/*
> >> +	 * Not found, maybe a hole, need to adjust the map length before
> >> +	 * seraching the real extent path. It can prevent incorrect hole
> >> +	 * length returned if the following entries have delayed only
> >> +	 * ones.
> >> +	 */
> >> +	if (!(flags & EXT4_GET_BLOCKS_CREATE) && es.es_lblk > map->m_lblk) {
> >> +		next = es.es_lblk;
> >> +		if (ext4_es_is_hole(&es))
> >> +			next = ext4_es_skip_hole_extent(inode, map->m_lblk,
> >> +							map->m_len);
> >> +		retval = next - map->m_lblk;
> >> +		if (map->m_len > retval)
> >> +			map->m_len = retval;
> >> +	}
> >> +
> >> +uncached:
> >>  	/*
> >>  	 * In the query cache no-wait mode, nothing we can do more if we
> >>  	 * cannot find extent in the cache.
> >> -- 
> >> 2.39.2
> >>
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 3/6] ext4: correct the hole length returned by ext4_map_blocks()
  2023-12-14 14:31       ` Jan Kara
@ 2023-12-15  4:36         ` Zhang Yi
  0 siblings, 0 replies; 11+ messages in thread
From: Zhang Yi @ 2023-12-15  4:36 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-ext4, tytso, adilger.kernel, ritesh.list, yi.zhang,
	chengzhihao1, yukuai3

On 2023/12/14 22:31, Jan Kara wrote:
> On Thu 14-12-23 17:18:45, Zhang Yi wrote:
>> On 2023/12/14 2:21, Jan Kara wrote:
>>> On Tue 21-11-23 17:34:26, Zhang Yi wrote:
>>>> From: Zhang Yi <yi.zhang@huawei.com>
>>>>
>>>> In ext4_map_blocks(), if we can't find a range of mapping in the
>>>> extents cache, we are calling ext4_ext_map_blocks() to search the real
>>>> path. But if the querying range was tail overlaped by a delayed extent,
>>>> we can't find it on the real extent path, so the returned hole length
>>>> could be larger than it really is.
>>>>
>>>>       |          querying map          |
>>>>       v                                v
>>>>       |----------{-------------}{------|----------------}-----...
>>>>       ^          ^             ^^                       ^
>>>>       | uncached | hole extent ||     delayed extent    |
>>>>
>>>> We have to adjust the mapping length to the next not hole extent's
>>>> lblk before searching the extent path.
>>>>
>>>> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
>>>
>>> So I agree the ext4_ext_determine_hole() does return a hole that does not
>>> reflect possible delalloc extent (it doesn't even need to be straddling the
>>> end of looked up range, does it?). But ext4_ext_put_gap_in_cache() does
>>
>> Yeah.
>>
>>> actually properly trim the hole length in the status tree so I think the
>>> problem rather is that the trimming should happen in
>>> ext4_ext_determine_hole() instead of ext4_ext_put_gap_in_cache() and that
>>> will also make ext4_map_blocks() return proper hole length? And then
>>> there's no need for this special handling? Or am I missing something?
>>>
>>
>> Thanks for your suggestions. Yeah, we can trim the hole length in
>> ext4_ext_determine_hole(), but I'm a little uneasy about the race condition.
>> ext4_da_map_blocks() only hold inode lock and i_data_sem read lock while
>> inserting delay extents, and not all query path of ext4_map_blocks() hold
>> inode lock.
> 
> That is a good point! I think something like following could happen already
> now:
> 
> Suppose we have a file 8192 bytes large containing just a hole.
> 
> Task1					Task2
> pread(f, buf, 4096, 0)			pwrite(f, buf, 4096, 4096)
>   filemap_read()
>     filemap_get_pages()
>       filemap_create_folio()
>         filemap_read_folio()
>           ext4_mpage_readpages()
>             ext4_map_blocks()
> 	      down_read(&EXT4_I(inode)->i_data_sem);
>               ext4_ext_map_blocks()
> 		- finds hole 0..8192
> 	        ext4_ext_put_gap_in_cache()
> 		  ext4_es_find_extent_range()
> 		    - finds no delalloc extent
> 					  ext4_da_write_begin()
> 					    ext4_da_get_block_prep()
> 					      ext4_da_map_blocks()
> 					        down_read(&EXT4_I(inode)->i_data_sem);
> 					        ext4_ext_map_blocks()
> 						  - nothing found
> 						ext4_insert_delayed_block()
> 						  - inserts delalloc extent
> 						    to 4096-8192
> 		  ext4_es_insert_extent()
> 		    - inserts 0..8192 a hole overwriting delalloc extent
> 
>> I guess the hole/delayed range could be raced by another new
>> delay allocation and changed after we first check in ext4_map_blocks(),
>> the querying range could be overlapped and became all or partial delayed,
>> so we also need to recheck the map type here if the start querying block
>> has became delayed, right?
> 
> I don't think think you can fix this just by rechecking. I think we need to
> hold i_data_sem in exclusive mode when inserting delalloc extents. Because
> that operation is in fact changing state of allocation tree (although not
> on disk yet). And that will fix this race because holding i_data_sem shared
> is then enough so that delalloc state cannot change.
> 
> Please do this as a separate patch because this will need to be backported
> to stable tree. Thanks!
> 

Thanks for the insightful graph,I totally agree with you. For now the absent
delayed extents could lead to inaccurate space reservation and perhaps some
other potential problems. I will send a separate patch to fix this long
standing issue.

Thanks,
Yi.

> 
>>>> ---
>>>>  fs/ext4/inode.c | 24 ++++++++++++++++++++++--
>>>>  1 file changed, 22 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>>>> index 4ce35f1c8b0a..94e7b8500878 100644
>>>> --- a/fs/ext4/inode.c
>>>> +++ b/fs/ext4/inode.c
>>>> @@ -479,6 +479,7 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
>>>>  		    struct ext4_map_blocks *map, int flags)
>>>>  {
>>>>  	struct extent_status es;
>>>> +	ext4_lblk_t next;
>>>>  	int retval;
>>>>  	int ret = 0;
>>>>  #ifdef ES_AGGRESSIVE_TEST
>>>> @@ -502,8 +503,10 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
>>>>  		return -EFSCORRUPTED;
>>>>  
>>>>  	/* Lookup extent status tree firstly */
>>>> -	if (!(EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) &&
>>>> -	    ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es)) {
>>>> +	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
>>>> +		goto uncached;
>>>> +
>>>> +	if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es)) {
>>>>  		if (ext4_es_is_written(&es) || ext4_es_is_unwritten(&es)) {
>>>>  			map->m_pblk = ext4_es_pblock(&es) +
>>>>  					map->m_lblk - es.es_lblk;
>>>> @@ -532,6 +535,23 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
>>>>  #endif
>>>>  		goto found;
>>>>  	}
>>>> +	/*
>>>> +	 * Not found, maybe a hole, need to adjust the map length before
>>>> +	 * seraching the real extent path. It can prevent incorrect hole
>>>> +	 * length returned if the following entries have delayed only
>>>> +	 * ones.
>>>> +	 */
>>>> +	if (!(flags & EXT4_GET_BLOCKS_CREATE) && es.es_lblk > map->m_lblk) {
>>>> +		next = es.es_lblk;
>>>> +		if (ext4_es_is_hole(&es))
>>>> +			next = ext4_es_skip_hole_extent(inode, map->m_lblk,
>>>> +							map->m_len);
>>>> +		retval = next - map->m_lblk;
>>>> +		if (map->m_len > retval)
>>>> +			map->m_len = retval;
>>>> +	}
>>>> +
>>>> +uncached:
>>>>  	/*
>>>>  	 * In the query cache no-wait mode, nothing we can do more if we
>>>>  	 * cannot find extent in the cache.
>>>> -- 
>>>> 2.39.2
>>>>
>>


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-12-15  4:36 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-21  9:34 [RFC PATCH 0/6] ext4: make ext4_map_blocks() recognize delayed only extent Zhang Yi
2023-11-21  9:34 ` [RFC PATCH 1/6] ext4: introduce ext4_es_skip_hole_extent() to skip hole extents Zhang Yi
2023-11-21  9:34 ` [RFC PATCH 2/6] ext4: make ext4_es_lookup_extent() return the next extent if not found Zhang Yi
2023-11-21  9:34 ` [RFC PATCH 3/6] ext4: correct the hole length returned by ext4_map_blocks() Zhang Yi
2023-12-13 18:21   ` Jan Kara
2023-12-14  9:18     ` Zhang Yi
2023-12-14 14:31       ` Jan Kara
2023-12-15  4:36         ` Zhang Yi
2023-11-21  9:34 ` [RFC PATCH 4/6] ext4: add a hole extent entry in cache after punch Zhang Yi
2023-11-21  9:34 ` [RFC PATCH 5/6] ext4: make ext4_map_blocks() distinguish delayed only mapping Zhang Yi
2023-11-21  9:34 ` [RFC PATCH 6/6] ext4: make ext4_set_iomap() recognize IOMAP_DELALLOC mapping type Zhang Yi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).