[PATCH 0/3] btrfs: always return the largest hole possible for btrfs_get

public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/3] btrfs: always return the largest hole possible for btrfs_get_extent()
@ 2025-11-23 23:32 Qu Wenruo
  2025-11-23 23:32 ` [PATCH 1/3] btrfs: return the largest hole between two file extent items Qu Wenruo
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Qu Wenruo @ 2025-11-23 23:32 UTC (permalink / raw)
  To: linux-btrfs

When looking in the call sites of btrfs_get_extent(), I didn't really
see the need of @len parameter, as normally btrfs_get_extent() would
just return the hole or the regular file extent covering @start.

And it turns out that @len is not really that much used in
btrfs_get_extent().

If we find a regular/inline/prealloc file extent, we just return the
full extent map from that file extent.

It's only for implied holes (aka, NO-HOLES feature where there is no
explicit file extent item for a hole) that @len makes a difference.

But in that case, we can simply return the largest hole possible (either
the range between two file extents, or for beyond EOF cases return hole
covering the largest possible file size).

Patch 1 is removing a hole size truncation, which in theory can benefit
readahead on a large hole (no extra tree search for the hole again and
again).

Patch 2 is a refactor to remove a weird code pattern.

Patch 3 is to make beyond EOF cases to return the largest hole possible
(covering the max file size), so that we won't really utilize @len for
hole extent map creation.

Qu Wenruo (3):
  btrfs: return the largest hole between two file extent items
  btrfs: refactor hole cases of btrfs_get_extent()
  btrfs: return the largest possible hole for EOF cases

 fs/btrfs/inode.c | 66 +++++++++++++++++++++++++++++++++---------------
 1 file changed, 45 insertions(+), 21 deletions(-)

-- 
2.52.0

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/3] btrfs: return the largest hole between two file extent items
  2025-11-23 23:32 [PATCH 0/3] btrfs: always return the largest hole possible for btrfs_get_extent() Qu Wenruo
@ 2025-11-23 23:32 ` Qu Wenruo
  2025-11-24 12:59   ` Filipe Manana
  2025-11-23 23:32 ` [PATCH 2/3] btrfs: refactor hole cases of btrfs_get_extent() Qu Wenruo
  2025-11-23 23:32 ` [PATCH 3/3] btrfs: return the largest possible hole for EOF cases Qu Wenruo
  2 siblings, 1 reply; 6+ messages in thread
From: Qu Wenruo @ 2025-11-23 23:32 UTC (permalink / raw)
  To: linux-btrfs

[CORNER CASE]
If we have the following file extents layout, btrfs_get_extent() can
return a smaller hole and cause unnecessary extra tree search:

	item 6 key (257 EXTENT_DATA 0) itemoff 15810 itemsize 53
		generation 9 type 1 (regular)
		extent data disk byte 13631488 nr 4096
		extent data offset 0 nr 4096 ram 4096
		extent compression 0 (none)

	item 7 key (257 EXTENT_DATA 32768) itemoff 15757 itemsize 53
		generation 9 type 1 (regular)
		extent data disk byte 13635584 nr 4096
		extent data offset 0 nr 4096 ram 4096
		extent compression 0 (none)

In above case, range [0, 4K) and [32K, 36K) are regular extents, and
there is a hole in range [4K, 32K), and the fs has "no-holes" feature,
meaning the hole will not have a file extent item.

[INEFFICIENCY]
Assume the system has 4K page size, and we're doing readahead for range
[4K, 32K), no large folio yet.

 btrfs_readahead() for range [4K, 32K)
 |- btrfs_do_readpage() for folio 4K
 |  |- get_extent_map() for range [4K, 8K)
 |     |- btrfs_get_extent() for range [4K, 8K)
 |        We hit item 6, then for the next item 7.
 |        At this stage we know range [4K, 32K) is a hole.
 |        But our search range is only [4K, 8K), not reaching 32K, thus
 |        we go into not_found: tag, returning a hole em for [4K, 8K).
 |
 |- btrfs_do_readpage() for folio 8K
 |  |- get_extent_map() for range [8K, 12K)
 |     |- btrfs_get_extent() for range [8K, 12K)
 |        We hit the same item 6, and then item 7.
 |        But still we goto not_found tag, inserting a new hole em,
 |        which will be merged with previous one.
 |
 | [ Repeat the same btrfs_get_extent() calls until the end ]

So we're calling btrfs_get_extent() again and again, just for a
different part of the same hole range [4K, 32K).

[ENHANCEMENT]
The problem is inside the next: tag, where if we find the next file extent
item and knows it's beyond our search range start.

But there is no need to fallback to not_found: tag, if we know there is
a larger hole for [start, found_key.offset).

By removing the check for (start + len) against (found_key.offset), we can
improve the above read loop by:

 btrfs_readahead()
 btrfs_readahead() for range [4K, 32K)
 |- btrfs_do_readpage() for folio 4K
 |  |- get_extent_map() for range [4K, 8K)
 |     |- btrfs_get_extent() for range [4K, 8K)
 |        We hit item 6, then for the next item 7.
 |        At this stage we know range [4K, 32K) is a hole.
 |        So the hole em for range [4K, 32K) is returned.
 |
 |- btrfs_do_readpage() for folio 8K
 |  |- get_extent_map() for range [8K, 12K)
 |     The cached hole em range [4K, 32K) covers the range,
 |     and reuse that em.
 |
 | [ Repeat the same btrfs_get_extent() calls until the end ]

Now we only call btrfs_get_extent() once for the whole range [4K, 32K),
other than the old 8 times.

Although again I do not expect much difference for the real world
performance.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/inode.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 3cf30abcdb08..3a76cea1d43d 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7181,8 +7181,6 @@ struct extent_map *btrfs_get_extent(struct btrfs_inode *inode,
 		if (found_key.objectid != objectid ||
 		    found_key.type != BTRFS_EXTENT_DATA_KEY)
 			goto not_found;
-		if (start + len <= found_key.offset)
-			goto not_found;
 		if (start > found_key.offset)
 			goto next;
 
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/3] btrfs: refactor hole cases of btrfs_get_extent()
  2025-11-23 23:32 [PATCH 0/3] btrfs: always return the largest hole possible for btrfs_get_extent() Qu Wenruo
  2025-11-23 23:32 ` [PATCH 1/3] btrfs: return the largest hole between two file extent items Qu Wenruo
@ 2025-11-23 23:32 ` Qu Wenruo
  2025-11-23 23:32 ` [PATCH 3/3] btrfs: return the largest possible hole for EOF cases Qu Wenruo
  2 siblings, 0 replies; 6+ messages in thread
From: Qu Wenruo @ 2025-11-23 23:32 UTC (permalink / raw)
  To: linux-btrfs

There are several cases in btrfs_get_extent() we're returning a hole.

- A special corner case where path.slot[0] == 0 and no exact match
  This is the special case where we are the lowest key of the whole
  tree.
  This means there is not even an inode item.

  But for btrfs_get_extent() it just means the whole inode range should
  be a hole.

- EOF cases
  Those are where no more items or no more file extent items for the
  same inode.

  Thus the range must be a hole.

- Hole without a hole file extent item
  This is the NO-HOLES feature, thus the range between the previous file
  extent and the current file extent should be a hole.

Furthermore there is a very weird code block inside that function:

 	btrfs_extent_item_to_extent_map(...);
	if (...)
		goto insert;
	else if (...) {
		/* Do something. */
		goto insert;
	}
not_found:
	/* Do something */
insert:
	/* The remaining logic */

This makes the whole code flow after btrfs_extent_item_to_extent_map()
to be hard to follow.

Refactor the function btrfs_get_extent() by:

- Add comments about the above hole cases

- Introduce a helper, set_hole_em(), to set the hole extent map

- Remove not_found: tag

- Remove unnecessary "goto insert;" calls
  The "goto insert;" calls after we got a regular extent map is just to
  skip setting the extent map to a hole.
  Since now every explicit hole case is calling set_hole_em() then
  insert the hole, there is no need to explicitly call "goto insert;".

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/inode.c | 49 +++++++++++++++++++++++++++++-------------------
 1 file changed, 30 insertions(+), 19 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 3a76cea1d43d..2e0dc82c4f17 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7045,6 +7045,13 @@ static int read_inline_extent(struct btrfs_path *path, struct folio *folio)
 	return 0;
 }
 
+static void set_hole_em(struct extent_map *em, u64 start, u64 len)
+{
+	em->start = start;
+	em->len = len;
+	em->disk_bytenr = EXTENT_MAP_HOLE;
+}
+
 /*
  * Lookup the first extent overlapping a range in a file.
  *
@@ -7123,8 +7130,16 @@ struct extent_map *btrfs_get_extent(struct btrfs_inode *inode,
 	if (ret < 0) {
 		goto out;
 	} else if (ret > 0) {
-		if (path->slots[0] == 0)
-			goto not_found;
+		if (path->slots[0] == 0) {
+			/*
+			 * The rare case where we're already the first key
+			 * of the whole tree.
+			 * This means even no inode item for the inode.
+			 * Thus the whole range should be a hole.
+			 */
+			set_hole_em(em, start, len);
+			goto insert;
+		}
 		path->slots[0]--;
 		ret = 0;
 	}
@@ -7172,31 +7187,32 @@ struct extent_map *btrfs_get_extent(struct btrfs_inode *inode,
 			ret = btrfs_next_leaf(root, path);
 			if (ret < 0)
 				goto out;
-			else if (ret > 0)
-				goto not_found;
+			if (ret > 0) {
+				/* EOF, thus a hole. */
+				set_hole_em(em, start, len);
+				goto insert;
+			}
 
 			leaf = path->nodes[0];
 		}
 		btrfs_item_key_to_cpu(leaf, &found_key, path->slots[0]);
 		if (found_key.objectid != objectid ||
-		    found_key.type != BTRFS_EXTENT_DATA_KEY)
-			goto not_found;
+		    found_key.type != BTRFS_EXTENT_DATA_KEY) {
+			/* EOF, thus a hole. */
+			set_hole_em(em, start, len);
+			goto insert;
+		}
 		if (start > found_key.offset)
 			goto next;
 
-		/* New extent overlaps with existing one */
-		em->start = start;
-		em->len = found_key.offset - start;
-		em->disk_bytenr = EXTENT_MAP_HOLE;
+		/* The range [start, found_key.offset) is a hole. */
+		set_hole_em(em, start, found_key.offset - start);
 		goto insert;
 	}
 
 	btrfs_extent_item_to_extent_map(inode, path, item, em);
 
-	if (extent_type == BTRFS_FILE_EXTENT_REG ||
-	    extent_type == BTRFS_FILE_EXTENT_PREALLOC) {
-		goto insert;
-	} else if (extent_type == BTRFS_FILE_EXTENT_INLINE) {
+	if (extent_type == BTRFS_FILE_EXTENT_INLINE) {
 		/*
 		 * Inline extent can only exist at file offset 0. This is
 		 * ensured by tree-checker and inline extent creation path.
@@ -7217,12 +7233,7 @@ struct extent_map *btrfs_get_extent(struct btrfs_inode *inode,
 		ret = read_inline_extent(path, folio);
 		if (ret < 0)
 			goto out;
-		goto insert;
 	}
-not_found:
-	em->start = start;
-	em->len = len;
-	em->disk_bytenr = EXTENT_MAP_HOLE;
 insert:
 	ret = 0;
 	btrfs_release_path(path);
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/3] btrfs: return the largest possible hole for EOF cases
  2025-11-23 23:32 [PATCH 0/3] btrfs: always return the largest hole possible for btrfs_get_extent() Qu Wenruo
  2025-11-23 23:32 ` [PATCH 1/3] btrfs: return the largest hole between two file extent items Qu Wenruo
  2025-11-23 23:32 ` [PATCH 2/3] btrfs: refactor hole cases of btrfs_get_extent() Qu Wenruo
@ 2025-11-23 23:32 ` Qu Wenruo
  2 siblings, 0 replies; 6+ messages in thread
From: Qu Wenruo @ 2025-11-23 23:32 UTC (permalink / raw)
  To: linux-btrfs

If @start is beyonf EOF of the inode, btrfs_get_extent() will only
return a hole extent map that is exactly the size of @len passed in.

This will make all callers of btrfs_get_extent() map happy, as normally
they will finish the lookup.

But if btrfs_get_extent() is called again and again on ranges beyond EOF
with different ranges, we will do the same tree search again and again,
returning a hole with different ranges, without really benefiting from
the cached extent maps.

Change the EOF handling by always returning a hole for the range
[@start, @btrfs_max_file_end), so future lookup will hit the cache
without searching the tree again.

The special value btrfs_max_file_end is calcuculating by rounding up
MAX_LFS_FILESIZE to fs block size.

Currently MAX_LFS_FILESIZE is at most LLONG_MAX, but btrfs handles
internal size using u64, thus we should never hit a range that will
overflow u64.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/inode.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 2e0dc82c4f17..39e22b8141e5 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7052,6 +7052,20 @@ static void set_hole_em(struct extent_map *em, u64 start, u64 len)
 	em->disk_bytenr = EXTENT_MAP_HOLE;
 }
 
+static u64 btrfs_max_file_end(struct btrfs_fs_info *fs_info)
+{
+	/*
+	 * MAX_LFS_FILESIZE is either LLONG_MAX or LONG_MAX << PAGE_SHIFT.
+	 * LLONG_MAX is not blocksize aligned, so here we have to round it
+	 * up to the fs block size.
+	 */
+	u64 result = round_up(MAX_LFS_FILESIZE, fs_info->sectorsize);
+
+	/* Make sure rounding up MAX_LFS_FILESIZE won't overflow. */
+	ASSERT(result > 0);
+	return result;
+}
+
 /*
  * Lookup the first extent overlapping a range in a file.
  *
@@ -7077,6 +7091,7 @@ struct extent_map *btrfs_get_extent(struct btrfs_inode *inode,
 	u64 extent_start = 0;
 	u64 extent_end = 0;
 	u64 objectid = btrfs_ino(inode);
+	const u64 max_file_end = btrfs_max_file_end(fs_info);
 	int extent_type = -1;
 	struct btrfs_path *path = NULL;
 	struct btrfs_root *root = inode->root;
@@ -7137,7 +7152,7 @@ struct extent_map *btrfs_get_extent(struct btrfs_inode *inode,
 			 * This means even no inode item for the inode.
 			 * Thus the whole range should be a hole.
 			 */
-			set_hole_em(em, start, len);
+			set_hole_em(em, start, max_file_end - start);
 			goto insert;
 		}
 		path->slots[0]--;
@@ -7189,7 +7204,7 @@ struct extent_map *btrfs_get_extent(struct btrfs_inode *inode,
 				goto out;
 			if (ret > 0) {
 				/* EOF, thus a hole. */
-				set_hole_em(em, start, len);
+				set_hole_em(em, start, max_file_end - start);
 				goto insert;
 			}
 
@@ -7199,7 +7214,7 @@ struct extent_map *btrfs_get_extent(struct btrfs_inode *inode,
 		if (found_key.objectid != objectid ||
 		    found_key.type != BTRFS_EXTENT_DATA_KEY) {
 			/* EOF, thus a hole. */
-			set_hole_em(em, start, len);
+			set_hole_em(em, start, max_file_end - start);
 			goto insert;
 		}
 		if (start > found_key.offset)
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/3] btrfs: return the largest hole between two file extent items
  2025-11-23 23:32 ` [PATCH 1/3] btrfs: return the largest hole between two file extent items Qu Wenruo
@ 2025-11-24 12:59   ` Filipe Manana
  2025-11-24 20:54     ` Qu Wenruo
  0 siblings, 1 reply; 6+ messages in thread
From: Filipe Manana @ 2025-11-24 12:59 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Sun, Nov 23, 2025 at 11:32 PM Qu Wenruo <wqu@suse.com> wrote:
>
> [CORNER CASE]
> If we have the following file extents layout, btrfs_get_extent() can
> return a smaller hole and cause unnecessary extra tree search:
>
>         item 6 key (257 EXTENT_DATA 0) itemoff 15810 itemsize 53
>                 generation 9 type 1 (regular)
>                 extent data disk byte 13631488 nr 4096
>                 extent data offset 0 nr 4096 ram 4096
>                 extent compression 0 (none)
>
>         item 7 key (257 EXTENT_DATA 32768) itemoff 15757 itemsize 53
>                 generation 9 type 1 (regular)
>                 extent data disk byte 13635584 nr 4096
>                 extent data offset 0 nr 4096 ram 4096
>                 extent compression 0 (none)
>
> In above case, range [0, 4K) and [32K, 36K) are regular extents, and
> there is a hole in range [4K, 32K), and the fs has "no-holes" feature,
> meaning the hole will not have a file extent item.
>
> [INEFFICIENCY]
> Assume the system has 4K page size, and we're doing readahead for range
> [4K, 32K), no large folio yet.
>
>  btrfs_readahead() for range [4K, 32K)
>  |- btrfs_do_readpage() for folio 4K
>  |  |- get_extent_map() for range [4K, 8K)
>  |     |- btrfs_get_extent() for range [4K, 8K)
>  |        We hit item 6, then for the next item 7.
>  |        At this stage we know range [4K, 32K) is a hole.
>  |        But our search range is only [4K, 8K), not reaching 32K, thus
>  |        we go into not_found: tag, returning a hole em for [4K, 8K).
>  |
>  |- btrfs_do_readpage() for folio 8K
>  |  |- get_extent_map() for range [8K, 12K)
>  |     |- btrfs_get_extent() for range [8K, 12K)
>  |        We hit the same item 6, and then item 7.
>  |        But still we goto not_found tag, inserting a new hole em,
>  |        which will be merged with previous one.
>  |
>  | [ Repeat the same btrfs_get_extent() calls until the end ]
>
> So we're calling btrfs_get_extent() again and again, just for a
> different part of the same hole range [4K, 32K).
>
> [ENHANCEMENT]
> The problem is inside the next: tag, where if we find the next file extent
> item and knows it's beyond our search range start.
>
> But there is no need to fallback to not_found: tag, if we know there is
> a larger hole for [start, found_key.offset).
>
> By removing the check for (start + len) against (found_key.offset), we can
> improve the above read loop by:
>
>  btrfs_readahead()
>  btrfs_readahead() for range [4K, 32K)
>  |- btrfs_do_readpage() for folio 4K
>  |  |- get_extent_map() for range [4K, 8K)
>  |     |- btrfs_get_extent() for range [4K, 8K)
>  |        We hit item 6, then for the next item 7.
>  |        At this stage we know range [4K, 32K) is a hole.
>  |        So the hole em for range [4K, 32K) is returned.
>  |
>  |- btrfs_do_readpage() for folio 8K
>  |  |- get_extent_map() for range [8K, 12K)
>  |     The cached hole em range [4K, 32K) covers the range,
>  |     and reuse that em.
>  |
>  | [ Repeat the same btrfs_get_extent() calls until the end ]
>
> Now we only call btrfs_get_extent() once for the whole range [4K, 32K),
> other than the old 8 times.
>
> Although again I do not expect much difference for the real world
> performance.
>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  fs/btrfs/inode.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 3cf30abcdb08..3a76cea1d43d 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -7181,8 +7181,6 @@ struct extent_map *btrfs_get_extent(struct btrfs_inode *inode,
>                 if (found_key.objectid != objectid ||
>                     found_key.type != BTRFS_EXTENT_DATA_KEY)
>                         goto not_found;
> -               if (start + len <= found_key.offset)
> -                       goto not_found;

Your point about the inefficiency for the readahead case is valid, but
this may be dangerous in other contexts.

If a caller of btrfs_get_extent() passes a specific length, it may
mean it has locked the range in the inode's io tree only for that
range.
For the readahead case, the caller has typically locked a larger range
in the io tree - except when attempting to read the last page/folio
for the readahead range and there's hole that crosses that limit.

By allowing to return an extent map for a larger range:

1) We can now return a stale extent map.
     After the path is released in btrfs_get_extent(), another task
may insert a new file extent item (such as a direct IO task).

2) While another task adds the new file extent item, it will also trim
the hole extent map created by the task that has just finished calling
btrfs_get_extent().
    The trimming (typically done in btrfs_drop_extent_map_range())
means updating the extent map's length, start fields, etc, while the
task that just called btrfs_get_extent() is using it, causing a race
with unpredictable results.

We've had problems of this sort in the past.
It's a bad idea for a task to create extent maps beyond the range it
has locked in the inode's io tree.

Thanks.


>                 if (start > found_key.offset)
>                         goto next;
>
> --
> 2.52.0
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/3] btrfs: return the largest hole between two file extent items
  2025-11-24 12:59   ` Filipe Manana
@ 2025-11-24 20:54     ` Qu Wenruo
  0 siblings, 0 replies; 6+ messages in thread
From: Qu Wenruo @ 2025-11-24 20:54 UTC (permalink / raw)
  To: Filipe Manana, Qu Wenruo; +Cc: linux-btrfs



在 2025/11/24 23:29, Filipe Manana 写道:
> On Sun, Nov 23, 2025 at 11:32 PM Qu Wenruo <wqu@suse.com> wrote:
[...]
>>
>> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
>> index 3cf30abcdb08..3a76cea1d43d 100644
>> --- a/fs/btrfs/inode.c
>> +++ b/fs/btrfs/inode.c
>> @@ -7181,8 +7181,6 @@ struct extent_map *btrfs_get_extent(struct btrfs_inode *inode,
>>                  if (found_key.objectid != objectid ||
>>                      found_key.type != BTRFS_EXTENT_DATA_KEY)
>>                          goto not_found;
>> -               if (start + len <= found_key.offset)
>> -                       goto not_found;
> 
> Your point about the inefficiency for the readahead case is valid, but
> this may be dangerous in other contexts.
> 
> If a caller of btrfs_get_extent() passes a specific length, it may
> mean it has locked the range in the inode's io tree only for that
> range.
> For the readahead case, the caller has typically locked a larger range
> in the io tree - except when attempting to read the last page/folio
> for the readahead range and there's hole that crosses that limit.
> 
> By allowing to return an extent map for a larger range:
> 
> 1) We can now return a stale extent map.
>       After the path is released in btrfs_get_extent(), another task
> may insert a new file extent item (such as a direct IO task).
> 
> 2) While another task adds the new file extent item, it will also trim
> the hole extent map created by the task that has just finished calling
> btrfs_get_extent().
>      The trimming (typically done in btrfs_drop_extent_map_range())
> means updating the extent map's length, start fields, etc, while the
> task that just called btrfs_get_extent() is using it, causing a race
> with unpredictable results.

I agree with the concerns, although I failed to trigger any regression 
with several days of fstests runs, the limited benefit vs potential bugs 
is not really worthy.

Thus I'm fine discarding this series.

> 
> We've had problems of this sort in the past.
> It's a bad idea for a task to create extent maps beyond the range it
> has locked in the inode's io tree.

However I also find several btrfs_get_extent() call sites without 
explicit extent locking.

Call sites like find_first_non_hole() and btrfs_zero_range() rely on 
inode lock then waiting for ordered extents.

And buffered writeback path in extent_writepage_io() which relies on the 
(sub)page range being locked and the extent map is also pinned.

I'm wondering should we concentrate all extent maps related call sites 
to use proper extent locking to be extra safe.

Thanks,
Qu

> 
> Thanks.
> 
> 
>>                  if (start > found_key.offset)
>>                          goto next;
>>
>> --
>> 2.52.0
>>
>>
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-11-24 20:54 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-23 23:32 [PATCH 0/3] btrfs: always return the largest hole possible for btrfs_get_extent() Qu Wenruo
2025-11-23 23:32 ` [PATCH 1/3] btrfs: return the largest hole between two file extent items Qu Wenruo
2025-11-24 12:59   ` Filipe Manana
2025-11-24 20:54     ` Qu Wenruo
2025-11-23 23:32 ` [PATCH 2/3] btrfs: refactor hole cases of btrfs_get_extent() Qu Wenruo
2025-11-23 23:32 ` [PATCH 3/3] btrfs: return the largest possible hole for EOF cases Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox