public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH V3 0/2] Some bugfix for xfs fsmap
@ 2024-08-12  1:15 Zizhi Wo
  2024-08-12  1:15 ` [PATCH V3 1/2] xfs: Fix the owner setting issue for rmap query in " Zizhi Wo
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Zizhi Wo @ 2024-08-12  1:15 UTC (permalink / raw)
  To: chandan.babu, djwong, dchinner, osandov, john.g.garry
  Cc: linux-xfs, linux-kernel, wozizhi, yangerkun

Changes since V2[1]:
 - Split the original patch into two, each for a different problem.
 - The fix focuses solely on addressing the omission problem and does not
   involve the precision of intervals.

This patch set contains two patches to repair fsmap. Although they are both
problems of missing query intervals, the root causes of the two are
inconsistent, so two patches are proposed.

Patch 1: The fix addresses the interval omission issue caused by the
incorrect setting of "rm_owner" in the high_key during rmap queries. In
this scenario, fsmap finds the record on the rmapbt, but due to the
incorrect setting of the "rm_owner", the key of the record is larger than
the high_key, causing the query result to be incorrect. This issue is
resolved by fixing the "rm_owner" setup logic.

Patch 2: The fix addresses the interval omission issue caused by bit
shifting during gap queries in fsmap. In this scenario, fsmap does not
find the record on the rmapbt, so it needs to locate it by the gap of the
info->next_daddr and high_key address. However, due to the shift, the
two are reduced to 0, so the query error is caused. The issue is resolved
by introducing the "end_daddr" field in the xfs_getfsmap_info structure to
store the high_key at the sector granularity.

[1] https://lore.kernel.org/all/20240808144759.1330237-1-wozizhi@huawei.com/

Zizhi Wo (2):
  xfs: Fix the owner setting issue for rmap query in xfs fsmap
  xfs: Fix missing interval for missing_owner in xfs fsmap

 fs/xfs/xfs_fsmap.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH V3 1/2] xfs: Fix the owner setting issue for rmap query in xfs fsmap
  2024-08-12  1:15 [PATCH V3 0/2] Some bugfix for xfs fsmap Zizhi Wo
@ 2024-08-12  1:15 ` Zizhi Wo
  2024-08-15 16:36   ` Darrick J. Wong
  2024-08-12  1:15 ` [PATCH V3 2/2] xfs: Fix missing interval for missing_owner " Zizhi Wo
  2024-08-15 10:20 ` [PATCH V3 0/2] Some bugfix for " Zizhi Wo
  2 siblings, 1 reply; 8+ messages in thread
From: Zizhi Wo @ 2024-08-12  1:15 UTC (permalink / raw)
  To: chandan.babu, djwong, dchinner, osandov, john.g.garry
  Cc: linux-xfs, linux-kernel, wozizhi, yangerkun

I notice a rmap query bug in xfs_io fsmap:
[root@fedora ~]# xfs_io -c 'fsmap -vvvv' /mnt
 EXT: DEV    BLOCK-RANGE           OWNER              FILE-OFFSET      AG AG-OFFSET             TOTAL
   0: 253:16 [0..7]:               static fs metadata                  0  (0..7)                    8
   1: 253:16 [8..23]:              per-AG metadata                     0  (8..23)                  16
   2: 253:16 [24..39]:             inode btree                         0  (24..39)                 16
   3: 253:16 [40..47]:             per-AG metadata                     0  (40..47)                  8
   4: 253:16 [48..55]:             refcount btree                      0  (48..55)                  8
   5: 253:16 [56..103]:            per-AG metadata                     0  (56..103)                48
   6: 253:16 [104..127]:           free space                          0  (104..127)               24
   ......

Bug:
[root@fedora ~]# xfs_io -c 'fsmap -vvvv -d 0 3' /mnt
[root@fedora ~]#
Normally, we should be able to get one record, but we got nothing.

The root cause of this problem lies in the incorrect setting of rm_owner in
the rmap query. In the case of the initial query where the owner is not
set, __xfs_getfsmap_datadev() first sets info->high.rm_owner to ULLONG_MAX.
This is done to prevent any omissions when comparing rmap items. However,
if the current ag is detected to be the last one, the function sets info's
high_irec based on the provided key. If high->rm_owner is not specified, it
should continue to be set to ULLONG_MAX; otherwise, there will be issues
with interval omissions. For example, consider "start" and "end" within the
same block. If high->rm_owner == 0, it will be smaller than the founded
record in rmapbt, resulting in a query with no records. The main call stack
is as follows:

xfs_ioc_getfsmap
  xfs_getfsmap
    xfs_getfsmap_datadev_rmapbt
      __xfs_getfsmap_datadev
        info->high.rm_owner = ULLONG_MAX
        if (pag->pag_agno == end_ag)
	  xfs_fsmap_owner_to_rmap
	    // set info->high.rm_owner = 0 because fmr_owner == 0
	    dest->rm_owner = 0
	// get nothing
	xfs_getfsmap_datadev_rmapbt_query

The problem can be resolved by setting the rm_owner of high to ULLONG_MAX
again under certain conditions.

After applying this patch, the above problem have been solved:
[root@fedora ~]# xfs_io -c 'fsmap -vvvv -d 0 3' /mnt
 EXT: DEV    BLOCK-RANGE      OWNER              FILE-OFFSET      AG AG-OFFSET        TOTAL
   0: 253:16 [0..7]:          static fs metadata                  0  (0..7)               8

Fixes: e89c041338ed ("xfs: implement the GETFSMAP ioctl")
Signed-off-by: Zizhi Wo <wozizhi@huawei.com>
---
 fs/xfs/xfs_fsmap.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c
index 85dbb46452ca..d346acff7725 100644
--- a/fs/xfs/xfs_fsmap.c
+++ b/fs/xfs/xfs_fsmap.c
@@ -655,6 +655,13 @@ __xfs_getfsmap_datadev(
 			error = xfs_fsmap_owner_to_rmap(&info->high, &keys[1]);
 			if (error)
 				break;
+			/*
+			 * Set the owner of high_key to the maximum again to
+			 * prevent missing intervals during the query.
+			 */
+			if (info->high.rm_owner == 0 &&
+			    info->missing_owner == XFS_FMR_OWN_FREE)
+			    info->high.rm_owner = ULLONG_MAX;
 			xfs_getfsmap_set_irec_flags(&info->high, &keys[1]);
 		}
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH V3 2/2] xfs: Fix missing interval for missing_owner in xfs fsmap
  2024-08-12  1:15 [PATCH V3 0/2] Some bugfix for xfs fsmap Zizhi Wo
  2024-08-12  1:15 ` [PATCH V3 1/2] xfs: Fix the owner setting issue for rmap query in " Zizhi Wo
@ 2024-08-12  1:15 ` Zizhi Wo
  2024-08-15 17:42   ` Darrick J. Wong
  2024-08-15 10:20 ` [PATCH V3 0/2] Some bugfix for " Zizhi Wo
  2 siblings, 1 reply; 8+ messages in thread
From: Zizhi Wo @ 2024-08-12  1:15 UTC (permalink / raw)
  To: chandan.babu, djwong, dchinner, osandov, john.g.garry
  Cc: linux-xfs, linux-kernel, wozizhi, yangerkun

In the fsmap query of xfs, there is an interval missing problem:
[root@fedora ~]# xfs_io -c 'fsmap -vvvv' /mnt
 EXT: DEV    BLOCK-RANGE           OWNER              FILE-OFFSET      AG AG-OFFSET             TOTAL
   0: 253:16 [0..7]:               static fs metadata                  0  (0..7)                    8
   1: 253:16 [8..23]:              per-AG metadata                     0  (8..23)                  16
   2: 253:16 [24..39]:             inode btree                         0  (24..39)                 16
   3: 253:16 [40..47]:             per-AG metadata                     0  (40..47)                  8
   4: 253:16 [48..55]:             refcount btree                      0  (48..55)                  8
   5: 253:16 [56..103]:            per-AG metadata                     0  (56..103)                48
   6: 253:16 [104..127]:           free space                          0  (104..127)               24
   ......

BUG:
[root@fedora ~]# xfs_io -c 'fsmap -vvvv -d 104 107' /mnt
[root@fedora ~]#
Normally, we should be able to get [104, 107), but we got nothing.

The problem is caused by shifting. The query for the problem-triggered
scenario is for the missing_owner interval (e.g. freespace in rmapbt/
unknown space in bnobt), which is obtained by subtraction (gap). For this
scenario, the interval is obtained by info->last. However, rec_daddr is
calculated based on the start_block recorded in key[1], which is converted
by calling XFS_BB_TO_FSBT. Then if rec_daddr does not exceed
info->next_daddr, which means keys[1].fmr_physical >> (mp)->m_blkbb_log
<= info->next_daddr, no records will be displayed. In the above example,
104 >> (mp)->m_blkbb_log = 12 and 107 >> (mp)->m_blkbb_log = 12, so the two
are reduced to 0 and the gap is ignored:

 before calculate ----------------> after shifting
 104(st)  107(ed)		      12(st/ed)
  |---------|				  |
  sector size			      block size

Resolve this issue by introducing the "end_daddr" field in
xfs_getfsmap_info. This records |key[1].fmr_physical + key[1].length| at
the granularity of sector. If the current query is the last, the rec_daddr
is end_daddr to prevent missing interval problems caused by shifting. We
only need to focus on the last query, because xfs disks are internally
aligned with disk blocksize that are powers of two and minimum 512, so
there is no problem with shifting in previous queries.

After applying this patch, the above problem have been solved:
[root@fedora ~]# xfs_io -c 'fsmap -vvvv -d 104 107' /mnt
 EXT: DEV    BLOCK-RANGE      OWNER            FILE-OFFSET      AG AG-OFFSET        TOTAL
   0: 253:16 [104..106]:      free space                        0  (104..106)           3

Fixes: e89c041338ed ("xfs: implement the GETFSMAP ioctl")
Signed-off-by: Zizhi Wo <wozizhi@huawei.com>
---
 fs/xfs/xfs_fsmap.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c
index d346acff7725..4ae273b54129 100644
--- a/fs/xfs/xfs_fsmap.c
+++ b/fs/xfs/xfs_fsmap.c
@@ -162,6 +162,7 @@ struct xfs_getfsmap_info {
 	xfs_daddr_t		next_daddr;	/* next daddr we expect */
 	/* daddr of low fsmap key when we're using the rtbitmap */
 	xfs_daddr_t		low_daddr;
+	xfs_daddr_t		end_daddr;	/* daddr of high fsmap key */
 	u64			missing_owner;	/* owner of holes */
 	u32			dev;		/* device id */
 	/*
@@ -294,6 +295,13 @@ xfs_getfsmap_helper(
 		return 0;
 	}
 
+	/*
+	 * To prevent missing intervals in the last query, consider using
+	 * sectors as the granularity.
+	 */
+	if (info->last && info->end_daddr)
+		rec_daddr = info->end_daddr;
+
 	/* Are we just counting mappings? */
 	if (info->head->fmh_count == 0) {
 		if (info->head->fmh_entries == UINT_MAX)
@@ -973,8 +981,10 @@ xfs_getfsmap(
 		 * low key, zero out the low key so that we get
 		 * everything from the beginning.
 		 */
-		if (handlers[i].dev == head->fmh_keys[1].fmr_device)
+		if (handlers[i].dev == head->fmh_keys[1].fmr_device) {
 			dkeys[1] = head->fmh_keys[1];
+			info.end_daddr = dkeys[1].fmr_physical + dkeys[1].fmr_length;
+		}
 		if (handlers[i].dev > head->fmh_keys[0].fmr_device)
 			memset(&dkeys[0], 0, sizeof(struct xfs_fsmap));
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH V3 0/2] Some bugfix for xfs fsmap
  2024-08-12  1:15 [PATCH V3 0/2] Some bugfix for xfs fsmap Zizhi Wo
  2024-08-12  1:15 ` [PATCH V3 1/2] xfs: Fix the owner setting issue for rmap query in " Zizhi Wo
  2024-08-12  1:15 ` [PATCH V3 2/2] xfs: Fix missing interval for missing_owner " Zizhi Wo
@ 2024-08-15 10:20 ` Zizhi Wo
  2 siblings, 0 replies; 8+ messages in thread
From: Zizhi Wo @ 2024-08-15 10:20 UTC (permalink / raw)
  To: chandan.babu, djwong, dchinner, osandov, john.g.garry
  Cc: linux-xfs, linux-kernel, yangerkun

friendly ping


在 2024/8/12 9:15, Zizhi Wo 写道:
> Changes since V2[1]:
>   - Split the original patch into two, each for a different problem.
>   - The fix focuses solely on addressing the omission problem and does not
>     involve the precision of intervals.
> 
> This patch set contains two patches to repair fsmap. Although they are both
> problems of missing query intervals, the root causes of the two are
> inconsistent, so two patches are proposed.
> 
> Patch 1: The fix addresses the interval omission issue caused by the
> incorrect setting of "rm_owner" in the high_key during rmap queries. In
> this scenario, fsmap finds the record on the rmapbt, but due to the
> incorrect setting of the "rm_owner", the key of the record is larger than
> the high_key, causing the query result to be incorrect. This issue is
> resolved by fixing the "rm_owner" setup logic.
> 
> Patch 2: The fix addresses the interval omission issue caused by bit
> shifting during gap queries in fsmap. In this scenario, fsmap does not
> find the record on the rmapbt, so it needs to locate it by the gap of the
> info->next_daddr and high_key address. However, due to the shift, the
> two are reduced to 0, so the query error is caused. The issue is resolved
> by introducing the "end_daddr" field in the xfs_getfsmap_info structure to
> store the high_key at the sector granularity.
> 
> [1] https://lore.kernel.org/all/20240808144759.1330237-1-wozizhi@huawei.com/
> 
> Zizhi Wo (2):
>    xfs: Fix the owner setting issue for rmap query in xfs fsmap
>    xfs: Fix missing interval for missing_owner in xfs fsmap
> 
>   fs/xfs/xfs_fsmap.c | 19 ++++++++++++++++++-
>   1 file changed, 18 insertions(+), 1 deletion(-)
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V3 1/2] xfs: Fix the owner setting issue for rmap query in xfs fsmap
  2024-08-12  1:15 ` [PATCH V3 1/2] xfs: Fix the owner setting issue for rmap query in " Zizhi Wo
@ 2024-08-15 16:36   ` Darrick J. Wong
  2024-08-16  1:02     ` Zizhi Wo
  0 siblings, 1 reply; 8+ messages in thread
From: Darrick J. Wong @ 2024-08-15 16:36 UTC (permalink / raw)
  To: Zizhi Wo
  Cc: chandan.babu, dchinner, osandov, john.g.garry, linux-xfs,
	linux-kernel, yangerkun

On Mon, Aug 12, 2024 at 09:15:04AM +0800, Zizhi Wo wrote:
> I notice a rmap query bug in xfs_io fsmap:
> [root@fedora ~]# xfs_io -c 'fsmap -vvvv' /mnt
>  EXT: DEV    BLOCK-RANGE           OWNER              FILE-OFFSET      AG AG-OFFSET             TOTAL
>    0: 253:16 [0..7]:               static fs metadata                  0  (0..7)                    8
>    1: 253:16 [8..23]:              per-AG metadata                     0  (8..23)                  16
>    2: 253:16 [24..39]:             inode btree                         0  (24..39)                 16
>    3: 253:16 [40..47]:             per-AG metadata                     0  (40..47)                  8
>    4: 253:16 [48..55]:             refcount btree                      0  (48..55)                  8
>    5: 253:16 [56..103]:            per-AG metadata                     0  (56..103)                48
>    6: 253:16 [104..127]:           free space                          0  (104..127)               24
>    ......
> 
> Bug:
> [root@fedora ~]# xfs_io -c 'fsmap -vvvv -d 0 3' /mnt
> [root@fedora ~]#
> Normally, we should be able to get one record, but we got nothing.
> 
> The root cause of this problem lies in the incorrect setting of rm_owner in
> the rmap query. In the case of the initial query where the owner is not
> set, __xfs_getfsmap_datadev() first sets info->high.rm_owner to ULLONG_MAX.
> This is done to prevent any omissions when comparing rmap items. However,
> if the current ag is detected to be the last one, the function sets info's
> high_irec based on the provided key. If high->rm_owner is not specified, it
> should continue to be set to ULLONG_MAX; otherwise, there will be issues
> with interval omissions. For example, consider "start" and "end" within the
> same block. If high->rm_owner == 0, it will be smaller than the founded
> record in rmapbt, resulting in a query with no records. The main call stack
> is as follows:
> 
> xfs_ioc_getfsmap
>   xfs_getfsmap
>     xfs_getfsmap_datadev_rmapbt
>       __xfs_getfsmap_datadev
>         info->high.rm_owner = ULLONG_MAX
>         if (pag->pag_agno == end_ag)
> 	  xfs_fsmap_owner_to_rmap
> 	    // set info->high.rm_owner = 0 because fmr_owner == 0
> 	    dest->rm_owner = 0
> 	// get nothing
> 	xfs_getfsmap_datadev_rmapbt_query
> 
> The problem can be resolved by setting the rm_owner of high to ULLONG_MAX
> again under certain conditions.
> 
> After applying this patch, the above problem have been solved:
> [root@fedora ~]# xfs_io -c 'fsmap -vvvv -d 0 3' /mnt
>  EXT: DEV    BLOCK-RANGE      OWNER              FILE-OFFSET      AG AG-OFFSET        TOTAL
>    0: 253:16 [0..7]:          static fs metadata                  0  (0..7)               8
> 
> Fixes: e89c041338ed ("xfs: implement the GETFSMAP ioctl")
> Signed-off-by: Zizhi Wo <wozizhi@huawei.com>
> ---
>  fs/xfs/xfs_fsmap.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c
> index 85dbb46452ca..d346acff7725 100644
> --- a/fs/xfs/xfs_fsmap.c
> +++ b/fs/xfs/xfs_fsmap.c
> @@ -655,6 +655,13 @@ __xfs_getfsmap_datadev(
>  			error = xfs_fsmap_owner_to_rmap(&info->high, &keys[1]);
>  			if (error)
>  				break;
> +			/*
> +			 * Set the owner of high_key to the maximum again to
> +			 * prevent missing intervals during the query.
> +			 */
> +			if (info->high.rm_owner == 0 &&
> +			    info->missing_owner == XFS_FMR_OWN_FREE)
> +			    info->high.rm_owner = ULLONG_MAX;

Shouldn't this be in xfs_fsmap_owner_to_rmap?

And, looking at that function, isn't this the solution:

	switch (src->fmr_owner) {
	case 0:			/* "lowest owner id possible" */
	case -1ULL:		/* "highest owner id possible" */
		dest->rm_owner = src->fmr_owner;
		break;

instead of this special-casing outside the setter function?

--D

>  			xfs_getfsmap_set_irec_flags(&info->high, &keys[1]);
>  		}
>  
> -- 
> 2.39.2
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V3 2/2] xfs: Fix missing interval for missing_owner in xfs fsmap
  2024-08-12  1:15 ` [PATCH V3 2/2] xfs: Fix missing interval for missing_owner " Zizhi Wo
@ 2024-08-15 17:42   ` Darrick J. Wong
  2024-08-16  1:37     ` Zizhi Wo
  0 siblings, 1 reply; 8+ messages in thread
From: Darrick J. Wong @ 2024-08-15 17:42 UTC (permalink / raw)
  To: Zizhi Wo
  Cc: chandan.babu, dchinner, osandov, john.g.garry, linux-xfs,
	linux-kernel, yangerkun

On Mon, Aug 12, 2024 at 09:15:05AM +0800, Zizhi Wo wrote:
> In the fsmap query of xfs, there is an interval missing problem:
> [root@fedora ~]# xfs_io -c 'fsmap -vvvv' /mnt
>  EXT: DEV    BLOCK-RANGE           OWNER              FILE-OFFSET      AG AG-OFFSET             TOTAL
>    0: 253:16 [0..7]:               static fs metadata                  0  (0..7)                    8
>    1: 253:16 [8..23]:              per-AG metadata                     0  (8..23)                  16
>    2: 253:16 [24..39]:             inode btree                         0  (24..39)                 16
>    3: 253:16 [40..47]:             per-AG metadata                     0  (40..47)                  8
>    4: 253:16 [48..55]:             refcount btree                      0  (48..55)                  8
>    5: 253:16 [56..103]:            per-AG metadata                     0  (56..103)                48
>    6: 253:16 [104..127]:           free space                          0  (104..127)               24
>    ......
> 
> BUG:
> [root@fedora ~]# xfs_io -c 'fsmap -vvvv -d 104 107' /mnt
> [root@fedora ~]#
> Normally, we should be able to get [104, 107), but we got nothing.
> 
> The problem is caused by shifting. The query for the problem-triggered
> scenario is for the missing_owner interval (e.g. freespace in rmapbt/
> unknown space in bnobt), which is obtained by subtraction (gap). For this
> scenario, the interval is obtained by info->last. However, rec_daddr is
> calculated based on the start_block recorded in key[1], which is converted
> by calling XFS_BB_TO_FSBT. Then if rec_daddr does not exceed
> info->next_daddr, which means keys[1].fmr_physical >> (mp)->m_blkbb_log
> <= info->next_daddr, no records will be displayed. In the above example,
> 104 >> (mp)->m_blkbb_log = 12 and 107 >> (mp)->m_blkbb_log = 12, so the two
> are reduced to 0 and the gap is ignored:
> 
>  before calculate ----------------> after shifting
>  104(st)  107(ed)		      12(st/ed)
>   |---------|				  |
>   sector size			      block size
> 
> Resolve this issue by introducing the "end_daddr" field in
> xfs_getfsmap_info. This records |key[1].fmr_physical + key[1].length| at
> the granularity of sector. If the current query is the last, the rec_daddr
> is end_daddr to prevent missing interval problems caused by shifting. We
> only need to focus on the last query, because xfs disks are internally
> aligned with disk blocksize that are powers of two and minimum 512, so
> there is no problem with shifting in previous queries.
> 
> After applying this patch, the above problem have been solved:
> [root@fedora ~]# xfs_io -c 'fsmap -vvvv -d 104 107' /mnt
>  EXT: DEV    BLOCK-RANGE      OWNER            FILE-OFFSET      AG AG-OFFSET        TOTAL
>    0: 253:16 [104..106]:      free space                        0  (104..106)           3
> 
> Fixes: e89c041338ed ("xfs: implement the GETFSMAP ioctl")
> Signed-off-by: Zizhi Wo <wozizhi@huawei.com>
> ---
>  fs/xfs/xfs_fsmap.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c
> index d346acff7725..4ae273b54129 100644
> --- a/fs/xfs/xfs_fsmap.c
> +++ b/fs/xfs/xfs_fsmap.c
> @@ -162,6 +162,7 @@ struct xfs_getfsmap_info {
>  	xfs_daddr_t		next_daddr;	/* next daddr we expect */
>  	/* daddr of low fsmap key when we're using the rtbitmap */
>  	xfs_daddr_t		low_daddr;
> +	xfs_daddr_t		end_daddr;	/* daddr of high fsmap key */

This ought to be initialized to an obviously impossible value (e.g.
-1ULL) in xfs_getfsmap before we start walking btrees.

>  	u64			missing_owner;	/* owner of holes */
>  	u32			dev;		/* device id */
>  	/*
> @@ -294,6 +295,13 @@ xfs_getfsmap_helper(
>  		return 0;
>  	}
>  
> +	/*
> +	 * To prevent missing intervals in the last query, consider using
> +	 * sectors as the granularity.
> +	 */
> +	if (info->last && info->end_daddr)
> +		rec_daddr = info->end_daddr;

I think this needs a better comment.  How about:

	/*
	 * For an info->last query, we're looking for a gap between the
	 * last mapping emitted and the high key specified by userspace.
	 * If the user's query spans less than 1 fsblock, then
	 * info->high and info->low will have the same rm_startblock,
	 * which causes rec_daddr and next_daddr to be the same.
	 * Therefore, use the end_daddr that we calculated from
	 * userspace's high key to synthesize the record.  Note that if
	 * the btree query found a mapping, there won't be a gap.
	 */

> +
>  	/* Are we just counting mappings? */
>  	if (info->head->fmh_count == 0) {
>  		if (info->head->fmh_entries == UINT_MAX)
> @@ -973,8 +981,10 @@ xfs_getfsmap(
>  		 * low key, zero out the low key so that we get
>  		 * everything from the beginning.
>  		 */
> -		if (handlers[i].dev == head->fmh_keys[1].fmr_device)
> +		if (handlers[i].dev == head->fmh_keys[1].fmr_device) {
>  			dkeys[1] = head->fmh_keys[1];
> +			info.end_daddr = dkeys[1].fmr_physical + dkeys[1].fmr_length;

dkeys[1].fmr_length is never used by anything in the fsmap code --
__xfs_getfsmap_datadev sets end_fsb using only dkeys[1].fmr_physical.
You shouldn't add it to end_daddr here because then they won't be
describing the same thing.

Anyway I'll figure out a reproducer for fstests and send the whole pile
back to the mailing list once it passes QA.  Thanks for finding the bug
and attempting a fix. :)

--D

> +		}
>  		if (handlers[i].dev > head->fmh_keys[0].fmr_device)
>  			memset(&dkeys[0], 0, sizeof(struct xfs_fsmap));
>  
> -- 
> 2.39.2
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V3 1/2] xfs: Fix the owner setting issue for rmap query in xfs fsmap
  2024-08-15 16:36   ` Darrick J. Wong
@ 2024-08-16  1:02     ` Zizhi Wo
  0 siblings, 0 replies; 8+ messages in thread
From: Zizhi Wo @ 2024-08-16  1:02 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: chandan.babu, dchinner, osandov, john.g.garry, linux-xfs,
	linux-kernel, yangerkun



在 2024/8/16 0:36, Darrick J. Wong 写道:
> On Mon, Aug 12, 2024 at 09:15:04AM +0800, Zizhi Wo wrote:
>> I notice a rmap query bug in xfs_io fsmap:
>> [root@fedora ~]# xfs_io -c 'fsmap -vvvv' /mnt
>>   EXT: DEV    BLOCK-RANGE           OWNER              FILE-OFFSET      AG AG-OFFSET             TOTAL
>>     0: 253:16 [0..7]:               static fs metadata                  0  (0..7)                    8
>>     1: 253:16 [8..23]:              per-AG metadata                     0  (8..23)                  16
>>     2: 253:16 [24..39]:             inode btree                         0  (24..39)                 16
>>     3: 253:16 [40..47]:             per-AG metadata                     0  (40..47)                  8
>>     4: 253:16 [48..55]:             refcount btree                      0  (48..55)                  8
>>     5: 253:16 [56..103]:            per-AG metadata                     0  (56..103)                48
>>     6: 253:16 [104..127]:           free space                          0  (104..127)               24
>>     ......
>>
>> Bug:
>> [root@fedora ~]# xfs_io -c 'fsmap -vvvv -d 0 3' /mnt
>> [root@fedora ~]#
>> Normally, we should be able to get one record, but we got nothing.
>>
>> The root cause of this problem lies in the incorrect setting of rm_owner in
>> the rmap query. In the case of the initial query where the owner is not
>> set, __xfs_getfsmap_datadev() first sets info->high.rm_owner to ULLONG_MAX.
>> This is done to prevent any omissions when comparing rmap items. However,
>> if the current ag is detected to be the last one, the function sets info's
>> high_irec based on the provided key. If high->rm_owner is not specified, it
>> should continue to be set to ULLONG_MAX; otherwise, there will be issues
>> with interval omissions. For example, consider "start" and "end" within the
>> same block. If high->rm_owner == 0, it will be smaller than the founded
>> record in rmapbt, resulting in a query with no records. The main call stack
>> is as follows:
>>
>> xfs_ioc_getfsmap
>>    xfs_getfsmap
>>      xfs_getfsmap_datadev_rmapbt
>>        __xfs_getfsmap_datadev
>>          info->high.rm_owner = ULLONG_MAX
>>          if (pag->pag_agno == end_ag)
>> 	  xfs_fsmap_owner_to_rmap
>> 	    // set info->high.rm_owner = 0 because fmr_owner == 0
>> 	    dest->rm_owner = 0
>> 	// get nothing
>> 	xfs_getfsmap_datadev_rmapbt_query
>>
>> The problem can be resolved by setting the rm_owner of high to ULLONG_MAX
>> again under certain conditions.
>>
>> After applying this patch, the above problem have been solved:
>> [root@fedora ~]# xfs_io -c 'fsmap -vvvv -d 0 3' /mnt
>>   EXT: DEV    BLOCK-RANGE      OWNER              FILE-OFFSET      AG AG-OFFSET        TOTAL
>>     0: 253:16 [0..7]:          static fs metadata                  0  (0..7)               8
>>
>> Fixes: e89c041338ed ("xfs: implement the GETFSMAP ioctl")
>> Signed-off-by: Zizhi Wo <wozizhi@huawei.com>
>> ---
>>   fs/xfs/xfs_fsmap.c | 7 +++++++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c
>> index 85dbb46452ca..d346acff7725 100644
>> --- a/fs/xfs/xfs_fsmap.c
>> +++ b/fs/xfs/xfs_fsmap.c
>> @@ -655,6 +655,13 @@ __xfs_getfsmap_datadev(
>>   			error = xfs_fsmap_owner_to_rmap(&info->high, &keys[1]);
>>   			if (error)
>>   				break;
>> +			/*
>> +			 * Set the owner of high_key to the maximum again to
>> +			 * prevent missing intervals during the query.
>> +			 */
>> +			if (info->high.rm_owner == 0 &&
>> +			    info->missing_owner == XFS_FMR_OWN_FREE)
>> +			    info->high.rm_owner = ULLONG_MAX;
> 
> Shouldn't this be in xfs_fsmap_owner_to_rmap?
> 
> And, looking at that function, isn't this the solution:
> 
> 	switch (src->fmr_owner) {
> 	case 0:			/* "lowest owner id possible" */
> 	case -1ULL:		/* "highest owner id possible" */
> 		dest->rm_owner = src->fmr_owner;
> 		break;
> 

Yes, the simple modification logic in the xfs_fsmap_owner_to_rmap
function makes more sense.

Thanks,
Zizhi Wo

> instead of this special-casing outside the setter function?
> 
> --D
> 
>>   			xfs_getfsmap_set_irec_flags(&info->high, &keys[1]);
>>   		}
>>   
>> -- 
>> 2.39.2
>>
>>
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V3 2/2] xfs: Fix missing interval for missing_owner in xfs fsmap
  2024-08-15 17:42   ` Darrick J. Wong
@ 2024-08-16  1:37     ` Zizhi Wo
  0 siblings, 0 replies; 8+ messages in thread
From: Zizhi Wo @ 2024-08-16  1:37 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: chandan.babu, dchinner, osandov, john.g.garry, linux-xfs,
	linux-kernel, yangerkun



在 2024/8/16 1:42, Darrick J. Wong 写道:
> On Mon, Aug 12, 2024 at 09:15:05AM +0800, Zizhi Wo wrote:
>> In the fsmap query of xfs, there is an interval missing problem:
>> [root@fedora ~]# xfs_io -c 'fsmap -vvvv' /mnt
>>   EXT: DEV    BLOCK-RANGE           OWNER              FILE-OFFSET      AG AG-OFFSET             TOTAL
>>     0: 253:16 [0..7]:               static fs metadata                  0  (0..7)                    8
>>     1: 253:16 [8..23]:              per-AG metadata                     0  (8..23)                  16
>>     2: 253:16 [24..39]:             inode btree                         0  (24..39)                 16
>>     3: 253:16 [40..47]:             per-AG metadata                     0  (40..47)                  8
>>     4: 253:16 [48..55]:             refcount btree                      0  (48..55)                  8
>>     5: 253:16 [56..103]:            per-AG metadata                     0  (56..103)                48
>>     6: 253:16 [104..127]:           free space                          0  (104..127)               24
>>     ......
>>
>> BUG:
>> [root@fedora ~]# xfs_io -c 'fsmap -vvvv -d 104 107' /mnt
>> [root@fedora ~]#
>> Normally, we should be able to get [104, 107), but we got nothing.
>>
>> The problem is caused by shifting. The query for the problem-triggered
>> scenario is for the missing_owner interval (e.g. freespace in rmapbt/
>> unknown space in bnobt), which is obtained by subtraction (gap). For this
>> scenario, the interval is obtained by info->last. However, rec_daddr is
>> calculated based on the start_block recorded in key[1], which is converted
>> by calling XFS_BB_TO_FSBT. Then if rec_daddr does not exceed
>> info->next_daddr, which means keys[1].fmr_physical >> (mp)->m_blkbb_log
>> <= info->next_daddr, no records will be displayed. In the above example,
>> 104 >> (mp)->m_blkbb_log = 12 and 107 >> (mp)->m_blkbb_log = 12, so the two
>> are reduced to 0 and the gap is ignored:
>>
>>   before calculate ----------------> after shifting
>>   104(st)  107(ed)		      12(st/ed)
>>    |---------|				  |
>>    sector size			      block size
>>
>> Resolve this issue by introducing the "end_daddr" field in
>> xfs_getfsmap_info. This records |key[1].fmr_physical + key[1].length| at
>> the granularity of sector. If the current query is the last, the rec_daddr
>> is end_daddr to prevent missing interval problems caused by shifting. We
>> only need to focus on the last query, because xfs disks are internally
>> aligned with disk blocksize that are powers of two and minimum 512, so
>> there is no problem with shifting in previous queries.
>>
>> After applying this patch, the above problem have been solved:
>> [root@fedora ~]# xfs_io -c 'fsmap -vvvv -d 104 107' /mnt
>>   EXT: DEV    BLOCK-RANGE      OWNER            FILE-OFFSET      AG AG-OFFSET        TOTAL
>>     0: 253:16 [104..106]:      free space                        0  (104..106)           3
>>
>> Fixes: e89c041338ed ("xfs: implement the GETFSMAP ioctl")
>> Signed-off-by: Zizhi Wo <wozizhi@huawei.com>
>> ---
>>   fs/xfs/xfs_fsmap.c | 12 +++++++++++-
>>   1 file changed, 11 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c
>> index d346acff7725..4ae273b54129 100644
>> --- a/fs/xfs/xfs_fsmap.c
>> +++ b/fs/xfs/xfs_fsmap.c
>> @@ -162,6 +162,7 @@ struct xfs_getfsmap_info {
>>   	xfs_daddr_t		next_daddr;	/* next daddr we expect */
>>   	/* daddr of low fsmap key when we're using the rtbitmap */
>>   	xfs_daddr_t		low_daddr;
>> +	xfs_daddr_t		end_daddr;	/* daddr of high fsmap key */
> 
> This ought to be initialized to an obviously impossible value (e.g.
> -1ULL) in xfs_getfsmap before we start walking btrees.
> 

This really makes the semantics clearer. So the assignment condition for
rec_daddr = info->end_daddr also needs to be changed.

>>   	u64			missing_owner;	/* owner of holes */
>>   	u32			dev;		/* device id */
>>   	/*
>> @@ -294,6 +295,13 @@ xfs_getfsmap_helper(
>>   		return 0;
>>   	}
>>   
>> +	/*
>> +	 * To prevent missing intervals in the last query, consider using
>> +	 * sectors as the granularity.
>> +	 */
>> +	if (info->last && info->end_daddr)
>> +		rec_daddr = info->end_daddr;
> 
> I think this needs a better comment.  How about:
> 
> 	/*
> 	 * For an info->last query, we're looking for a gap between the
> 	 * last mapping emitted and the high key specified by userspace.
> 	 * If the user's query spans less than 1 fsblock, then
> 	 * info->high and info->low will have the same rm_startblock,
> 	 * which causes rec_daddr and next_daddr to be the same.
> 	 * Therefore, use the end_daddr that we calculated from
> 	 * userspace's high key to synthesize the record.  Note that if
> 	 * the btree query found a mapping, there won't be a gap.
> 	 */
> 

A more detailed explanation, indeed. As for my previous description, I
simply explained that it would be missed, but did not say the specific
reason. I will adopt it in the next patch, thanks.

>> +
>>   	/* Are we just counting mappings? */
>>   	if (info->head->fmh_count == 0) {
>>   		if (info->head->fmh_entries == UINT_MAX)
>> @@ -973,8 +981,10 @@ xfs_getfsmap(
>>   		 * low key, zero out the low key so that we get
>>   		 * everything from the beginning.
>>   		 */
>> -		if (handlers[i].dev == head->fmh_keys[1].fmr_device)
>> +		if (handlers[i].dev == head->fmh_keys[1].fmr_device) {
>>   			dkeys[1] = head->fmh_keys[1];
>> +			info.end_daddr = dkeys[1].fmr_physical + dkeys[1].fmr_length;
> 
> dkeys[1].fmr_length is never used by anything in the fsmap code --
> __xfs_getfsmap_datadev sets end_fsb using only dkeys[1].fmr_physical.
> You shouldn't add it to end_daddr here because then they won't be
> describing the same thing.
> 

I actually have a question here. We set info->low.rm_blockcount with
keys[0].fmr_length to indicate if this is the first fsmap query. But I'm
not sure what keys[1].fmr_length does. When fsmap is invoked in user
mode, keys[1].fmr_physical is only set. In view of this, I hope to get
your answer.

Anyway, I will send the next patch as soon as possible, thanks for your
comments.

Thanks,
Zizhi Wo

> Anyway I'll figure out a reproducer for fstests and send the whole pile
> back to the mailing list once it passes QA.  Thanks for finding the bug
> and attempting a fix. :)
> 
> --D
> 
>> +		}
>>   		if (handlers[i].dev > head->fmh_keys[0].fmr_device)
>>   			memset(&dkeys[0], 0, sizeof(struct xfs_fsmap));
>>   
>> -- 
>> 2.39.2
>>
>>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-08-16  1:37 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-12  1:15 [PATCH V3 0/2] Some bugfix for xfs fsmap Zizhi Wo
2024-08-12  1:15 ` [PATCH V3 1/2] xfs: Fix the owner setting issue for rmap query in " Zizhi Wo
2024-08-15 16:36   ` Darrick J. Wong
2024-08-16  1:02     ` Zizhi Wo
2024-08-12  1:15 ` [PATCH V3 2/2] xfs: Fix missing interval for missing_owner " Zizhi Wo
2024-08-15 17:42   ` Darrick J. Wong
2024-08-16  1:37     ` Zizhi Wo
2024-08-15 10:20 ` [PATCH V3 0/2] Some bugfix for " Zizhi Wo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox