public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] xfs_repair: handling a block with bad crc, bad uuid, and bad magic number needs fixing
@ 2025-03-21 14:28 bodonnel
  2025-03-21 15:27 ` Darrick J. Wong
  2025-03-21 20:49 ` Eric Sandeen
  0 siblings, 2 replies; 7+ messages in thread
From: bodonnel @ 2025-03-21 14:28 UTC (permalink / raw)
  To: linux-xfs; +Cc: djwong, sandeen, hch, Bill O'Donnell

From: Bill O'Donnell <bodonnel@redhat.com>

In certain cases, if a block is so messed up that crc, uuid and magic
number are all bad, we need to not only detect in phase3 but fix it
properly in phase6. In the current code, the mechanism doesn't work
in that it only pays attention to one of the parameters.

Note: in this case, the nlink inode link count drops to 1, but
re-running xfs_repair fixes it back to 2. This is a side effect that
should probably be handled in update_inode_nlinks() with separate patch.
Regardless, running xfs_repair twice fixes the issue. Also, this patch
fixes the issue with v5, but not v4 xfs.

Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>

v2: remove superfluous wantmagic logic

---
 repair/phase6.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/repair/phase6.c b/repair/phase6.c
index 4064a84b2450..9cffbb1f4510 100644
--- a/repair/phase6.c
+++ b/repair/phase6.c
@@ -2364,7 +2364,6 @@ longform_dir2_entry_check(
 	     da_bno = (xfs_dablk_t)next_da_bno) {
 		const struct xfs_buf_ops *ops;
 		int			 error;
-		struct xfs_dir2_data_hdr *d;
 
 		next_da_bno = da_bno + mp->m_dir_geo->fsbcount - 1;
 		if (bmap_next_offset(ip, &next_da_bno)) {
@@ -2404,9 +2403,7 @@ longform_dir2_entry_check(
 		}
 
 		/* check v5 metadata */
-		d = bp->b_addr;
-		if (be32_to_cpu(d->magic) == XFS_DIR3_BLOCK_MAGIC ||
-		    be32_to_cpu(d->magic) == XFS_DIR3_DATA_MAGIC) {
+		if (xfs_has_crc(mp)) {
 			error = check_dir3_header(mp, bp, ino);
 			if (error) {
 				fixit++;
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] xfs_repair: handling a block with bad crc, bad uuid, and bad magic number needs fixing
  2025-03-21 14:28 [PATCH v2] xfs_repair: handling a block with bad crc, bad uuid, and bad magic number needs fixing bodonnel
@ 2025-03-21 15:27 ` Darrick J. Wong
  2025-03-21 20:36   ` Bill O'Donnell
  2025-03-21 20:49 ` Eric Sandeen
  1 sibling, 1 reply; 7+ messages in thread
From: Darrick J. Wong @ 2025-03-21 15:27 UTC (permalink / raw)
  To: bodonnel; +Cc: linux-xfs, sandeen, hch

On Fri, Mar 21, 2025 at 09:28:49AM -0500, bodonnel@redhat.com wrote:
> From: Bill O'Donnell <bodonnel@redhat.com>
> 
> In certain cases, if a block is so messed up that crc, uuid and magic
> number are all bad, we need to not only detect in phase3 but fix it
> properly in phase6. In the current code, the mechanism doesn't work
> in that it only pays attention to one of the parameters.
> 
> Note: in this case, the nlink inode link count drops to 1, but
> re-running xfs_repair fixes it back to 2. This is a side effect that
> should probably be handled in update_inode_nlinks() with separate patch.
> Regardless, running xfs_repair twice fixes the issue. Also, this patch
> fixes the issue with v5, but not v4 xfs.
> 
> Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>

That makes sense.
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

Bonus question: does longform_dir2_check_leaf need a similar correction
for:

	if (leafhdr.magic == XFS_DIR3_LEAF1_MAGIC) {
		error = check_da3_header(mp, bp, ip->i_ino);
		if (error) {
			libxfs_buf_relse(bp);
			return error;
		}
	}

--D

> 
> v2: remove superfluous wantmagic logic
> 
> ---
>  repair/phase6.c | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/repair/phase6.c b/repair/phase6.c
> index 4064a84b2450..9cffbb1f4510 100644
> --- a/repair/phase6.c
> +++ b/repair/phase6.c
> @@ -2364,7 +2364,6 @@ longform_dir2_entry_check(
>  	     da_bno = (xfs_dablk_t)next_da_bno) {
>  		const struct xfs_buf_ops *ops;
>  		int			 error;
> -		struct xfs_dir2_data_hdr *d;
>  
>  		next_da_bno = da_bno + mp->m_dir_geo->fsbcount - 1;
>  		if (bmap_next_offset(ip, &next_da_bno)) {
> @@ -2404,9 +2403,7 @@ longform_dir2_entry_check(
>  		}
>  
>  		/* check v5 metadata */
> -		d = bp->b_addr;
> -		if (be32_to_cpu(d->magic) == XFS_DIR3_BLOCK_MAGIC ||
> -		    be32_to_cpu(d->magic) == XFS_DIR3_DATA_MAGIC) {
> +		if (xfs_has_crc(mp)) {
>  			error = check_dir3_header(mp, bp, ino);
>  			if (error) {
>  				fixit++;
> -- 
> 2.48.1
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] xfs_repair: handling a block with bad crc, bad uuid, and bad magic number needs fixing
  2025-03-21 15:27 ` Darrick J. Wong
@ 2025-03-21 20:36   ` Bill O'Donnell
  2025-03-21 20:39     ` Darrick J. Wong
  0 siblings, 1 reply; 7+ messages in thread
From: Bill O'Donnell @ 2025-03-21 20:36 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, sandeen, hch

On Fri, Mar 21, 2025 at 08:27:25AM -0700, Darrick J. Wong wrote:
> On Fri, Mar 21, 2025 at 09:28:49AM -0500, bodonnel@redhat.com wrote:
> > From: Bill O'Donnell <bodonnel@redhat.com>
> > 
> > In certain cases, if a block is so messed up that crc, uuid and magic
> > number are all bad, we need to not only detect in phase3 but fix it
> > properly in phase6. In the current code, the mechanism doesn't work
> > in that it only pays attention to one of the parameters.
> > 
> > Note: in this case, the nlink inode link count drops to 1, but
> > re-running xfs_repair fixes it back to 2. This is a side effect that
> > should probably be handled in update_inode_nlinks() with separate patch.
> > Regardless, running xfs_repair twice fixes the issue. Also, this patch
> > fixes the issue with v5, but not v4 xfs.
> > 
> > Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
> 
> That makes sense.
> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
> 
> Bonus question: does longform_dir2_check_leaf need a similar correction
> for:
> 
> 	if (leafhdr.magic == XFS_DIR3_LEAF1_MAGIC) {
> 		error = check_da3_header(mp, bp, ip->i_ino);
> 		if (error) {
> 			libxfs_buf_relse(bp);
> 			return error;
> 		}
> 	}
> --D
> 

I believe so, yes. Basing the v4/v5 decisions on an assumed correct
magic number is not so good. I'll fix it in a new version or separate
patch if preferred.

Thanks-
Bill


> > 
> > v2: remove superfluous wantmagic logic
> > 
> > ---
> >  repair/phase6.c | 5 +----
> >  1 file changed, 1 insertion(+), 4 deletions(-)
> > 
> > diff --git a/repair/phase6.c b/repair/phase6.c
> > index 4064a84b2450..9cffbb1f4510 100644
> > --- a/repair/phase6.c
> > +++ b/repair/phase6.c
> > @@ -2364,7 +2364,6 @@ longform_dir2_entry_check(
> >  	     da_bno = (xfs_dablk_t)next_da_bno) {
> >  		const struct xfs_buf_ops *ops;
> >  		int			 error;
> > -		struct xfs_dir2_data_hdr *d;
> >  
> >  		next_da_bno = da_bno + mp->m_dir_geo->fsbcount - 1;
> >  		if (bmap_next_offset(ip, &next_da_bno)) {
> > @@ -2404,9 +2403,7 @@ longform_dir2_entry_check(
> >  		}
> >  
> >  		/* check v5 metadata */
> > -		d = bp->b_addr;
> > -		if (be32_to_cpu(d->magic) == XFS_DIR3_BLOCK_MAGIC ||
> > -		    be32_to_cpu(d->magic) == XFS_DIR3_DATA_MAGIC) {
> > +		if (xfs_has_crc(mp)) {
> >  			error = check_dir3_header(mp, bp, ino);
> >  			if (error) {
> >  				fixit++;
> > -- 
> > 2.48.1
> > 
> > 
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] xfs_repair: handling a block with bad crc, bad uuid, and bad magic number needs fixing
  2025-03-21 20:36   ` Bill O'Donnell
@ 2025-03-21 20:39     ` Darrick J. Wong
  2025-03-21 23:57       ` Bill O'Donnell
  0 siblings, 1 reply; 7+ messages in thread
From: Darrick J. Wong @ 2025-03-21 20:39 UTC (permalink / raw)
  To: Bill O'Donnell; +Cc: linux-xfs, sandeen, hch

On Fri, Mar 21, 2025 at 03:36:39PM -0500, Bill O'Donnell wrote:
> On Fri, Mar 21, 2025 at 08:27:25AM -0700, Darrick J. Wong wrote:
> > On Fri, Mar 21, 2025 at 09:28:49AM -0500, bodonnel@redhat.com wrote:
> > > From: Bill O'Donnell <bodonnel@redhat.com>
> > > 
> > > In certain cases, if a block is so messed up that crc, uuid and magic
> > > number are all bad, we need to not only detect in phase3 but fix it
> > > properly in phase6. In the current code, the mechanism doesn't work
> > > in that it only pays attention to one of the parameters.
> > > 
> > > Note: in this case, the nlink inode link count drops to 1, but
> > > re-running xfs_repair fixes it back to 2. This is a side effect that
> > > should probably be handled in update_inode_nlinks() with separate patch.
> > > Regardless, running xfs_repair twice fixes the issue. Also, this patch
> > > fixes the issue with v5, but not v4 xfs.
> > > 
> > > Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
> > 
> > That makes sense.
> > Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
> > 
> > Bonus question: does longform_dir2_check_leaf need a similar correction
> > for:
> > 
> > 	if (leafhdr.magic == XFS_DIR3_LEAF1_MAGIC) {
> > 		error = check_da3_header(mp, bp, ip->i_ino);
> > 		if (error) {
> > 			libxfs_buf_relse(bp);
> > 			return error;
> > 		}
> > 	}
> > --D
> > 
> 
> I believe so, yes. Basing the v4/v5 decisions on an assumed correct
> magic number is not so good. I'll fix it in a new version or separate
> patch if preferred.

It's up to you, but since this fix has already earned its review, how
about a separate patch? :)

--D

> Thanks-
> Bill
> 
> 
> > > 
> > > v2: remove superfluous wantmagic logic
> > > 
> > > ---
> > >  repair/phase6.c | 5 +----
> > >  1 file changed, 1 insertion(+), 4 deletions(-)
> > > 
> > > diff --git a/repair/phase6.c b/repair/phase6.c
> > > index 4064a84b2450..9cffbb1f4510 100644
> > > --- a/repair/phase6.c
> > > +++ b/repair/phase6.c
> > > @@ -2364,7 +2364,6 @@ longform_dir2_entry_check(
> > >  	     da_bno = (xfs_dablk_t)next_da_bno) {
> > >  		const struct xfs_buf_ops *ops;
> > >  		int			 error;
> > > -		struct xfs_dir2_data_hdr *d;
> > >  
> > >  		next_da_bno = da_bno + mp->m_dir_geo->fsbcount - 1;
> > >  		if (bmap_next_offset(ip, &next_da_bno)) {
> > > @@ -2404,9 +2403,7 @@ longform_dir2_entry_check(
> > >  		}
> > >  
> > >  		/* check v5 metadata */
> > > -		d = bp->b_addr;
> > > -		if (be32_to_cpu(d->magic) == XFS_DIR3_BLOCK_MAGIC ||
> > > -		    be32_to_cpu(d->magic) == XFS_DIR3_DATA_MAGIC) {
> > > +		if (xfs_has_crc(mp)) {
> > >  			error = check_dir3_header(mp, bp, ino);
> > >  			if (error) {
> > >  				fixit++;
> > > -- 
> > > 2.48.1
> > > 
> > > 
> > 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] xfs_repair: handling a block with bad crc, bad uuid, and bad magic number needs fixing
  2025-03-21 14:28 [PATCH v2] xfs_repair: handling a block with bad crc, bad uuid, and bad magic number needs fixing bodonnel
  2025-03-21 15:27 ` Darrick J. Wong
@ 2025-03-21 20:49 ` Eric Sandeen
  2025-03-21 21:57   ` Bill O'Donnell
  1 sibling, 1 reply; 7+ messages in thread
From: Eric Sandeen @ 2025-03-21 20:49 UTC (permalink / raw)
  To: bodonnel, linux-xfs; +Cc: djwong, hch

On 3/21/25 9:28 AM, bodonnel@redhat.com wrote:
> From: Bill O'Donnell <bodonnel@redhat.com>
> 
> In certain cases, if a block is so messed up that crc, uuid and magic
> number are all bad, we need to not only detect in phase3 but fix it
> properly in phase6. In the current code, the mechanism doesn't work
> in that it only pays attention to one of the parameters.
> 
> Note: in this case, the nlink inode link count drops to 1, but
> re-running xfs_repair fixes it back to 2. This is a side effect that
> should probably be handled in update_inode_nlinks() with separate patch.
> Regardless, running xfs_repair twice fixes the issue. Also, this patch
> fixes the issue with v5, but not v4 xfs.

Nitpick: IIRC V4 filesystems do not have UUIDs in metadata blocks,
so I think this problem is unique to corrupted V5 filesystems.

-Eric


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] xfs_repair: handling a block with bad crc, bad uuid, and bad magic number needs fixing
  2025-03-21 20:49 ` Eric Sandeen
@ 2025-03-21 21:57   ` Bill O'Donnell
  0 siblings, 0 replies; 7+ messages in thread
From: Bill O'Donnell @ 2025-03-21 21:57 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-xfs, djwong, hch

On Fri, Mar 21, 2025 at 03:49:59PM -0500, Eric Sandeen wrote:
> On 3/21/25 9:28 AM, bodonnel@redhat.com wrote:
> > From: Bill O'Donnell <bodonnel@redhat.com>
> > 
> > In certain cases, if a block is so messed up that crc, uuid and magic
> > number are all bad, we need to not only detect in phase3 but fix it
> > properly in phase6. In the current code, the mechanism doesn't work
> > in that it only pays attention to one of the parameters.
> > 
> > Note: in this case, the nlink inode link count drops to 1, but
> > re-running xfs_repair fixes it back to 2. This is a side effect that
> > should probably be handled in update_inode_nlinks() with separate patch.
> > Regardless, running xfs_repair twice fixes the issue. Also, this patch
> > fixes the issue with v5, but not v4 xfs.
> 
> Nitpick: IIRC V4 filesystems do not have UUIDs in metadata blocks,
> so I think this problem is unique to corrupted V5 filesystems.

Right. I'll send a patch version 3, just to clarify the message.

Thanks!
-Bill


> 
> -Eric
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] xfs_repair: handling a block with bad crc, bad uuid, and bad magic number needs fixing
  2025-03-21 20:39     ` Darrick J. Wong
@ 2025-03-21 23:57       ` Bill O'Donnell
  0 siblings, 0 replies; 7+ messages in thread
From: Bill O'Donnell @ 2025-03-21 23:57 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, sandeen, hch

On Fri, Mar 21, 2025 at 01:39:14PM -0700, Darrick J. Wong wrote:
> On Fri, Mar 21, 2025 at 03:36:39PM -0500, Bill O'Donnell wrote:
> > On Fri, Mar 21, 2025 at 08:27:25AM -0700, Darrick J. Wong wrote:
> > > On Fri, Mar 21, 2025 at 09:28:49AM -0500, bodonnel@redhat.com wrote:
> > > > From: Bill O'Donnell <bodonnel@redhat.com>
> > > > 
> > > > In certain cases, if a block is so messed up that crc, uuid and magic
> > > > number are all bad, we need to not only detect in phase3 but fix it
> > > > properly in phase6. In the current code, the mechanism doesn't work
> > > > in that it only pays attention to one of the parameters.
> > > > 
> > > > Note: in this case, the nlink inode link count drops to 1, but
> > > > re-running xfs_repair fixes it back to 2. This is a side effect that
> > > > should probably be handled in update_inode_nlinks() with separate patch.
> > > > Regardless, running xfs_repair twice fixes the issue. Also, this patch
> > > > fixes the issue with v5, but not v4 xfs.
> > > > 
> > > > Signed-off-by: Bill O'Donnell <bodonnel@redhat.com>
> > > 
> > > That makes sense.
> > > Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
> > > 
> > > Bonus question: does longform_dir2_check_leaf need a similar correction
> > > for:
> > > 
> > > 	if (leafhdr.magic == XFS_DIR3_LEAF1_MAGIC) {
> > > 		error = check_da3_header(mp, bp, ip->i_ino);
> > > 		if (error) {
> > > 			libxfs_buf_relse(bp);
> > > 			return error;
> > > 		}
> > > 	}
> > > --D
> > > 
> > 
> > I believe so, yes. Basing the v4/v5 decisions on an assumed correct
> > magic number is not so good. I'll fix it in a new version or separate
> > patch if preferred.
> 
> It's up to you, but since this fix has already earned its review, how
> about a separate patch? :)

That's what I'll do. Thanks again for the review :)
-Bill


> 
> --D
> 
> > Thanks-
> > Bill
> > 
> > 
> > > > 
> > > > v2: remove superfluous wantmagic logic
> > > > 
> > > > ---
> > > >  repair/phase6.c | 5 +----
> > > >  1 file changed, 1 insertion(+), 4 deletions(-)
> > > > 
> > > > diff --git a/repair/phase6.c b/repair/phase6.c
> > > > index 4064a84b2450..9cffbb1f4510 100644
> > > > --- a/repair/phase6.c
> > > > +++ b/repair/phase6.c
> > > > @@ -2364,7 +2364,6 @@ longform_dir2_entry_check(
> > > >  	     da_bno = (xfs_dablk_t)next_da_bno) {
> > > >  		const struct xfs_buf_ops *ops;
> > > >  		int			 error;
> > > > -		struct xfs_dir2_data_hdr *d;
> > > >  
> > > >  		next_da_bno = da_bno + mp->m_dir_geo->fsbcount - 1;
> > > >  		if (bmap_next_offset(ip, &next_da_bno)) {
> > > > @@ -2404,9 +2403,7 @@ longform_dir2_entry_check(
> > > >  		}
> > > >  
> > > >  		/* check v5 metadata */
> > > > -		d = bp->b_addr;
> > > > -		if (be32_to_cpu(d->magic) == XFS_DIR3_BLOCK_MAGIC ||
> > > > -		    be32_to_cpu(d->magic) == XFS_DIR3_DATA_MAGIC) {
> > > > +		if (xfs_has_crc(mp)) {
> > > >  			error = check_dir3_header(mp, bp, ino);
> > > >  			if (error) {
> > > >  				fixit++;
> > > > -- 
> > > > 2.48.1
> > > > 
> > > > 
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-03-21 23:57 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-21 14:28 [PATCH v2] xfs_repair: handling a block with bad crc, bad uuid, and bad magic number needs fixing bodonnel
2025-03-21 15:27 ` Darrick J. Wong
2025-03-21 20:36   ` Bill O'Donnell
2025-03-21 20:39     ` Darrick J. Wong
2025-03-21 23:57       ` Bill O'Donnell
2025-03-21 20:49 ` Eric Sandeen
2025-03-21 21:57   ` Bill O'Donnell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox