xfs_repair after data corruption (not caused by xfs, but by failing nvme drive)

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

* xfs_repair after data corruption (not caused by xfs, but by failing nvme drive)
@ 2025-01-20 15:15 Christian Brauner
  2025-01-21 21:32 ` Dave Chinner
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Christian Brauner @ 2025-01-20 15:15 UTC (permalink / raw)
  To: Dave Chinner, djwong, hch; +Cc: linux-xfs, linux-fsdevel

Hey,

so last week I got a nice surprise when my (relatively new) nvme drive
decided to tell me to gf myself. I managed to recover by now and get
pull requests out and am back in a working state.

I had to reboot and it turned out that my LUKS encrypted xfs filesystem
got corrupted. I booted a live image and did a ddrescue to an external
drive in the hopes of recovering the things that hadn't been backed up
and also I didn't want to have to go and setup my laptop again.

The xfs filesystem was mountable with:

mount -t xfs -o norecovery,ro /dev/mapper/dm4 /mnt

and I was able to copy out everything without a problem.

However, I was curious whether xfs_repair would get me anything and so I
tried it (with and without the -L option and with and without the -o
force_geometry option).

What was surprising to me is that xfs_repair failed at the first step
finding a usable superblock:

> sudo xfs_repair /dev/mapper/dm-sdd4
Phase 1 - find and verify superblock...
couldn't verify primary superblock - not enough secondary superblocks with matching geometry !!!

attempting to find secondary superblock...
..found candidate secondary superblock...
unable to verify superblock, continuing...
....found candidate secondary superblock...
unable to verify superblock, continuing...

I let it run across the whole drive and it did find a lot more
superblocks but for all of them it told me that it couldn't verify them.

I was surprised that I was able to recover all my data and had no issues
mounting the filesystems but xfs_repair failing to validate any
superblocks.

I honestly am just curious why xfs_repair fails to validate any
superblocks.

This is the splat I get when mounting without norecovery,ro:

[88222.149672] XFS (dm-4): Mounting V5 Filesystem 80526d30-90c7-4347-9d9e-333db3f5353b
[88222.632954] XFS (dm-4): Starting recovery (logdev: internal)
[88224.056721] XFS (dm-4): Metadata CRC error detected at xfs_agfl_read_verify+0xa5/0x120 [xfs], xfs_agfl block 0xeb6f603
[88224.057319] XFS (dm-4): Unmount and run xfs_repair
[88224.057328] XFS (dm-4): First 128 bytes of corrupted metadata buffer:
[88224.057338] 00000000: f1 80 cf 13 6c 73 aa 39 55 20 29 5c 2a ca ee 9a  ....ls.9U )\*...
[88224.057346] 00000010: 5a 0f 56 de ff da 93 5a 95 f2 01 ff 9f e7 6f 86  Z.V....Z......o.
[88224.057353] 00000020: dc 90 f4 ad 8b 7c 6d 47 87 1d b6 47 80 25 d0 d5  .....|mG...G.%..
[88224.057359] 00000030: da 36 1c f4 ee 22 e0 f4 b4 19 9a 74 bf d2 7d 49  .6...".....t..}I
[88224.057366] 00000040: 2e 1c 0d 62 a9 93 7b c0 53 b5 52 b7 eb 58 d3 52  ...b..{.S.R..X.R
[88224.057371] 00000050: fc 4b 13 cc 42 c7 36 88 1d 52 28 ef c7 20 cb 39  .K..B.6..R(.. .9
[88224.057377] 00000060: f7 db 9a 83 2c eb 23 52 b3 1a 85 bb d6 5e ff 4b  ....,.#R.....^.K
[88224.057383] 00000070: c3 3d 88 a6 dd bf ab 2a 94 1d 2d 19 6c b5 d1 e5  .=.....*..-.l...
[88224.057473] XFS (dm-4): metadata I/O error in "xfs_alloc_read_agfl+0x9b/0x100 [xfs]" at daddr 0xeb6f603 len 1 error 74
[88224.058055] XFS (dm-4): Metadata I/O Error (0x1) detected at xfs_trans_read_buf_map+0x159/0x300 [xfs] (fs/xfs/xfs_trans_buf.c:296).  Shutting down filesystem.
[88224.058638] XFS (dm-4): Please unmount the filesystem and rectify the problem(s)
[88224.058654] 00000000: 40 12 01 00 01 00 00 00 e0 da a2 29 81 88 ff ff  @..........)....
[88224.058662] XFS (dm-4): Internal error xfs_rmap_recover_work at line 544 of file fs/xfs/xfs_rmap_item.c.  Caller xfs_defer_finish_recovery+0x21/0x90 [xfs]
[88224.059283] CPU: 0 UID: 0 PID: 164438 Comm: mount Not tainted 6.12.9-amd64 #1  Debian 6.12.9-1
[88224.059298] Hardware name: LENOVO 20KHCTO1WW/20KHCTO1WW, BIOS N23ET88W (1.63 ) 02/28/2024
[88224.059305] Call Trace:
[88224.059314]  <TASK>
[88224.059325]  dump_stack_lvl+0x5d/0x80
[88224.059347]  xfs_corruption_error+0x92/0xa0 [xfs]
[88224.059942]  ? xfs_defer_finish_recovery+0x21/0x90 [xfs]
[88224.060615]  xfs_rmap_recover_work+0x38d/0x3b0 [xfs]
[88224.061148]  ? xfs_defer_finish_recovery+0x21/0x90 [xfs]
[88224.061704]  xfs_defer_finish_recovery+0x21/0x90 [xfs]
[88224.062240]  xlog_recover_process_intents+0x75/0x210 [xfs]
[88224.062768]  ? xfs_read_agf+0x95/0x150 [xfs]
[88224.063024]  ? lock_timer_base+0x76/0xa0
[88224.063030]  xlog_recover_finish+0x4a/0x310 [xfs]
[88224.063215]  xfs_log_mount_finish+0x115/0x170 [xfs]
[88224.063396]  xfs_mountfs+0x58d/0x990 [xfs]
[88224.063578]  xfs_fs_fill_super+0x5a3/0x9b0 [xfs]
[88224.063760]  ? __pfx_xfs_fs_fill_super+0x10/0x10 [xfs]
[88224.063936]  get_tree_bdev_flags+0x131/0x1d0
[88224.063942]  vfs_get_tree+0x26/0xd0
[88224.063946]  vfs_cmd_create+0x59/0xe0
[88224.063951]  __do_sys_fsconfig+0x4e3/0x6b0
[88224.063956]  do_syscall_64+0x82/0x190
[88224.063963]  ? generic_permission+0x39/0x220
[88224.063967]  ? mntput_no_expire+0x4a/0x260
[88224.063971]  ? do_faccessat+0x1e1/0x2e0
[88224.063975]  ? syscall_exit_to_user_mode+0x4d/0x210
[88224.063981]  ? do_syscall_64+0x8e/0x190
[88224.063986]  ? exc_page_fault+0x7e/0x180
[88224.063990]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[88224.063996] RIP: 0033:0x7f0e424557da
[88224.064027] Code: 73 01 c3 48 8b 0d 46 56 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 49 89 ca b8 af 01 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 16 56 0d 00 f7 d8 64 89 01 48
[88224.064031] RSP: 002b:00007ffd930e3098 EFLAGS: 00000246 ORIG_RAX: 00000000000001af
[88224.064036] RAX: ffffffffffffffda RBX: 0000564cee560a60 RCX: 00007f0e424557da
[88224.064039] RDX: 0000000000000000 RSI: 0000000000000006 RDI: 0000000000000003
[88224.064041] RBP: 0000564cee561b40 R08: 0000000000000000 R09: 00007f0e4252bb20
[88224.064044] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[88224.064046] R13: 00007f0e425dc5e0 R14: 00007f0e425de244 R15: 00007f0e425c57bf
[88224.064051]  </TASK>
[88224.064059] XFS (dm-4): Corruption detected. Unmount and run xfs_repair
[88224.064066] XFS (dm-4): Failed to recover intents
[88224.064069] XFS (dm-4): Ending recovery (logdev: internal)
[88224.064169] XFS (dm-4): log mount finish failed

With xfs_metadump I get:

> sudo xfs_metadump /dev/mapper/dm-sdd4 xfs_corrupt.metadump                                    

Superblock has bad magic number 0xa604f4c6. Not an XFS filesystem?
Metadata CRC error detected at 0x55d6d5e1c553, xfs_agfl block 0xeb6f603/0x200

I can generate a metadump image if that's helpful and there's interest
in looking into this. But as I said, I've recovered so I don't want to
waste your time.

Thanks!
Christian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: xfs_repair after data corruption (not caused by xfs, but by failing nvme drive)
  2025-01-20 15:15 xfs_repair after data corruption (not caused by xfs, but by failing nvme drive) Christian Brauner
@ 2025-01-21 21:32 ` Dave Chinner
  2025-01-22  6:04 ` Christoph Hellwig
  2025-01-22 21:58 ` Dave Chinner
  2 siblings, 0 replies; 5+ messages in thread
From: Dave Chinner @ 2025-01-21 21:32 UTC (permalink / raw)
  To: Christian Brauner; +Cc: djwong, hch, linux-xfs, linux-fsdevel

On Mon, Jan 20, 2025 at 04:15:00PM +0100, Christian Brauner wrote:
> Hey,
> 
> so last week I got a nice surprise when my (relatively new) nvme drive
> decided to tell me to gf myself. I managed to recover by now and get
> pull requests out and am back in a working state.
....

> I honestly am just curious why xfs_repair fails to validate any
> superblocks.

Ditto. It should be doing the same checks as the runtime validation
code...

> This is the splat I get when mounting without norecovery,ro:
> 
> [88222.149672] XFS (dm-4): Mounting V5 Filesystem 80526d30-90c7-4347-9d9e-333db3f5353b
> [88222.632954] XFS (dm-4): Starting recovery (logdev: internal)
> [88224.056721] XFS (dm-4): Metadata CRC error detected at xfs_agfl_read_verify+0xa5/0x120 [xfs], xfs_agfl block 0xeb6f603
> [88224.057319] XFS (dm-4): Unmount and run xfs_repair
> [88224.057328] XFS (dm-4): First 128 bytes of corrupted metadata buffer:
> [88224.057338] 00000000: f1 80 cf 13 6c 73 aa 39 55 20 29 5c 2a ca ee 9a  ....ls.9U )\*...
> [88224.057346] 00000010: 5a 0f 56 de ff da 93 5a 95 f2 01 ff 9f e7 6f 86  Z.V....Z......o.
> [88224.057353] 00000020: dc 90 f4 ad 8b 7c 6d 47 87 1d b6 47 80 25 d0 d5  .....|mG...G.%..
> [88224.057359] 00000030: da 36 1c f4 ee 22 e0 f4 b4 19 9a 74 bf d2 7d 49  .6...".....t..}I
> [88224.057366] 00000040: 2e 1c 0d 62 a9 93 7b c0 53 b5 52 b7 eb 58 d3 52  ...b..{.S.R..X.R
> [88224.057371] 00000050: fc 4b 13 cc 42 c7 36 88 1d 52 28 ef c7 20 cb 39  .K..B.6..R(.. .9
> [88224.057377] 00000060: f7 db 9a 83 2c eb 23 52 b3 1a 85 bb d6 5e ff 4b  ....,.#R.....^.K
> [88224.057383] 00000070: c3 3d 88 a6 dd bf ab 2a 94 1d 2d 19 6c b5 d1 e5  .=.....*..-.l...

Yeah, that's garbage.

> With xfs_metadump I get:
> 
> > sudo xfs_metadump /dev/mapper/dm-sdd4 xfs_corrupt.metadump                                    
> 
> Superblock has bad magic number 0xa604f4c6. Not an XFS filesystem?
> Metadata CRC error detected at 0x55d6d5e1c553, xfs_agfl block 0xeb6f603/0x200

That might be complaining about a secondary superblock, not the
primary. That would explain why it mounts...

> I can generate a metadump image if that's helpful and there's interest
> in looking into this. But as I said, I've recovered so I don't want to
> waste your time.

I'd like to have a look at the metadump image of the broken fs if
you've still got it.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: xfs_repair after data corruption (not caused by xfs, but by failing nvme drive)
  2025-01-20 15:15 xfs_repair after data corruption (not caused by xfs, but by failing nvme drive) Christian Brauner
  2025-01-21 21:32 ` Dave Chinner
@ 2025-01-22  6:04 ` Christoph Hellwig
  2025-01-22 21:58 ` Dave Chinner
  2 siblings, 0 replies; 5+ messages in thread
From: Christoph Hellwig @ 2025-01-22  6:04 UTC (permalink / raw)
  To: Christian Brauner; +Cc: Dave Chinner, djwong, hch, linux-xfs, linux-fsdevel

As Dave said this looks like the secondary superblocks got messed up.
repair really should not fail because of that, we can look into it.

If you still have the fs around for a metadump image that would help,
but otherwise it should be possibly to craft a reproducer.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: xfs_repair after data corruption (not caused by xfs, but by failing nvme drive)
  2025-01-20 15:15 xfs_repair after data corruption (not caused by xfs, but by failing nvme drive) Christian Brauner
  2025-01-21 21:32 ` Dave Chinner
  2025-01-22  6:04 ` Christoph Hellwig
@ 2025-01-22 21:58 ` Dave Chinner
  2025-01-24  8:53   ` Christian Brauner
  2 siblings, 1 reply; 5+ messages in thread
From: Dave Chinner @ 2025-01-22 21:58 UTC (permalink / raw)
  To: Christian Brauner; +Cc: djwong, hch, linux-xfs, linux-fsdevel

On Mon, Jan 20, 2025 at 04:15:00PM +0100, Christian Brauner wrote:
> Hey,
> 
> so last week I got a nice surprise when my (relatively new) nvme drive
> decided to tell me to gf myself. I managed to recover by now and get
> pull requests out and am back in a working state.
> 
> I had to reboot and it turned out that my LUKS encrypted xfs filesystem
> got corrupted. I booted a live image and did a ddrescue to an external
> drive in the hopes of recovering the things that hadn't been backed up
> and also I didn't want to have to go and setup my laptop again.
> 
> The xfs filesystem was mountable with:
> 
> mount -t xfs -o norecovery,ro /dev/mapper/dm4 /mnt
> 
> and I was able to copy out everything without a problem.
> 
> However, I was curious whether xfs_repair would get me anything and so I
> tried it (with and without the -L option and with and without the -o
> force_geometry option).
> 
> What was surprising to me is that xfs_repair failed at the first step
> finding a usable superblock:
> 
> > sudo xfs_repair /dev/mapper/dm-sdd4
> Phase 1 - find and verify superblock...
> couldn't verify primary superblock - not enough secondary superblocks with matching geometry !!!
> 
> attempting to find secondary superblock...
> ..found candidate secondary superblock...
> unable to verify superblock, continuing...
> ....found candidate secondary superblock...
> unable to verify superblock, continuing...

Yeah, so it's a 4 AG filesystem so it has 1 primary superblock and 2
secondary superblocks. Two of the 3 secondary superblocks are trash,
and repair needs 2 of the secondary superblocks to match the primary
for it to validate the primary as a good superblock.

xfs_repair considers this situation as "too far gone to reliably
repair" and so aborts.

I did notice a pattern to the corruption, though. while sb 1 is
trashed, the adjacent sector (agf 1) is perfectly fine. So is agi 1.
But then agfl 1 is trash. But then the first filesystem block after
these (a free space btree block) is intact. In the case of sb 3,
it's just a single sector that is gone.

To find if there were any other metadata corruptions, I copied the
primary superblock over the corrupted one in AG 1:

xfs_db> sb 1
Superblock has bad magic number 0xa604f4c6. Not an XFS filesystem?
xfs_db> daddr
datadev daddr is 246871552
xfs_db> q
$ dd if=t.img of=t.img oseek=246871552 bs=512 count=1 conv=notrunc
...

and then ran repair on it again. This time repair ran (after zeroing
the log) and there were no corruptions other than what I'd expect
from zeroing the log (e.g. unlinked inode lists were populated,
some free space mismatches, etc).

Hence there doesn't appear to be any other metadata corruptions
outside of the 3 bad sectors already identified. Two of those
sectors were considered critical by repair, hence it's failure.

What I suspect happened is that the drive lost the first page that
data was ever written to - mkfs lays down the AG headers first, so
there is every chance that the FTL has put them in the same physical
page. the primary superblock, all the AGI, AGF and AGFL headers get
rewritten all the time, so the current versions of them will be
immediately moved to some other page. hence if the original page is
lost, the contents of those sectors will still be valid. However,
the superblocks never get rewritten, so only they get lost.

Journal recovery failed on the AGFL sector in AG 1 that was also
corrupted - that had been rewritten many times, so it's possible
that the drive lost multiple flash pages. It is also possible that
garbage collection had recently relocated the secondary superblocks
and that AGFL into the same page and that was lost. This is only
speculation, though.

That said, Christian, I wouldn't trust any of the recovered data to
be perfectly intact - there's every chance random files have random
data corruption in them. Even though the filesystem was recovered,
it is worth checking the validity of the data as much as you can...

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: xfs_repair after data corruption (not caused by xfs, but by failing nvme drive)
  2025-01-22 21:58 ` Dave Chinner
@ 2025-01-24  8:53   ` Christian Brauner
  0 siblings, 0 replies; 5+ messages in thread
From: Christian Brauner @ 2025-01-24  8:53 UTC (permalink / raw)
  To: Dave Chinner; +Cc: djwong, hch, linux-xfs, linux-fsdevel

On Thu, Jan 23, 2025 at 08:58:26AM +1100, Dave Chinner wrote:
> On Mon, Jan 20, 2025 at 04:15:00PM +0100, Christian Brauner wrote:
> > Hey,
> > 
> > so last week I got a nice surprise when my (relatively new) nvme drive
> > decided to tell me to gf myself. I managed to recover by now and get
> > pull requests out and am back in a working state.
> > 
> > I had to reboot and it turned out that my LUKS encrypted xfs filesystem
> > got corrupted. I booted a live image and did a ddrescue to an external
> > drive in the hopes of recovering the things that hadn't been backed up
> > and also I didn't want to have to go and setup my laptop again.
> > 
> > The xfs filesystem was mountable with:
> > 
> > mount -t xfs -o norecovery,ro /dev/mapper/dm4 /mnt
> > 
> > and I was able to copy out everything without a problem.
> > 
> > However, I was curious whether xfs_repair would get me anything and so I
> > tried it (with and without the -L option and with and without the -o
> > force_geometry option).
> > 
> > What was surprising to me is that xfs_repair failed at the first step
> > finding a usable superblock:
> > 
> > > sudo xfs_repair /dev/mapper/dm-sdd4
> > Phase 1 - find and verify superblock...
> > couldn't verify primary superblock - not enough secondary superblocks with matching geometry !!!
> > 
> > attempting to find secondary superblock...
> > ..found candidate secondary superblock...
> > unable to verify superblock, continuing...
> > ....found candidate secondary superblock...
> > unable to verify superblock, continuing...
> 
> Yeah, so it's a 4 AG filesystem so it has 1 primary superblock and 2
> secondary superblocks. Two of the 3 secondary superblocks are trash,
> and repair needs 2 of the secondary superblocks to match the primary
> for it to validate the primary as a good superblock.
> 
> xfs_repair considers this situation as "too far gone to reliably
> repair" and so aborts.
> 
> I did notice a pattern to the corruption, though. while sb 1 is
> trashed, the adjacent sector (agf 1) is perfectly fine. So is agi 1.
> But then agfl 1 is trash. But then the first filesystem block after
> these (a free space btree block) is intact. In the case of sb 3,
> it's just a single sector that is gone.
> 
> To find if there were any other metadata corruptions, I copied the
> primary superblock over the corrupted one in AG 1:
> 
> xfs_db> sb 1
> Superblock has bad magic number 0xa604f4c6. Not an XFS filesystem?
> xfs_db> daddr
> datadev daddr is 246871552
> xfs_db> q
> $ dd if=t.img of=t.img oseek=246871552 bs=512 count=1 conv=notrunc
> ...
> 
> and then ran repair on it again. This time repair ran (after zeroing
> the log) and there were no corruptions other than what I'd expect
> from zeroing the log (e.g. unlinked inode lists were populated,
> some free space mismatches, etc).
> 
> Hence there doesn't appear to be any other metadata corruptions
> outside of the 3 bad sectors already identified. Two of those
> sectors were considered critical by repair, hence it's failure.
> 
> What I suspect happened is that the drive lost the first page that
> data was ever written to - mkfs lays down the AG headers first, so
> there is every chance that the FTL has put them in the same physical
> page. the primary superblock, all the AGI, AGF and AGFL headers get
> rewritten all the time, so the current versions of them will be
> immediately moved to some other page. hence if the original page is
> lost, the contents of those sectors will still be valid. However,
> the superblocks never get rewritten, so only they get lost.
> 
> Journal recovery failed on the AGFL sector in AG 1 that was also
> corrupted - that had been rewritten many times, so it's possible
> that the drive lost multiple flash pages. It is also possible that
> garbage collection had recently relocated the secondary superblocks
> and that AGFL into the same page and that was lost. This is only
> speculation, though.

Thanks for taking the time to look into this!

> 
> That said, Christian, I wouldn't trust any of the recovered data to
> be perfectly intact - there's every chance random files have random

Yes, I think I'm fine with that risk. The data I recovered is strictly
from /home/ so at least I won't have to worry about some system library
being corrupted.

> data corruption in them. Even though the filesystem was recovered,
> it is worth checking the validity of the data as much as you can...

Fwiw, xfs did a great job here. I was very happy how it behaved even
though that drive was shot to hell!

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-01-24  8:53 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-20 15:15 xfs_repair after data corruption (not caused by xfs, but by failing nvme drive) Christian Brauner
2025-01-21 21:32 ` Dave Chinner
2025-01-22  6:04 ` Christoph Hellwig
2025-01-22 21:58 ` Dave Chinner
2025-01-24  8:53   ` Christian Brauner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox