Linux XFS filesystem development
 help / color / mirror / Atom feed
* xfs_metadump segmentation fault on large fs - xfsprogs 6.1
@ 2025-07-25 11:27 ` hubert .
  2025-07-26  3:51   ` Carlos Maiolino
  0 siblings, 1 reply; 14+ messages in thread
From: hubert . @ 2025-07-25 11:27 UTC (permalink / raw)
  To: linux-xfs@vger.kernel.org

Hi,

A few months ago we had a serious crash in our monster RAID60 (~590TB) when one of the subvolume's disks failed and then then rebuild process triggered failures in other drives (you guessed it, no backup).
The hardware issues were plenty to the point where we don't rule out problems in the Areca controller either, compounding to some probably poor decisions on my part.
The rebuild took weeks to complete and we left it in a degraded state not to make things worse.
The first attempt to mount it read-only of course failed. From journalctl:

kernel: XFS (sdb1): Mounting V5 Filesystem
kernel: XFS (sdb1): Starting recovery (logdev: internal)
kernel: XFS (sdb1): Metadata CRC error detected at xfs_agf_read_verify+0x70/0x120 [xfs], xfs_agf block 0xa7fffff59
kernel: XFS (sdb1): Unmount and run xfs_repair
kernel: XFS (sdb1): First 64 bytes of corrupted metadata buffer:
kernel: ffff89b444a94400: 74 4e 5a cc ae eb a0 6d 6c 08 95 5e ed 6b a4 ff  tNZ....ml..^.k..
kernel: ffff89b444a94410: be d2 05 24 09 f2 0a d2 66 f3 be 3a 7b 97 9a 84  ...$....f..:{...
kernel: ffff89b444a94420: a4 95 78 72 58 08 ca ec 10 a7 c3 20 1a a3 a6 08  ..xrX...... ....
kernel: ffff89b444a94430: b0 43 0f d6 80 fd 12 25 70 de 7f 28 78 26 3d 94  .C.....%p..(x&=.
kernel: XFS (sdb1): metadata I/O error: block 0xa7fffff59 ("xfs_trans_read_buf_map") error 74 numblks 1

Following the advice in the list, I attempted to run a xfs_metadump (xfsprogs 4.5.0), but after after copying 30 out of 590 AGs, it segfaulted:
/usr/sbin/xfs_metadump: line 33:  3139 Segmentation fault      (core dumped) xfs_db$DBOPTS -i -p xfs_metadump -c "metadump$OPTS $2" $1

-journalctl:
xfs_db[3139]: segfault at 1015390b1 ip 0000000000407906 sp 00007ffcaef2c2c0 error 4 in xfs_db[400000+8a000]

Now, the host machine is rather critical and old, running CentOS 7, 3.10 kernel on a Xeon X5650. Not trusting the hardware, I used ddrescue to clone the partition to some other luckily available system.
The copy went ok(?), but it did encounter reading errors at the end, which confirmed my suspicion that the rebuild process was not as successful. About 10MB could not be retrieved.

I attempted a metadump on the copy too, now on a machine with AMD EPYC 7302, 128GB RAM, a 6.1 kernel and xfsprogs v6.1.0.

# xfs_metadump -aogfw  /storage/image/sdb1.img   /storage/metadump/sdb1.metadump 2>&1 | tee mddump2.log

It creates again a 280MB dump and at 30 AGs it segfaults:

Jul24 14:47] xfs_db[42584]: segfault at 557051a1d2b0 ip 0000556f19f1e090 sp 00007ffe431a7be0 error 4 in xfs_db[556f19f04000+64000] likely on CPU 21 (core 9, socket 0)
[  +0.000025] Code: 00 00 00 83 f8 0a 0f 84 90 07 00 00 c6 44 24 53 00 48 63 f1 49 89 ff 48 c1 e6 04 48 8d 54 37 f0 48 bf ff ff ff ff ff ff 3f 00 <48> 8b 02 48 8b 52 08 48 0f c8 48 c1 e8 09 48 0f ca 81 e2 ff ff 1f

This is the log https://pastebin.com/jsSFeCr6, which looks similar to the first one. The machine does not seem loaded at all and further tries result in the same code. 

My next step would be trying a later xfsprogs version, or maybe xfs_repair -n on a compatible CPU machine as non-destructive options, but I feel I'm kidding myself as to what I can try to recover anything at all from such humongous disaster.

Thanks in advance for any input
Hub

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs_metadump segmentation fault on large fs - xfsprogs 6.1
  2025-07-25 11:27 ` xfs_metadump segmentation fault on large fs - xfsprogs 6.1 hubert .
@ 2025-07-26  3:51   ` Carlos Maiolino
  2025-08-01 13:51     ` hubert .
  0 siblings, 1 reply; 14+ messages in thread
From: Carlos Maiolino @ 2025-07-26  3:51 UTC (permalink / raw)
  To: hubert .; +Cc: linux-xfs@vger.kernel.org

On Fri, Jul 25, 2025 at 11:27:40AM +0000, hubert . wrote:
> Hi,
> 
> A few months ago we had a serious crash in our monster RAID60 (~590TB) when one of the subvolume's disks failed and then then rebuild process triggered failures in other drives (you guessed it, no backup).
> The hardware issues were plenty to the point where we don't rule out problems in the Areca controller either, compounding to some probably poor decisions on my part.
> The rebuild took weeks to complete and we left it in a degraded state not to make things worse.
> The first attempt to mount it read-only of course failed. From journalctl:
> 
> kernel: XFS (sdb1): Mounting V5 Filesystem
> kernel: XFS (sdb1): Starting recovery (logdev: internal)
> kernel: XFS (sdb1): Metadata CRC error detected at xfs_agf_read_verify+0x70/0x120 [xfs], xfs_agf block 0xa7fffff59
> kernel: XFS (sdb1): Unmount and run xfs_repair
> kernel: XFS (sdb1): First 64 bytes of corrupted metadata buffer:
> kernel: ffff89b444a94400: 74 4e 5a cc ae eb a0 6d 6c 08 95 5e ed 6b a4 ff  tNZ....ml..^.k..
> kernel: ffff89b444a94410: be d2 05 24 09 f2 0a d2 66 f3 be 3a 7b 97 9a 84  ...$....f..:{...
> kernel: ffff89b444a94420: a4 95 78 72 58 08 ca ec 10 a7 c3 20 1a a3 a6 08  ..xrX...... ....
> kernel: ffff89b444a94430: b0 43 0f d6 80 fd 12 25 70 de 7f 28 78 26 3d 94  .C.....%p..(x&=.
> kernel: XFS (sdb1): metadata I/O error: block 0xa7fffff59 ("xfs_trans_read_buf_map") error 74 numblks 1
> 
> Following the advice in the list, I attempted to run a xfs_metadump (xfsprogs 4.5.0), but after after copying 30 out of 590 AGs, it segfaulted:
> /usr/sbin/xfs_metadump: line 33:  3139 Segmentation fault      (core dumped) xfs_db$DBOPTS -i -p xfs_metadump -c "metadump$OPTS $2" $1

I'm not sure what you expect from a metadump, this is usually used for
post-mortem analysis, but you already know what went wrong and why.

> 
> -journalctl:
> xfs_db[3139]: segfault at 1015390b1 ip 0000000000407906 sp 00007ffcaef2c2c0 error 4 in xfs_db[400000+8a000]
> 
> Now, the host machine is rather critical and old, running CentOS 7, 3.10 kernel on a Xeon X5650. Not trusting the hardware, I used ddrescue to clone the partition to some other luckily available system.
> The copy went ok(?), but it did encounter reading errors at the end, which confirmed my suspicion that the rebuild process was not as successful. About 10MB could not be retrieved.
> 
> I attempted a metadump on the copy too, now on a machine with AMD EPYC 7302, 128GB RAM, a 6.1 kernel and xfsprogs v6.1.0.
> 
> # xfs_metadump -aogfw  /storage/image/sdb1.img   /storage/metadump/sdb1.metadump 2>&1 | tee mddump2.log
> 
> It creates again a 280MB dump and at 30 AGs it segfaults:
> 
> Jul24 14:47] xfs_db[42584]: segfault at 557051a1d2b0 ip 0000556f19f1e090 sp 00007ffe431a7be0 error 4 in xfs_db[556f19f04000+64000] likely on CPU 21 (core 9, socket 0)
> [  +0.000025] Code: 00 00 00 83 f8 0a 0f 84 90 07 00 00 c6 44 24 53 00 48 63 f1 49 89 ff 48 c1 e6 04 48 8d 54 37 f0 48 bf ff ff ff ff ff ff 3f 00 <48> 8b 02 48 8b 52 08 48 0f c8 48 c1 e8 09 48 0f ca 81 e2 ff ff 1f
> 
> This is the log https://pastebin.com/jsSFeCr6, which looks similar to the first one. The machine does not seem loaded at all and further tries result in the same code.
> 
> My next step would be trying a later xfsprogs version, or maybe xfs_repair -n on a compatible CPU machine as non-destructive options, but I feel I'm kidding myself as to what I can try to recover anything at all from such humongous disaster.

Yes, that's probably the best approach now. To run the latest xfsprogs
available.

Also, xfs_repair does not need to be executed on the same architecture
as the FS was running. Despite log replay (which is done by the Linux
kernel), xfs_repair is capable of converting the filesystem data
structures back and forth to the current machine endianness


> 
> Thanks in advance for any input
> Hub

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs_metadump segmentation fault on large fs - xfsprogs 6.1
  2025-07-26  3:51   ` Carlos Maiolino
@ 2025-08-01 13:51     ` hubert .
  2025-08-18 15:56       ` hubert .
  0 siblings, 1 reply; 14+ messages in thread
From: hubert . @ 2025-08-01 13:51 UTC (permalink / raw)
  To: Carlos Maiolino, linux-xfs@vger.kernel.org

Am 26.07.25 um 00:52 schrieb Carlos Maiolino:
>  
> On Fri, Jul 25, 2025 at 11:27:40AM +0000, hubert . wrote:
> > Hi,
> >
> > A few months ago we had a serious crash in our monster RAID60 (~590TB) when one of the subvolume's disks failed and then then rebuild process triggered failures in other drives (you guessed it, no backup).
> > The hardware issues were plenty to the point where we don't rule out problems in the Areca controller either, compounding to some probably poor decisions on my part.
> > The rebuild took weeks to complete and we left it in a degraded state not to make things worse.
> > The first attempt to mount it read-only of course failed. From journalctl:
> >
> > kernel: XFS (sdb1): Mounting V5 Filesystem
> > kernel: XFS (sdb1): Starting recovery (logdev: internal)
> > kernel: XFS (sdb1): Metadata CRC error detected at xfs_agf_read_verify+0x70/0x120 [xfs], xfs_agf block 0xa7fffff59
> > kernel: XFS (sdb1): Unmount and run xfs_repair
> > kernel: XFS (sdb1): First 64 bytes of corrupted metadata buffer:
> > kernel: ffff89b444a94400: 74 4e 5a cc ae eb a0 6d 6c 08 95 5e ed 6b a4 ff  tNZ....ml..^.k..
> > kernel: ffff89b444a94410: be d2 05 24 09 f2 0a d2 66 f3 be 3a 7b 97 9a 84  ...$....f..:{...
> > kernel: ffff89b444a94420: a4 95 78 72 58 08 ca ec 10 a7 c3 20 1a a3 a6 08  ..xrX...... ....
> > kernel: ffff89b444a94430: b0 43 0f d6 80 fd 12 25 70 de 7f 28 78 26 3d 94  .C.....%p..(x&=.
> > kernel: XFS (sdb1): metadata I/O error: block 0xa7fffff59 ("xfs_trans_read_buf_map") error 74 numblks 1
> >
> > Following the advice in the list, I attempted to run a xfs_metadump (xfsprogs 4.5.0), but after after copying 30 out of 590 AGs, it segfaulted:
> > /usr/sbin/xfs_metadump: line 33:  3139 Segmentation fault      (core dumped) xfs_db$DBOPTS -i -p xfs_metadump -c "metadump$OPTS $2" $1
>
> I'm not sure what you expect from a metadump, this is usually used for
> post-mortem analysis, but you already know what went wrong and why

I was hoping to have a restored metadata file I could try things on
without risking the copy, since it's not possible to have a second one
with this inordinate amount of data.

> >
> > -journalctl:
> > xfs_db[3139]: segfault at 1015390b1 ip 0000000000407906 sp 00007ffcaef2c2c0 error 4 in xfs_db[400000+8a000]
> >
> > Now, the host machine is rather critical and old, running CentOS 7, 3.10 kernel on a Xeon X5650. Not trusting the hardware, I used ddrescue to clone the partition to some other luckily available system.
> > The copy went ok(?), but it did encounter reading errors at the end, which confirmed my suspicion that the rebuild process was not as successful. About 10MB could not be retrieved.
> >
> > I attempted a metadump on the copy too, now on a machine with AMD EPYC 7302, 128GB RAM, a 6.1 kernel and xfsprogs v6.1.0.
> >
> > # xfs_metadump -aogfw  /storage/image/sdb1.img   /storage/metadump/sdb1.metadump 2>&1 | tee mddump2.log
> >
> > It creates again a 280MB dump and at 30 AGs it segfaults:
> >
> > Jul24 14:47] xfs_db[42584]: segfault at 557051a1d2b0 ip 0000556f19f1e090 sp 00007ffe431a7be0 error 4 in xfs_db[556f19f04000+64000] likely on CPU 21 (core 9, socket 0)
> > [  +0.000025] Code: 00 00 00 83 f8 0a 0f 84 90 07 00 00 c6 44 24 53 00 48 63 f1 49 89 ff 48 c1 e6 04 48 8d 54 37 f0 48 bf ff ff ff ff ff ff 3f 00 <48> 8b 02 48 8b 52 08 48 0f c8 48 c1 e8 09 48 0f ca 81 e2 ff ff 1f
> >
> > This is the log https://pastebin.com/jsSFeCr6, which looks similar to the first one. The machine does not seem loaded at all and further tries result in the same code.
> >
> > My next step would be trying a later xfsprogs version, or maybe xfs_repair -n on a compatible CPU machine as non-destructive options, but I feel I'm kidding myself as to what I can try to recover anything at all from such humongous disaster.
>
> Yes, that's probably the best approach now. To run the latest xfsprogs
> available.

Ok, so I ran into some unrelated issues, but I could finally install xfsprogs 6.15.0:

root@serv:~# xfs_metadump -aogfw /storage/image/sdb1.img  /storage/metadump/sdb1.metadump
xfs_metadump: read failed: Invalid argument
xfs_metadump: data size check failed
xfs_metadump: read failed: Invalid argument
xfs_metadump: cannot init perag data (22). Continuing anyway.
xfs_metadump: read failed: Invalid argument
empty log check failed
xlog_is_dirty: cannot find log head/tail (xlog_find_tail=-22)

xfs_metadump: read failed: Invalid argument
xfs_metadump: cannot read superblock for ag 0
xfs_metadump: read failed: Invalid argument
xfs_metadump: cannot read agf block for ag 0
xfs_metadump: read failed: Invalid argument
xfs_metadump: cannot read agi block for ag 0
xfs_metadump: read failed: Invalid argument
xfs_metadump: cannot read agfl block for ag 0
xfs_metadump: read failed: Invalid argument
xfs_metadump: cannot read superblock for ag 1
xfs_metadump: read failed: Invalid argument
xfs_metadump: cannot read agf block for ag 1
xfs_metadump: read failed: Invalid argument
xfs_metadump: cannot read agi block for ag 1
xfs_metadump: read failed: Invalid argument
xfs_metadump: cannot read agfl block for ag 1
xfs_metadump: read failed: Invalid argument
xfs_metadump: cannot read superblock for ag 2
xfs_metadump: read failed: Invalid argument
xfs_metadump: cannot read agf block for ag 2
xfs_metadump: read failed: Invalid argument
xfs_metadump: cannot read agi block for ag 2
...
...
...
xfs_metadump: read failed: Invalid argument
xfs_metadump: cannot read agfl block for ag 588
xfs_metadump: read failed: Invalid argument
xfs_metadump: cannot read superblock for ag 589
xfs_metadump: read failed: Invalid argument
xfs_metadump: cannot read agf block for ag 589
xfs_metadump: read failed: Invalid argument
xfs_metadump: cannot read agi block for ag 589
xfs_metadump: read failed: Invalid argument
xfs_metadump: cannot read agfl block for ag 589
Copying log                                                
root@serv:~#

It did create a 2.1GB dump which of course restores to an empty file.

I thought I had messed up with some of the dependency libs, so then I 
tried with xfsprogs 6.13 in Debian testing, same result.

I'm not exactly sure why now it fails to read the image; nothing has
changed about it. I could not find much more info in the documentation.
What am I missing..?

Thanks
>
> Also, xfs_repair does not need to be executed on the same architecture
> as the FS was running. Despite log replay (which is done by the Linux
> kernel), xfs_repair is capable of converting the filesystem data
> structures back and forth to the current machine endianness
>
>
> >
> > Thanks in advance for any input
> > Hub

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs_metadump segmentation fault on large fs - xfsprogs 6.1
  2025-08-01 13:51     ` hubert .
@ 2025-08-18 15:56       ` hubert .
  2025-08-25  7:51         ` Carlos Maiolino
  0 siblings, 1 reply; 14+ messages in thread
From: hubert . @ 2025-08-18 15:56 UTC (permalink / raw)
  To: Carlos Maiolino, linux-xfs@vger.kernel.org

Am 18.08.25 um 17:14 schrieb hubert .:
>
> Am 26.07.25 um 00:52 schrieb Carlos Maiolino:
>>
>> On Fri, Jul 25, 2025 at 11:27:40AM +0000, hubert . wrote:
>>> Hi,
>>>
>>> A few months ago we had a serious crash in our monster RAID60 (~590TB) when one of the subvolume's disks failed and then then rebuild process triggered failures in other drives (you guessed it, no backup).
>>> The hardware issues were plenty to the point where we don't rule out problems in the Areca controller either, compounding to some probably poor decisions on my part.
>>> The rebuild took weeks to complete and we left it in a degraded state not to make things worse.
>>> The first attempt to mount it read-only of course failed. From journalctl:
>>>
>>> kernel: XFS (sdb1): Mounting V5 Filesystem
>>> kernel: XFS (sdb1): Starting recovery (logdev: internal)
>>> kernel: XFS (sdb1): Metadata CRC error detected at xfs_agf_read_verify+0x70/0x120 [xfs], xfs_agf block 0xa7fffff59
>>> kernel: XFS (sdb1): Unmount and run xfs_repair
>>> kernel: XFS (sdb1): First 64 bytes of corrupted metadata buffer:
>>> kernel: ffff89b444a94400: 74 4e 5a cc ae eb a0 6d 6c 08 95 5e ed 6b a4 ff  tNZ....ml..^.k..
>>> kernel: ffff89b444a94410: be d2 05 24 09 f2 0a d2 66 f3 be 3a 7b 97 9a 84  ...$....f..:{...
>>> kernel: ffff89b444a94420: a4 95 78 72 58 08 ca ec 10 a7 c3 20 1a a3 a6 08  ..xrX...... ....
>>> kernel: ffff89b444a94430: b0 43 0f d6 80 fd 12 25 70 de 7f 28 78 26 3d 94  .C.....%p..(x&=.
>>> kernel: XFS (sdb1): metadata I/O error: block 0xa7fffff59 ("xfs_trans_read_buf_map") error 74 numblks 1
>>>
>>> Following the advice in the list, I attempted to run a xfs_metadump (xfsprogs 4.5.0), but after after copying 30 out of 590 AGs, it segfaulted:
>>> /usr/sbin/xfs_metadump: line 33:  3139 Segmentation fault      (core dumped) xfs_db$DBOPTS -i -p xfs_metadump -c "metadump$OPTS $2" $1
>>
>> I'm not sure what you expect from a metadump, this is usually used for
>> post-mortem analysis, but you already know what went wrong and why
>
> I was hoping to have a restored metadata file I could try things on
> without risking the copy, since it's not possible to have a second one
> with this inordinate amount of data.
>
>>>
>>> -journalctl:
>>> xfs_db[3139]: segfault at 1015390b1 ip 0000000000407906 sp 00007ffcaef2c2c0 error 4 in xfs_db[400000+8a000]
>>>
>>> Now, the host machine is rather critical and old, running CentOS 7, 3.10 kernel on a Xeon X5650. Not trusting the hardware, I used ddrescue to clone the partition to some other luckily available system.
>>> The copy went ok(?), but it did encounter reading errors at the end, which confirmed my suspicion that the rebuild process was not as successful. About 10MB could not be retrieved.
>>>
>>> I attempted a metadump on the copy too, now on a machine with AMD EPYC 7302, 128GB RAM, a 6.1 kernel and xfsprogs v6.1.0.
>>>
>>> # xfs_metadump -aogfw  /storage/image/sdb1.img   /storage/metadump/sdb1.metadump 2>&1 | tee mddump2.log
>>>
>>> It creates again a 280MB dump and at 30 AGs it segfaults:
>>>
>>> Jul24 14:47] xfs_db[42584]: segfault at 557051a1d2b0 ip 0000556f19f1e090 sp 00007ffe431a7be0 error 4 in xfs_db[556f19f04000+64000] likely on CPU 21 (core 9, socket 0)
>>> [  +0.000025] Code: 00 00 00 83 f8 0a 0f 84 90 07 00 00 c6 44 24 53 00 48 63 f1 49 89 ff 48 c1 e6 04 48 8d 54 37 f0 48 bf ff ff ff ff ff ff 3f 00 <48> 8b 02 48 8b 52 08 48 0f c8 48 c1 e8 09 48 0f ca 81 e2 ff ff 1f
>>>
>>> This is the log https://pastebin.com/jsSFeCr6, which looks similar to the first one. The machine does not seem loaded at all and further tries result in the same code.
>>>
>>> My next step would be trying a later xfsprogs version, or maybe xfs_repair -n on a compatible CPU machine as non-destructive options, but I feel I'm kidding myself as to what I can try to recover anything at all from such humongous disaster.
>>
>> Yes, that's probably the best approach now. To run the latest xfsprogs
>> available.
>
> Ok, so I ran into some unrelated issues, but I could finally install xfsprogs 6.15.0:
>
> root@serv:~# xfs_metadump -aogfw /storage/image/sdb1.img  /storage/metadump/sdb1.metadump
> xfs_metadump: read failed: Invalid argument
> xfs_metadump: data size check failed
> xfs_metadump: read failed: Invalid argument
> xfs_metadump: cannot init perag data (22). Continuing anyway.
> xfs_metadump: read failed: Invalid argument
> empty log check failed
> xlog_is_dirty: cannot find log head/tail (xlog_find_tail=-22)
>
> xfs_metadump: read failed: Invalid argument
> xfs_metadump: cannot read superblock for ag 0
> xfs_metadump: read failed: Invalid argument
> xfs_metadump: cannot read agf block for ag 0
> xfs_metadump: read failed: Invalid argument
> xfs_metadump: cannot read agi block for ag 0
> xfs_metadump: read failed: Invalid argument
> xfs_metadump: cannot read agfl block for ag 0
> xfs_metadump: read failed: Invalid argument
> xfs_metadump: cannot read superblock for ag 1
> xfs_metadump: read failed: Invalid argument
> xfs_metadump: cannot read agf block for ag 1
> xfs_metadump: read failed: Invalid argument
> xfs_metadump: cannot read agi block for ag 1
> xfs_metadump: read failed: Invalid argument
> xfs_metadump: cannot read agfl block for ag 1
> xfs_metadump: read failed: Invalid argument
> xfs_metadump: cannot read superblock for ag 2
> xfs_metadump: read failed: Invalid argument
> xfs_metadump: cannot read agf block for ag 2
> xfs_metadump: read failed: Invalid argument
> xfs_metadump: cannot read agi block for ag 2
> ...
> ...
> ...
> xfs_metadump: read failed: Invalid argument
> xfs_metadump: cannot read agfl block for ag 588
> xfs_metadump: read failed: Invalid argument
> xfs_metadump: cannot read superblock for ag 589
> xfs_metadump: read failed: Invalid argument
> xfs_metadump: cannot read agf block for ag 589
> xfs_metadump: read failed: Invalid argument
> xfs_metadump: cannot read agi block for ag 589
> xfs_metadump: read failed: Invalid argument
> xfs_metadump: cannot read agfl block for ag 589
> Copying log
> root@serv:~#
>
> It did create a 2.1GB dump which of course restores to an empty file.
>
> I thought I had messed up with some of the dependency libs, so then I
> tried with xfsprogs 6.13 in Debian testing, same result.
>
> I'm not exactly sure why now it fails to read the image; nothing has
> changed about it. I could not find much more info in the documentation.
> What am I missing..?

I tried a few more things on the img, as I realized it was probably not 
the best idea to dd it to a file instead of a device, but I got nowhere.
After some team deliberations, we decided to connect the original block 
device to the new machine (Debian 13, 16 AMD cores, 128RAM, new 
controller, plenty of swap, xfsprogs 6.13) and and see if the dump was possible then.

It had the same behavior as with with xfsprogs 6.1 and segfauled after 
30 AGs. journalctl and dmesg don't really add any more info, so I tried 
to debug a bit, though I'm afraid it's all quite foreign to me:

root@ap:/metadump# gdb xfs_metadump core.12816 
GNU gdb (Debian 16.3-1) 16.3
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
...
Type "apropos word" to search for commands related to "word"...
"/usr/sbin/xfs_metadump": not in executable format: file format not recognized
[New LWP 12816]
Reading symbols from /usr/sbin/xfs_db...
(No debugging symbols found in /usr/sbin/xfs_db)
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/sbin/xfs_db -i -p xfs_metadump -c metadump /dev/sda1'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000556f127d6857 in ?? ()
(gdb) bt full
#0  0x0000556f127d6857 in ?? ()
No symbol table info available.
#1  0x0000556f127dbdc4 in ?? ()
No symbol table info available.
#2  0x0000556f127d5546 in ?? ()
No symbol table info available.
#3  0x0000556f127db350 in ?? ()
No symbol table info available.
#4  0x0000556f127d5546 in ?? ()
No symbol table info available.
#5  0x0000556f127d99aa in ?? ()
No symbol table info available.
#6  0x0000556f127b9764 in ?? ()
No symbol table info available.
#7  0x00007eff29058ca8 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#8  0x00007eff29058d65 in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#9  0x0000556f127ba8c1 in ?? ()
No symbol table info available.

And this:

root@ap:/PETA/metadump# coredumpctl info
           PID: 13103 (xfs_db)
           UID: 0 (root)
           GID: 0 (root)
        Signal: 11 (SEGV)
     Timestamp: Mon 2025-08-18 19:03:19 CEST (1min 12s ago)
  Command Line: xfs_db -i -p xfs_metadump -c metadump -a -o -g -w $' /metadump/metadata.img' /dev/sda1
    Executable: /usr/sbin/xfs_db
 Control Group: /user.slice/user-0.slice/session-8.scope
          Unit: session-8.scope
         Slice: user-0.slice
       Session: 8
     Owner UID: 0 (root)
       Boot ID: c090e507272647838c77bcdefd67e79c
    Machine ID: 83edcebe83994c67ac4f88e2a3c185e3
      Hostname: ap
       Storage: /var/lib/systemd/coredump/core.xfs_db.0.c090e507272647838c77bcdefd67e79c.13103.1755536599000000.zst (present)
  Size on Disk: 26.2M
       Message: Process 13103 (xfs_db) of user 0 dumped core.
                
                Module libuuid.so.1 from deb util-linux-2.41-5.amd64
                Stack trace of thread 13103:
                #0  0x000055b961d29857 n/a (/usr/sbin/xfs_db + 0x32857)
                #1  0x000055b961d2edc4 n/a (/usr/sbin/xfs_db + 0x37dc4)
                #2  0x000055b961d28546 n/a (/usr/sbin/xfs_db + 0x31546)
                #3  0x000055b961d2e350 n/a (/usr/sbin/xfs_db + 0x37350)
                #4  0x000055b961d28546 n/a (/usr/sbin/xfs_db + 0x31546)
                #5  0x000055b961d2c9aa n/a (/usr/sbin/xfs_db + 0x359aa)
                #6  0x000055b961d0c764 n/a (/usr/sbin/xfs_db + 0x15764)
                #7  0x00007fc870455ca8 n/a (libc.so.6 + 0x29ca8)
                #8  0x00007fc870455d65 __libc_start_main (libc.so.6 + 0x29d65)
                #9  0x000055b961d0d8c1 n/a (/usr/sbin/xfs_db + 0x168c1)
                ELF object binary architecture: AMD x86-64

I guess my questions are: can the fs be so corrupted that it causes 
xfs_metadump (or xfs_db) to segfault? Are there too many AGs / fs too 
large?
Shall I assume that xfs_repair could fail similarly then?

I'll appreciate any ideas. Also, if you think the core dump or other logs 
could be useful, I can upload them somewhere.

Thanks again

>
>
> Thanks
>>
>> Also, xfs_repair does not need to be executed on the same architecture
>> as the FS was running. Despite log replay (which is done by the Linux
>> kernel), xfs_repair is capable of converting the filesystem data
>> structures back and forth to the current machine endianness
>>
>>
>>>
>>> Thanks in advance for any input
>>> Hub
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs_metadump segmentation fault on large fs - xfsprogs 6.1
  2025-08-18 15:56       ` hubert .
@ 2025-08-25  7:51         ` Carlos Maiolino
  2025-08-27 10:51           ` hubert .
  0 siblings, 1 reply; 14+ messages in thread
From: Carlos Maiolino @ 2025-08-25  7:51 UTC (permalink / raw)
  To: hubert .; +Cc: linux-xfs@vger.kernel.org

Hello Hubert, my apologies for the delay.


On Mon, Aug 18, 2025 at 03:56:53PM +0000, hubert . wrote:
> Am 18.08.25 um 17:14 schrieb hubert .:
> >
> > Am 26.07.25 um 00:52 schrieb Carlos Maiolino:
> >>
> >> On Fri, Jul 25, 2025 at 11:27:40AM +0000, hubert . wrote:
> >>> Hi,
> >>>
> >>> A few months ago we had a serious crash in our monster RAID60 (~590TB) when one of the subvolume's disks failed and then then rebuild process triggered failures in other drives (you guessed it, no backup).
> >>> The hardware issues were plenty to the point where we don't rule out problems in the Areca controller either, compounding to some probably poor decisions on my part.
> >>> The rebuild took weeks to complete and we left it in a degraded state not to make things worse.
> >>> The first attempt to mount it read-only of course failed. From journalctl:
> >>>
> >>> kernel: XFS (sdb1): Mounting V5 Filesystem
> >>> kernel: XFS (sdb1): Starting recovery (logdev: internal)
> >>> kernel: XFS (sdb1): Metadata CRC error detected at xfs_agf_read_verify+0x70/0x120 [xfs], xfs_agf block 0xa7fffff59
> >>> kernel: XFS (sdb1): Unmount and run xfs_repair
> >>> kernel: XFS (sdb1): First 64 bytes of corrupted metadata buffer:
> >>> kernel: ffff89b444a94400: 74 4e 5a cc ae eb a0 6d 6c 08 95 5e ed 6b a4 ff  tNZ....ml..^.k..
> >>> kernel: ffff89b444a94410: be d2 05 24 09 f2 0a d2 66 f3 be 3a 7b 97 9a 84  ...$....f..:{...
> >>> kernel: ffff89b444a94420: a4 95 78 72 58 08 ca ec 10 a7 c3 20 1a a3 a6 08  ..xrX...... ....
> >>> kernel: ffff89b444a94430: b0 43 0f d6 80 fd 12 25 70 de 7f 28 78 26 3d 94  .C.....%p..(x&=.
> >>> kernel: XFS (sdb1): metadata I/O error: block 0xa7fffff59 ("xfs_trans_read_buf_map") error 74 numblks 1
> >>>
> >>> Following the advice in the list, I attempted to run a xfs_metadump (xfsprogs 4.5.0), but after after copying 30 out of 590 AGs, it segfaulted:
> >>> /usr/sbin/xfs_metadump: line 33:  3139 Segmentation fault      (core dumped) xfs_db$DBOPTS -i -p xfs_metadump -c "metadump$OPTS $2" $1
> >>
> >> I'm not sure what you expect from a metadump, this is usually used for
> >> post-mortem analysis, but you already know what went wrong and why
> >
> > I was hoping to have a restored metadata file I could try things on
> > without risking the copy, since it's not possible to have a second one
> > with this inordinate amount of data.
> >
> >>>
> >>> -journalctl:
> >>> xfs_db[3139]: segfault at 1015390b1 ip 0000000000407906 sp 00007ffcaef2c2c0 error 4 in xfs_db[400000+8a000]
> >>>
> >>> Now, the host machine is rather critical and old, running CentOS 7, 3.10 kernel on a Xeon X5650. Not trusting the hardware, I used ddrescue to clone the partition to some other luckily available system.
> >>> The copy went ok(?), but it did encounter reading errors at the end, which confirmed my suspicion that the rebuild process was not as successful. About 10MB could not be retrieved.
> >>>
> >>> I attempted a metadump on the copy too, now on a machine with AMD EPYC 7302, 128GB RAM, a 6.1 kernel and xfsprogs v6.1.0.
> >>>
> >>> # xfs_metadump -aogfw  /storage/image/sdb1.img   /storage/metadump/sdb1.metadump 2>&1 | tee mddump2.log
> >>>
> >>> It creates again a 280MB dump and at 30 AGs it segfaults:
> >>>
> >>> Jul24 14:47] xfs_db[42584]: segfault at 557051a1d2b0 ip 0000556f19f1e090 sp 00007ffe431a7be0 error 4 in xfs_db[556f19f04000+64000] likely on CPU 21 (core 9, socket 0)
> >>> [  +0.000025] Code: 00 00 00 83 f8 0a 0f 84 90 07 00 00 c6 44 24 53 00 48 63 f1 49 89 ff 48 c1 e6 04 48 8d 54 37 f0 48 bf ff ff ff ff ff ff 3f 00 <48> 8b 02 48 8b 52 08 48 0f c8 48 c1 e8 09 48 0f ca 81 e2 ff ff 1f
> >>>
> >>> This is the log https://pastebin.com/jsSFeCr6, which looks similar to the first one. The machine does not seem loaded at all and further tries result in the same code.
> >>>
> >>> My next step would be trying a later xfsprogs version, or maybe xfs_repair -n on a compatible CPU machine as non-destructive options, but I feel I'm kidding myself as to what I can try to recover anything at all from such humongous disaster.
> >>
> >> Yes, that's probably the best approach now. To run the latest xfsprogs
> >> available.
> >
> > Ok, so I ran into some unrelated issues, but I could finally install xfsprogs 6.15.0:
> >
> > root@serv:~# xfs_metadump -aogfw /storage/image/sdb1.img  /storage/metadump/sdb1.metadump
> > xfs_metadump: read failed: Invalid argument
> > xfs_metadump: data size check failed
> > xfs_metadump: read failed: Invalid argument
> > xfs_metadump: cannot init perag data (22). Continuing anyway.
> > xfs_metadump: read failed: Invalid argument
> > empty log check failed
> > xlog_is_dirty: cannot find log head/tail (xlog_find_tail=-22)
> >
> > xfs_metadump: read failed: Invalid argument
> > xfs_metadump: cannot read superblock for ag 0
> > xfs_metadump: read failed: Invalid argument
> > xfs_metadump: cannot read agf block for ag 0
> > xfs_metadump: read failed: Invalid argument
> > xfs_metadump: cannot read agi block for ag 0
> > xfs_metadump: read failed: Invalid argument
> > xfs_metadump: cannot read agfl block for ag 0
> > xfs_metadump: read failed: Invalid argument
> > xfs_metadump: cannot read superblock for ag 1
> > xfs_metadump: read failed: Invalid argument
> > xfs_metadump: cannot read agf block for ag 1
> > xfs_metadump: read failed: Invalid argument
> > xfs_metadump: cannot read agi block for ag 1
> > xfs_metadump: read failed: Invalid argument
> > xfs_metadump: cannot read agfl block for ag 1
> > xfs_metadump: read failed: Invalid argument
> > xfs_metadump: cannot read superblock for ag 2
> > xfs_metadump: read failed: Invalid argument
> > xfs_metadump: cannot read agf block for ag 2
> > xfs_metadump: read failed: Invalid argument
> > xfs_metadump: cannot read agi block for ag 2
> > ...
> > ...
> > ...
> > xfs_metadump: read failed: Invalid argument
> > xfs_metadump: cannot read agfl block for ag 588
> > xfs_metadump: read failed: Invalid argument
> > xfs_metadump: cannot read superblock for ag 589
> > xfs_metadump: read failed: Invalid argument
> > xfs_metadump: cannot read agf block for ag 589
> > xfs_metadump: read failed: Invalid argument
> > xfs_metadump: cannot read agi block for ag 589
> > xfs_metadump: read failed: Invalid argument
> > xfs_metadump: cannot read agfl block for ag 589
> > Copying log
> > root@serv:~#
> >
> > It did create a 2.1GB dump which of course restores to an empty file.
> >
> > I thought I had messed up with some of the dependency libs, so then I
> > tried with xfsprogs 6.13 in Debian testing, same result.
> >
> > I'm not exactly sure why now it fails to read the image; nothing has
> > changed about it. I could not find much more info in the documentation.
> > What am I missing..?
> 
> I tried a few more things on the img, as I realized it was probably not
> the best idea to dd it to a file instead of a device, but I got nowhere.
> After some team deliberations, we decided to connect the original block
> device to the new machine (Debian 13, 16 AMD cores, 128RAM, new
> controller, plenty of swap, xfsprogs 6.13) and and see if the dump was possible then.
> 
> It had the same behavior as with with xfsprogs 6.1 and segfauled after
> 30 AGs. journalctl and dmesg don't really add any more info, so I tried
> to debug a bit, though I'm afraid it's all quite foreign to me:
> 
> root@ap:/metadump# gdb xfs_metadump core.12816
> GNU gdb (Debian 16.3-1) 16.3
> Copyright (C) 2024 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.
> ...
> Type "apropos word" to search for commands related to "word"...
> "/usr/sbin/xfs_metadump": not in executable format: file format not recognized
> [New LWP 12816]
> Reading symbols from /usr/sbin/xfs_db...
> (No debugging symbols found in /usr/sbin/xfs_db)
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> Core was generated by `/usr/sbin/xfs_db -i -p xfs_metadump -c metadump /dev/sda1'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0x0000556f127d6857 in ?? ()
> (gdb) bt full
> #0  0x0000556f127d6857 in ?? ()
> No symbol table info available.
> #1  0x0000556f127dbdc4 in ?? ()
> No symbol table info available.
> #2  0x0000556f127d5546 in ?? ()
> No symbol table info available.
> #3  0x0000556f127db350 in ?? ()
> No symbol table info available.
> #4  0x0000556f127d5546 in ?? ()
> No symbol table info available.
> #5  0x0000556f127d99aa in ?? ()
> No symbol table info available.
> #6  0x0000556f127b9764 in ?? ()
> No symbol table info available.
> #7  0x00007eff29058ca8 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
> No symbol table info available.
> #8  0x00007eff29058d65 in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
> No symbol table info available.
> #9  0x0000556f127ba8c1 in ?? ()
> No symbol table info available.
> 
> And this:
> 
> root@ap:/PETA/metadump# coredumpctl info
>            PID: 13103 (xfs_db)
>            UID: 0 (root)
>            GID: 0 (root)
>         Signal: 11 (SEGV)
>      Timestamp: Mon 2025-08-18 19:03:19 CEST (1min 12s ago)
>   Command Line: xfs_db -i -p xfs_metadump -c metadump -a -o -g -w $' /metadump/metadata.img' /dev/sda1
>     Executable: /usr/sbin/xfs_db
>  Control Group: /user.slice/user-0.slice/session-8.scope
>           Unit: session-8.scope
>          Slice: user-0.slice
>        Session: 8
>      Owner UID: 0 (root)
>        Boot ID: c090e507272647838c77bcdefd67e79c
>     Machine ID: 83edcebe83994c67ac4f88e2a3c185e3
>       Hostname: ap
>        Storage: /var/lib/systemd/coredump/core.xfs_db.0.c090e507272647838c77bcdefd67e79c.13103.1755536599000000.zst (present)
>   Size on Disk: 26.2M
>        Message: Process 13103 (xfs_db) of user 0 dumped core.
> 
>                 Module libuuid.so.1 from deb util-linux-2.41-5.amd64
>                 Stack trace of thread 13103:
>                 #0  0x000055b961d29857 n/a (/usr/sbin/xfs_db + 0x32857)
>                 #1  0x000055b961d2edc4 n/a (/usr/sbin/xfs_db + 0x37dc4)
>                 #2  0x000055b961d28546 n/a (/usr/sbin/xfs_db + 0x31546)
>                 #3  0x000055b961d2e350 n/a (/usr/sbin/xfs_db + 0x37350)
>                 #4  0x000055b961d28546 n/a (/usr/sbin/xfs_db + 0x31546)
>                 #5  0x000055b961d2c9aa n/a (/usr/sbin/xfs_db + 0x359aa)
>                 #6  0x000055b961d0c764 n/a (/usr/sbin/xfs_db + 0x15764)
>                 #7  0x00007fc870455ca8 n/a (libc.so.6 + 0x29ca8)
>                 #8  0x00007fc870455d65 __libc_start_main (libc.so.6 + 0x29d65)
>                 #9  0x000055b961d0d8c1 n/a (/usr/sbin/xfs_db + 0x168c1)
>                 ELF object binary architecture: AMD x86-64

Without the debug symbols it get virtually impossible to know what
was going on =/

> 
> I guess my questions are: can the fs be so corrupted that it causes
> xfs_metadump (or xfs_db) to segfault? Are there too many AGs / fs too
> large?
> Shall I assume that xfs_repair could fail similarly then?

In a nutshell xfs_metadump shouldn't segfault even if the fs is
corrupted.
About xfs_repair, it depends, there is some code shared between both,
but xfs_repair is much more resilient.

> 
> I'll appreciate any ideas. Also, if you think the core dump or other logs
> could be useful, I can upload them somewhere.

I'd start by running xfs_repair in no-modify mode, i.e. `xfs_repair -n`
and check what it finds.

Regarding the xfs_metadump segfault, yes, a core might be useful to
investigate where the segfault is triggered, but you'll need to be
running xfsprogs from the upstream tree (preferentially latest code), so
we can actually match the core information the code.

Cheers,
Carlos.

> 
> Thanks again
> 
> >
> >
> > Thanks
> >>
> >> Also, xfs_repair does not need to be executed on the same architecture
> >> as the FS was running. Despite log replay (which is done by the Linux
> >> kernel), xfs_repair is capable of converting the filesystem data
> >> structures back and forth to the current machine endianness
> >>
> >>
> >>>
> >>> Thanks in advance for any input
> >>> Hub
> >

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs_metadump segmentation fault on large fs - xfsprogs 6.1
  2025-08-25  7:51         ` Carlos Maiolino
@ 2025-08-27 10:51           ` hubert .
  2025-09-26  9:04             ` hubert .
  0 siblings, 1 reply; 14+ messages in thread
From: hubert . @ 2025-08-27 10:51 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: linux-xfs@vger.kernel.org

 ________________________________________
> From: Carlos Maiolino <cem@kernel.org>
> Sent: Monday, August 25, 2025 09:51
> To: hubert .
> Cc: linux-xfs@vger.kernel.org
> Subject: Re: xfs_metadump segmentation fault on large fs - xfsprogs 6.1
>
> Hello Hubert, my apologies for the delay.
No problem, Carlos, I'm also juggling several things, thanks for the follow-up

> On Mon, Aug 18, 2025 at 03:56:53PM +0000, hubert . wrote:
>> Am 18.08.25 um 17:14 schrieb hubert .:
>>> Am 26.07.25 um 00:52 schrieb Carlos Maiolino:
>>>> On Fri, Jul 25, 2025 at 11:27:40AM +0000, hubert . wrote:
>>>>> Hi,
>>>>>
>>>>> A few months ago we had a serious crash in our monster RAID60 (~590TB) when one of the subvolume's disks failed and then then rebuild process triggered failures in other drives (you guessed it, no backup).
>>>>> The hardware issues were plenty to the point where we don't rule out problems in the Areca controller either, compounding to some probably poor decisions on my part.
>>>>> The rebuild took weeks to complete and we left it in a degraded state not to make things worse.
>>>>> The first attempt to mount it read-only of course failed. From journalctl:
>>>>>
>>>>> kernel: XFS (sdb1): Mounting V5 Filesystem
>>>>> kernel: XFS (sdb1): Starting recovery (logdev: internal)
>>>>> kernel: XFS (sdb1): Metadata CRC error detected at xfs_agf_read_verify+0x70/0x120 [xfs], xfs_agf block 0xa7fffff59
>>>>> kernel: XFS (sdb1): Unmount and run xfs_repair
>>>>> kernel: XFS (sdb1): First 64 bytes of corrupted metadata buffer:
>>>>> kernel: ffff89b444a94400: 74 4e 5a cc ae eb a0 6d 6c 08 95 5e ed 6b a4 ff  tNZ....ml..^.k..
>>>>> kernel: ffff89b444a94410: be d2 05 24 09 f2 0a d2 66 f3 be 3a 7b 97 9a 84  ...$....f..:{...
>>>>> kernel: ffff89b444a94420: a4 95 78 72 58 08 ca ec 10 a7 c3 20 1a a3 a6 08  ..xrX...... ....
>>>>> kernel: ffff89b444a94430: b0 43 0f d6 80 fd 12 25 70 de 7f 28 78 26 3d 94  .C.....%p..(x&=.
>>>>> kernel: XFS (sdb1): metadata I/O error: block 0xa7fffff59 ("xfs_trans_read_buf_map") error 74 numblks 1
>>>>>
>>>>> Following the advice in the list, I attempted to run a xfs_metadump (xfsprogs 4.5.0), but after after copying 30 out of 590 AGs, it segfaulted:
>>>>> /usr/sbin/xfs_metadump: line 33:  3139 Segmentation fault      (core dumped) xfs_db$DBOPTS -i -p xfs_metadump -c "metadump$OPTS $2" $1
>>>> I'm not sure what you expect from a metadump, this is usually used for
>>>> post-mortem analysis, but you already know what went wrong and why
>>> I was hoping to have a restored metadata file I could try things on
>>> without risking the copy, since it's not possible to have a second one
>>> with this inordinate amount of data.
>>>
>>>>> -journalctl:
>>>>> xfs_db[3139]: segfault at 1015390b1 ip 0000000000407906 sp 00007ffcaef2c2c0 error 4 in xfs_db[400000+8a000]
>>>>>
>>>>> Now, the host machine is rather critical and old, running CentOS 7, 3.10 kernel on a Xeon X5650. Not trusting the hardware, I used ddrescue to clone the partition to some other luckily available system.
>>>>> The copy went ok(?), but it did encounter reading errors at the end, which confirmed my suspicion that the rebuild process was not as successful. About 10MB could not be retrieved.
>>>>>
>>>>> I attempted a metadump on the copy too, now on a machine with AMD EPYC 7302, 128GB RAM, a 6.1 kernel and xfsprogs v6.1.0.
>>>>>
>>>>> # xfs_metadump -aogfw  /storage/image/sdb1.img   /storage/metadump/sdb1.metadump 2>&1 | tee mddump2.log
>>>>>
>>>>> It creates again a 280MB dump and at 30 AGs it segfaults:
>>>>>
>>>>> Jul24 14:47] xfs_db[42584]: segfault at 557051a1d2b0 ip 0000556f19f1e090 sp 00007ffe431a7be0 error 4 in xfs_db[556f19f04000+64000] likely on CPU 21 (core 9, socket 0)
>>>>> [  +0.000025] Code: 00 00 00 83 f8 0a 0f 84 90 07 00 00 c6 44 24 53 00 48 63 f1 49 89 ff 48 c1 e6 04 48 8d 54 37 f0 48 bf ff ff ff ff ff ff 3f 00 <48> 8b 02 48 8b 52 08 48 0f c8 48 c1 e8 09 48 0f ca 81 e2 ff ff 1f
>>>>>
>>>>> This is the log https://pastebin.com/jsSFeCr6, which looks similar to the first one. The machine does not seem loaded at all and further tries result in the same code.
>>>>>
>>>>> My next step would be trying a later xfsprogs version, or maybe xfs_repair -n on a compatible CPU machine as non-destructive options, but I feel I'm kidding myself as to what I can try to recover anything at all from such humongous disaster.
>>>> Yes, that's probably the best approach now. To run the latest xfsprogs
>>>> available.
>>> Ok, so I ran into some unrelated issues, but I could finally install xfsprogs 6.15.0:
>>>
>>> root@serv:~# xfs_metadump -aogfw /storage/image/sdb1.img  /storage/metadump/sdb1.metadump
>>> xfs_metadump: read failed: Invalid argument
>>> xfs_metadump: data size check failed
>>> xfs_metadump: read failed: Invalid argument
>>> xfs_metadump: cannot init perag data (22). Continuing anyway.
>>> xfs_metadump: read failed: Invalid argument
>>> empty log check failed
>>> xlog_is_dirty: cannot find log head/tail (xlog_find_tail=-22)
>>>
>>> xfs_metadump: read failed: Invalid argument
>>> xfs_metadump: cannot read superblock for ag 0
>>> xfs_metadump: read failed: Invalid argument
>>> xfs_metadump: cannot read agf block for ag 0
>>> xfs_metadump: read failed: Invalid argument
>>> xfs_metadump: cannot read agi block for ag 0
>>> xfs_metadump: read failed: Invalid argument
>>> xfs_metadump: cannot read agfl block for ag 0
>>> xfs_metadump: read failed: Invalid argument
>>> xfs_metadump: cannot read superblock for ag 1
>>> xfs_metadump: read failed: Invalid argument
>>> xfs_metadump: cannot read agf block for ag 1
>>> xfs_metadump: read failed: Invalid argument
>>> xfs_metadump: cannot read agi block for ag 1
>>> xfs_metadump: read failed: Invalid argument
>>> xfs_metadump: cannot read agfl block for ag 1
>>> xfs_metadump: read failed: Invalid argument
>>> xfs_metadump: cannot read superblock for ag 2
>>> xfs_metadump: read failed: Invalid argument
>>> xfs_metadump: cannot read agf block for ag 2
>>> xfs_metadump: read failed: Invalid argument
>>> xfs_metadump: cannot read agi block for ag 2
>>> ...
>>> ...
>>> ...
>>> xfs_metadump: read failed: Invalid argument
>>> xfs_metadump: cannot read agfl block for ag 588
>>> xfs_metadump: read failed: Invalid argument
>>> xfs_metadump: cannot read superblock for ag 589
>>> xfs_metadump: read failed: Invalid argument
>>> xfs_metadump: cannot read agf block for ag 589
>>> xfs_metadump: read failed: Invalid argument
>>> xfs_metadump: cannot read agi block for ag 589
>>> xfs_metadump: read failed: Invalid argument
>>> xfs_metadump: cannot read agfl block for ag 589
>>> Copying log
>>> root@serv:~#
>>>
>>> It did create a 2.1GB dump which of course restores to an empty file.
>>>
>>> I thought I had messed up with some of the dependency libs, so then I
>>> tried with xfsprogs 6.13 in Debian testing, same result.
>>>
>>> I'm not exactly sure why now it fails to read the image; nothing has
>>> changed about it. I could not find much more info in the documentation.
>>> What am I missing..?
>> I tried a few more things on the img, as I realized it was probably not
>> the best idea to dd it to a file instead of a device, but I got nowhere.
>> After some team deliberations, we decided to connect the original block
>> device to the new machine (Debian 13, 16 AMD cores, 128RAM, new
>> controller, plenty of swap, xfsprogs 6.13) and and see if the dump was possible then.
>>
>> It had the same behavior as with with xfsprogs 6.1 and segfauled after
>> 30 AGs. journalctl and dmesg don't really add any more info, so I tried
>> to debug a bit, though I'm afraid it's all quite foreign to me:
>>
>> root@ap:/metadump# gdb xfs_metadump core.12816
>> GNU gdb (Debian 16.3-1) 16.3
>> Copyright (C) 2024 Free Software Foundation, Inc.
>> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
>> This is free software: you are free to change and redistribute it.
>> There is NO WARRANTY, to the extent permitted by law.
>> ...
>> Type "apropos word" to search for commands related to "word"...
>> "/usr/sbin/xfs_metadump": not in executable format: file format not recognized
>> [New LWP 12816]
>> Reading symbols from /usr/sbin/xfs_db...
>> (No debugging symbols found in /usr/sbin/xfs_db)
>> [Thread debugging using libthread_db enabled]
>> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
>> Core was generated by `/usr/sbin/xfs_db -i -p xfs_metadump -c metadump /dev/sda1'.
>> Program terminated with signal SIGSEGV, Segmentation fault.
>> #0  0x0000556f127d6857 in ?? ()
>> (gdb) bt full
>> #0  0x0000556f127d6857 in ?? ()
>> No symbol table info available.
>> #1  0x0000556f127dbdc4 in ?? ()
>> No symbol table info available.
>> #2  0x0000556f127d5546 in ?? ()
>> No symbol table info available.
>> #3  0x0000556f127db350 in ?? ()
>> No symbol table info available.
>> #4  0x0000556f127d5546 in ?? ()
>> No symbol table info available.
>> #5  0x0000556f127d99aa in ?? ()
>> No symbol table info available.
>> #6  0x0000556f127b9764 in ?? ()
>> No symbol table info available.
>> #7  0x00007eff29058ca8 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
>> No symbol table info available.
>> #8  0x00007eff29058d65 in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
>> No symbol table info available.
>> #9  0x0000556f127ba8c1 in ?? ()
>> No symbol table info available.
>>
>> And this:
>>
>> root@ap:/PETA/metadump# coredumpctl info
>>             PID: 13103 (xfs_db)
>>             UID: 0 (root)
>>             GID: 0 (root)
>>          Signal: 11 (SEGV)
>>       Timestamp: Mon 2025-08-18 19:03:19 CEST (1min 12s ago)
>>    Command Line: xfs_db -i -p xfs_metadump -c metadump -a -o -g -w $' /metadump/metadata.img' /dev/sda1
>>      Executable: /usr/sbin/xfs_db
>>   Control Group: /user.slice/user-0.slice/session-8.scope
>>            Unit: session-8.scope
>>           Slice: user-0.slice
>>         Session: 8
>>       Owner UID: 0 (root)
>>         Boot ID: c090e507272647838c77bcdefd67e79c
>>      Machine ID: 83edcebe83994c67ac4f88e2a3c185e3
>>        Hostname: ap
>>         Storage: /var/lib/systemd/coredump/core.xfs_db.0.c090e507272647838c77bcdefd67e79c.13103.1755536599000000.zst (present)
>>    Size on Disk: 26.2M
>>         Message: Process 13103 (xfs_db) of user 0 dumped core.
>>
>>                  Module libuuid.so.1 from deb util-linux-2.41-5.amd64
>>                  Stack trace of thread 13103:
>>                  #0  0x000055b961d29857 n/a (/usr/sbin/xfs_db + 0x32857)
>>                  #1  0x000055b961d2edc4 n/a (/usr/sbin/xfs_db + 0x37dc4)
>>                  #2  0x000055b961d28546 n/a (/usr/sbin/xfs_db + 0x31546)
>>                  #3  0x000055b961d2e350 n/a (/usr/sbin/xfs_db + 0x37350)
>>                  #4  0x000055b961d28546 n/a (/usr/sbin/xfs_db + 0x31546)
>>                  #5  0x000055b961d2c9aa n/a (/usr/sbin/xfs_db + 0x359aa)
>>                  #6  0x000055b961d0c764 n/a (/usr/sbin/xfs_db + 0x15764)
>>                  #7  0x00007fc870455ca8 n/a (libc.so.6 + 0x29ca8)
>>                  #8  0x00007fc870455d65 __libc_start_main (libc.so.6 + 0x29d65)
>>                  #9  0x000055b961d0d8c1 n/a (/usr/sbin/xfs_db + 0x168c1)
>>                  ELF object binary architecture: AMD x86-64
> Without the debug symbols it get virtually impossible to know what
> was going on =/
>> I guess my questions are: can the fs be so corrupted that it causes
>> xfs_metadump (or xfs_db) to segfault? Are there too many AGs / fs too
>> large?
>> Shall I assume that xfs_repair could fail similarly then?
> In a nutshell xfs_metadump shouldn't segfault even if the fs is
> corrupted.
> About xfs_repair, it depends, there is some code shared between both,
> but xfs_repair is much more resilient.
>
>> I'll appreciate any ideas. Also, if you think the core dump or other logs
>> could be useful, I can upload them somewhere.
> I'd start by running xfs_repair in no-modify mode, i.e. `xfs_repair -n`
> and check what it finds.
>
> Regarding the xfs_metadump segfault, yes, a core might be useful to
> investigate where the segfault is triggered, but you'll need to be
> running xfsprogs from the upstream tree (preferentially latest code), so
> we can actually match the core information the code.

I figured it was not all the needed info, thanks for clarifying.

Right now we had to put away the original hdds, as we cannot afford
another failed drive and time is pressing, and are dd'ing the image to a
real partition to try xfs_repair on it directly (takes days, of course,
but we're lucky we got the storage).
I will try the metadump and do further debugging if it segfaults again.

Regarding the "invalid argument" when attempting the metadump with the
image... could it be related to a mismatch with the block/sector size of
the host fs?
I thought about attaching the img to a loop device, but I wasn't sure if
xfs_metadump tries that already. Also at this point I don't trust myself
to try anything without a 2nd copy.

I'll let you know how that goes, thanks a lot again.
H.

> Cheers,
> Carlos.
>
>> Thanks again
>>
>>>
>>> Thanks
>>>> Also, xfs_repair does not need to be executed on the same architecture
>>>> as the FS was running. Despite log replay (which is done by the Linux
>>>> kernel), xfs_repair is capable of converting the filesystem data
>>>> structures back and forth to the current machine endianness
>>>>
>>>>
>>>>> Thanks in advance for any input
>>>>> Hub

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs_metadump segmentation fault on large fs - xfsprogs 6.1
  2025-08-27 10:51           ` hubert .
@ 2025-09-26  9:04             ` hubert .
  2025-09-26  9:39               ` Dave Chinner
  0 siblings, 1 reply; 14+ messages in thread
From: hubert . @ 2025-09-26  9:04 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: linux-xfs@vger.kernel.org


>> Hello Hubert, my apologies for the delay.
> No problem, Carlos, I'm also juggling several things, thanks for the follow-up
>
>> On Mon, Aug 18, 2025 at 03:56:53PM +0000, hubert . wrote:
>>> Am 18.08.25 um 17:14 schrieb hubert .:
>>>> Am 26.07.25 um 00:52 schrieb Carlos Maiolino:
>>>>> On Fri, Jul 25, 2025 at 11:27:40AM +0000, hubert . wrote:
>>>>>> Hi,
>>>>>>
>>>>>> A few months ago we had a serious crash in our monster RAID60 (~590TB) when one of the subvolume's disks failed and then then rebuild process triggered failures in other drives (you guessed it, no backup).
>>>>>> The hardware issues were plenty to the point where we don't rule out problems in the Areca controller either, compounding to some probably poor decisions on my part.
>>>>>> The rebuild took weeks to complete and we left it in a degraded state not to make things worse.
>>>>>> The first attempt to mount it read-only of course failed. From journalctl:
>>>>>>
>>>>>> kernel: XFS (sdb1): Mounting V5 Filesystem
>>>>>> kernel: XFS (sdb1): Starting recovery (logdev: internal)
>>>>>> kernel: XFS (sdb1): Metadata CRC error detected at xfs_agf_read_verify+0x70/0x120 [xfs], xfs_agf block 0xa7fffff59
>>>>>> kernel: XFS (sdb1): Unmount and run xfs_repair
>>>>>> kernel: XFS (sdb1): First 64 bytes of corrupted metadata buffer:
>>>>>> kernel: ffff89b444a94400: 74 4e 5a cc ae eb a0 6d 6c 08 95 5e ed 6b a4 ff  tNZ....ml..^.k..
>>>>>> kernel: ffff89b444a94410: be d2 05 24 09 f2 0a d2 66 f3 be 3a 7b 97 9a 84  ...$....f..:{...
>>>>>> kernel: ffff89b444a94420: a4 95 78 72 58 08 ca ec 10 a7 c3 20 1a a3 a6 08  ..xrX...... ....
>>>>>> kernel: ffff89b444a94430: b0 43 0f d6 80 fd 12 25 70 de 7f 28 78 26 3d 94  .C.....%p..(x&=.
>>>>>> kernel: XFS (sdb1): metadata I/O error: block 0xa7fffff59 ("xfs_trans_read_buf_map") error 74 numblks 1
>>>>>>
>>>>>> Following the advice in the list, I attempted to run a xfs_metadump (xfsprogs 4.5.0), but after after copying 30 out of 590 AGs, it segfaulted:
>>>>>> /usr/sbin/xfs_metadump: line 33:  3139 Segmentation fault      (core dumped) xfs_db$DBOPTS -i -p xfs_metadump -c "metadump$OPTS $2" $1
>>>>> I'm not sure what you expect from a metadump, this is usually used for
>>>>> post-mortem analysis, but you already know what went wrong and why
>>>> I was hoping to have a restored metadata file I could try things on
>>>> without risking the copy, since it's not possible to have a second one
>>>> with this inordinate amount of data.
>>>>
>>>>>> -journalctl:
>>>>>> xfs_db[3139]: segfault at 1015390b1 ip 0000000000407906 sp 00007ffcaef2c2c0 error 4 in xfs_db[400000+8a000]
>>>>>>
>>>>>> Now, the host machine is rather critical and old, running CentOS 7, 3.10 kernel on a Xeon X5650. Not trusting the hardware, I used ddrescue to clone the partition to some other luckily available system.
>>>>>> The copy went ok(?), but it did encounter reading errors at the end, which confirmed my suspicion that the rebuild process was not as successful. About 10MB could not be retrieved.
>>>>>>
>>>>>> I attempted a metadump on the copy too, now on a machine with AMD EPYC 7302, 128GB RAM, a 6.1 kernel and xfsprogs v6.1.0.
>>>>>>
>>>>>> # xfs_metadump -aogfw  /storage/image/sdb1.img   /storage/metadump/sdb1.metadump 2>&1 | tee mddump2.log
>>>>>>
>>>>>> It creates again a 280MB dump and at 30 AGs it segfaults:
>>>>>>
>>>>>> Jul24 14:47] xfs_db[42584]: segfault at 557051a1d2b0 ip 0000556f19f1e090 sp 00007ffe431a7be0 error 4 in xfs_db[556f19f04000+64000] likely on CPU 21 (core 9, socket 0)
>>>>>> [  +0.000025] Code: 00 00 00 83 f8 0a 0f 84 90 07 00 00 c6 44 24 53 00 48 63 f1 49 89 ff 48 c1 e6 04 48 8d 54 37 f0 48 bf ff ff ff ff ff ff 3f 00 <48> 8b 02 48 8b 52 08 48 0f c8 48 c1 e8 09 48 0f ca 81 e2 ff ff 1f
>>>>>>
>>>>>> This is the log https://pastebin.com/jsSFeCr6, which looks similar to the first one. The machine does not seem loaded at all and further tries result in the same code.
>>>>>>
>>>>>> My next step would be trying a later xfsprogs version, or maybe xfs_repair -n on a compatible CPU machine as non-destructive options, but I feel I'm kidding myself as to what I can try to recover anything at all from such humongous disaster.
>>>>> Yes, that's probably the best approach now. To run the latest xfsprogs
>>>>> available.
>>>> Ok, so I ran into some unrelated issues, but I could finally install xfsprogs 6.15.0:
>>>>
>>>> root@serv:~# xfs_metadump -aogfw /storage/image/sdb1.img  /storage/metadump/sdb1.metadump
>>>> xfs_metadump: read failed: Invalid argument
>>>> xfs_metadump: data size check failed
>>>> xfs_metadump: read failed: Invalid argument
>>>> xfs_metadump: cannot init perag data (22). Continuing anyway.
>>>> xfs_metadump: read failed: Invalid argument
>>>> empty log check failed
>>>> xlog_is_dirty: cannot find log head/tail (xlog_find_tail=-22)
>>>>
>>>> xfs_metadump: read failed: Invalid argument
>>>> xfs_metadump: cannot read superblock for ag 0
>>>> xfs_metadump: read failed: Invalid argument
>>>> xfs_metadump: cannot read agf block for ag 0
>>>> xfs_metadump: read failed: Invalid argument
>>>> xfs_metadump: cannot read agi block for ag 0
>>>> xfs_metadump: read failed: Invalid argument
>>>> xfs_metadump: cannot read agfl block for ag 0
>>>> xfs_metadump: read failed: Invalid argument
>>>> xfs_metadump: cannot read superblock for ag 1
>>>> xfs_metadump: read failed: Invalid argument
>>>> xfs_metadump: cannot read agf block for ag 1
>>>> xfs_metadump: read failed: Invalid argument
>>>> xfs_metadump: cannot read agi block for ag 1
>>>> xfs_metadump: read failed: Invalid argument
>>>> xfs_metadump: cannot read agfl block for ag 1
>>>> xfs_metadump: read failed: Invalid argument
>>>> xfs_metadump: cannot read superblock for ag 2
>>>> xfs_metadump: read failed: Invalid argument
>>>> xfs_metadump: cannot read agf block for ag 2
>>>> xfs_metadump: read failed: Invalid argument
>>>> xfs_metadump: cannot read agi block for ag 2
>>>> ...
>>>> ...
>>>> ...
>>>> xfs_metadump: read failed: Invalid argument
>>>> xfs_metadump: cannot read agfl block for ag 588
>>>> xfs_metadump: read failed: Invalid argument
>>>> xfs_metadump: cannot read superblock for ag 589
>>>> xfs_metadump: read failed: Invalid argument
>>>> xfs_metadump: cannot read agf block for ag 589
>>>> xfs_metadump: read failed: Invalid argument
>>>> xfs_metadump: cannot read agi block for ag 589
>>>> xfs_metadump: read failed: Invalid argument
>>>> xfs_metadump: cannot read agfl block for ag 589
>>>> Copying log
>>>> root@serv:~#
>>>>
>>>> It did create a 2.1GB dump which of course restores to an empty file.
>>>>
>>>> I thought I had messed up with some of the dependency libs, so then I
>>>> tried with xfsprogs 6.13 in Debian testing, same result.
>>>>
>>>> I'm not exactly sure why now it fails to read the image; nothing has
>>>> changed about it. I could not find much more info in the documentation.
>>>> What am I missing..?
>>> I tried a few more things on the img, as I realized it was probably not
>>> the best idea to dd it to a file instead of a device, but I got nowhere.
>>> After some team deliberations, we decided to connect the original block
>>> device to the new machine (Debian 13, 16 AMD cores, 128RAM, new
>>> controller, plenty of swap, xfsprogs 6.13) and and see if the dump was possible then.
>>>
>>> It had the same behavior as with with xfsprogs 6.1 and segfauled after
>>> 30 AGs. journalctl and dmesg don't really add any more info, so I tried
>>> to debug a bit, though I'm afraid it's all quite foreign to me:
>>>
>>> root@ap:/metadump# gdb xfs_metadump core.12816
>>> GNU gdb (Debian 16.3-1) 16.3
>>> Copyright (C) 2024 Free Software Foundation, Inc.
>>> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
>>> This is free software: you are free to change and redistribute it.
>>> There is NO WARRANTY, to the extent permitted by law.
>>> ...
>>> Type "apropos word" to search for commands related to "word"...
>>> "/usr/sbin/xfs_metadump": not in executable format: file format not recognized
>>> [New LWP 12816]
>>> Reading symbols from /usr/sbin/xfs_db...
>>> (No debugging symbols found in /usr/sbin/xfs_db)
>>> [Thread debugging using libthread_db enabled]
>>> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
>>> Core was generated by `/usr/sbin/xfs_db -i -p xfs_metadump -c metadump /dev/sda1'.
>>> Program terminated with signal SIGSEGV, Segmentation fault.
>>> #0  0x0000556f127d6857 in ?? ()
>>> (gdb) bt full
>>> #0  0x0000556f127d6857 in ?? ()
>>> No symbol table info available.
>>> #1  0x0000556f127dbdc4 in ?? ()
>>> No symbol table info available.
>>> #2  0x0000556f127d5546 in ?? ()
>>> No symbol table info available.
>>> #3  0x0000556f127db350 in ?? ()
>>> No symbol table info available.
>>> #4  0x0000556f127d5546 in ?? ()
>>> No symbol table info available.
>>> #5  0x0000556f127d99aa in ?? ()
>>> No symbol table info available.
>>> #6  0x0000556f127b9764 in ?? ()
>>> No symbol table info available.
>>> #7  0x00007eff29058ca8 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
>>> No symbol table info available.
>>> #8  0x00007eff29058d65 in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
>>> No symbol table info available.
>>> #9  0x0000556f127ba8c1 in ?? ()
>>> No symbol table info available.
>>>
>>> And this:
>>>
>>> root@ap:/PETA/metadump# coredumpctl info
>>>             PID: 13103 (xfs_db)
>>>             UID: 0 (root)
>>>             GID: 0 (root)
>>>          Signal: 11 (SEGV)
>>>       Timestamp: Mon 2025-08-18 19:03:19 CEST (1min 12s ago)
>>>    Command Line: xfs_db -i -p xfs_metadump -c metadump -a -o -g -w $' /metadump/metadata.img' /dev/sda1
>>>      Executable: /usr/sbin/xfs_db
>>>   Control Group: /user.slice/user-0.slice/session-8.scope
>>>            Unit: session-8.scope
>>>           Slice: user-0.slice
>>>         Session: 8
>>>       Owner UID: 0 (root)
>>>         Boot ID: c090e507272647838c77bcdefd67e79c
>>>      Machine ID: 83edcebe83994c67ac4f88e2a3c185e3
>>>        Hostname: ap
>>>         Storage: /var/lib/systemd/coredump/core.xfs_db.0.c090e507272647838c77bcdefd67e79c.13103.1755536599000000.zst (present)
>>>    Size on Disk: 26.2M
>>>         Message: Process 13103 (xfs_db) of user 0 dumped core.
>>>
>>>                  Module libuuid.so.1 from deb util-linux-2.41-5.amd64
>>>                  Stack trace of thread 13103:
>>>                  #0  0x000055b961d29857 n/a (/usr/sbin/xfs_db + 0x32857)
>>>                  #1  0x000055b961d2edc4 n/a (/usr/sbin/xfs_db + 0x37dc4)
>>>                  #2  0x000055b961d28546 n/a (/usr/sbin/xfs_db + 0x31546)
>>>                  #3  0x000055b961d2e350 n/a (/usr/sbin/xfs_db + 0x37350)
>>>                  #4  0x000055b961d28546 n/a (/usr/sbin/xfs_db + 0x31546)
>>>                  #5  0x000055b961d2c9aa n/a (/usr/sbin/xfs_db + 0x359aa)
>>>                  #6  0x000055b961d0c764 n/a (/usr/sbin/xfs_db + 0x15764)
>>>                  #7  0x00007fc870455ca8 n/a (libc.so.6 + 0x29ca8)
>>>                  #8  0x00007fc870455d65 __libc_start_main (libc.so.6 + 0x29d65)
>>>                  #9  0x000055b961d0d8c1 n/a (/usr/sbin/xfs_db + 0x168c1)
>>>                  ELF object binary architecture: AMD x86-64
>> Without the debug symbols it get virtually impossible to know what
>> was going on =/
>>> I guess my questions are: can the fs be so corrupted that it causes
>>> xfs_metadump (or xfs_db) to segfault? Are there too many AGs / fs too
>>> large?
>>> Shall I assume that xfs_repair could fail similarly then?
>> In a nutshell xfs_metadump shouldn't segfault even if the fs is
>> corrupted.
>> About xfs_repair, it depends, there is some code shared between both,
>> but xfs_repair is much more resilient.
>>
>>> I'll appreciate any ideas. Also, if you think the core dump or other logs
>>> could be useful, I can upload them somewhere.
>> I'd start by running xfs_repair in no-modify mode, i.e. `xfs_repair -n`
>> and check what it finds.
>>
>> Regarding the xfs_metadump segfault, yes, a core might be useful to
>> investigate where the segfault is triggered, but you'll need to be
>> running xfsprogs from the upstream tree (preferentially latest code), so
>> we can actually match the core information the code.
>
> I figured it was not all the needed info, thanks for clarifying.
>
> Right now we had to put away the original hdds, as we cannot afford
> another failed drive and time is pressing, and are dd'ing the image to a
> real partition to try xfs_repair on it directly (takes days, of course,
> but we're lucky we got the storage).
> I will try the metadump and do further debugging if it segfaults again.

So I'm back now with a real partition. 
First, I ran "xfs_repair -vn" and it did complete, reporting - as expected - a 
bunch of entries to junk, skipping the last phases with "Inode allocation 
btrees are too corrupted, skipping phases 6 and 7".
It created a 270MB log, I can upload it somewhere if it could be of interest.

Since clearly xfs_repair will throw away a lot of stuff, I pulled xfsprogs 
from git.kernel.org/pub/scm/fs/xfs/xfsprogs-dev.git (hope that's the right 
one), and tried again xfs_metadump on the partition. 
It segfaulted as last time, but this time I hope to have more useful info:

root@ap:/XFS/repair# coredumpctl debug 70665
           PID: 70665 (xfs_db)
           UID: 0 (root)
           GID: 0 (root)
        Signal: 11 (SEGV)
     Timestamp: Thu 2025-09-25 19:10:54 CEST (16h ago)
  Command Line: xfs_db -i -p xfs_metadump -c metadump -a -o -g -w $' /XFS/sda1.metadump' /dev/sda1
    Executable: /usr/sbin/xfs_db
 Control Group: /user.slice/user-0.slice/session-52.scope
          Unit: session-52.scope
         Slice: user-0.slice
       Session: 52
     Owner UID: 0 (root)
       Boot ID: 7b209b8c777947ef9f286a69376f109f
    Machine ID: 83edcebe83994c67ac4f88e2a3c185e3
      Hostname: ap
       Storage: /var/lib/systemd/coredump/core.xfs_db.0.7b209b8c777947ef9f286a69376f109f.70665.1758820254000000.zst (present)
  Size on Disk: 24.3M
       Message: Process 70665 (xfs_db) of user 0 dumped core.
                
                Module libuuid.so.1 from deb util-linux-2.41-5.amd64
                Stack trace of thread 70665:
                #0  0x000055c36404aca3 libxfs_bmbt_disk_get_all (/usr/sbin/xfs_db + 0x34ca3)
                #1  0x000055c36404e042 process_exinode (/usr/sbin/xfs_db + 0x38042)
                #2  0x000055c36404a182 scan_btree (/usr/sbin/xfs_db + 0x34182)
                #3  0x000055c36404da9b scanfunc_ino (/usr/sbin/xfs_db + 0x37a9b)
                #4  0x000055c36404a182 scan_btree (/usr/sbin/xfs_db + 0x34182)
                #5  0x000055c36404d2d8 copy_inodes (/usr/sbin/xfs_db + 0x372d8)
                #6  0x000055c36402ca64 main (/usr/sbin/xfs_db + 0x16a64)
                #7  0x00007fe91bce2ca8 n/a (libc.so.6 + 0x29ca8)
                #8  0x00007fe91bce2d65 __libc_start_main (libc.so.6 + 0x29d65)
                #9  0x000055c36402dba1 _start (/usr/sbin/xfs_db + 0x17ba1)
                ELF object binary architecture: AMD x86-64

GNU gdb (Debian 16.3-1) 16.3
...
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/sbin/xfs_db...
Reading symbols from /usr/lib/debug/.build-id/3d/2cfd2face51733516278556a9024e640d64678.debug...
[New LWP 70665]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/sbin/xfs_db -i -p xfs_metadump -c metadump /dev/sda1'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  libxfs_bmbt_disk_get_all (rec=0x55c47aec3eb0, irec=<synthetic pointer>) at ../include/libxfs.h:226

warning: 226	../include/libxfs.h: No such file or directory
(gdb) bt full
#0  libxfs_bmbt_disk_get_all (rec=0x55c47aec3eb0, irec=<synthetic pointer>) at ../include/libxfs.h:226
        l0 = <optimized out>
        l1 = <optimized out>
        l0 = <optimized out>
        l1 = <optimized out>
#1  convert_extent (rp=0x55c47aec3eb0, op=<synthetic pointer>, sp=<synthetic pointer>, cp=<synthetic pointer>, fp=<synthetic pointer>) at /build/reproducible-path/xfsprogs-6.16.0/db/bmap.c:320
        irec = <optimized out>
        irec = <optimized out>
#2  process_bmbt_reclist (dip=dip@entry=0x55c37aec3e00, whichfork=whichfork@entry=0, rp=0x55c37aec3eb0, numrecs=numrecs@entry=268435457) at /build/reproducible-path/xfsprogs-6.16.0/db/metadump.c:2181
        is_meta = false
        btype = <optimized out>
        i = <optimized out>
        o = <optimized out>
        op = <optimized out>
        s = <optimized out>
        c = <optimized out>
        cp = <optimized out>
        f = <optimized out>
        last = <optimized out>
        agno = <optimized out>
        agbno = <optimized out>
        rval = <optimized out>
#3  0x000055c36404e042 in process_exinode (dip=0x55c37aec3e00, whichfork=0) at /build/reproducible-path/xfsprogs-6.16.0/db/metadump.c:2421
        max_nex = <optimized out>
        nex = 268435457
        used = <optimized out>
        max_nex = <optimized out>
        nex = <optimized out>
        used = <optimized out>
#4  process_inode_data (dip=0x55c37aec3e00) at /build/reproducible-path/xfsprogs-6.16.0/db/metadump.c:2589
No locals.
#5  process_inode (agno=<optimized out>, agino=<optimized out>, dip=0x55c37aec3e00, free_inode=<optimized out>) at /build/reproducible-path/xfsprogs-6.16.0/db/metadump.c:2678
        rval = 1
        crc_was_ok = <optimized out>
        need_new_crc = false
        rval = <optimized out>
        crc_was_ok = <optimized out>
        need_new_crc = <optimized out>
        done = <optimized out>
#6  copy_inode_chunk (agno=<optimized out>, rp=<optimized out>) at /build/reproducible-path/xfsprogs-6.16.0/db/metadump.c:2821
        dip = 0x55c37aec3e00
        agbno = <optimized out>
        rval = 0
        blks_per_buf = <optimized out>
        agino = <optimized out>
        off = <optimized out>
        inodes_per_buf = <optimized out>
        end_agbno = <optimized out>
        i = 17
        ioff = <optimized out>
        igeo = <optimized out>
        agino = <optimized out>
        off = <optimized out>
        agbno = <optimized out>
        end_agbno = <optimized out>
        i = <optimized out>
        rval = <optimized out>
        blks_per_buf = <optimized out>
        inodes_per_buf = <optimized out>
        ioff = <optimized out>
        igeo = <optimized out>
        next_bp = <optimized out>
        pop_out = <optimized out>
        dip = <optimized out>
#7  scanfunc_ino (block=<optimized out>, agno=<optimized out>, agbno=<optimized out>, level=<optimized out>, btype=<optimized out>, arg=<optimized out>) at /build/reproducible-path/xfsprogs-6.16.0/db/metadump.c:2882
        rp = <optimized out>
        pp = <optimized out>
        i = 0
        numrecs = <optimized out>
--Type <RET> for more, q to quit, c to continue without paging--
        finobt = <optimized out>
        igeo = <optimized out>
#8  0x000055c36404a182 in scan_btree (agno=agno@entry=30, agbno=3, level=level@entry=1, btype=btype@entry=TYP_INOBT, arg=arg@entry=0x7ffece539510, func=func@entry=0x55c36404d830 <scanfunc_ino>) at /build/reproducible-path/xfsprogs-6.16.0/db/metadump.c:395
        rval = 0
#9  0x000055c36404da9b in scanfunc_ino (block=<optimized out>, agno=30, agbno=22371408, level=1, btype=<optimized out>, arg=0x7ffece539510) at /build/reproducible-path/xfsprogs-6.16.0/db/metadump.c:2905
        rp = <optimized out>
        pp = <optimized out>
        i = <optimized out>
        numrecs = <optimized out>
        finobt = <optimized out>
        igeo = <optimized out>
#10 0x000055c36404a182 in scan_btree (agno=agno@entry=30, agbno=22371408, level=2, btype=btype@entry=TYP_INOBT, arg=arg@entry=0x7ffece539510, func=func@entry=0x55c36404d830 <scanfunc_ino>) at /build/reproducible-path/xfsprogs-6.16.0/db/metadump.c:395
        rval = 0
#11 0x000055c36404d2d8 in copy_inodes (agno=30, agi=0x55c37a93cc00) at /build/reproducible-path/xfsprogs-6.16.0/db/metadump.c:2938
        root = <optimized out>
        levels = <optimized out>
        finobt = 0
        root = <optimized out>
        levels = <optimized out>
        finobt = <optimized out>
#12 scan_ag (agno=30) at /build/reproducible-path/xfsprogs-6.16.0/db/metadump.c:3077
        agi = 0x55c37a93cc00
        agf = <optimized out>
        stack_count = 4
        rval = 0
        agf = <optimized out>
        agi = <optimized out>
        stack_count = <optimized out>
        rval = <optimized out>
        pop_out = <optimized out>
        sb = <optimized out>
        i = <optimized out>
        agfl_bno = <optimized out>
#13 metadump_f (argc=<optimized out>, argv=<optimized out>) at /build/reproducible-path/xfsprogs-6.16.0/db/metadump.c:3574
        agno = 30
        c = <optimized out>
        start_iocur_sp = <optimized out>
        outfd = <optimized out>
        ret = <optimized out>
        p = 0x0
        version_opt_set = <optimized out>
        out = <optimized out>
#14 0x000055c36402ca64 in main (argc=<optimized out>, argv=<optimized out>) at /build/reproducible-path/xfsprogs-6.16.0/db/init.c:189
        c = 6
        i = <optimized out>
        done = 0
        input = <optimized out>
        v = 0x55c37a939340
        start_iocur_sp = 0
        close_devices = <optimized out>
(gdb) 

That was when using "-a". Running without it leads to an earlier segfault (agno=20).
dmesg and journalctl don't add any other info.
Let me know if there is any further debugging that I should try.

Thanks

>
> Regarding the "invalid argument" when attempting the metadump with the
> image... could it be related to a mismatch with the block/sector size of
> the host fs?
> I thought about attaching the img to a loop device, but I wasn't sure if
> xfs_metadump tries that already. Also at this point I don't trust myself
> to try anything without a 2nd copy.
>
> I'll let you know how that goes, thanks a lot again.
> H.
>
>> Cheers,
>> Carlos.
>>
>>> Thanks again
>>>
>>>>
>>>> Thanks
>>>>> Also, xfs_repair does not need to be executed on the same architecture
>>>>> as the FS was running. Despite log replay (which is done by the Linux
>>>>> kernel), xfs_repair is capable of converting the filesystem data
>>>>> structures back and forth to the current machine endianness
>>>>>
>>>>>
>>>>>> Thanks in advance for any input
>>>>>> Hub

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs_metadump segmentation fault on large fs - xfsprogs 6.1
  2025-09-26  9:04             ` hubert .
@ 2025-09-26  9:39               ` Dave Chinner
  2025-09-26 13:45                 ` Carlos Maiolino
  0 siblings, 1 reply; 14+ messages in thread
From: Dave Chinner @ 2025-09-26  9:39 UTC (permalink / raw)
  To: hubert .; +Cc: Carlos Maiolino, linux-xfs@vger.kernel.org

On Fri, Sep 26, 2025 at 09:04:12AM +0000, hubert . wrote:
> >> Regarding the xfs_metadump segfault, yes, a core might be useful to
> >> investigate where the segfault is triggered, but you'll need to be
> >> running xfsprogs from the upstream tree (preferentially latest code), so
> >> we can actually match the core information the code.
> >
> > I figured it was not all the needed info, thanks for clarifying.
> >
> > Right now we had to put away the original hdds, as we cannot afford
> > another failed drive and time is pressing, and are dd'ing the image to a
> > real partition to try xfs_repair on it directly (takes days, of course,
> > but we're lucky we got the storage).
> > I will try the metadump and do further debugging if it segfaults again.
> 
> So I'm back now with a real partition. 
> First, I ran "xfs_repair -vn" and it did complete, reporting - as expected - a 
> bunch of entries to junk, skipping the last phases with "Inode allocation 
> btrees are too corrupted, skipping phases 6 and 7".
> It created a 270MB log, I can upload it somewhere if it could be of interest.

No need, but thanks for the offer.

> Core was generated by `/usr/sbin/xfs_db -i -p xfs_metadump -c metadump /dev/sda1'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  libxfs_bmbt_disk_get_all (rec=0x55c47aec3eb0, irec=<synthetic pointer>) at ../include/libxfs.h:226
> 
> warning: 226	../include/libxfs.h: No such file or directory
> (gdb) bt full
> #0  libxfs_bmbt_disk_get_all (rec=0x55c47aec3eb0, irec=<synthetic pointer>) at ../include/libxfs.h:226
>         l0 = <optimized out>
>         l1 = <optimized out>
>         l0 = <optimized out>
>         l1 = <optimized out>

Ok, so it's faulted when trying to read a BMBT record from an
in-memory buffer...

Remember the addr of the rec (0x55c47aec3eb0) now....

> #1  convert_extent (rp=0x55c47aec3eb0, op=<synthetic pointer>, sp=<synthetic pointer>, cp=<synthetic pointer>, fp=<synthetic pointer>) at /build/reproducible-path/xfsprogs-6.16.0/db/bmap.c:320
>         irec = <optimized out>
>         irec = <optimized out>
> #2  process_bmbt_reclist (dip=dip@entry=0x55c37aec3e00, whichfork=whichfork@entry=0, rp=0x55c37aec3eb0, numrecs=numrecs@entry=268435457) at /build/reproducible-path/xfsprogs-6.16.0/db/metadump.c:2181

Smoking gun:

numrecs=numrecs@entry=268435457

268435457 = 2^28 + 1

>         is_meta = false
>         btype = <optimized out>
>         i = <optimized out>
>         o = <optimized out>
>         op = <optimized out>
>         s = <optimized out>
>         c = <optimized out>
>         cp = <optimized out>
>         f = <optimized out>
>         last = <optimized out>
>         agno = <optimized out>
>         agbno = <optimized out>
>         rval = <optimized out>
> #3  0x000055c36404e042 in process_exinode (dip=0x55c37aec3e00, whichfork=0) at /build/reproducible-path/xfsprogs-6.16.0/db/metadump.c:2421
>         max_nex = <optimized out>
>         nex = 268435457

Yup, that's the problem.

The inode is in extent format, which means the extent records are in
the inode data fork area, which is about 300 bytes max for a 512
byte inode. IOWs, it can hold about 12 BMBT records. The BMBT
records are in the on-disk inode buffer, as is the disk inode @dip.

Look at the address of dip:  0x55c37aec3e00
The address of the BMBT rec: 0x55c47aec3eb0

Now lok at what BMBT record convert_extent() is trying to access:

process_bmbt_reclist()
{
.....
	convert_extent(&rp[numrecs - 1], &o, &s, &c, &f);
.....

Yeah, that inode buffer isn't 268 million bmbt recrods long....

So there must be a bounds checking bug in process_exinode():

static int
process_exinode(
        struct xfs_dinode       *dip,
        int                     whichfork)
{
        xfs_extnum_t            max_nex = xfs_iext_max_nextents(
                        xfs_dinode_has_large_extent_counts(dip), whichfork);
        xfs_extnum_t            nex = xfs_dfork_nextents(dip, whichfork);
        int                     used = nex * sizeof(struct xfs_bmbt_rec);

        if (nex > max_nex || used > XFS_DFORK_SIZE(dip, mp, whichfork)) {
                if (metadump.show_warnings)
                        print_warning("bad number of extents %llu in inode %lld",
                                (unsigned long long)nex,
                                (long long)metadump.cur_ino);
                return 1;
        }

Can you spot it?

Hint: ((2^28 + 1) * 2^4) - 1 as an int is?

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs_metadump segmentation fault on large fs - xfsprogs 6.1
  2025-09-26  9:39               ` Dave Chinner
@ 2025-09-26 13:45                 ` Carlos Maiolino
  2025-09-27 23:22                   ` Dave Chinner
  0 siblings, 1 reply; 14+ messages in thread
From: Carlos Maiolino @ 2025-09-26 13:45 UTC (permalink / raw)
  To: Dave Chinner; +Cc: hubert ., linux-xfs@vger.kernel.org

On Fri, Sep 26, 2025 at 07:39:18PM +1000, Dave Chinner wrote:
> On Fri, Sep 26, 2025 at 09:04:12AM +0000, hubert . wrote:
> > >> Regarding the xfs_metadump segfault, yes, a core might be useful to
> > >> investigate where the segfault is triggered, but you'll need to be
> > >> running xfsprogs from the upstream tree (preferentially latest code), so
> > >> we can actually match the core information the code.
> > >
> > > I figured it was not all the needed info, thanks for clarifying.
> > >
> > > Right now we had to put away the original hdds, as we cannot afford
> > > another failed drive and time is pressing, and are dd'ing the image to a
> > > real partition to try xfs_repair on it directly (takes days, of course,
> > > but we're lucky we got the storage).
> > > I will try the metadump and do further debugging if it segfaults again.
> >
> > So I'm back now with a real partition.
> > First, I ran "xfs_repair -vn" and it did complete, reporting - as expected - a
> > bunch of entries to junk, skipping the last phases with "Inode allocation
> > btrees are too corrupted, skipping phases 6 and 7".
> > It created a 270MB log, I can upload it somewhere if it could be of interest.
> 
> No need, but thanks for the offer.
> 
> > Core was generated by `/usr/sbin/xfs_db -i -p xfs_metadump -c metadump /dev/sda1'.
> > Program terminated with signal SIGSEGV, Segmentation fault.
> > #0  libxfs_bmbt_disk_get_all (rec=0x55c47aec3eb0, irec=<synthetic pointer>) at ../include/libxfs.h:226
> >
> > warning: 226	../include/libxfs.h: No such file or directory
> > (gdb) bt full
> > #0  libxfs_bmbt_disk_get_all (rec=0x55c47aec3eb0, irec=<synthetic pointer>) at ../include/libxfs.h:226
> >         l0 = <optimized out>
> >         l1 = <optimized out>
> >         l0 = <optimized out>
> >         l1 = <optimized out>
> 
> Ok, so it's faulted when trying to read a BMBT record from an
> in-memory buffer...
> 
> Remember the addr of the rec (0x55c47aec3eb0) now....
> 
> > #1  convert_extent (rp=0x55c47aec3eb0, op=<synthetic pointer>, sp=<synthetic pointer>, cp=<synthetic pointer>, fp=<synthetic pointer>) at /build/reproducible-path/xfsprogs-6.16.0/db/bmap.c:320
> >         irec = <optimized out>
> >         irec = <optimized out>
> > #2  process_bmbt_reclist (dip=dip@entry=0x55c37aec3e00, whichfork=whichfork@entry=0, rp=0x55c37aec3eb0, numrecs=numrecs@entry=268435457) at /build/reproducible-path/xfsprogs-6.16.0/db/metadump.c:2181
> 
> Smoking gun:
> 
> numrecs=numrecs@entry=268435457
> 
> 268435457 = 2^28 + 1
> 
> >         is_meta = false
> >         btype = <optimized out>
> >         i = <optimized out>
> >         o = <optimized out>
> >         op = <optimized out>
> >         s = <optimized out>
> >         c = <optimized out>
> >         cp = <optimized out>
> >         f = <optimized out>
> >         last = <optimized out>
> >         agno = <optimized out>
> >         agbno = <optimized out>
> >         rval = <optimized out>
> > #3  0x000055c36404e042 in process_exinode (dip=0x55c37aec3e00, whichfork=0) at /build/reproducible-path/xfsprogs-6.16.0/db/metadump.c:2421
> >         max_nex = <optimized out>
> >         nex = 268435457
> 
> Yup, that's the problem.
> 
> The inode is in extent format, which means the extent records are in
> the inode data fork area, which is about 300 bytes max for a 512
> byte inode. IOWs, it can hold about 12 BMBT records. The BMBT
> records are in the on-disk inode buffer, as is the disk inode @dip.
> 
> Look at the address of dip:  0x55c37aec3e00
> The address of the BMBT rec: 0x55c47aec3eb0
> 
> Now lok at what BMBT record convert_extent() is trying to access:
> 
> process_bmbt_reclist()
> {
> .....
> 	convert_extent(&rp[numrecs - 1], &o, &s, &c, &f);
> .....
> 
> Yeah, that inode buffer isn't 268 million bmbt recrods long....
> 
> So there must be a bounds checking bug in process_exinode():
> 
> static int
> process_exinode(
>         struct xfs_dinode       *dip,
>         int                     whichfork)
> {
>         xfs_extnum_t            max_nex = xfs_iext_max_nextents(
>                         xfs_dinode_has_large_extent_counts(dip), whichfork);
>         xfs_extnum_t            nex = xfs_dfork_nextents(dip, whichfork);
>         int                     used = nex * sizeof(struct xfs_bmbt_rec);
> 
>         if (nex > max_nex || used > XFS_DFORK_SIZE(dip, mp, whichfork)) {
>                 if (metadump.show_warnings)
>                         print_warning("bad number of extents %llu in inode %lld",
>                                 (unsigned long long)nex,
>                                 (long long)metadump.cur_ino);
>                 return 1;
>         }
> 
> Can you spot it?
> 
> Hint: ((2^28 + 1) * 2^4) - 1 as an int is?

Perhaps the patch below will suffice?


diff --git a/db/metadump.c b/db/metadump.c
index 34f2d61700fe..1dd38ab84ade 100644
--- a/db/metadump.c
+++ b/db/metadump.c
@@ -2395,7 +2395,7 @@ process_btinode(
 
 static int
 process_exinode(
-	struct xfs_dinode 	*dip,
+	struct xfs_dinode	*dip,
 	int			whichfork)
 {
 	xfs_extnum_t		max_nex = xfs_iext_max_nextents(
@@ -2403,7 +2403,13 @@ process_exinode(
 	xfs_extnum_t		nex = xfs_dfork_nextents(dip, whichfork);
 	int			used = nex * sizeof(struct xfs_bmbt_rec);
 
-	if (nex > max_nex || used > XFS_DFORK_SIZE(dip, mp, whichfork)) {
+	/*
+	 * We need to check for overflow of used counter.
+	 * If the inode extent count is corrupted, we risk having a
+	 * big enough number of extents to overflow it.
+	 */
+	if (used < nex || nex > max_nex ||
+	    used > XFS_DFORK_SIZE(dip, mp, whichfork)) {
 		if (metadump.show_warnings)
 			print_warning("bad number of extents %llu in inode %lld",
 				(unsigned long long)nex,


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: xfs_metadump segmentation fault on large fs - xfsprogs 6.1
  2025-09-26 13:45                 ` Carlos Maiolino
@ 2025-09-27 23:22                   ` Dave Chinner
  2025-09-28  6:11                     ` Carlos Maiolino
  0 siblings, 1 reply; 14+ messages in thread
From: Dave Chinner @ 2025-09-27 23:22 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: hubert ., linux-xfs@vger.kernel.org

On Fri, Sep 26, 2025 at 03:45:17PM +0200, Carlos Maiolino wrote:
> On Fri, Sep 26, 2025 at 07:39:18PM +1000, Dave Chinner wrote:
> > So there must be a bounds checking bug in process_exinode():
> > 
> > static int
> > process_exinode(
> >         struct xfs_dinode       *dip,
> >         int                     whichfork)
> > {
> >         xfs_extnum_t            max_nex = xfs_iext_max_nextents(
> >                         xfs_dinode_has_large_extent_counts(dip), whichfork);
> >         xfs_extnum_t            nex = xfs_dfork_nextents(dip, whichfork);
> >         int                     used = nex * sizeof(struct xfs_bmbt_rec);
> > 
> >         if (nex > max_nex || used > XFS_DFORK_SIZE(dip, mp, whichfork)) {
> >                 if (metadump.show_warnings)
> >                         print_warning("bad number of extents %llu in inode %lld",
> >                                 (unsigned long long)nex,
> >                                 (long long)metadump.cur_ino);
> >                 return 1;
> >         }
> > 
> > Can you spot it?
> > 
> > Hint: ((2^28 + 1) * 2^4) - 1 as an int is?
> 
> Perhaps the patch below will suffice?
> 
> diff --git a/db/metadump.c b/db/metadump.c
> index 34f2d61700fe..1dd38ab84ade 100644
> --- a/db/metadump.c
> +++ b/db/metadump.c
> @@ -2395,7 +2395,7 @@ process_btinode(
>  
>  static int
>  process_exinode(
> -	struct xfs_dinode 	*dip,
> +	struct xfs_dinode	*dip,
>  	int			whichfork)
>  {
>  	xfs_extnum_t		max_nex = xfs_iext_max_nextents(
> @@ -2403,7 +2403,13 @@ process_exinode(
>  	xfs_extnum_t		nex = xfs_dfork_nextents(dip, whichfork);
>  	int			used = nex * sizeof(struct xfs_bmbt_rec);
>  
> -	if (nex > max_nex || used > XFS_DFORK_SIZE(dip, mp, whichfork)) {
> +	/*
> +	 * We need to check for overflow of used counter.
> +	 * If the inode extent count is corrupted, we risk having a
> +	 * big enough number of extents to overflow it.
> +	 */
> +	if (used < nex || nex > max_nex ||
> +	    used > XFS_DFORK_SIZE(dip, mp, whichfork)) {
>  		if (metadump.show_warnings)
>  			print_warning("bad number of extents %llu in inode %lld",
>  				(unsigned long long)nex,
> 

That fixes this specific problem, but now it will reject valid
inodes with valid but large extent counts.

What type does XFS_SB_FEAT_INCOMPAT_NREXT64 require for extent
count calculations?  i.e. what's the size of xfs_extnum_t?

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs_metadump segmentation fault on large fs - xfsprogs 6.1
  2025-09-27 23:22                   ` Dave Chinner
@ 2025-09-28  6:11                     ` Carlos Maiolino
  2025-09-30  9:22                       ` Dave Chinner
  0 siblings, 1 reply; 14+ messages in thread
From: Carlos Maiolino @ 2025-09-28  6:11 UTC (permalink / raw)
  To: Dave Chinner; +Cc: hubert ., linux-xfs@vger.kernel.org

On Sun, Sep 28, 2025 at 09:22:57AM +1000, Dave Chinner wrote:
> On Fri, Sep 26, 2025 at 03:45:17PM +0200, Carlos Maiolino wrote:
> > On Fri, Sep 26, 2025 at 07:39:18PM +1000, Dave Chinner wrote:
> > > So there must be a bounds checking bug in process_exinode():
> > >
> > > static int
> > > process_exinode(
> > >         struct xfs_dinode       *dip,
> > >         int                     whichfork)
> > > {
> > >         xfs_extnum_t            max_nex = xfs_iext_max_nextents(
> > >                         xfs_dinode_has_large_extent_counts(dip), whichfork);
> > >         xfs_extnum_t            nex = xfs_dfork_nextents(dip, whichfork);
> > >         int                     used = nex * sizeof(struct xfs_bmbt_rec);
> > >
> > >         if (nex > max_nex || used > XFS_DFORK_SIZE(dip, mp, whichfork)) {
> > >                 if (metadump.show_warnings)
> > >                         print_warning("bad number of extents %llu in inode %lld",
> > >                                 (unsigned long long)nex,
> > >                                 (long long)metadump.cur_ino);
> > >                 return 1;
> > >         }
> > >
> > > Can you spot it?
> > >
> > > Hint: ((2^28 + 1) * 2^4) - 1 as an int is?
> >
> > Perhaps the patch below will suffice?
> >
> > diff --git a/db/metadump.c b/db/metadump.c
> > index 34f2d61700fe..1dd38ab84ade 100644
> > --- a/db/metadump.c
> > +++ b/db/metadump.c
> > @@ -2395,7 +2395,7 @@ process_btinode(
> >
> >  static int
> >  process_exinode(
> > -	struct xfs_dinode 	*dip,
> > +	struct xfs_dinode	*dip,
> >  	int			whichfork)
> >  {
> >  	xfs_extnum_t		max_nex = xfs_iext_max_nextents(
> > @@ -2403,7 +2403,13 @@ process_exinode(
> >  	xfs_extnum_t		nex = xfs_dfork_nextents(dip, whichfork);
> >  	int			used = nex * sizeof(struct xfs_bmbt_rec);
> >
> > -	if (nex > max_nex || used > XFS_DFORK_SIZE(dip, mp, whichfork)) {
> > +	/*
> > +	 * We need to check for overflow of used counter.
> > +	 * If the inode extent count is corrupted, we risk having a
> > +	 * big enough number of extents to overflow it.
> > +	 */
> > +	if (used < nex || nex > max_nex ||
> > +	    used > XFS_DFORK_SIZE(dip, mp, whichfork)) {
> >  		if (metadump.show_warnings)
> >  			print_warning("bad number of extents %llu in inode %lld",
> >  				(unsigned long long)nex,
> >
> 
> That fixes this specific problem, but now it will reject valid
> inodes with valid but large extent counts.
> 
> What type does XFS_SB_FEAT_INCOMPAT_NREXT64 require for extent
> count calculations?  i.e. what's the size of xfs_extnum_t?

I thought about extending it to 64bit, but honestly thought it was not
necessary here as I thought the number of extents in an inode before it
was converted to btree format wouldn't exceed a 32-bit counter. That's a
trivial change for the patch, but still I think the overflow check
should still be there as even for a 64bit counter we could have enough
garbage to overflow it. Does it make sense to you?

-Carlos

> 
> -Dave.
> --
> Dave Chinner
> david@fromorbit.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs_metadump segmentation fault on large fs - xfsprogs 6.1
  2025-09-28  6:11                     ` Carlos Maiolino
@ 2025-09-30  9:22                       ` Dave Chinner
  2025-11-03  9:23                         ` hubert .
  0 siblings, 1 reply; 14+ messages in thread
From: Dave Chinner @ 2025-09-30  9:22 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: hubert ., linux-xfs@vger.kernel.org

On Sun, Sep 28, 2025 at 08:11:05AM +0200, Carlos Maiolino wrote:
> On Sun, Sep 28, 2025 at 09:22:57AM +1000, Dave Chinner wrote:
> > On Fri, Sep 26, 2025 at 03:45:17PM +0200, Carlos Maiolino wrote:
> > > On Fri, Sep 26, 2025 at 07:39:18PM +1000, Dave Chinner wrote:
> > > > So there must be a bounds checking bug in process_exinode():
> > > >
> > > > static int
> > > > process_exinode(
> > > >         struct xfs_dinode       *dip,
> > > >         int                     whichfork)
> > > > {
> > > >         xfs_extnum_t            max_nex = xfs_iext_max_nextents(
> > > >                         xfs_dinode_has_large_extent_counts(dip), whichfork);
> > > >         xfs_extnum_t            nex = xfs_dfork_nextents(dip, whichfork);
> > > >         int                     used = nex * sizeof(struct xfs_bmbt_rec);
> > > >
> > > >         if (nex > max_nex || used > XFS_DFORK_SIZE(dip, mp, whichfork)) {
> > > >                 if (metadump.show_warnings)
> > > >                         print_warning("bad number of extents %llu in inode %lld",
> > > >                                 (unsigned long long)nex,
> > > >                                 (long long)metadump.cur_ino);
> > > >                 return 1;
> > > >         }
> > > >
> > > > Can you spot it?
> > > >
> > > > Hint: ((2^28 + 1) * 2^4) - 1 as an int is?
> > >
> > > Perhaps the patch below will suffice?
> > >
> > > diff --git a/db/metadump.c b/db/metadump.c
> > > index 34f2d61700fe..1dd38ab84ade 100644
> > > --- a/db/metadump.c
> > > +++ b/db/metadump.c
> > > @@ -2395,7 +2395,7 @@ process_btinode(
> > >
> > >  static int
> > >  process_exinode(
> > > -	struct xfs_dinode 	*dip,
> > > +	struct xfs_dinode	*dip,
> > >  	int			whichfork)
> > >  {
> > >  	xfs_extnum_t		max_nex = xfs_iext_max_nextents(
> > > @@ -2403,7 +2403,13 @@ process_exinode(
> > >  	xfs_extnum_t		nex = xfs_dfork_nextents(dip, whichfork);
> > >  	int			used = nex * sizeof(struct xfs_bmbt_rec);
> > >
> > > -	if (nex > max_nex || used > XFS_DFORK_SIZE(dip, mp, whichfork)) {
> > > +	/*
> > > +	 * We need to check for overflow of used counter.
> > > +	 * If the inode extent count is corrupted, we risk having a
> > > +	 * big enough number of extents to overflow it.
> > > +	 */
> > > +	if (used < nex || nex > max_nex ||
> > > +	    used > XFS_DFORK_SIZE(dip, mp, whichfork)) {
> > >  		if (metadump.show_warnings)
> > >  			print_warning("bad number of extents %llu in inode %lld",
> > >  				(unsigned long long)nex,
> > >
> > 
> > That fixes this specific problem, but now it will reject valid
> > inodes with valid but large extent counts.
> > 
> > What type does XFS_SB_FEAT_INCOMPAT_NREXT64 require for extent
> > count calculations?  i.e. what's the size of xfs_extnum_t?
> 
> I thought about extending it to 64bit, but honestly thought it was not
> necessary here as I thought the number of extents in an inode before it
> was converted to btree format wouldn't exceed a 32-bit counter.

The filesystem is corrupt so the normal rules of sanity don't apply.
The extent count could be anything, and we can't assume that it fits
in a 32 bit value, nor that any unchecked calculation based on the
value fits in 32 bits.

Mixing integer types like this always leads to bugs. It's bad
practice because everyone who looks at the code has to think about
type conversion rules (which no-one ever remembers or gets right) to
determine if the code is correct or not. Nobody catches stuff
like this during review and the compiler is no help, either.

> That's a
> trivial change for the patch, but still I think the overflow check
> should still be there as even for a 64bit counter we could have enough
> garbage to overflow it. Does it make sense to you?

Yes, we need to check for overflow, but IMO, the best way to do
these checks is to use the same type (and hence unsigned 64 bit
math) throughout. This requires much less metnal gymnastics to
determine that it is obviously correct:

....
	xfs_extnum_t		used = nex * sizeof(struct xfs_bmbt_rec);

	// number of extents clearly bad
	if (nex > max_nex)
		goto warn;

	// catch extent array size overflow
	if (used < nex)
		goto warn;

	// extent array should fit in the inode fork
	if (used > XFS_DFORK_SIZE(dip, mp, whichfork))
		goto warn;

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs_metadump segmentation fault on large fs - xfsprogs 6.1
  2025-09-30  9:22                       ` Dave Chinner
@ 2025-11-03  9:23                         ` hubert .
  2025-11-03 10:18                           ` Carlos Maiolino
  0 siblings, 1 reply; 14+ messages in thread
From: hubert . @ 2025-11-03  9:23 UTC (permalink / raw)
  To: Dave Chinner, Carlos Maiolino; +Cc: linux-xfs@vger.kernel.org

>>>> On Fri, Sep 26, 2025 at 07:39:18PM +1000, Dave Chinner wrote:
>>>>> So there must be a bounds checking bug in process_exinode():
>>>>>
>>>>> static int
>>>>> process_exinode(
>>>>>         struct xfs_dinode       *dip,
>>>>>         int                     whichfork)
>>>>> {
>>>>>         xfs_extnum_t            max_nex = xfs_iext_max_nextents(
>>>>>                         xfs_dinode_has_large_extent_counts(dip), whichfork);
>>>>>         xfs_extnum_t            nex = xfs_dfork_nextents(dip, whichfork);
>>>>>         int                     used = nex * sizeof(struct xfs_bmbt_rec);
>>>>>
>>>>>         if (nex > max_nex || used > XFS_DFORK_SIZE(dip, mp, whichfork)) {
>>>>>                 if (metadump.show_warnings)
>>>>>                         print_warning("bad number of extents %llu in inode %lld",
>>>>>                                 (unsigned long long)nex,
>>>>>                                 (long long)metadump.cur_ino);
>>>>>                 return 1;
>>>>>         }
>>>>>
>>>>> Can you spot it?
>>>>>
>>>>> Hint: ((2^28 + 1) * 2^4) - 1 as an int is?
>>>>
>>>> Perhaps the patch below will suffice?
>>>>
>>>> diff --git a/db/metadump.c b/db/metadump.c
>>>> index 34f2d61700fe..1dd38ab84ade 100644
>>>> --- a/db/metadump.c
>>>> +++ b/db/metadump.c
>>>> @@ -2395,7 +2395,7 @@ process_btinode(
>>>>
>>>>  static int
>>>>  process_exinode(
>>>> - struct xfs_dinode       *dip,
>>>> + struct xfs_dinode       *dip,
>>>>   int                     whichfork)
>>>>  {
>>>>   xfs_extnum_t            max_nex = xfs_iext_max_nextents(
>>>> @@ -2403,7 +2403,13 @@ process_exinode(
>>>>   xfs_extnum_t            nex = xfs_dfork_nextents(dip, whichfork);
>>>>   int                     used = nex * sizeof(struct xfs_bmbt_rec);
>>>>
>>>> - if (nex > max_nex || used > XFS_DFORK_SIZE(dip, mp, whichfork)) {
>>>> + /*
>>>> +  * We need to check for overflow of used counter.
>>>> +  * If the inode extent count is corrupted, we risk having a
>>>> +  * big enough number of extents to overflow it.
>>>> +  */
>>>> + if (used < nex || nex > max_nex ||
>>>> +     used > XFS_DFORK_SIZE(dip, mp, whichfork)) {
>>>>           if (metadump.show_warnings)
>>>>                   print_warning("bad number of extents %llu in inode %lld",
>>>>                           (unsigned long long)nex,
>>>>
>>>
>>> That fixes this specific problem, but now it will reject valid
>>> inodes with valid but large extent counts.
>>>
>>> What type does XFS_SB_FEAT_INCOMPAT_NREXT64 require for extent
>>> count calculations?  i.e. what's the size of xfs_extnum_t?
>>
>> I thought about extending it to 64bit, but honestly thought it was not
>> necessary here as I thought the number of extents in an inode before it
>> was converted to btree format wouldn't exceed a 32-bit counter.
>
> The filesystem is corrupt so the normal rules of sanity don't apply.
> The extent count could be anything, and we can't assume that it fits
> in a 32 bit value, nor that any unchecked calculation based on the
> value fits in 32 bits.
>
> Mixing integer types like this always leads to bugs. It's bad
> practice because everyone who looks at the code has to think about
> type conversion rules (which no-one ever remembers or gets right) to
> determine if the code is correct or not. Nobody catches stuff
> like this during review and the compiler is no help, either.
>
>> That's a
>> trivial change for the patch, but still I think the overflow check
>> should still be there as even for a 64bit counter we could have enough
>> garbage to overflow it. Does it make sense to you?
>
> Yes, we need to check for overflow, but IMO, the best way to do
> these checks is to use the same type (and hence unsigned 64 bit
> math) throughout. This requires much less metnal gymnastics to
> determine that it is obviously correct:
>
> ....
>         xfs_extnum_t            used = nex * sizeof(struct xfs_bmbt_rec);
>
>         // number of extents clearly bad
>         if (nex > max_nex)
>                 goto warn;
>
>         // catch extent array size overflow
>         if (used < nex)
>                 goto warn;
>
>         // extent array should fit in the inode fork
>         if (used > XFS_DFORK_SIZE(dip, mp, whichfork))
>                 goto warn;

Dear Carlos, dear Dave,

Sorry for the late reply and thank you so much for looking into this.
Not sure if there is something else you would change here, but the patch
Carlos proposed worked for me and the metadump completed with no
issues.
Things got really busy since my last message, but I still wanted to
belatedly thank you both for your time and expert help.

Best,
Hub
>
>
> -Dave.
> --
> Dave Chinner
> david@fromorbit.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs_metadump segmentation fault on large fs - xfsprogs 6.1
  2025-11-03  9:23                         ` hubert .
@ 2025-11-03 10:18                           ` Carlos Maiolino
  0 siblings, 0 replies; 14+ messages in thread
From: Carlos Maiolino @ 2025-11-03 10:18 UTC (permalink / raw)
  To: hubert .; +Cc: Dave Chinner, linux-xfs@vger.kernel.org

On Mon, Nov 03, 2025 at 09:23:08AM +0000, hubert . wrote:
> >>>> On Fri, Sep 26, 2025 at 07:39:18PM +1000, Dave Chinner wrote:
> >>>>> So there must be a bounds checking bug in process_exinode():
> >>>>>
> >>>>> static int
> >>>>> process_exinode(
> >>>>>         struct xfs_dinode       *dip,
> >>>>>         int                     whichfork)
> >>>>> {
> >>>>>         xfs_extnum_t            max_nex = xfs_iext_max_nextents(
> >>>>>                         xfs_dinode_has_large_extent_counts(dip), whichfork);
> >>>>>         xfs_extnum_t            nex = xfs_dfork_nextents(dip, whichfork);
> >>>>>         int                     used = nex * sizeof(struct xfs_bmbt_rec);
> >>>>>
> >>>>>         if (nex > max_nex || used > XFS_DFORK_SIZE(dip, mp, whichfork)) {
> >>>>>                 if (metadump.show_warnings)
> >>>>>                         print_warning("bad number of extents %llu in inode %lld",
> >>>>>                                 (unsigned long long)nex,
> >>>>>                                 (long long)metadump.cur_ino);
> >>>>>                 return 1;
> >>>>>         }
> >>>>>
> >>>>> Can you spot it?
> >>>>>
> >>>>> Hint: ((2^28 + 1) * 2^4) - 1 as an int is?
> >>>>
> >>>> Perhaps the patch below will suffice?
> >>>>
> >>>> diff --git a/db/metadump.c b/db/metadump.c
> >>>> index 34f2d61700fe..1dd38ab84ade 100644
> >>>> --- a/db/metadump.c
> >>>> +++ b/db/metadump.c
> >>>> @@ -2395,7 +2395,7 @@ process_btinode(
> >>>>
> >>>>  static int
> >>>>  process_exinode(
> >>>> - struct xfs_dinode       *dip,
> >>>> + struct xfs_dinode       *dip,
> >>>>   int                     whichfork)
> >>>>  {
> >>>>   xfs_extnum_t            max_nex = xfs_iext_max_nextents(
> >>>> @@ -2403,7 +2403,13 @@ process_exinode(
> >>>>   xfs_extnum_t            nex = xfs_dfork_nextents(dip, whichfork);
> >>>>   int                     used = nex * sizeof(struct xfs_bmbt_rec);
> >>>>
> >>>> - if (nex > max_nex || used > XFS_DFORK_SIZE(dip, mp, whichfork)) {
> >>>> + /*
> >>>> +  * We need to check for overflow of used counter.
> >>>> +  * If the inode extent count is corrupted, we risk having a
> >>>> +  * big enough number of extents to overflow it.
> >>>> +  */
> >>>> + if (used < nex || nex > max_nex ||
> >>>> +     used > XFS_DFORK_SIZE(dip, mp, whichfork)) {
> >>>>           if (metadump.show_warnings)
> >>>>                   print_warning("bad number of extents %llu in inode %lld",
> >>>>                           (unsigned long long)nex,
> >>>>
> >>>
> >>> That fixes this specific problem, but now it will reject valid
> >>> inodes with valid but large extent counts.
> >>>
> >>> What type does XFS_SB_FEAT_INCOMPAT_NREXT64 require for extent
> >>> count calculations?  i.e. what's the size of xfs_extnum_t?
> >>
> >> I thought about extending it to 64bit, but honestly thought it was not
> >> necessary here as I thought the number of extents in an inode before it
> >> was converted to btree format wouldn't exceed a 32-bit counter.
> >
> > The filesystem is corrupt so the normal rules of sanity don't apply.
> > The extent count could be anything, and we can't assume that it fits
> > in a 32 bit value, nor that any unchecked calculation based on the
> > value fits in 32 bits.
> >
> > Mixing integer types like this always leads to bugs. It's bad
> > practice because everyone who looks at the code has to think about
> > type conversion rules (which no-one ever remembers or gets right) to
> > determine if the code is correct or not. Nobody catches stuff
> > like this during review and the compiler is no help, either.
> >
> >> That's a
> >> trivial change for the patch, but still I think the overflow check
> >> should still be there as even for a 64bit counter we could have enough
> >> garbage to overflow it. Does it make sense to you?
> >
> > Yes, we need to check for overflow, but IMO, the best way to do
> > these checks is to use the same type (and hence unsigned 64 bit
> > math) throughout. This requires much less metnal gymnastics to
> > determine that it is obviously correct:
> >
> > ....
> >         xfs_extnum_t            used = nex * sizeof(struct xfs_bmbt_rec);
> >
> >         // number of extents clearly bad
> >         if (nex > max_nex)
> >                 goto warn;
> >
> >         // catch extent array size overflow
> >         if (used < nex)
> >                 goto warn;
> >
> >         // extent array should fit in the inode fork
> >         if (used > XFS_DFORK_SIZE(dip, mp, whichfork))
> >                 goto warn;
> 
> Dear Carlos, dear Dave,
> 
> Sorry for the late reply and thank you so much for looking into this.
> Not sure if there is something else you would change here, but the patch
> Carlos proposed worked for me and the metadump completed with no
> issues.
> Things got really busy since my last message, but I still wanted to
> belatedly thank you both for your time and expert help.

I'll wriet a formal patch for this soon. Just couldn't get to it yet on
my TODO.

Cheers.

> 
> Best,
> Hub
> >
> >
> > -Dave.
> > --
> > Dave Chinner
> > david@fromorbit.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-11-03 10:18 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <f9Etb2La9b1KOT-5VdCdf6cd10olyT-FsRb8AZh8HNI1D4Czb610tw4BE15cNrEhY5OiXDGS7xR6R1trRyn1LA==@protonmail.internalid>
2025-07-25 11:27 ` xfs_metadump segmentation fault on large fs - xfsprogs 6.1 hubert .
2025-07-26  3:51   ` Carlos Maiolino
2025-08-01 13:51     ` hubert .
2025-08-18 15:56       ` hubert .
2025-08-25  7:51         ` Carlos Maiolino
2025-08-27 10:51           ` hubert .
2025-09-26  9:04             ` hubert .
2025-09-26  9:39               ` Dave Chinner
2025-09-26 13:45                 ` Carlos Maiolino
2025-09-27 23:22                   ` Dave Chinner
2025-09-28  6:11                     ` Carlos Maiolino
2025-09-30  9:22                       ` Dave Chinner
2025-11-03  9:23                         ` hubert .
2025-11-03 10:18                           ` Carlos Maiolino

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox