* "This is a bug." @ 2015-09-10 9:18 Tapani Tarvainen 2015-09-10 10:31 ` Tapani Tarvainen 2015-09-10 11:48 ` Emmanuel Florac 0 siblings, 2 replies; 21+ messages in thread From: Tapani Tarvainen @ 2015-09-10 9:18 UTC (permalink / raw) To: xfs Hi, After a rather spectacular crash we've got an xfs filesystem in unrepairable state: mount fails with "Structure needs cleaning", xfs_repair without options refuses to work (asks to mount first), and xfs_repair -L stopped with corrupt dinode 2151195170, extent total = 1, nblocks = 0. This is a bug. Please capture the filesystem metadata with xfs_metadump and report it to xfs@oss.sgi.com. I tried to dump the metadata and it failed, too: # xfs_metadump /dev/sdata1/data1 /data2/tmp/data1_metadump xfs_metadump: cannot init perag data (117) *** glibc detected *** xfs_db: double free or corruption (!prev): 0x0000000003361000 *** ======= Backtrace: ========= [...] Aborted At this point I'm going to give up trying to recover the data (recreate filesystem and restore from backup), but if you want to analyze it to find the bug, I have enough spare disk space to keep a copy for a while (took lvm snapshot of it before recovery attempt, could dd it to another location). The machine is running Debian Wheezy (7.8), kernel 3.2.0-4-amd64 #1 SMP Debian 3.2.68-1+deb7u1 x86_64 GNU/Linux, xfsprogs version 3.1.7+b1. And the filesystem is 6TB in size. If you want to take a look, please let me know what I can do to help. Thank you, -- Tapani Tarvainen _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: "This is a bug." 2015-09-10 9:18 "This is a bug." Tapani Tarvainen @ 2015-09-10 10:31 ` Tapani Tarvainen 2015-09-10 11:53 ` Emmanuel Florac 2015-09-10 11:48 ` Emmanuel Florac 1 sibling, 1 reply; 21+ messages in thread From: Tapani Tarvainen @ 2015-09-10 10:31 UTC (permalink / raw) To: xfs A perhaps interesting addition: xfs_metadump succeeds with the -o option (without it it fails consistently). If you think the dump would be useful to you I can probably send it unobfuscated (would have to check but I don't think there's anything sensitive in the filenames or attributes). -- Tapani Tarvainen On 10 Sep 12:18, Tapani Tarvainen (tapani.j.tarvainen@jyu.fi) wrote: > > Hi, > > After a rather spectacular crash we've got an xfs filesystem > in unrepairable state: mount fails with "Structure needs cleaning", > xfs_repair without options refuses to work (asks to mount first), > and xfs_repair -L stopped with > > corrupt dinode 2151195170, extent total = 1, nblocks = 0. This is a bug. > Please capture the filesystem metadata with xfs_metadump and > report it to xfs@oss.sgi.com. > > I tried to dump the metadata and it failed, too: > > # xfs_metadump /dev/sdata1/data1 /data2/tmp/data1_metadump > xfs_metadump: cannot init perag data (117) > *** glibc detected *** xfs_db: double free or corruption (!prev): 0x0000000003361000 *** > ======= Backtrace: ========= > [...] > Aborted > > > At this point I'm going to give up trying to recover the data > (recreate filesystem and restore from backup), but if you want to > analyze it to find the bug, I have enough spare disk space to keep a > copy for a while (took lvm snapshot of it before recovery attempt, > could dd it to another location). > > The machine is running Debian Wheezy (7.8), > kernel 3.2.0-4-amd64 #1 SMP Debian 3.2.68-1+deb7u1 x86_64 GNU/Linux, > xfsprogs version 3.1.7+b1. > And the filesystem is 6TB in size. > > If you want to take a look, please let me know what I can do to help. > > Thank you, > > -- > Tapani Tarvainen _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: "This is a bug." 2015-09-10 10:31 ` Tapani Tarvainen @ 2015-09-10 11:53 ` Emmanuel Florac 2015-09-10 12:05 ` Tapani Tarvainen 0 siblings, 1 reply; 21+ messages in thread From: Emmanuel Florac @ 2015-09-10 11:53 UTC (permalink / raw) To: Tapani Tarvainen; +Cc: xfs Le Thu, 10 Sep 2015 13:31:42 +0300 Tapani Tarvainen <tapani.j.tarvainen@jyu.fi> écrivait: > > Hi, > > > > After a rather spectacular crash we've got an xfs filesystem > > in unrepairable state: mount fails with "Structure needs cleaning", > > xfs_repair without options refuses to work (asks to mount first), > > and xfs_repair -L stopped with > > > > corrupt dinode 2151195170, extent total = 1, nblocks = 0. This is > > a bug. Please capture the filesystem metadata with xfs_metadump and > > report it to xfs@oss.sgi.com. > > If you're not afraid of binaries from the web, I've just compiled xfs-repair 4.2.0 on a Wheezy machine: http://update.intellique.com/pub/xfs_repair-4.2.0.gz -- ------------------------------------------------------------------------ Emmanuel Florac | Direction technique | Intellique | <eflorac@intellique.com> | +33 1 78 94 84 02 ------------------------------------------------------------------------ _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: "This is a bug." 2015-09-10 11:53 ` Emmanuel Florac @ 2015-09-10 12:05 ` Tapani Tarvainen 0 siblings, 0 replies; 21+ messages in thread From: Tapani Tarvainen @ 2015-09-10 12:05 UTC (permalink / raw) To: Emmanuel Florac; +Cc: xfs On 10 Sep 13:53, Emmanuel Florac (eflorac@intellique.com) wrote: > If you're not afraid of binaries from the web, I've just compiled > xfs-repair 4.2.0 on a Wheezy machine: Thanks, but I'd just compiled it myself... waiting to see what it does. -- Tapani Tarvainen _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: "This is a bug." 2015-09-10 9:18 "This is a bug." Tapani Tarvainen 2015-09-10 10:31 ` Tapani Tarvainen @ 2015-09-10 11:48 ` Emmanuel Florac 2015-09-10 11:55 ` Tapani Tarvainen 1 sibling, 1 reply; 21+ messages in thread From: Emmanuel Florac @ 2015-09-10 11:48 UTC (permalink / raw) To: Tapani Tarvainen; +Cc: xfs Le Thu, 10 Sep 2015 12:18:34 +0300 Tapani Tarvainen <tapani.j.tarvainen@jyu.fi> écrivait: > xfsprogs version 3.1.7+b1. Don't use that, it's much too old. Use at least a 3.2.x version, or if possible the very latest version of xfs_repair. You can copy over xfs_repair from another machine as it's a static binary. -- ------------------------------------------------------------------------ Emmanuel Florac | Direction technique | Intellique | <eflorac@intellique.com> | +33 1 78 94 84 02 ------------------------------------------------------------------------ _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: "This is a bug." 2015-09-10 11:48 ` Emmanuel Florac @ 2015-09-10 11:55 ` Tapani Tarvainen 2015-09-10 12:30 ` Tapani Tarvainen 0 siblings, 1 reply; 21+ messages in thread From: Tapani Tarvainen @ 2015-09-10 11:55 UTC (permalink / raw) To: Emmanuel Florac; +Cc: xfs On 10 Sep 13:48, Emmanuel Florac (eflorac@intellique.com) wrote: > > xfsprogs version 3.1.7+b1. > > Don't use that, it's much too old. Use at least a 3.2.x version, or if > possible the very latest version of xfs_repair. You can copy over > xfs_repair from another machine as it's a static binary. Seems it isn't static in Debian (copied over from a Jessie box): # /home/tt/xfs_repair -v /dev/sdata1/data1 /home/tt/xfs_repair: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.14' not found (required by /home/tt/xfs_repair) But if a new version is really likely to help I can build it from source. Thank you for the suggestion. -- Tapani Tarvainen _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: "This is a bug." 2015-09-10 11:55 ` Tapani Tarvainen @ 2015-09-10 12:30 ` Tapani Tarvainen 2015-09-10 12:36 ` Brian Foster 0 siblings, 1 reply; 21+ messages in thread From: Tapani Tarvainen @ 2015-09-10 12:30 UTC (permalink / raw) To: Emmanuel Florac; +Cc: xfs On 10 Sep 14:55, Tapani Tarvainen (tapani.j.tarvainen@jyu.fi) wrote: > On 10 Sep 13:48, Emmanuel Florac (eflorac@intellique.com) wrote: > > > > xfsprogs version 3.1.7+b1. > > > > Don't use that, it's much too old. Use at least a 3.2.x version, or if > > possible the very latest version of xfs_repair. With (self-compiled) 4.2.0 version xfs_repair without -L no longer refuses to run but after a while failed with [...] correcting nextents for inode 2152363147 xfs_repair: dinode.c:1961: process_inode_data_fork: Assertion `err == 0' failed. Aborted With -L ... same result. -- Tapani Tarvainen _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: "This is a bug." 2015-09-10 12:30 ` Tapani Tarvainen @ 2015-09-10 12:36 ` Brian Foster 2015-09-10 12:54 ` Tapani Tarvainen 0 siblings, 1 reply; 21+ messages in thread From: Brian Foster @ 2015-09-10 12:36 UTC (permalink / raw) To: Tapani Tarvainen; +Cc: xfs On Thu, Sep 10, 2015 at 03:30:30PM +0300, Tapani Tarvainen wrote: > On 10 Sep 14:55, Tapani Tarvainen (tapani.j.tarvainen@jyu.fi) wrote: > > On 10 Sep 13:48, Emmanuel Florac (eflorac@intellique.com) wrote: > > > > > > xfsprogs version 3.1.7+b1. > > > > > > Don't use that, it's much too old. Use at least a 3.2.x version, or if > > > possible the very latest version of xfs_repair. > > With (self-compiled) 4.2.0 version xfs_repair without -L no longer > refuses to run but after a while failed with > > [...] > correcting nextents for inode 2152363147 > xfs_repair: dinode.c:1961: process_inode_data_fork: Assertion `err == 0' failed. > Aborted > > With -L ... same result. > Care to post the metadump? Brian > -- > Tapani Tarvainen > > _______________________________________________ > xfs mailing list > xfs@oss.sgi.com > http://oss.sgi.com/mailman/listinfo/xfs _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: "This is a bug." 2015-09-10 12:36 ` Brian Foster @ 2015-09-10 12:54 ` Tapani Tarvainen 2015-09-10 13:01 ` Brian Foster 0 siblings, 1 reply; 21+ messages in thread From: Tapani Tarvainen @ 2015-09-10 12:54 UTC (permalink / raw) To: Brian Foster; +Cc: xfs On 10 Sep 08:36, Brian Foster (bfoster@redhat.com) wrote: > Care to post the metadump? It is 2.5GB so not really nice to mail... but if you want to take a look, here it is: https://huom.it.jyu.fi/tmp/data1.metadump -- Tapani Tarvainen _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: "This is a bug." 2015-09-10 12:54 ` Tapani Tarvainen @ 2015-09-10 13:01 ` Brian Foster 2015-09-10 13:05 ` Tapani Tarvainen 0 siblings, 1 reply; 21+ messages in thread From: Brian Foster @ 2015-09-10 13:01 UTC (permalink / raw) To: Tapani Tarvainen; +Cc: xfs On Thu, Sep 10, 2015 at 03:54:41PM +0300, Tapani Tarvainen wrote: > On 10 Sep 08:36, Brian Foster (bfoster@redhat.com) wrote: > > > Care to post the metadump? > > It is 2.5GB so not really nice to mail... but if you want > to take a look, here it is: > > https://huom.it.jyu.fi/tmp/data1.metadump > Can you compress it? Brian > -- > Tapani Tarvainen _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: "This is a bug." 2015-09-10 13:01 ` Brian Foster @ 2015-09-10 13:05 ` Tapani Tarvainen 2015-09-10 14:51 ` Brian Foster 0 siblings, 1 reply; 21+ messages in thread From: Tapani Tarvainen @ 2015-09-10 13:05 UTC (permalink / raw) To: Brian Foster; +Cc: xfs On 10 Sep 09:01, Brian Foster (bfoster@redhat.com) wrote: > > It is 2.5GB so not really nice to mail... > Can you compress it? Ah. Of course, should've done it in the first place. Still 250MB though: https://huom.it.jyu.fi/tmp/data1.metadump.gz -- Tapani Tarvainen _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: "This is a bug." 2015-09-10 13:05 ` Tapani Tarvainen @ 2015-09-10 14:51 ` Brian Foster 2015-09-10 15:05 ` Brian Foster 2015-09-10 17:31 ` Tapani Tarvainen 0 siblings, 2 replies; 21+ messages in thread From: Brian Foster @ 2015-09-10 14:51 UTC (permalink / raw) To: Tapani Tarvainen; +Cc: xfs On Thu, Sep 10, 2015 at 04:05:30PM +0300, Tapani Tarvainen wrote: > On 10 Sep 09:01, Brian Foster (bfoster@redhat.com) wrote: > > > > It is 2.5GB so not really nice to mail... > > > Can you compress it? > > Ah. Of course, should've done it in the first place. > Still 250MB though: > > https://huom.it.jyu.fi/tmp/data1.metadump.gz > First off, I see ~60MB of corruption output before I even get to the reported repair failure, so this appears to be an extremely severe corruption and I wouldn't be surprised if ultimately beyond repair (not that it matters for you, since you are restoring from backups). The failure itself is an assert failure against an error return value that appears to have a fallback path, so I'm not really sure why it's there. I tried just removing it to see what happens. It ran to completion, but there was a ton of output, write verifier errors, etc., so I'm not totally sure how coherent the result is yet. I'll run another repair pass and do some directory traversals and whatnot and see if it explodes... I suspect what's more interesting at this point is what happened to cause this level of corruption? What kind of event lead to this? Was it a pure filesystem crash or some kind of hardware/raid failure? Also, do you happen to know the geometry (xfs_info) of the original fs? Repair was showing agno's up in the 20k's and now that I've mounted the repaired image, xfs_info shows the following: meta-data=/dev/loop0 isize=256 agcount=24576, agsize=65536 blks = sectsz=4096 attr=2, projid32bit=0 = crc=0 finobt=0 spinodes=0 data = bsize=4096 blocks=1610612736, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=0 log =internal bsize=4096 blocks=2560, version=2 = sectsz=4096 sunit=1 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 So that's a 6TB fs with over 24000 allocation groups of size 256MB, as opposed to the mkfs default of 6 allocation groups of 1TB each. Is that intentional? Brian > -- > Tapani Tarvainen > > _______________________________________________ > xfs mailing list > xfs@oss.sgi.com > http://oss.sgi.com/mailman/listinfo/xfs _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: "This is a bug." 2015-09-10 14:51 ` Brian Foster @ 2015-09-10 15:05 ` Brian Foster 2015-09-10 17:52 ` Tapani Tarvainen 2015-09-10 17:31 ` Tapani Tarvainen 1 sibling, 1 reply; 21+ messages in thread From: Brian Foster @ 2015-09-10 15:05 UTC (permalink / raw) To: Tapani Tarvainen; +Cc: xfs On Thu, Sep 10, 2015 at 10:51:54AM -0400, Brian Foster wrote: > On Thu, Sep 10, 2015 at 04:05:30PM +0300, Tapani Tarvainen wrote: > > On 10 Sep 09:01, Brian Foster (bfoster@redhat.com) wrote: > > > > > > It is 2.5GB so not really nice to mail... > > > > > Can you compress it? > > > > Ah. Of course, should've done it in the first place. > > Still 250MB though: > > > > https://huom.it.jyu.fi/tmp/data1.metadump.gz > > > > First off, I see ~60MB of corruption output before I even get to the > reported repair failure, so this appears to be an extremely severe > corruption and I wouldn't be surprised if ultimately beyond repair (not > that it matters for you, since you are restoring from backups). > > The failure itself is an assert failure against an error return value > that appears to have a fallback path, so I'm not really sure why it's > there. I tried just removing it to see what happens. It ran to > completion, but there was a ton of output, write verifier errors, etc., > so I'm not totally sure how coherent the result is yet. I'll run another > repair pass and do some directory traversals and whatnot and see if it > explodes... > FWIW, the follow up repair did come up clean so it appears (so far) to have put the fs back together from a metadata standpoint. That said, >570k files end up in lost+found and who knows whether the files themselves would have contained the expected data once all of the bmaps are fixed up and whatnot. Brian > I suspect what's more interesting at this point is what happened to > cause this level of corruption? What kind of event lead to this? Was it > a pure filesystem crash or some kind of hardware/raid failure? > > Also, do you happen to know the geometry (xfs_info) of the original fs? > Repair was showing agno's up in the 20k's and now that I've mounted the > repaired image, xfs_info shows the following: > > meta-data=/dev/loop0 isize=256 agcount=24576, agsize=65536 blks > = sectsz=4096 attr=2, projid32bit=0 > = crc=0 finobt=0 spinodes=0 > data = bsize=4096 blocks=1610612736, imaxpct=25 > = sunit=0 swidth=0 blks > naming =version 2 bsize=4096 ascii-ci=0 ftype=0 > log =internal bsize=4096 blocks=2560, version=2 > = sectsz=4096 sunit=1 blks, lazy-count=1 > realtime =none extsz=4096 blocks=0, rtextents=0 > > So that's a 6TB fs with over 24000 allocation groups of size 256MB, as > opposed to the mkfs default of 6 allocation groups of 1TB each. Is that > intentional? > > Brian > > > -- > > Tapani Tarvainen > > > > _______________________________________________ > > xfs mailing list > > xfs@oss.sgi.com > > http://oss.sgi.com/mailman/listinfo/xfs > > _______________________________________________ > xfs mailing list > xfs@oss.sgi.com > http://oss.sgi.com/mailman/listinfo/xfs _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: "This is a bug." 2015-09-10 15:05 ` Brian Foster @ 2015-09-10 17:52 ` Tapani Tarvainen 2015-09-10 18:01 ` Tapani Tarvainen 0 siblings, 1 reply; 21+ messages in thread From: Tapani Tarvainen @ 2015-09-10 17:52 UTC (permalink / raw) To: Brian Foster; +Cc: xfs On Thu, Sep 10, 2015 at 11:05:25AM -0400, Brian Foster (bfoster@redhat.com) wrote: > FWIW, the follow up repair did come up clean so it appears (so far) to > have put the fs back together from a metadata standpoint. Indeed, now I can mount it. Not that I expect to find much useful there. Also, it turns out I have another similar case: another filesystem (in the same RAID set) failed also, even though it'd initially mounted without problems. Now it shows the same symptoms, neither mount nor repair works (have't tried repair -L yet). -- Tapani Tarvainen _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: "This is a bug." 2015-09-10 17:52 ` Tapani Tarvainen @ 2015-09-10 18:01 ` Tapani Tarvainen 0 siblings, 0 replies; 21+ messages in thread From: Tapani Tarvainen @ 2015-09-10 18:01 UTC (permalink / raw) To: Brian Foster; +Cc: xfs On Thu, Sep 10, 2015 at 08:52:20PM +0300, Tapani Tarvainen (tapani@tapanitarvainen.fi) wrote: > Also, it turns out I have another similar case: another filesystem > (in the same RAID set) failed also, even though it'd initially > mounted without problems. Now it shows the same symptoms, > neither mount nor repair works (have't tried repair -L yet). Apparently it only failed after someone tried to write on it. Potentially interesting stuff from dmesg: [31604.130052] XFS (dm-4): xfs_da_do_buf: bno 8388608 dir: inode 964 [31604.130080] XFS (dm-4): [00] br_startoff 8388608 br_startblock -2 br_blockcount 1 br_state 0 [31604.130126] XFS (dm-4): Internal error xfs_da_do_buf(1) at line 2011 of file /build/linux-4wkEzn/ linux-3.2.68/fs/xfs/xfs_da_btree.c. Caller 0xffffffffa0371b67 [31604.130127] [31604.130213] Pid: 18082, comm: du Not tainted 3.2.0-4-amd64 #1 Debian 3.2.68-1+deb7u1 [...] -- Tapani Tarvainen _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: "This is a bug." 2015-09-10 14:51 ` Brian Foster 2015-09-10 15:05 ` Brian Foster @ 2015-09-10 17:31 ` Tapani Tarvainen 2015-09-10 17:55 ` Brian Foster 1 sibling, 1 reply; 21+ messages in thread From: Tapani Tarvainen @ 2015-09-10 17:31 UTC (permalink / raw) To: Brian Foster; +Cc: xfs On Thu, Sep 10, 2015 at 10:51:54AM -0400, Brian Foster (bfoster@redhat.com) wrote: > First off, I see ~60MB of corruption output before I even get to the > reported repair failure, so this appears to be an extremely severe > corruption and I wouldn't be surprised if ultimately beyond repair I assumed as much already. > I suspect what's more interesting at this point is what happened to > cause this level of corruption? What kind of event lead to this? Was it > a pure filesystem crash or some kind of hardware/raid failure? Hardware failure. Details are still a bit unclear but apparently raid controller went haywire, offlining the array in the middle of heavy filesystem use. > Also, do you happen to know the geometry (xfs_info) of the original fs? No (and xfs_info doesn't work on the copy made after crash as it can't be mounted). > Repair was showing agno's up in the 20k's and now that I've mounted the > repaired image, xfs_info shows the following: [...] > So that's a 6TB fs with over 24000 allocation groups of size 256MB, as > opposed to the mkfs default of 6 allocation groups of 1TB each. Is that > intentional? Not to my knowledge. Unless I'm mistaken, the filesystem was created while the machine was running Debian Squeeze, using whatever defaults were back then. -- Tapani Tarvainen _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: "This is a bug." 2015-09-10 17:31 ` Tapani Tarvainen @ 2015-09-10 17:55 ` Brian Foster 2015-09-10 18:03 ` Tapani Tarvainen 0 siblings, 1 reply; 21+ messages in thread From: Brian Foster @ 2015-09-10 17:55 UTC (permalink / raw) To: Tapani Tarvainen; +Cc: xfs On Thu, Sep 10, 2015 at 08:31:38PM +0300, Tapani Tarvainen wrote: > On Thu, Sep 10, 2015 at 10:51:54AM -0400, Brian Foster (bfoster@redhat.com) wrote: > > > First off, I see ~60MB of corruption output before I even get to the > > reported repair failure, so this appears to be an extremely severe > > corruption and I wouldn't be surprised if ultimately beyond repair > > I assumed as much already. > > > I suspect what's more interesting at this point is what happened to > > cause this level of corruption? What kind of event lead to this? Was it > > a pure filesystem crash or some kind of hardware/raid failure? > > Hardware failure. Details are still a bit unclear but apparently raid > controller went haywire, offlining the array in the middle of > heavy filesystem use. > > > Also, do you happen to know the geometry (xfs_info) of the original fs? > > No (and xfs_info doesn't work on the copy made after crash as it > can't be mounted). > > > Repair was showing agno's up in the 20k's and now that I've mounted the > > repaired image, xfs_info shows the following: > [...] > > So that's a 6TB fs with over 24000 allocation groups of size 256MB, as > > opposed to the mkfs default of 6 allocation groups of 1TB each. Is that > > intentional? > > Not to my knowledge. Unless I'm mistaken, the filesystem was created > while the machine was running Debian Squeeze, using whatever defaults > were back then. > Strange... was the filesystem created small and then grown to a much larger size via xfs_growfs? I just formatted a 1GB fs that started with 4 allocation groups and ends with 24576 (same as above) AGs when grown to 6TB. Brian > -- > Tapani Tarvainen _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: "This is a bug." 2015-09-10 17:55 ` Brian Foster @ 2015-09-10 18:03 ` Tapani Tarvainen 2015-09-10 18:33 ` Brian Foster 2015-09-11 0:12 ` Eric Sandeen 0 siblings, 2 replies; 21+ messages in thread From: Tapani Tarvainen @ 2015-09-10 18:03 UTC (permalink / raw) To: Brian Foster; +Cc: xfs On Thu, Sep 10, 2015 at 01:55:58PM -0400, Brian Foster (bfoster@redhat.com) wrote: > > > So that's a 6TB fs with over 24000 allocation groups of size 256MB, as > > > opposed to the mkfs default of 6 allocation groups of 1TB each. Is that > > > intentional? > > > > Not to my knowledge. Unless I'm mistaken, the filesystem was created > > while the machine was running Debian Squeeze, using whatever defaults > > were back then. > Strange... was the filesystem created small and then grown to a much > larger size via xfs_growfs? Almost certainly yes, although how small it initially was I'm not sure. -- Tapani Tarvainen _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: "This is a bug." 2015-09-10 18:03 ` Tapani Tarvainen @ 2015-09-10 18:33 ` Brian Foster 2015-09-11 6:19 ` Tapani Tarvainen 2015-09-11 0:12 ` Eric Sandeen 1 sibling, 1 reply; 21+ messages in thread From: Brian Foster @ 2015-09-10 18:33 UTC (permalink / raw) To: Tapani Tarvainen; +Cc: xfs On Thu, Sep 10, 2015 at 09:03:39PM +0300, Tapani Tarvainen wrote: > On Thu, Sep 10, 2015 at 01:55:58PM -0400, Brian Foster (bfoster@redhat.com) wrote: > > > > > So that's a 6TB fs with over 24000 allocation groups of size 256MB, as > > > > opposed to the mkfs default of 6 allocation groups of 1TB each. Is that > > > > intentional? > > > > > > Not to my knowledge. Unless I'm mistaken, the filesystem was created > > > while the machine was running Debian Squeeze, using whatever defaults > > > were back then. > > > Strange... was the filesystem created small and then grown to a much > > larger size via xfs_growfs? > > Almost certainly yes, although how small it initially was I'm not > sure. > That probably explains that then. While growfs is obviously supported, it's not usually a great idea to grow from something really small to really large like this precisely because you end up with this kind of weird geometry. mkfs tries to format the fs to an ideal default geometry based on the current size of the device, but the allocation group size cannot be modified once the filesystem is created. Therefore, growfs can only add more AGs of the original size. As a result, you end up with a 6TB filesystem with >24k allocation groups, whereas mkfs will format a 6TB device with 6 allocation groups by default (though I think specifying a stripe unit can tweak this). My understanding is that this could be increased sanely on large cpu count systems and such, but we're probably talking about going to something on the order of 32 or 64 allocation groups as opposed to thousands. I'd expect such a large filesystem with such small allocation groups to probably introduce overhead in terms of metadata usage (24k agi's, agf's, 2x free space btrees and 1x inode btree per AG), spending more time in AG selection algorithms for allocations and whatnot, increased fragmentation due to capping the maximum contiguous extent size, creating more work for userspace tools such as repair, etc., and probably to have other weird or non-obvious side effects that I'm not familiar with. Brian > -- > Tapani Tarvainen _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: "This is a bug." 2015-09-10 18:33 ` Brian Foster @ 2015-09-11 6:19 ` Tapani Tarvainen 0 siblings, 0 replies; 21+ messages in thread From: Tapani Tarvainen @ 2015-09-11 6:19 UTC (permalink / raw) To: Brian Foster; +Cc: xfs On 10 Sep 14:33, Brian Foster (bfoster@redhat.com) wrote: > > >... was the filesystem created small and then grown to a much > > > larger size via xfs_growfs? > > > > Almost certainly yes, although how small it initially was I'm not > > sure. It actually been grown several times over the years - the system is rather old. Indeed, all its disks have been replaced with bigger ones without reinstallation, so the filesystem could not have been initially created as big as it is now. > That probably explains that then. While growfs is obviously supported, > it's not usually a great idea to grow from something really small to > really large like this That's good to know - but sometimes you just can't plan ahead far enough. > I'd expect such a large filesystem with such small allocation groups to > probably introduce overhead in terms of metadata usage (24k agi's, > agf's, 2x free space btrees and 1x inode btree per AG), spending more > time in AG selection algorithms for allocations and whatnot, increased > fragmentation due to capping the maximum contiguous extent size, > creating more work for userspace tools such as repair, etc., and > probably to have other weird or non-obvious side effects that I'm not > familiar with. So it's likely to also make it more fragile and harder to repair in case of a disaster like this. So, my take from this is that (1) The bug was real but it was just in the old version of xfs_repair in Debian Wheezy, and even when the machine is updated to Jessie (due soon) it's better to install latest (4.20) xfsprogs from sources rather Jessie's packaged 3.20; and (2) When a filesystem grows a lot it is better to recreate it (at least every now and then if the growth is incremental) rather than keep growing it forever. If there's anything you'd like to add, and especially if there is something you'd still like to debug where I could help, please let me know. Thank you for your help, -- Tapani Tarvainen _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: "This is a bug." 2015-09-10 18:03 ` Tapani Tarvainen 2015-09-10 18:33 ` Brian Foster @ 2015-09-11 0:12 ` Eric Sandeen 1 sibling, 0 replies; 21+ messages in thread From: Eric Sandeen @ 2015-09-11 0:12 UTC (permalink / raw) To: xfs On 9/10/15 1:03 PM, Tapani Tarvainen wrote: > On Thu, Sep 10, 2015 at 01:55:58PM -0400, Brian Foster (bfoster@redhat.com) wrote: > >>>> So that's a 6TB fs with over 24000 allocation groups of size 256MB, as >>>> opposed to the mkfs default of 6 allocation groups of 1TB each. Is that >>>> intentional? >>> >>> Not to my knowledge. Unless I'm mistaken, the filesystem was created >>> while the machine was running Debian Squeeze, using whatever defaults >>> were back then. > >> Strange... was the filesystem created small and then grown to a much >> larger size via xfs_growfs? > > Almost certainly yes, although how small it initially was I'm not > sure. Oof; with a default of 4 AGs that means that this filesystem was likely grown from 1G to 6T. Like Brian says, that is definitely not recommended. ;) -Eric _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2015-09-11 6:19 UTC | newest] Thread overview: 21+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-09-10 9:18 "This is a bug." Tapani Tarvainen 2015-09-10 10:31 ` Tapani Tarvainen 2015-09-10 11:53 ` Emmanuel Florac 2015-09-10 12:05 ` Tapani Tarvainen 2015-09-10 11:48 ` Emmanuel Florac 2015-09-10 11:55 ` Tapani Tarvainen 2015-09-10 12:30 ` Tapani Tarvainen 2015-09-10 12:36 ` Brian Foster 2015-09-10 12:54 ` Tapani Tarvainen 2015-09-10 13:01 ` Brian Foster 2015-09-10 13:05 ` Tapani Tarvainen 2015-09-10 14:51 ` Brian Foster 2015-09-10 15:05 ` Brian Foster 2015-09-10 17:52 ` Tapani Tarvainen 2015-09-10 18:01 ` Tapani Tarvainen 2015-09-10 17:31 ` Tapani Tarvainen 2015-09-10 17:55 ` Brian Foster 2015-09-10 18:03 ` Tapani Tarvainen 2015-09-10 18:33 ` Brian Foster 2015-09-11 6:19 ` Tapani Tarvainen 2015-09-11 0:12 ` Eric Sandeen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox