* XFS corruption after power surge/outage @ 2024-02-09 18:39 Jorge Garcia 2024-02-11 20:39 ` Eric Sandeen 0 siblings, 1 reply; 6+ messages in thread From: Jorge Garcia @ 2024-02-09 18:39 UTC (permalink / raw) To: linux-xfs Hello, We have a server with a very large (300+ TB) XFS filesystem that we use to provide downloads to the world. Last week's storms in California caused damage to our machine room, causing unexpected power surges and power outages, even in our UPS and generator backed data center. One of the end results was some data corruption on our server (running Centos 8). After looking around the internet for solutions to our issues, the general consensus seemed to be to run xfs_repair on the filesystem to get it to recover. We tried that (xfs_repair V 5.0) and it seemed to report lots of issues before eventually failing during "Phase 6" with an error like: Metadata corruption detected at 0x46d6c4, inode 0x8700657ff8 dinode fatal error -- couldn't map inode 579827236856, err = 117 After another set of internet searches, we found some postings that suggested this could be a bug that may have been fixed in later versions, so we built xfs_repair V 6.5 and tried the repair again. The results were the same. We even tried "xfs_repair -L", and no joy. So now we're desperate. Is the data all lost? We can't mount the filesystem. We tried using xfs_metadump (another suggestion from our searches) and it reports lots of metadata corruption ending with: Metadata corruption detected at 0x4382f0, xfs_cntbt block 0x1300023518/0x1000 Metadata corruption detected at 0x4382f0, xfs_cntbt block 0x1300296bf8/0x1000 Metadata corruption detected at 0x4382f0, xfs_bnobt block 0x137fffb258/0x1000 Metadata corruption detected at 0x4382f0, xfs_bnobt block 0x138009ebd8/0x1000 Metadata corruption detected at 0x467858, xfs_inobt block 0x138067f550/0x1000 Metadata corruption detected at 0x467858, xfs_inobt block 0x13834b39e0/0x1000 xfs_metadump: bad starting inode offset 5 Not sure what to try next. Any help would be greatly appreciated. Thanks! Jorge ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: XFS corruption after power surge/outage 2024-02-09 18:39 XFS corruption after power surge/outage Jorge Garcia @ 2024-02-11 20:39 ` Eric Sandeen 2024-02-12 18:07 ` Jorge Garcia 0 siblings, 1 reply; 6+ messages in thread From: Eric Sandeen @ 2024-02-11 20:39 UTC (permalink / raw) To: Jorge Garcia, linux-xfs On 2/9/24 12:39 PM, Jorge Garcia wrote: > Hello, > > We have a server with a very large (300+ TB) XFS filesystem that we > use to provide downloads to the world. Last week's storms in > California caused damage to our machine room, causing unexpected power > surges and power outages, even in our UPS and generator backed data > center. One of the end results was some data corruption on our server > (running Centos 8). After looking around the internet for solutions to > our issues, the general consensus seemed to be to run xfs_repair on > the filesystem to get it to recover. We tried that (xfs_repair V 5.0) > and it seemed to report lots of issues before eventually failing > during "Phase 6" with an error like: > > Metadata corruption detected at 0x46d6c4, inode 0x8700657ff8 dinode > > fatal error -- couldn't map inode 579827236856, err = 117 > > After another set of internet searches, we found some postings that > suggested this could be a bug that may have been fixed in later > versions, so we built xfs_repair V 6.5 and tried the repair again. The > results were the same. We even tried "xfs_repair -L", and no joy. So > now we're desperate. Is the data all lost? We can't mount the > filesystem. We tried using xfs_metadump (another suggestion from our > searches) and it reports lots of metadata corruption ending with: I was going to suggest creating an xfs_metadump image for analysis. Was that created with xfsprogs v6.5.0 as well? > Metadata corruption detected at 0x4382f0, xfs_cntbt block 0x1300023518/0x1000 > Metadata corruption detected at 0x4382f0, xfs_cntbt block 0x1300296bf8/0x1000 > Metadata corruption detected at 0x4382f0, xfs_bnobt block 0x137fffb258/0x1000 > Metadata corruption detected at 0x4382f0, xfs_bnobt block 0x138009ebd8/0x1000 > Metadata corruption detected at 0x467858, xfs_inobt block 0x138067f550/0x1000 > Metadata corruption detected at 0x467858, xfs_inobt block 0x13834b39e0/0x1000 > xfs_metadump: bad starting inode offset 5 so the metadump did not complete? Does the filesystem mount? Can you mount it -o ro or -o ro,norecovery to see how much you can read off of it? If mount fails, what is in the kernel log when it fails? > Not sure what to try next. Any help would be greatly appreciated. Thanks! Power losses really should not cause corruption, it's a metadata journaling filesytem which should maintain consistency even with a power loss. What kind of storage do you have, though? Corruption after a power loss often stems from a filesystem on a RAID with a write cache that does not honor data integrity commands and/or does not have its own battery backup. -Eric > Jorge > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: XFS corruption after power surge/outage 2024-02-11 20:39 ` Eric Sandeen @ 2024-02-12 18:07 ` Jorge Garcia 2024-02-12 21:06 ` Dave Chinner 0 siblings, 1 reply; 6+ messages in thread From: Jorge Garcia @ 2024-02-12 18:07 UTC (permalink / raw) To: Eric Sandeen; +Cc: linux-xfs On Sun, Feb 11, 2024 at 12:39 PM Eric Sandeen <sandeen@sandeen.net> wrote: > I was going to suggest creating an xfs_metadump image for analysis. > Was that created with xfsprogs v6.5.0 as well? > so the metadump did not complete? I actually tried running xfs_metadump with both v5.0 and v6.5.0. They both gave many error messages, but they created files. Not sure what I can do with those files > Does the filesystem mount? Can you mount it -o ro or -o ro,norecovery > to see how much you can read off of it? The file system doesn't mount. the message when I try to mount it is: mount: /data: wrong fs type, bad option, bad superblock on /dev/sda1, missing codepage or helper program, or other error. and Feb 12 10:06:02 hgdownload1 kernel: XFS (sda1): Superblock has unknown incompatible features (0x10) enabled. Feb 12 10:06:02 hgdownload1 kernel: XFS (sda1): Filesystem cannot be safely mounted by this kernel. Feb 12 10:06:02 hgdownload1 kernel: XFS (sda1): SB validate failed with error -22. I wonder if that is because I tried a xfs_repair with a newer version... > > If mount fails, what is in the kernel log when it fails? > Power losses really should not cause corruption, it's a metadata journaling > filesytem which should maintain consistency even with a power loss. > > What kind of storage do you have, though? Corruption after a power loss often > stems from a filesystem on a RAID with a write cache that does not honor > data integrity commands and/or does not have its own battery backup. We have a RAID 6 card with a BBU: Product Name : AVAGO MegaRAID SAS 9361-8i Serial No : SK00485396 FW Package Build: 24.21.0-0017 I agree that power issues should not cause corruption, but here we are. Somewhere on one of the discussion threads I saw somebody mention ufsexplorer, and when I downloaded the trial version, it seemed to see most of the files on the device. I guess if I can't find a way to recover the current filesystem, I will try to use that to recover the data. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: XFS corruption after power surge/outage 2024-02-12 18:07 ` Jorge Garcia @ 2024-02-12 21:06 ` Dave Chinner 2024-02-12 21:46 ` Jorge Garcia 2024-02-12 22:39 ` Eric Sandeen 0 siblings, 2 replies; 6+ messages in thread From: Dave Chinner @ 2024-02-12 21:06 UTC (permalink / raw) To: Jorge Garcia; +Cc: Eric Sandeen, linux-xfs On Mon, Feb 12, 2024 at 10:07:33AM -0800, Jorge Garcia wrote: > On Sun, Feb 11, 2024 at 12:39 PM Eric Sandeen <sandeen@sandeen.net> wrote: > > > I was going to suggest creating an xfs_metadump image for analysis. > > Was that created with xfsprogs v6.5.0 as well? > > > so the metadump did not complete? > > I actually tried running xfs_metadump with both v5.0 and v6.5.0. They > both gave many error messages, but they created files. Not sure what I > can do with those files Nothing - they are incomplete as metadump aborted at when it got that error. > > Does the filesystem mount? Can you mount it -o ro or -o ro,norecovery > > to see how much you can read off of it? > > The file system doesn't mount. the message when I try to mount it is: > > mount: /data: wrong fs type, bad option, bad superblock on /dev/sda1, > missing codepage or helper program, or other error. > > and > > Feb 12 10:06:02 hgdownload1 kernel: XFS (sda1): Superblock has unknown > incompatible features (0x10) enabled. > Feb 12 10:06:02 hgdownload1 kernel: XFS (sda1): Filesystem cannot be > safely mounted by this kernel. > Feb 12 10:06:02 hgdownload1 kernel: XFS (sda1): SB validate failed > with error -22. That has the XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR bit set... > I wonder if that is because I tried a xfs_repair with a newer version... .... which is a result of xfs_repair 6.5.0 crashing mid way through repair of the filesystem. Your kernel is too old to recognise the NEEDSREPAIR bit. You can clear it with xfs_db like this: Run this to get the current field value: # xfs_db -c "sb 0" -c "p features_incompat" <dev> Then subtract 0x10 from the value returned and run: # xfs_db -c "sb 0" -c "write features_incompat <val>" <dev> But that won't get you too far - the filesystem is still corrupt and inconsistent. By blowing away the log with xfs_repair before actually determining if the problem was caused by a RAID array issue, you've essentially forced yourself into a filesystem recovery situation. > > If mount fails, what is in the kernel log when it fails? > > > Power losses really should not cause corruption, it's a metadata journaling > > filesytem which should maintain consistency even with a power loss. > > > > What kind of storage do you have, though? Corruption after a power loss often > > stems from a filesystem on a RAID with a write cache that does not honor > > data integrity commands and/or does not have its own battery backup. > > We have a RAID 6 card with a BBU: > > Product Name : AVAGO MegaRAID SAS 9361-8i > Serial No : SK00485396 > FW Package Build: 24.21.0-0017 Ok, so they don't actually have a BBU on board - it's an option to add via a module, but the basic RAID controller doesn't have any power failure protection. These cards are also pretty old tech now - how old is this card, and when was the last time the cache protection module was tested? Indeed, how long was the power out for? The BBU on most RAID controllers is only guaranteed to hold the state for 72 hours (when new) and I've personally seen them last for only a few minutes before dying when the RAID controller had been in continuous service for ~5 years. So the duration of the power failure may be important here. Also, how are the back end disks configured? Do they have their volatile write caches turned off? What cache mode was the RAID controller operating in - write-back or write-through? What's the rest of your storage stack? Do you have MD, LVM, etc between the storage hardware and the filesystem? > I agree that power issues should not cause corruption, but here we > are. Yup. Keep in mind that we do occasionally see these old LSI hardware raid cards corrupt storage on power failure, so we're not necessarily even looking for filesystem problems at this point in time. We need to rule that out first before doing any more damage to the filesystem than you've already done trying to recover it so far... > Somewhere on one of the discussion threads I saw somebody mention > ufsexplorer, and when I downloaded the trial version, it seemed to see > most of the files on the device. I guess if I can't find a way to > recover the current filesystem, I will try to use that to recover the > data. Well, that's a last resort. But if your raid controller is unhealthy or the volume has been corrupted by the raid controller the ufsexplorer won't help you get your data back, either.... Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: XFS corruption after power surge/outage 2024-02-12 21:06 ` Dave Chinner @ 2024-02-12 21:46 ` Jorge Garcia 2024-02-12 22:39 ` Eric Sandeen 1 sibling, 0 replies; 6+ messages in thread From: Jorge Garcia @ 2024-02-12 21:46 UTC (permalink / raw) To: Dave Chinner; +Cc: Eric Sandeen, linux-xfs On Mon, Feb 12, 2024 at 1:06 PM Dave Chinner <david@fromorbit.com> wrote: > Ok, so they don't actually have a BBU on board - it's an option to > add via a module, but the basic RAID controller doesn't have any > power failure protection. These cards are also pretty old tech now - > how old is this card, and when was the last time the cache > protection module was tested? The card claims Mfg. Date or 01/26/20, which is not too old. The last time the cache protection was tested? No idea. The BBU status for the card reports battery state optimal. > Indeed, how long was the power out for? I'm not exactly sure how long power was out for, but probably less than an hour. The data center is supposed to have UPS power and generator power, but a breaker tripped, and we lost power. My guess is that power was out for less than an hour, and probably more like a few minutes. > > The BBU on most RAID controllers is only guaranteed to hold the > state for 72 hours (when new) and I've personally seen them last for > only a few minutes before dying when the RAID controller had been in > continuous service for ~5 years. So the duration of the power > failure may be important here. > > Also, how are the back end disks configured? Do they have their > volatile write caches turned off? What cache mode was the RAID > controller operating in - write-back or write-through? > > What's the rest of your storage stack? Do you have MD, LVM, etc > between the storage hardware and the filesystem? You may be asking questions I'm not sure how to answer. Most of the settings are default settings. RAID controller was operating in WB mode. No MD or LVM, just 24 disks in MegaRAID RAID-6 configuration, then seen by the OS as one device, which was formatted as XFS. > > Somewhere on one of the discussion threads I saw somebody mention > > ufsexplorer, and when I downloaded the trial version, it seemed to see > > most of the files on the device. I guess if I can't find a way to > > recover the current filesystem, I will try to use that to recover the > > data. > > Well, that's a last resort. But if your raid controller is unhealthy > or the volume has been corrupted by the raid controller the > ufsexplorer won't help you get your data back, either.... The controller is reporting everything as working, all disks are Online and Spun Up, and no errors reported as far as I can tell. I did get ufsexplorer, and it seems to be recovering data, but it will take days or weeks to recover all of the data. Still would like to know more of what happened and how to prevent it from happening in the future, and what would have been the correct sequence of steps I should have done when encountering a problem like this. Thanks for all your help! ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: XFS corruption after power surge/outage 2024-02-12 21:06 ` Dave Chinner 2024-02-12 21:46 ` Jorge Garcia @ 2024-02-12 22:39 ` Eric Sandeen 1 sibling, 0 replies; 6+ messages in thread From: Eric Sandeen @ 2024-02-12 22:39 UTC (permalink / raw) To: Dave Chinner, Jorge Garcia; +Cc: linux-xfs On 2/12/24 3:06 PM, Dave Chinner wrote: > That has the XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR bit set... > >> I wonder if that is because I tried a xfs_repair with a newer version... > .... which is a result of xfs_repair 6.5.0 crashing mid way through > repair of the filesystem. Your kernel is too old to recognise the > NEEDSREPAIR bit. You can clear it with xfs_db like this: > > Run this to get the current field value: > > # xfs_db -c "sb 0" -c "p features_incompat" <dev> > > Then subtract 0x10 from the value returned and run: > > # xfs_db -c "sb 0" -c "write features_incompat <val>" <dev> > > But that won't get you too far - the filesystem is still corrupt and > inconsistent. By blowing away the log with xfs_repair before > actually determining if the problem was caused by a RAID array > issue, you've essentially forced yourself into a filesystem recovery > situation. Everything Dave said, yes. Depending on how bad the corruption is, you *might* be able to do a readonly or readonly/norecovery mount and scrape some data out. Ideally the first thing to do would be to make a 1:1 dd image of the device as a safe backup, but I understand it's 300T ... -Eric ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-02-12 22:39 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-02-09 18:39 XFS corruption after power surge/outage Jorge Garcia 2024-02-11 20:39 ` Eric Sandeen 2024-02-12 18:07 ` Jorge Garcia 2024-02-12 21:06 ` Dave Chinner 2024-02-12 21:46 ` Jorge Garcia 2024-02-12 22:39 ` Eric Sandeen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).