From: Brian Foster <bfoster@redhat.com>
To: Ingard - <ingard1@gmail.com>
Cc: Bill O'Donnell <billodo@redhat.com>, linux-xfs@vger.kernel.org
Subject: Re: corrupt xfs log
Date: Wed, 30 Aug 2017 10:58:09 -0400 [thread overview]
Message-ID: <20170830145808.GB16641@bfoster.bfoster> (raw)
In-Reply-To: <CAHeWggPzVSXXGK+35jEf9FD2gM-QcL_3iebaKzjOK7DbRmWDiA@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 5803 bytes --]
On Mon, Aug 21, 2017 at 10:24:32PM +0200, Ingard - wrote:
> On Mon, Aug 21, 2017 at 5:51 PM, Brian Foster <bfoster@redhat.com> wrote:
> > On Mon, Aug 21, 2017 at 02:08:43PM +0200, Ingard - wrote:
> >> On Fri, Aug 18, 2017 at 2:17 PM, Brian Foster <bfoster@redhat.com> wrote:
> >> > On Fri, Aug 18, 2017 at 07:02:24AM -0500, Bill O'Donnell wrote:
> >> >> On Fri, Aug 18, 2017 at 01:56:31PM +0200, Ingard - wrote:
> >> >> > After a server crash we've encountered a corrupt xfs filesystem. When
> >> >> > trying to mount said filesystem normally the system hangs.
> >> >> > This was initially on a ubuntu trusty server with 3.13 kernel with
> >> >> > xfsprogs 3.1.9
> >> >> >
> >> >> > We've installed a newer kernel (4.4.0-92) and compiled xfsprogs v
> >> >> > 4.12.0 from source. We're still not able to mount the filesystem (and
> >> >> > replay the log) normally.
> >> >> > We are able to mount it -o ro,norecovery, but we're reluctant to do
> >> >> > xfs_repair -L without trying everything we can first. The filesystem
> >> >> > is browsable albeit a few paths which gives an error : "Structure
> >> >> > needs cleaning"
> >> >> >
> >> >> > Does anyone have any advice as to how we might recover/repair the
> >> >> > corrupt log so we can replay it? Or is xfs_repair -L the only way
> >> >> > forward?
> >> >>
> >> >> Can you try xfs_repair -n (only scans the fs and reports what repairs
> >> >> would be made)?
> >> >>
> >> >
> >> > An xfs_metadump of the fs might be useful as well. Then we can see if we
> >> > can reproduce the mount hang on latest kernels and if so, potentially
> >> > try and root cause it.
> >> >
> >> > Brian
> >>
> >> Here is a link for the metadump :
> >> https://www.jottacloud.com/p/ingardme/95ec2e45ba80431d962345981d38bdff
> >
> > This points to a 29GB image file, apparently uncompressed..? Could you
> > upload a compressed file? Thanks.
>
> Hi. Sorry about that. Didnt realize the output would be compressable.
> Here is a link to the compressed tgz (6G)
> https://www.jottacloud.com/p/ingardme/cac6939649e14b98b928647f5222a2ae
>
I finally played around with this image a bit. Note that mount does not
hang on latest kernels. Instead, log recovery emits a torn write message
due to a bad crc at the head of the log and then ultimately fails due to
a bad crc at the tail of the log. I ran a couple experiments to skip the
bad crc records and/or to completely ignore all bad crc's and both still
either fail to mount (due to other corruption) or continue to show
corruption in the recovered fs.
It's not clear to me what would have caused this corruption or log
state. Have you encountered any corruption before? If not, is this kind
of crash or unclean shutdown of the server an uncommon event?
That aside, I think the best course of action is to run 'xfs_repair -L'
on the fs. I ran a v4.12 version against the metadump image and it
successfully repaired the fs. I've attached the repair output for
reference, but I would recommend to first restore your metadump to a
temporary location, attempt to repair that and examine the results
before repairing the original fs. Note that the metadump will not have
any file content, but will represent which files might be cleared, moved
to lost+found, etc.
Brian
> >
> > Brian
> >
> >> And the repair -n output :
> >> https://www.jottacloud.com/p/ingardme/0205c6ca6f7e495ebcda5f255b96f63d
> >>
> >> kind regards
> >> ingard
> >>
> >> >
> >> >> Thanks-
> >> >> Bill
> >> >>
> >> >>
> >> >> >
> >> >> >
> >> >> > Excerpt from kern.log:
> >> >> > 2017-08-17T13:40:41.122121+02:00 dn-238 kernel: [ 294.300347] XFS
> >> >> > (sdd1): Mounting V4 filesystem in no-recovery mode. Filesystem will be
> >> >> > inconsistent.
> >> >> >
> >> >> > 2017-08-17T17:04:54.794194+02:00 dn-238 kernel: [12548.400260] XFS
> >> >> > (sdd1): Metadata corruption detected at xfs_inode_buf_verify+0x6f/0xd0
> >> >> > [xfs], xfs_inode block 0x81c9c210
> >> >> > 2017-08-17T17:04:54.794216+02:00 dn-238 kernel: [12548.400342] XFS
> >> >> > (sdd1): Unmount and run xfs_repair
> >> >> > 2017-08-17T17:04:54.794218+02:00 dn-238 kernel: [12548.400374] XFS
> >> >> > (sdd1): First 64 bytes of corrupted metadata buffer:
> >> >> > 2017-08-17T17:04:54.794220+02:00 dn-238 kernel: [12548.400418]
> >> >> > ffff880171fff000: 3f 1a 33 54 5b 55 85 0b 7c f5 c6 d5 cf 51 47 41
> >> >> > ?.3T[U..|....QGA
> >> >> > 2017-08-17T17:04:54.794222+02:00 dn-238 kernel: [12548.400473]
> >> >> > ffff880171fff010: 97 ba ba 03 5c e4 02 7a e6 bc fb 5d f1 72 db c1
> >> >> > ....\..z...].r..
> >> >> > 2017-08-17T17:04:54.794223+02:00 dn-238 kernel: [12548.400527]
> >> >> > ffff880171fff020: c8 ad 3a 76 c7 e4 20 92 88 a2 35 0c 1f 36 cf b5
> >> >> > ..:v.. ...5..6..
> >> >> > 2017-08-17T17:04:54.794226+02:00 dn-238 kernel: [12548.400581]
> >> >> > ffff880171fff030: 8a bc 42 75 86 50 a0 a2 be 2c 2d 99 96 2d e1 ee
> >> >> > ..Bu.P...,-..-..
> >> >> >
> >> >> > kind regards
> >> >> > ingard
> >> >> > --
> >> >> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> >> >> > the body of a message to majordomo@vger.kernel.org
> >> >> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >> >> --
> >> >> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> >> >> the body of a message to majordomo@vger.kernel.org
> >> >> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
[-- Attachment #2: xfs_repair.out.gz --]
[-- Type: application/gzip, Size: 4457 bytes --]
next prev parent reply other threads:[~2017-08-30 14:58 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-18 11:56 corrupt xfs log Ingard -
2017-08-18 12:02 ` Bill O'Donnell
2017-08-18 12:17 ` Brian Foster
2017-08-21 12:08 ` Ingard -
2017-08-21 15:51 ` Brian Foster
2017-08-21 20:24 ` Ingard -
2017-08-28 8:56 ` Ingard -
2017-08-28 10:59 ` Brian Foster
2017-08-30 14:58 ` Brian Foster [this message]
2017-08-31 7:27 ` Ingard -
2017-08-31 10:20 ` Brian Foster
2017-09-01 6:48 ` Ingard -
2017-09-01 11:33 ` Brian Foster
2017-09-01 15:11 ` Darrick J. Wong
2017-09-01 16:26 ` Brian Foster
2017-08-18 13:43 ` Ingard -
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170830145808.GB16641@bfoster.bfoster \
--to=bfoster@redhat.com \
--cc=billodo@redhat.com \
--cc=ingard1@gmail.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).