linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Emmanuel Florac <eflorac@intellique.com>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>,
	Dave Chinner <david@fromorbit.com>,
	"'linux-xfs@vger.kernel.org'" <linux-xfs@vger.kernel.org>
Subject: Re: Weird xfs_repair error
Date: Mon, 24 Jul 2017 10:51:25 -0400	[thread overview]
Message-ID: <20170724145125.GA12097@bfoster.bfoster> (raw)
In-Reply-To: <20170724162728.2a77797a@harpe.intellique.com>

On Mon, Jul 24, 2017 at 04:27:28PM +0200, Emmanuel Florac wrote:
> Le Mon, 17 Jul 2017 13:11:29 -0400
> Brian Foster <bfoster@redhat.com> écrivait:
> 
> > On Tue, Jul 11, 2017 at 03:23:52PM +0200, Emmanuel Florac wrote:
> > > Le Fri, 7 Jul 2017 08:36:33 -0700
> > > "Darrick J. Wong" <darrick.wong@oracle.com> écrivait:
> > >   
> > > > > fatal error -- name create failed in lost+found (28), filesystem
> > > > > may be out of space    
> > > > 
> > > > Would be helpful to have a metadump of this goobered-up lost+found
> > > > fs...
> > > >   
> > > 
> > > The metadump is here for anyone who would like to have a look:
> > > 
> > > http://update2.intellique.com/pub/bign.metadump.xz
> > > 
> > > The filesystem is about 115 TiB.
> > >   
> > 
> > Thanks for posting this. The first thing to note is that this
> > filesystem is severely corrupted.
> 
> This I have determined myself through the fact that many runs of
> xfs_repair (and different versions of it, v4.7, 4.9, 4.11...) can't get
> it into a stable (i.e. that won't crash while trying to access it)
> state.
> 
> > Nonetheless, I've been playing
> > around with trying to get the latest for-next xfs_repair to run
> > through this fs (via gdb) and have definitely hit a few issues:
> > 
> > - xfs_sb_verify() was changed to use bp->b_maps[0].bm_bn rather than
> >   bp->b_bn in libxfs commit 85428dd23f ("xfs: fix superblock
> > inprogress check"). b_maps isn't allocated if the buffer was
> > initialized with libxfs_initbuf() (rather than libxfs_initbuf_map()).
> > This causes a sigsegv here, though only if I disable -O2 optimization
> > for some reason that I haven't dug into yet.
> > - libxfs commit 0268fdc3fe ("xfs: remove xfs_trans_get_block_res")
> >   replaced the use of xfs_trans_get_block_res() in
> >   xfs_bmbt_alloc_block() which causes the -ENOSPC error. The previous
> >   function was hardcoded to return 1 such that this would never occur.
> > - The recently added directory sf format verifier (xfs_iformat_fork()
> > -> xfs_dir2_sf_verify()) seems to cause a premature repair failure in
> > at least one case.
> > 
> > I was able to eventually get repair to complete with some quick hacks
> > to bypass those issues. I did have to run repair two or three times
> > to get the fs to a clean state. The fs mounts and otherwise appears
> > clean to xfs_repair, but it's not clear to me how usable the
> > resulting fs really is (repair is for fs consistency after all, not
> > necessarily data recovery). Note that lost+found appears to be loaded
> > with 18T of data across almost 2 million inodes. :/
> 
> Thank you for your efforts, the loaded lost+found matches my own
> results, however some of the files there have been present for possibly
> years. In fact this filesystem has crashed several times in the past
> years but always went back online at some point, until... now.
> 
> So what could I do, at least to be able to mount it and copy everything
> elsewhere before mkfs'ing it all again? Do you have an xfs_repair
> binary at hand that I could use, or should I dig into the latest
> source?
> 

There are several fixes in-flight for the issues uncovered by this
metadump. I think you'll want to include the following 3 patches to
xfsprogs:

http://marc.info/?l=linux-xfs&m=150047977108174&w=2
http://marc.info/?l=linux-xfs&m=150040481220074&w=2
http://marc.info/?l=linux-xfs&m=150040481820076&w=2

Note that the last 2 patches are probably going to be reworked into a
different implementation. The idea here is ultimately to avoid running
the verifier in a case where it disrupts xfs_repair, so using this
intermediate patch series should be good enough to build a custom binary
that allows xfs_repair to eventually piece the fs back together. You
could alternatively just hack xfs_dir2_sf_verify() to return 0.

Note that I would highly recommend to test whatever you build against
your metadump before the original fs.

Brian

> -- 
> ------------------------------------------------------------------------
> Emmanuel Florac     |   Direction technique
>                     |   Intellique
>                     |	<eflorac@intellique.com>
>                     |   +33 1 78 94 84 02
> ------------------------------------------------------------------------



  reply	other threads:[~2017-07-24 14:51 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-06 13:30 Weird xfs_repair error Emmanuel Florac
2017-07-06 13:48 ` Brian Foster
2017-07-06 14:49   ` Emmanuel Florac
2017-07-06 23:28 ` Dave Chinner
2017-07-07 11:36   ` Emmanuel Florac
2017-07-07 11:50   ` Emmanuel Florac
2017-07-07 15:36     ` Darrick J. Wong
2017-07-10 17:29       ` Emmanuel Florac
2017-07-11 13:23       ` Emmanuel Florac
2017-07-17 17:11         ` Brian Foster
2017-07-24 14:27           ` Emmanuel Florac
2017-07-24 14:51             ` Brian Foster [this message]
2017-07-25 16:44               ` Emmanuel Florac
2017-07-25 17:16               ` Emmanuel Florac
2017-07-25 19:22                 ` Brian Foster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170724145125.GA12097@bfoster.bfoster \
    --to=bfoster@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=eflorac@intellique.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).