From: Christoph Hellwig <hch@infradead.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@infradead.org>,
Stefan Pfetzing <stefan.pfetzing@1und1.de>,
xfs@oss.sgi.com
Subject: Re: [PATCH repair: do not walk the unlinked inode list
Date: Mon, 14 Nov 2011 13:55:59 -0500 [thread overview]
Message-ID: <20111114185559.GA23715@infradead.org> (raw)
In-Reply-To: <20111109231133.GS5534@dastard>
On Thu, Nov 10, 2011 at 10:11:33AM +1100, Dave Chinner wrote:
> You're making the assumption that log recovery has done the correct
> thing any only replayed entire unlink transactions and hence the
> filesystem is otherwise consistent (i.e that there are no other
> references). I think that's a bad assumption - there's no guarantee
> that the unlinked list only contains unreferenced inodes if there's
> been corruption and/or log replay was not able to be run.
We add inodes to the uncertain list if any of the following applies
a) are found in an inode btree record reachable from the root in
phase2, but they are suspect based on certain factors - else
we add them to the inode tree directly.
b) are found on the unlinked inodes list in phase3
c) a directory found in an reachable inode btree record points to
them in phase3
so any inodes that either has a link pointing to it, or an inode
allocation btree record pointing to it will still be added to the
uncertain inode list if they aren't on the actual inode btree yet.
Then later in phase3 we move all uncertain inodes that appear fine
back into the main inode record tree.
> I also think there's more to it than that. The walk of the inode list
> also marks all the blocks in the block map as containing inodes, and
> all the blocks still used by those inodes as data/bmap/attr types.
> This change removes that, so we're going to potentially lose that
> state if all the inodes in a block are on the unlinked list.
We still do that walk if we have any genuine reference to the inode.
If we don't have any reference but the unlinked list they can be
considered free - we'd free every ressources assoicated with them
on log recovery anyway.
> Hence we'll end up with blocks containing inodes that are still
> marked as used in the AGINO btree, but are marked as free space in
> the block map.
They aren't. We completely rebuild both the inode allocation and
space allocation bitmaps from the information we gather in the earlier
repair phases, and they will be in sync.
> the AG or filesystem (e.g. agi->agi_count - agi->agi_free). Yes, it will
> still spin for some time on this sort of corruption, but it won't
> get stuck, and it won't add new holes into our block/inode usage
> tracking...
This would basically take forever with thinkgs like Arek's filesystem
with almost 11 million inodes in each AG.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
prev parent reply other threads:[~2011-11-14 18:56 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-11-09 8:37 [PATCH repair: do not walk the unlinked inode list Christoph Hellwig
2011-11-09 23:11 ` Dave Chinner
2011-11-14 18:55 ` Christoph Hellwig [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111114185559.GA23715@infradead.org \
--to=hch@infradead.org \
--cc=david@fromorbit.com \
--cc=stefan.pfetzing@1und1.de \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox