From: Theodore Tso <tytso@MIT.EDU>
To: Andi Kleen <andi@firstfloor.org>
Cc: Alexey Zaytsev <alexey.zaytsev@gmail.com>,
linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
Rik van Riel <riel@surriel.com>
Subject: Re: Mentor for a GSoC application wanted (Online ext2/3 filesystem checker)
Date: Sun, 20 Apr 2008 22:33:42 -0400 [thread overview]
Message-ID: <20080421023342.GC9700@mit.edu> (raw)
In-Reply-To: <87ej9085dq.fsf@basil.nowhere.org>
On Mon, Apr 21, 2008 at 01:37:37AM +0200, Andi Kleen wrote:
> Are you sure about all data? I think he would just need some lookup table from
> metadata block numbers to inode numbers and then when a hit occurs on a block
> in the table somehow invalidate all data related to that inode
> and restart that part. And the same thing for bitmap blocks. That lookup
> table should be much smaller than the full metadata.
Yeah, unfortunately it's close to all of the metadata. Consider that
e2fsck also has to deal with changes in the directory, and there can
be multiple hard links in a directory, so it's not just a simple
lookup table. You could try to condense the directory into a list of
inodes numbers and the number of times they were counted in a
directory, but then any time the directory changed, you'd have to
rescan the *entire* directory.
Also, consider that the lookup table might not be enough, if the
filesystem is actually corrupted, and there are multiple blocks
claimed by an inode. How you "invalidate all data" in that case
becomes less obvious.
It would be possible to condense the metdata somewhat by taking the
omitting unused inodes, and storing the indirect blocks as extents.
But there would still be a huge amount of metadata that would have to
be stored in memory. If you're willing to completely rewrite e2fsck
(which the on-line resize would need anyway, because the updated data
could invalidate the previously done work at any point anywhere in the
e2fsck processing), maybe the extra cached data structures won't be on
completely additive on top of the other intermediate data kept by
e2fsck, but it once again points out it would be insane for a student
to try to do this in 3 months.
> Anyways my favourite fsck wish list feature would be a way to record the
> changes a read-only fsck would want to do and then some quick way
> to apply them to a writable version of the file system without
> doing a full rescan. Then you could regularly do a background check
> and if it finds something wrong just remount and apply the changes
> quickly.
This is a read-only fsck while the filesystem is changing out from
underneath it, and the hope is that you can take the instructions
gathered from the read-only fsck (presumably run on a snapshot) and
then apply them to filesystem that has since been modified after the
snaphot was taken. Even if it has been remounted read-only at this
point, this gets really dicey. Consider that with certain types of
corruption, if the filesystem continues to get modified, the
corruption can get worse.
> Or perhaps just tell the kernel which objects is suspicious and
> should be EIOed.
Yeah; you could do that, as long as it's not a guarantee that all of
the objects which were suspicious were found. It would also be
possible to isolate the objects, perhaps with some potential inode and
block leakage that would get fixed at the next off-line fsck. Still,
it would be a lot of work. Let me know if someone is willing to pay
for this, and I could probably work with someone like Val to execute
this. But otherwise, it probably falls in the "we'd all like a pony"
sort of wishlist.....
- Ted
next prev parent reply other threads:[~2008-04-21 13:21 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <f19298770804180720w2e72b821j95b709c1dd1b1c25@mail.gmail.com>
[not found] ` <20080419012952.GE25797@mit.edu>
2008-04-19 9:44 ` Mentor for a GSoC application wanted (Online ext2/3 filesystem checker) Alexey Zaytsev
2008-04-19 18:56 ` Theodore Tso
2008-04-19 19:07 ` Eric Sandeen
2008-04-19 22:04 ` Theodore Tso
2008-04-20 1:24 ` Eric Sandeen
2008-04-20 23:30 ` Andi Kleen
2008-04-20 23:42 ` Jamie Lokier
2008-04-21 8:01 ` Andi Kleen
[not found] ` <20080421080111.GD14446@one.firstfloor.org>
2008-04-21 11:51 ` Jamie Lokier
2008-04-21 17:29 ` Ricardo M. Correia
2008-04-21 17:40 ` Andi Kleen
2008-04-21 18:27 ` Ricardo M. Correia
2008-04-22 14:48 ` Jamie Lokier
2008-04-21 18:15 ` Ric Wheeler
2008-04-21 18:25 ` Eric Sandeen
2008-04-21 18:44 ` Ric Wheeler
2008-04-21 18:58 ` Matthew Wilcox
2008-04-21 19:11 ` Ric Wheeler
2008-04-21 0:27 ` Alexey Zaytsev
2008-04-21 9:45 ` Andi Kleen
2008-04-22 16:54 ` Peter Teoh
2008-04-22 17:02 ` Eric Sandeen
2008-04-22 23:37 ` Andreas Dilger
2008-04-23 0:52 ` Eric Sandeen
[not found] ` <480E4950.1090300@oracle.com>
[not found] ` <804dabb00804221633g1f61029dh7b27737134fc0b7a@mail.gmail.com>
[not found] ` <480E7954.9090408@oracle.com>
2008-04-23 1:02 ` Peter Teoh
2008-04-20 23:37 ` Andi Kleen
2008-04-21 2:33 ` Theodore Tso [this message]
2008-04-21 14:43 ` Andi Kleen
2008-04-21 0:23 ` Alexey Zaytsev
2008-04-21 12:53 ` Theodore Tso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080421023342.GC9700@mit.edu \
--to=tytso@mit.edu \
--cc=alexey.zaytsev@gmail.com \
--cc=andi@firstfloor.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=riel@surriel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).