From: Sam Vilain <sam@vilain.net>
To: Oleg Drokin <green@namesys.com>
Cc: reiserfs-list@namesys.com, vitaly@namesys.com
Subject: Re: reiserfsck --blame-it-on-the-hardware-yeah-yeah
Date: Tue, 11 Feb 2003 05:24:58 +1300 [thread overview]
Message-ID: <200302110525.06890.sam@vilain.net> (raw)
In-Reply-To: <20030209130116.A32206@namesys.com>
On Sun, 09 Feb 2003 23:01, Oleg Drokin wrote:
> > Therefore, your reiserfsck has a bug. The whole point of a fsck is
> Well, currently the logic is "If we cannot read some block, that
> usually means this is a badblock".
> And so it prints the message. Of course more testing about
> if the block is beyond partition boundary should be probably added.
The block is not bad, it's EINVAL :-). The block *number* is bad; you
*could* add to your is_block_shagged() function a test for whether the
block is out of bounds, but the point is that if it gets as far as that
function, chances are that it is too late.
(In reiserfsck), you need to do the bounds check when the referring
block/data structure is checked.
> > that any data, anywhere, can be corrupted - and reiserfsck should not
> > fall over because of it. So, what you should do is carefully go
> Sure, unfortunatelly interactive part of reiserfsck is not very mature.
> And what do you think it should have done? Shrink the size of FS
> to fit changed (may be because of corruption) partition size?
> Enlarge the partition? What else?
Hmm. It wasn't a changed partition size. It was just a junk record in the
journal. It almost certainly got there by virtue of a hard hang that I
experienced just before this all happening.
> > through your filesystem data structure, insert garbage in at each
> > unique structural location, and run `reiserfsck' on it to see if it
> > handles the problem correctly. Then I'd suggest sollowing that up
> > with some randomly corrupted filesystems.
> Yup, we are running such tests. But thanks for suggestion.
Good to hear.
> > Looking at the source code, I now see why the --no-journal-available
> > switch does not do anything if a `standard' journal is used rather
> > than an off-device journal. However, I would suggest that this test
> > is superfluous, and the tool has more benefit to the system
> > administrator if the test for a `standard' journal with
> > fsck_skip_journal is removed, or perhaps replaced with a warning or
> > another prompt.
> We will think about it. Thanks for the idea.
From my experiment, removing the test is not quite all that's required :-).
Here's a brief log of what I did, hopefully you can get an idea of the sort
of changes that will be required to reiserfsck when ignoring the journal
on a `standard' journal filesystem:
- first, I removed the test in main.c from 3.x.1b and latest pre-release
reiserfsck and re-compiled
- Latest pre-release still refused to ignore journal contents,
complaining about invalid block offset.
- 3.x.1b reiserfsck, however, completes successfully. Ran
--rebuild-tree.
- mounted filesystem, however mount complains that a superblock is in the
log area (uh-oh). Force mount with nolog, see filesystem in
semi-consistent state. Great! It looks good. I look around the FS a
little bit. Oops. Panic. Reboot.
- Now that the journal is gone, all of the other reiserfsck modes of
operation seem to work. I ran (the latest) reiserfsck --rebuild-sb,
and then reiserfsck --rebuild-tree
- filesystem now mounts, however about the first 2 levels of directories,
and many recently written files, have had their directory entries
lost - lost+found contains roughly 11,000 entries (of 150,000 or
so).
- thankfully, I can locate the several hundred megabytes of .debs to save
myself spending days re-downloading it all over 56k :-). Mission
successful.
If reiserfsck was built with --no-journal-available in mind (that is,
ignoring the data present in an in-partition journal with that switch),
then I'm fairly sure that I wouldn't have suffered the last problem.
After the first scan, the journal would have been written back to an empty
state.
> > I'm going to try removing that test in the 3.x.1b version and see if
> > the fsck completes.
> Well, 3.x.1b should not be actually used, lots of bugs were fixed since
> then.
> Vitaly: We need a check that journal target block is in range of
> filesystem. Please add this test.
That is not all you must do!
You need to do one, preferably both of the following:
a) allow reiserfsck to ignore the in-partition journal, without producing
an insane result (where the filesystem header says there is a journal,
but the space where the journal is has filesystem data in it).
b) make reiserfsck validate the journal as well as the filesystem,
probably playing them back itself rather than relying on a mount
option that just does the playback for it. In theory you could decide
whether to use the on-disk or the in-journal data structure, depending
on which was more consistent!
--
Sam Vilain, sam@vilain.net
All work and no play make Jack a dull boy and Jill a wealthy widow.
- anon.
next prev parent reply other threads:[~2003-02-10 16:24 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-02-08 0:50 reiserfsck --blame-it-on-the-hardware-yeah-yeah Sam
2003-02-08 10:20 ` Oleg Drokin
[not found] ` <20030208224928.A30012@ns.soreal.co.uk>
2003-02-09 10:01 ` Oleg Drokin
2003-02-10 16:24 ` Sam Vilain [this message]
2003-02-10 13:11 ` Oleg Drokin
2003-02-11 2:53 ` Sam Vilain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200302110525.06890.sam@vilain.net \
--to=sam@vilain.net \
--cc=green@namesys.com \
--cc=reiserfs-list@namesys.com \
--cc=vitaly@namesys.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.