reiserfsck --blame-it-on-the-hardware-yeah-yeah

All of lore.kernel.org
 help / color / mirror / Atom feed

* reiserfsck --blame-it-on-the-hardware-yeah-yeah
@ 2003-02-08  0:50 Sam
  2003-02-08 10:20 ` Oleg Drokin
  0 siblings, 1 reply; 6+ messages in thread
From: Sam @ 2003-02-08  0:50 UTC (permalink / raw)
  To: reiserfs-list

Hi there,

I have done as requested and used your latest pre-release of
reiserfsck to check my filesystem.  However, it reports an erroneous
error.

(none):~# reiserfsck --no-journal-available /dev/hda1

<-------------reiserfsck, 2002------------->
reiserfsprogs 3.6.5-pre1

  *************************************************************
  ** If you are using the latest reiserfsprogs and  it fails **
  ** please  email bug reports to reiserfs-list@namesys.com, **
  ** providing  as  much  information  as  possible --  your **
  ** hardware,  kernel,  patches,  settings,  all  reiserfsk **
  ** messages  (including version),  the reiserfsck logfile, **
  ** check  the  syslog file  for  any  related information. **
  ** If you would like advice on using this program, support **
  ** is available  for $25 at  www.namesys.com/support.html. **
  *************************************************************

Will read-only check consistency of the filesystem on /dev/hda1
Will put log info to 'stdout'

Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes
Filesystem with standard journal found, --no-journal-available is ignored
###########
reiserfsck --check started at Sun Feb  9 14:43:07 2003
###########

The problem has occurred looks like a hardware problem.
Check your hard drive for badblocks.

bread: Cannot read the block (524111).

Aborted
(none):~# dd if=/dev/hda of=/tmp/foo skip=524100 count=100
100+0 records in
100+0 records out
(none):~# od -x /tmp/foo
0000000 6974 6e6f 7720 6c69 206c 6562 6920 636e
0000020 756c 6564 2064 6e69 7420 6568 6e20 7865
[... lots of very valid looking data snipped ...]
(none):~# time badblocks -c 2048 -n -s -v /dev/hda1
Initializing random test data
Checking for bad blocks in non-destructive read-write mode
From block 0 to 2000061
Checking for bad blocks (non-destructive read-write test):   2000061/  2000061
Pass completed, 0 bad blocks found.

real    15m44.304s
user    0m6.420s
sys     0m33.250s
(none):~#

I used `-c 2048' as the disk's buffer is only 128KiB; to avoid having
badblocks merely testing the integrity of the disk cache :-).  Here's
what's happening through the eyes of `strace':

[...]
read(0, "Yes\n", 4096)                  = 4
open("/dev/hda1", O_RDONLY|O_LARGEFILE) = 3
brk(0x808d000)                          = 0x808d000
brk(0x808f000)                          = 0x808f000
brk(0x8091000)                          = 0x8091000
brk(0x8093000)                          = 0x8093000
brk(0x8095000)                          = 0x8095000
brk(0x8097000)                          = 0x8097000
_llseek(3, 8192, [8192], SEEK_SET)      = 0
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 409
6
_llseek(3, 65536, [65536], SEEK_SET)    = 0
read(3, "P\377\7\0005\356\0\0\35\223\0\0\22\0\0\0\0\0\0\0\0 \0\0"..., 4096) = 40
96
open("/dev/hda1", O_RDONLY|O_LARGEFILE) = 4
_llseek(4, 33628160, [33628160], SEEK_SET) = 0
read(4, "\357M\3\0\324\27\0\0e\0\0\0\22\0\0\0\0\0\0\0\0 \0\0\0\4"..., 4096) = 40
96
fstat64(3, {st_mode=S_IFBLK|0660, st_rdev=makedev(3, 1), ...}) = 0
open("/dev/null", O_RDONLY|O_NONBLOCK|O_DIRECTORY) = -1 ENOTDIR (Not a directory
)
open("/dev/", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 5
fstat64(5, {st_mode=S_IFDIR|0755, st_size=24576, ...}) = 0
fcntl64(5, F_SETFD, FD_CLOEXEC)         = 0
getdents64(0x5, 0x8095128, 0x1000, 0)   = 4088
stat64("/dev/kmem", {st_mode=S_IFCHR|0640, st_rdev=makedev(1, 2), ...}) = 0
stat64("/dev/mem", {st_mode=S_IFCHR|0640, st_rdev=makedev(1, 1), ...}) = 0
stat64("/dev/core", 0xbffffcec)         = -1 ENOENT (No such file or directory)
close(5)                                = 0
time([1044755218])                      = 1044755218
open("/etc/localtime", O_RDONLY)        = 5
fstat64(5, {st_mode=S_IFREG|0644, st_size=870, ...}) = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0
x40015000
read(5, "TZif\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\5\0\0\0\5\0"..., 4096) = 870
close(5)                                = 0
munmap(0x40015000, 4096)                = 0
write(2, "###########\nreiserfsck --check s"..., 79) = 79
brk(0x80a8000)                          = 0x80a8000
_llseek(3, 2146758656, 0xbffffc84, SEEK_SET) = -1 EINVAL (Invalid argument)
write(2, "\nThe problem has occurred looks "..., 94) = 94
write(2, "\nbread: Cannot read the block (5"..., 41) = 41
rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
getpid()                                = 71
kill(71, SIGABRT)                       = 0
--- SIGABRT (Aborted) ---
+++ killed by SIGABRT +++

The parameters to that _llseek() command look quite off.  The
filesystem is less than 2GiB!

It would be quite a loss in terms of time for me to have to rebuild
this filesystem from scratch.  It's my primary workstation's root
filesystem (and I had only just purchased a backup device; I was in
the process of backing up my data when this happened, Murphy's Law
proving itself).

I'd really appreciate it if you could help me zap that data journal in
the form of brief instructions (& references to kernel struct
typedefs), or perhaps give me a patch to reiserfsprogs that will allow
the --no-journal-available switch to actually do what it says.

Cheers,
--
Sam Vilain, sam@vilain.net

Real Programmers don't write in BASIC.  Actually, no programmers write
in BASIC after reaching puberty.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: reiserfsck --blame-it-on-the-hardware-yeah-yeah
  2003-02-08  0:50 reiserfsck --blame-it-on-the-hardware-yeah-yeah Sam
@ 2003-02-08 10:20 ` Oleg Drokin
       [not found]   ` <20030208224928.A30012@ns.soreal.co.uk>
  0 siblings, 1 reply; 6+ messages in thread
From: Oleg Drokin @ 2003-02-08 10:20 UTC (permalink / raw)
  To: Sam; +Cc: reiserfs-list

Hello!


> bread: Cannot read the block (524111).
> Aborted
> (none):~# dd if=/dev/hda of=/tmp/foo skip=524100 count=100
> 100+0 records in
> 100+0 records out
> (none):~# od -x /tmp/foo
> 0000000 6974 6e6f 7720 6c69 206c 6562 6920 636e
> 0000020 756c 6564 2064 6e69 7420 6568 6e20 7865
> [... lots of very valid looking data snipped ...]

This is wrong block, try adding bs=4k to dd
Also read not from /dev/hda, but from your partition instead

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: reiserfsck --blame-it-on-the-hardware-yeah-yeah
       [not found]   ` <20030208224928.A30012@ns.soreal.co.uk>
@ 2003-02-09 10:01     ` Oleg Drokin
  2003-02-10 16:24       ` Sam Vilain
  0 siblings, 1 reply; 6+ messages in thread
From: Oleg Drokin @ 2003-02-09 10:01 UTC (permalink / raw)
  To: Sam; +Cc: reiserfs-list, vitaly

Hello!

On Sat, Feb 08, 2003 at 10:49:28PM +0000, Sam@Vilain.net wrote:

> Ah, right, well that explains it.  It complained about block 524111,
> which would be physical block number 2096444.  This is off the end of
> the block device, which only has 2000061 blocks.

Aha, so this is indeed the problem.

> I acknowledge that I used `hda' where I should have used `hda1' for
> the simple read-test with dd, but did you not see the `badblocks'
> program output in the same e-mail?  `badblocks' read in the existing

Yes, I saw it.

> then wrote the original data back.  It detected no error anywhere in
> the block device.

That's good, it means your hard drive is probably ok.

> Therefore, your reiserfsck has a bug.  The whole point of a fsck is

Well, currently the logic is "If we cannot read some block, that
usually means this is a badblock".
And so it prints the message. Of course more testing about
if the block is beyond partition boundary should be probably added.

> that any data, anywhere, can be corrupted - and reiserfsck should not
> fall over because of it.  So, what you should do is carefully go

Sure, unfortunatelly interactive part of reiserfsck is not very mature.
And what do you think it should have done? Shrink the size of FS
to fit changed (may be because of corruption) partition size?
Enlarge the partition? What else?

> through your filesystem data structure, insert garbage in at each
> unique structural location, and run `reiserfsck' on it to see if it
> handles the problem correctly.  Then I'd suggest sollowing that up
> with some randomly corrupted filesystems.

Yup, we are running such tests. But thanks for suggestion.

> Looking at the source code, I now see why the --no-journal-available
> switch does not do anything if a `standard' journal is used rather
> than an off-device journal.  However, I would suggest that this test
> is superfluous, and the tool has more benefit to the system
> administrator if the test for a `standard' journal with
> fsck_skip_journal is removed, or perhaps replaced with a warning or
> another prompt.

We will think about it. Thanks for the idea.

> I'm going to try removing that test in the 3.x.1b version and see if
> the fsck completes.

Well, 3.x.1b should not be actually used, lots of bugs were fixed since then.

Thanks for the report.

Vitaly: We need a check that journal target block is in range of filesystem.
Please add this test.

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: reiserfsck --blame-it-on-the-hardware-yeah-yeah
  2003-02-10 16:24       ` Sam Vilain
@ 2003-02-10 13:11         ` Oleg Drokin
  2003-02-11  2:53           ` Sam Vilain
  0 siblings, 1 reply; 6+ messages in thread
From: Oleg Drokin @ 2003-02-10 13:11 UTC (permalink / raw)
  To: Sam Vilain; +Cc: reiserfs-list, vitaly

Hello!

On Tue, Feb 11, 2003 at 05:24:58AM +1300, Sam Vilain wrote:
> > > Therefore, your reiserfsck has a bug.  The whole point of a fsck is
> > Well, currently the logic is "If we cannot read some block, that
> > usually means this is a badblock".
> > And so it prints the message. Of course more testing about
> > if the block is beyond partition boundary should be probably added.
> The block is not bad, it's EINVAL :-).  The block *number* is bad; you 

Sure.

> *could* add to your is_block_shagged() function a test for whether the 
> block is out of bounds, but the point is that if it gets as far as that 
> function, chances are that it is too late.

This is being worked on currently.

> (In reiserfsck), you need to do the bounds check when the referring 
> block/data structure is checked.

Sure. We have some checks, though apparently not enough.

[horrors about recompiling fsck with customly disabled stuff skipped].
If you really decided to shoot yourself in the foot, you might as well
just will journal with zeroes. It would be much easier this way ;)

>   - filesystem now mounts, however about the first 2 levels of directories,
>     and many recently written files, have had their directory entries
>     lost - lost+found contains roughly 11,000 entries (of 150,000 or
>     so).

Hm, probably corresponding blocks (with names) were only present in
journal, and you erased that.

>   - thankfully, I can locate the several hundred megabytes of .debs to save
>     myself spending days re-downloading it all over 56k :-).  Mission
>     successful.

At least you have not lost anything valuable. This is good.

> If reiserfsck was built with --no-journal-available in mind (that is, 
> ignoring the data present in an in-partition journal with that switch), 
> then I'm fairly sure that I wouldn't have suffered the last problem.  

How so?

> After the first scan, the journal would have been written back to an empty 
> state.

So what? If directories content was only present in journal, you just loose that info.

> > > I'm going to try removing that test in the 3.x.1b version and see if
> > > the fsck completes.
> > Well, 3.x.1b should not be actually used, lots of bugs were fixed since
> > then.
> > Vitaly: We need a check that journal target block is in range of
> > filesystem. Please add this test.
> That is not all you must do!

> You need to do one, preferably both of the following:
>   a) allow reiserfsck to ignore the in-partition journal, without producing
>      an insane result (where the filesystem header says there is a journal,
>      but the space where the journal is has filesystem data in it).

This cannot happen in any sane way. (I mean root block just cannot live in journal).

>   b) make reiserfsck validate the journal as well as the filesystem,
>      probably playing them back itself rather than relying on a mount
>      option that just does the playback for it.  In theory you could decide
>      whether to use the on-disk or the in-journal data structure, depending
>      on which was more consistent!

I was thinking about that already. May be we will do something like that in 2.7/2.8,
but certainly not now. And it will make lots of complications, I fear.
People who will forget to upgrade their reiserfsprogs will get in trouble when
upgrading kernels and so on...

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: reiserfsck --blame-it-on-the-hardware-yeah-yeah
  2003-02-09 10:01     ` Oleg Drokin
@ 2003-02-10 16:24       ` Sam Vilain
  2003-02-10 13:11         ` Oleg Drokin
  0 siblings, 1 reply; 6+ messages in thread
From: Sam Vilain @ 2003-02-10 16:24 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: reiserfs-list, vitaly

On Sun, 09 Feb 2003 23:01, Oleg Drokin wrote:
> > Therefore, your reiserfsck has a bug.  The whole point of a fsck is
> Well, currently the logic is "If we cannot read some block, that
> usually means this is a badblock".
> And so it prints the message. Of course more testing about
> if the block is beyond partition boundary should be probably added.

The block is not bad, it's EINVAL :-).  The block *number* is bad; you 
*could* add to your is_block_shagged() function a test for whether the 
block is out of bounds, but the point is that if it gets as far as that 
function, chances are that it is too late.

(In reiserfsck), you need to do the bounds check when the referring 
block/data structure is checked.

> > that any data, anywhere, can be corrupted - and reiserfsck should not
> > fall over because of it.  So, what you should do is carefully go
> Sure, unfortunatelly interactive part of reiserfsck is not very mature.
> And what do you think it should have done? Shrink the size of FS
> to fit changed (may be because of corruption) partition size?
> Enlarge the partition? What else?

Hmm.  It wasn't a changed partition size.  It was just a junk record in the 
journal.  It almost certainly got there by virtue of a hard hang that I 
experienced just before this all happening.

> > through your filesystem data structure, insert garbage in at each
> > unique structural location, and run `reiserfsck' on it to see if it
> > handles the problem correctly.  Then I'd suggest sollowing that up
> > with some randomly corrupted filesystems.
> Yup, we are running such tests. But thanks for suggestion.

Good to hear.

> > Looking at the source code, I now see why the --no-journal-available
> > switch does not do anything if a `standard' journal is used rather
> > than an off-device journal.  However, I would suggest that this test
> > is superfluous, and the tool has more benefit to the system
> > administrator if the test for a `standard' journal with
> > fsck_skip_journal is removed, or perhaps replaced with a warning or
> > another prompt.
> We will think about it. Thanks for the idea.

From my experiment, removing the test is not quite all that's required :-).

Here's a brief log of what I did, hopefully you can get an idea of the sort 
of changes that will be required to reiserfsck when ignoring the journal 
on a `standard' journal filesystem:

  - first, I removed the test in main.c from 3.x.1b and latest pre-release
    reiserfsck and re-compiled

  - Latest pre-release still refused to ignore journal contents,
    complaining about invalid block offset.

  - 3.x.1b reiserfsck, however, completes successfully.  Ran
    --rebuild-tree.

  - mounted filesystem, however mount complains that a superblock is in the
    log area (uh-oh).  Force mount with nolog, see filesystem in
    semi-consistent state.  Great!  It looks good.  I look around the FS a
    little bit.  Oops.  Panic.  Reboot.

  - Now that the journal is gone, all of the other reiserfsck modes of
    operation seem to work.  I ran (the latest) reiserfsck --rebuild-sb,
    and then reiserfsck --rebuild-tree

  - filesystem now mounts, however about the first 2 levels of directories,
    and many recently written files, have had their directory entries
    lost - lost+found contains roughly 11,000 entries (of 150,000 or
    so).

  - thankfully, I can locate the several hundred megabytes of .debs to save
    myself spending days re-downloading it all over 56k :-).  Mission
    successful.

If reiserfsck was built with --no-journal-available in mind (that is, 
ignoring the data present in an in-partition journal with that switch), 
then I'm fairly sure that I wouldn't have suffered the last problem.  
After the first scan, the journal would have been written back to an empty 
state.

> > I'm going to try removing that test in the 3.x.1b version and see if
> > the fsck completes.
> Well, 3.x.1b should not be actually used, lots of bugs were fixed since
> then.
> Vitaly: We need a check that journal target block is in range of
> filesystem. Please add this test.

That is not all you must do!

You need to do one, preferably both of the following:

  a) allow reiserfsck to ignore the in-partition journal, without producing
     an insane result (where the filesystem header says there is a journal,
     but the space where the journal is has filesystem data in it).

  b) make reiserfsck validate the journal as well as the filesystem,
     probably playing them back itself rather than relying on a mount
     option that just does the playback for it.  In theory you could decide
     whether to use the on-disk or the in-journal data structure, depending
     on which was more consistent!


-- 
Sam Vilain, sam@vilain.net

All work and no play make Jack a dull boy and Jill a wealthy widow.
 - anon.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: reiserfsck --blame-it-on-the-hardware-yeah-yeah
  2003-02-10 13:11         ` Oleg Drokin
@ 2003-02-11  2:53           ` Sam Vilain
  0 siblings, 0 replies; 6+ messages in thread
From: Sam Vilain @ 2003-02-11  2:53 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: reiserfs-list, vitaly

On Tue, 11 Feb 2003 02:11, Oleg Drokin wrote:
> > You need to do one, preferably both of the following:
> >   a) allow reiserfsck to ignore the in-partition journal, without
> > producing an insane result (where the filesystem header says there is
> > a journal, but the space where the journal is has filesystem data in
> > it).
>
> This cannot happen in any sane way. (I mean root block just cannot live
> in journal).

Yes.  That's what happened when reiserfsck ran with the simple hack I 
detailed.  Granted, that violated a sanity condition of your code, with 
insane results.  GIGO.

I don't care how - there just needs to be some easy way to clear or ignore 
a corrupted journal.  With that junk in there and no `Oh, that's simple, 
just dd /dev/zero to ...' instructions on clearing the journal, I had no 
other option but to try a hack.

If you were to consider the --no-journal-available flag to mean 
--ignore-journal-contents when running with a Journal Inside(tm) 
filesystem, then presumably you would add corresponding changes to the 
code to make sure you don't end up trying to put superblocks in journal 
space.

While we're on the subject of reiserfs bugs, simply running `cddump' on my 
system created bogus directory entries I couldn't remove (on vanilla 
2.4.20).  It's just doing a (selective) `cp -al' and subsequently removing 
the structures, for 650MB of a filesystem at a time.  This was enough to 
trigger those race conditions, possibly the same as first reported by Zygo 
Blaxell - did you get anywhere with those?
-- 
Sam Vilain, sam@vilain.net

But I also made it clear to (Vladimir Putin) that it's important to
think beyond the old days of when we had the concept that if we blew
each other up, the world would be safe.
 - George W. Bush, May 1, 2001

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2003-02-11  2:53 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-02-08  0:50 reiserfsck --blame-it-on-the-hardware-yeah-yeah Sam
2003-02-08 10:20 ` Oleg Drokin
     [not found]   ` <20030208224928.A30012@ns.soreal.co.uk>
2003-02-09 10:01     ` Oleg Drokin
2003-02-10 16:24       ` Sam Vilain
2003-02-10 13:11         ` Oleg Drokin
2003-02-11  2:53           ` Sam Vilain

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.