All of lore.kernel.org
 help / color / mirror / Atom feed
* corrupted fs: bitmap does not match to the correct one....
@ 2002-05-15 12:59 grobe
  2002-05-15 13:08 ` Oleg Drokin
  0 siblings, 1 reply; 7+ messages in thread
From: grobe @ 2002-05-15 12:59 UTC (permalink / raw)
  To: reiserfs-list

Hi,

I have a bad problem with my reiserfs. I got a lot of SCSI-errors in the
logs, and did a reiserfsck --rebuild-tree. Than, I checked the (hopefuly) fixed
fs, and now I get errors:

[...]
node (9614688) with wrong level (1) found in the tree (should be 3)
[...a lot of this]
free block count 70151604 mismatches with a correct one 70141253
on-disk bitmap des not match to the correct one. 8134442 bytes differ

If I try to mount the fs, mount tells me "no directory"...

I have resized the fs in the past, some months ago... is it possible that
these errors result from a bad resize action? The fs was up for quite a long
time, is it possible that an obviously working fs with errors gives errors
after that long time?

Ok, I am getting a coffe... having a bunch of angry users behind me ;-) I
just would like to know if there is a chance to restore the fs, or should I
just take the backup and build a new fs?

Thank You, CU, Lars.

-- 
GMX - Die Kommunikationsplattform im Internet.
http://www.gmx.net


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: corrupted fs: bitmap does not match to the correct one....
  2002-05-15 12:59 corrupted fs: bitmap does not match to the correct one grobe
@ 2002-05-15 13:08 ` Oleg Drokin
  2002-05-15 13:58   ` grobe
  0 siblings, 1 reply; 7+ messages in thread
From: Oleg Drokin @ 2002-05-15 13:08 UTC (permalink / raw)
  To: grobe; +Cc: reiserfs-list

Hello!

On Wed, May 15, 2002 at 02:59:37PM +0200, grobe@gmx.net wrote:

> I have a bad problem with my reiserfs. I got a lot of SCSI-errors in the

Hm. SCSI errors are bad, and you have performed everything on the same hardware?
What kind of errors was there, by the way?

> logs, and did a reiserfsck --rebuild-tree. Than, I checked the (hopefuly) fixed
> fs, and now I get errors:

What version of reiserfsprogs was used?

> [...]
> node (9614688) with wrong level (1) found in the tree (should be 3)
> [...a lot of this]
> free block count 70151604 mismatches with a correct one 70141253
> on-disk bitmap des not match to the correct one. 8134442 bytes differ
> If I try to mount the fs, mount tells me "no directory"...

How about more error messages about scsi errors in the logs?

> I have resized the fs in the past, some months ago... is it possible that
> these errors result from a bad resize action? The fs was up for quite a long

Not likely. If you have extended the fs more than 132M in size, you'd
notice if something go wrong instantly.

> just would like to know if there is a chance to restore the fs, or should I
> just take the backup and build a new fs?

There is always a chance, another question (rather important in your case,
it seems) is what option would be faster.

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: corrupted fs: bitmap does not match to the correct one....
  2002-05-15 13:08 ` Oleg Drokin
@ 2002-05-15 13:58   ` grobe
  2002-05-15 14:06     ` Oleg Drokin
  0 siblings, 1 reply; 7+ messages in thread
From: grobe @ 2002-05-15 13:58 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: reiserfs-list

Hi!

> > I have a bad problem with my reiserfs. I got a lot of SCSI-errors in the
> > Hm. SCSI errors are bad, and you have performed everything on the same
> hardware?

Yes, a Serveraid 4M with a EXP15 Raid 5 attached. Curent kernel (2.4.16),
current firmware, current driver (and UNCHANGED!).

> What kind of errors was there, by the way?

Zeus:/tmp # tail -50 /var/log/warn
May 15 12:54:01 Zeus kernel: SCSI disk error : host 2 channel 0 id 1 lun 0
return code = 70000
May 15 12:54:01 Zeus kernel:  I/O error: dev 08:11, sector 211045693
[... a lot of these ...]
May 15 12:54:01 Zeus kernel: SCSI disk error : host 2 channel 0 id 1 lun 0
return code = 70000
May 15 12:54:01 Zeus kernel:  I/O error: dev 08:11, sector 242274461
May 15 14:18:03 Zeus kernel: reiserfs: checking transaction log (device
3a:00) ...
May 15 14:18:07 Zeus kernel: is_tree_node: node level 2 does not match to
the expected one 4
May 15 14:18:07 Zeus kernel: vs-5150: search_by_key: invalid format found in
block 9113. Fsck?
May 15 14:18:07 Zeus kernel: vs-13070: reiserfs_read_inode2: i/o failure
occurred trying to find stat data of [1 2 0x0 SD]
May 15 14:18:07 Zeus kernel: Using r5 hash to sort names
May 15 14:18:07 Zeus kernel: ReiserFS version 3.6.25
May 15 14:18:45 Zeus kernel: FAT: bogus logical sector size 0
May 15 14:18:45 Zeus kernel: VFS: Can't find a valid FAT filesystem on dev
3a:00.
May 15 14:18:45 Zeus kernel: VFS: Can't find a HFS filesystem on dev 3a:00.
May 15 14:19:00 Zeus kernel: reiserfs: checking transaction log (device
3a:00) ...
May 15 14:19:00 Zeus kernel: is_tree_node: node level 2 does not match to
the expected one 4
May 15 14:19:00 Zeus kernel: vs-5150: search_by_key: invalid format found in
block 9113. Fsck?
May 15 14:19:00 Zeus kernel: vs-13070: reiserfs_read_inode2: i/o failure
occurred trying to find stat data of [1 2 0x0 SD]
May 15 14:19:00 Zeus kernel: Using r5 hash to sort names
May 15 14:19:00 Zeus kernel: ReiserFS version 3.6.25
May 15 14:30:57 Zeus kernel: FAT: bogus logical sector size 0
May 15 14:30:57 Zeus kernel: VFS: Can't find a valid FAT filesystem on dev
3a:00.
May 15 14:30:57 Zeus kernel: VFS: Can't find a HFS filesystem on dev 3a:00.
May 15 14:31:11 Zeus kernel: reiserfs: checking transaction log (device
3a:00) ...
May 15 14:31:11 Zeus kernel: is_tree_node: node level 2 does not match to
the expected one 4
May 15 14:31:11 Zeus kernel: vs-5150: search_by_key: invalid format found in
block 9113. Fsck?
May 15 14:31:11 Zeus kernel: vs-13070: reiserfs_read_inode2: i/o failure
occurred trying to find stat data of [1 2 0x0 SD]
May 15 14:31:11 Zeus kernel: Using r5 hash to sort names
May 15 14:31:11 Zeus kernel: ReiserFS version 3.6.25

> What version of reiserfsprogs was used?

3.x.0j

> How about more error messages about scsi errors in the logs?

As I mentioned,  I can't mount any more, so no more messages...

> Not likely. If you have extended the fs more than 132M in size, you'd
> notice if something go wrong instantly.

I shrinked the fs, and I have created new fs (I use lvm, so I shrinke fs,
than reduced the logical volume size, and reused the disk space for other fs).

> There is always a chance, another question (rather important in your case,
> it seems) is what option would be faster.

That's what I need to know ;-) In fact, the backup is on an IDE-Array, I
dont get more than 10MB/s from the backup server, and i have 300GB of data...

Thank You, CU, Lars.

-- 
GMX - Die Kommunikationsplattform im Internet.
http://www.gmx.net


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: corrupted fs: bitmap does not match to the correct one....
  2002-05-15 13:58   ` grobe
@ 2002-05-15 14:06     ` Oleg Drokin
  2002-05-15 15:56       ` grobe
  0 siblings, 1 reply; 7+ messages in thread
From: Oleg Drokin @ 2002-05-15 14:06 UTC (permalink / raw)
  To: grobe; +Cc: reiserfs-list

Hello!

On Wed, May 15, 2002 at 03:58:46PM +0200, grobe@gmx.net wrote:

> > > I have a bad problem with my reiserfs. I got a lot of SCSI-errors in the
> > > Hm. SCSI errors are bad, and you have performed everything on the same
> > hardware?
> Yes, a Serveraid 4M with a EXP15 Raid 5 attached. Curent kernel (2.4.16),
> current firmware, current driver (and UNCHANGED!).
> > What kind of errors was there, by the way?
> May 15 12:54:01 Zeus kernel:  I/O error: dev 08:11, sector 211045693
> [... a lot of these ...]

Ah, failing hardware.
Unfortunatelly we only deal with failing harddrive based on
http://namesys.com/support.html terms.

> > What version of reiserfsprogs was used?
> 3.x.0j

That's way too old.

> > How about more error messages about scsi errors in the logs?
> As I mentioned,  I can't mount any more, so no more messages...

Hm. So you read corrupted on-disk infortmation with reiserfsck, then wrote it
back. That'd going to hurt for sure.

> > There is always a chance, another question (rather important in your case,
> > it seems) is what option would be faster.
> That's what I need to know ;-) In fact, the backup is on an IDE-Array, I
> dont get more than 10MB/s from the backup server, and i have 300GB of data...

I'd say you'd better stay using your backups (on a different hard drive(s)).
Also you probably want to ask whoever sold you the failing harddrive to replace
it.

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: corrupted fs: bitmap does not match to the correct one....
  2002-05-15 14:06     ` Oleg Drokin
@ 2002-05-15 15:56       ` grobe
  2002-05-15 17:04         ` Valdis.Kletnieks
  2002-05-16  5:13         ` Oleg Drokin
  0 siblings, 2 replies; 7+ messages in thread
From: grobe @ 2002-05-15 15:56 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: reiserfs-list

Hi!

> > May 15 12:54:01 Zeus kernel:  I/O error: dev 08:11, sector 211045693
> > [... a lot of these ...]
> 
> Ah, failing hardware.

That's funny. As I mentioned, we use a Serveraid 4m Raid Controller. So, if
a disk fails, the controller switches it off, sends me a warning messages and
downgrades from RAID5 to RAID0. But the controller tells me that there are
no errors. It seams to be something in the driver-kernel-lvm-chain.

> > > What version of reiserfsprogs was used?
> > 3.x.0j

I just installed 3.1b, with reiserfsck being about 8times faster. I also
upgraded lvm to 1.0.4, as I had an old pre-1-version installed (it was running
w/o problems so far, so I didn't change this before).

> Hm. So you read corrupted on-disk infortmation with reiserfsck, then wrote
> it back. That'd going to hurt for sure.

> I'd say you'd better stay using your backups (on a different hard
> drive(s)).

> Also you probably want to ask whoever sold you the failing harddrive to
> replace it.

As I wrote, the "harddrive" is a logical volume on a raid 5 array, attached
to a serveraid 4 m - so I would have to bring an entire server if I want to
replace the failing hardware ;-)

The new reiserfsck --rebuild-tree gives me

pass 0: reading block (xxxx) failed

Is this REALLY only possible with bad hardware, or also with lvm-problems,
raid-driver problems etc...?

Thanks and CU, Lars.

-- 
GMX - Die Kommunikationsplattform im Internet.
http://www.gmx.net


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: corrupted fs: bitmap does not match to the correct one....
  2002-05-15 15:56       ` grobe
@ 2002-05-15 17:04         ` Valdis.Kletnieks
  2002-05-16  5:13         ` Oleg Drokin
  1 sibling, 0 replies; 7+ messages in thread
From: Valdis.Kletnieks @ 2002-05-15 17:04 UTC (permalink / raw)
  To: grobe; +Cc: Oleg Drokin, reiserfs-list

[-- Attachment #1: Type: text/plain, Size: 827 bytes --]

On Wed, 15 May 2002 17:56:49 +0200, grobe@gmx.net said:

> That's funny. As I mentioned, we use a Serveraid 4m Raid Controller. So, if
> a disk fails, the controller switches it off, sends me a warning messages and
> downgrades from RAID5 to RAID0. But the controller tells me that there are
> no errors. It seams to be something in the driver-kernel-lvm-chain.

I once had the joy of getting dragged in after the fact to recover a 100G
database, after the RAID controller had gotten just a *bit* confused over
which disk was which and wrote all of its cache memory into the correct
block numbers on the wrong disks.

It then had the temerity to say that *it* was fine, but that multiple disks
had just gone to the Great Bit Bucket in the Sky.


-- 
				Valdis Kletnieks
				Computer Systems Senior Engineer
				Virginia Tech


[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: corrupted fs: bitmap does not match to the correct one....
  2002-05-15 15:56       ` grobe
  2002-05-15 17:04         ` Valdis.Kletnieks
@ 2002-05-16  5:13         ` Oleg Drokin
  1 sibling, 0 replies; 7+ messages in thread
From: Oleg Drokin @ 2002-05-16  5:13 UTC (permalink / raw)
  To: grobe; +Cc: reiserfs-list

Hello!

On Wed, May 15, 2002 at 05:56:49PM +0200, grobe@gmx.net wrote:
> > Ah, failing hardware.
> That's funny. As I mentioned, we use a Serveraid 4m Raid Controller. So, if
> a disk fails, the controller switches it off, sends me a warning messages and
> downgrades from RAID5 to RAID0. But the controller tells me that there are

Hm. I am not sure it is possible to downgrade from raid5 to raid0.
Also if you loose more than one disk, operations are impossible anymore.

> no errors. It seams to be something in the driver-kernel-lvm-chain.

May be so. At least errors are appearing on block layer already.

> > Also you probably want to ask whoever sold you the failing harddrive to
> > replace it.
> As I wrote, the "harddrive" is a logical volume on a raid 5 array, attached
> to a serveraid 4 m - so I would have to bring an entire server if I want to
> replace the failing hardware ;-)

Hm. If I were you, I'd probably go this route, anyway. :)
The other possibility is to dig deep into LVM and serverraid sources and
to see why does it produces I/O errors if hardware does not report any errors.

> The new reiserfsck --rebuild-tree gives me
> pass 0: reading block (xxxx) failed

So, it still cannot read blocks.
What are those xxxx, btw? Do they look correct? (in range of actual volume
size?)

> Is this REALLY only possible with bad hardware, or also with lvm-problems,
> raid-driver problems etc...?

This ispossible with bad hardware or bad drivers, but I cannot tell you which
case is yours because I do not know.

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2002-05-16  5:13 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-05-15 12:59 corrupted fs: bitmap does not match to the correct one grobe
2002-05-15 13:08 ` Oleg Drokin
2002-05-15 13:58   ` grobe
2002-05-15 14:06     ` Oleg Drokin
2002-05-15 15:56       ` grobe
2002-05-15 17:04         ` Valdis.Kletnieks
2002-05-16  5:13         ` Oleg Drokin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.