All of lore.kernel.org
 help / color / mirror / Atom feed
* Bad root block 0. (--rebuild-tree did not complete)
@ 2003-12-02 19:28 Marc Schmitt
  2003-12-02 20:06 ` Vitaly Fertman
  0 siblings, 1 reply; 6+ messages in thread
From: Marc Schmitt @ 2003-12-02 19:28 UTC (permalink / raw)
  To: ReiserFS-List

Hi list,

Attached to a Dell PE8450 (8 CPUs, 4GB RAM) we have a PowerVault 210 
Storage with 12 70GB SCSI disks in a RAID5 container w/ spare disk 
managed by a

scsi2 : Found a MegaRAID controller at 0xf8853000, IRQ: 24
scsi2 : Enabling 64 bit support
megaraid: [1.73:3.27] detected 1 logical drives

The machine is primarily running as NFS server under kernel 2.4.21 and 
now under 2.4.23, reiserfsprogs 3.6.11. Twice already I had to run 
`reiserfsck --rebuild-tree` this year due to weird corruption issues. 
Since a couple of weeks (when the server was running under 2.4.21), 
we've seen strange console output and users were claiming files had been 
lost or they would get "permission denied" errors.

Example console output:
Dec  1 05:28:37 sim0 kernel: vs-5150: search_by_key: invalid format 
found in block 175880844. Fsck?
Dec  1 05:28:37 sim0 kernel: vs-13070: reiserfs_read_inode2: i/o failure 
occurred trying to find stat data of [562045 562059 0x0 SD]
Dec  1 05:28:37 sim0 kernel: is_tree_node: node level 22058 does not 
match to the expected one 1
Dec  1 05:28:37 sim0 kernel: vs-5150: search_by_key: invalid format 
found in block 175880844. Fsck?
Dec  1 05:28:37 sim0 kernel: vs-13070: reiserfs_read_inode2: i/o failure 
occurred trying to find stat data of [562045 562056 0x0 SD]
Dec  1 05:28:37 sim0 kernel: is_tree_node: node level 22058 does not 
match to the expected one 1
Dec  1 05:28:37 sim0 kernel: vs-5150: search_by_key: invalid format 
found in block 175880844. Fsck?
Dec  1 05:28:37 sim0 kernel: vs-13070: reiserfs_read_inode2: i/o failure 
occurred trying to find stat data of [562045 562079 0x0 SD]
Dec  1 05:28:37 sim0 kernel: is_tree_node: node level 22058 does not 
match to the expected one 1
Dec  1 05:28:37 sim0 kernel: vs-5150: search_by_key: invalid format 
found in block 175880844. Fsck?
Dec  1 05:28:37 sim0 kernel: vs-13070: reiserfs_read_inode2: i/o failure 
occurred trying to find stat data of [562045 564468 0x0 SD]
Dec  1 05:28:37 sim0 kernel: is_tree_node: node level 22058 does not 
match to the expected one 1
Dec  1 05:28:37 sim0 kernel: vs-5150: search_by_key: invalid format 
found in block 175880844. Fsck?
Dec  1 05:28:37 sim0 kernel: vs-13070: reiserfs_read_inode2: i/o failure 
occurred trying to find stat data of [562057 562089 0x0 SD]
Dec  1 16:15:33 sim0 kernel: vs-7000: search_by_entry_key: search_by_key 
returned item position == 0<4>zam-7001: io error in reiserfs_find_entry
Dec  1 16:15:36 sim0 kernel: vs-7000: search_by_entry_key: search_by_key 
returned item position == 0<4>vs-7000: search_by_entry_key: 
search_by_key returned item position == 0<4>zam-7001: io error in 
reiserfs_find_entry

As the output suggested, I ran reiserfsck. I don't have the log of that 
check, but it ended suggesting to run `reiserfsck --rebuild-tree`. As I 
said, I've done --rebuild-tree twice already over one year and there 
were no problems so far. This time, the fsck bombed out with a 
Segmentation Fault after running for about 10 hours. I started it again, 
sorry, I have no log, it stopped after 10 hours of running again with a 
Segementation Fault.

When I try to mount it, the dmesg output is:
reiserfs: found format "3.6" with standard journal
reiserfs: checking transaction log (device sd(8,38)) ...
for (sd(8,38))
is_tree_node: node level 0 does not match to the expected one 65534
sd(8,38):vs-5150: search_by_key: invalid format found in block 0. Fsck?
sd(8,38):vs-13070: reiserfs_read_inode2: i/o failure occurred trying to 
find stat data of [1 2 0x0 SD]
sd(8,38):Using r5 hash to sort names
is_tree_node: node level 0 does not match to the expected one 65534
sd(8,38):vs-5150: search_by_key: invalid format found in block 0. Fsck?
sd(8,38):vs-2140: finish_unfinished: search_by_key returned -2

When I run a simple `reiserfsck` it says:
Replaying journal..
0 transactions replayed
Checking internal tree..

Bad root block 0. (--rebuild-tree did not complete)

Aborted



Should I start restoring from tape?

Greetz
    Marc


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Bad root block 0. (--rebuild-tree did not complete)
  2003-12-02 19:28 Bad root block 0. (--rebuild-tree did not complete) Marc Schmitt
@ 2003-12-02 20:06 ` Vitaly Fertman
  2003-12-02 20:28   ` Marc Schmitt
  0 siblings, 1 reply; 6+ messages in thread
From: Vitaly Fertman @ 2003-12-02 20:06 UTC (permalink / raw)
  To: Marc Schmitt, ReiserFS-List

Hi Marc,

On Tuesday 02 December 2003 22:28, Marc Schmitt wrote:
> Hi list,
>
> Attached to a Dell PE8450 (8 CPUs, 4GB RAM) we have a PowerVault 210
> Storage with 12 70GB SCSI disks in a RAID5 container w/ spare disk
> managed by a
>
> scsi2 : Found a MegaRAID controller at 0xf8853000, IRQ: 24
> scsi2 : Enabling 64 bit support
> megaraid: [1.73:3.27] detected 1 logical drives
>
> The machine is primarily running as NFS server under kernel 2.4.21 and
> now under 2.4.23, reiserfsprogs 3.6.11. Twice already I had to run
> `reiserfsck --rebuild-tree` this year due to weird corruption issues.
> Since a couple of weeks (when the server was running under 2.4.21),
> we've seen strange console output and users were claiming files had been
> lost or they would get "permission denied" errors.
>
> As the output suggested, I ran reiserfsck. I don't have the log of that
> check, but it ended suggesting to run `reiserfsck --rebuild-tree`. As I
> said, I've done --rebuild-tree twice already over one year and there
> were no problems so far. This time, the fsck bombed out with a
> Segmentation Fault after running for about 10 hours. I started it again,
> sorry, I have no log, it stopped after 10 hours of running again with a
> Segementation Fault.

Would you run 
    debugreiserfs -p /dev/problem_device | bzip2 -c > metadata.bz2
and make it available for downloading. I will debug the problem 
locally then.

> When I run a simple `reiserfsck` it says:
> Replaying journal..
> 0 transactions replayed
> Checking internal tree..
>
> Bad root block 0. (--rebuild-tree did not complete)
>
> Aborted
>
> Should I start restoring from tape?

Let's try to find the problem first.

-- 
Thanks,
Vitaly Fertman

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Bad root block 0. (--rebuild-tree did not complete)
  2003-12-02 20:06 ` Vitaly Fertman
@ 2003-12-02 20:28   ` Marc Schmitt
  2003-12-02 20:39     ` Carl-Daniel Hailfinger
  0 siblings, 1 reply; 6+ messages in thread
From: Marc Schmitt @ 2003-12-02 20:28 UTC (permalink / raw)
  To: Vitaly Fertman; +Cc: ReiserFS-List

Hi Vitaly,

Thanks for the fast answer.

Vitaly Fertman wrote:

>Hi Marc,
>
>On Tuesday 02 December 2003 22:28, Marc Schmitt wrote:
>  
>
>>Hi list,
>>
>>Attached to a Dell PE8450 (8 CPUs, 4GB RAM) we have a PowerVault 210
>>Storage with 12 70GB SCSI disks in a RAID5 container w/ spare disk
>>managed by a
>>
>>scsi2 : Found a MegaRAID controller at 0xf8853000, IRQ: 24
>>scsi2 : Enabling 64 bit support
>>megaraid: [1.73:3.27] detected 1 logical drives
>>
>>The machine is primarily running as NFS server under kernel 2.4.21 and
>>now under 2.4.23, reiserfsprogs 3.6.11. Twice already I had to run
>>`reiserfsck --rebuild-tree` this year due to weird corruption issues.
>>Since a couple of weeks (when the server was running under 2.4.21),
>>we've seen strange console output and users were claiming files had been
>>lost or they would get "permission denied" errors.
>>
>>As the output suggested, I ran reiserfsck. I don't have the log of that
>>check, but it ended suggesting to run `reiserfsck --rebuild-tree`. As I
>>said, I've done --rebuild-tree twice already over one year and there
>>were no problems so far. This time, the fsck bombed out with a
>>Segmentation Fault after running for about 10 hours. I started it again,
>>sorry, I have no log, it stopped after 10 hours of running again with a
>>Segementation Fault.
>>    
>>
>
>Would you run 
>    debugreiserfs -p /dev/problem_device | bzip2 -c > metadata.bz2
>and make it available for downloading. I will debug the problem 
>locally then.
>
It is running now. I'll make the file metadata.bz2 available for 
downloading asap. ETA is 10h, right, if the `reiserfsck --rebuild-tree` 
took 10h before it segfaulted?

>Let's try to find the problem first.
>
That gives hope... ;)

Greetz
    Marc


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Bad root block 0. (--rebuild-tree did not complete)
  2003-12-02 20:28   ` Marc Schmitt
@ 2003-12-02 20:39     ` Carl-Daniel Hailfinger
  2003-12-02 20:53       ` Marc Schmitt
  0 siblings, 1 reply; 6+ messages in thread
From: Carl-Daniel Hailfinger @ 2003-12-02 20:39 UTC (permalink / raw)
  To: Marc Schmitt; +Cc: Vitaly Fertman, ReiserFS-List

Marc Schmitt wrote:
> Hi Vitaly,
> 
> Thanks for the fast answer.
> 
> Vitaly Fertman wrote:
> 
>> Hi Marc,
>>
[...]
>> Would you run    debugreiserfs -p /dev/problem_device | bzip2 -c >
>> metadata.bz2
>> and make it available for downloading. I will debug the problem
>> locally then.
>>
> It is running now. I'll make the file metadata.bz2 available for
> downloading asap. ETA is 10h, right, if the `reiserfsck --rebuild-tree`
> took 10h before it segfaulted?

No. More like a few minutes for disk accesses and some time for
compressing the result. The command is of the "read all metadata and dump
it without any additional checks whatsoever" variety.


Carl-Daniel
-- 
http://www.hailfinger.org/


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Bad root block 0. (--rebuild-tree did not complete)
  2003-12-02 20:39     ` Carl-Daniel Hailfinger
@ 2003-12-02 20:53       ` Marc Schmitt
  2003-12-02 21:17         ` Carl-Daniel Hailfinger
  0 siblings, 1 reply; 6+ messages in thread
From: Marc Schmitt @ 2003-12-02 20:53 UTC (permalink / raw)
  To: Carl-Daniel Hailfinger; +Cc: Vitaly Fertman, ReiserFS-List

Hi Carl-Daniel,

Carl-Daniel Hailfinger wrote:

>Marc Schmitt wrote:
>  
>
>>Hi Vitaly,
>>
>>Thanks for the fast answer.
>>
>>Vitaly Fertman wrote:
>>
>>    
>>
>>>Hi Marc,
>>>
>>>      
>>>
>[...]
>  
>
>>>Would you run    debugreiserfs -p /dev/problem_device | bzip2 -c >
>>>metadata.bz2
>>>and make it available for downloading. I will debug the problem
>>>locally then.
>>>
>>>      
>>>
>>It is running now. I'll make the file metadata.bz2 available for
>>downloading asap. ETA is 10h, right, if the `reiserfsck --rebuild-tree`
>>took 10h before it segfaulted?
>>    
>>
>
>No. More like a few minutes for disk accesses and some time for
>compressing the result. The command is of the "read all metadata and dump
>it without any additional checks whatsoever" variety.
>
Hmm, it's running for 15' now and it does not look like it's going to 
finish soon:

debugreiserfs 3.6.11 (2003 www.namesys.com)

Loading on-disk bitmap .. 162980857 bits set - done
super block..ok
bitmaps..(5396).. ok
journal (from 18 to 8210)..ok
Super block, bitmaps, journal - 13590 blocks - done, 162967267 blocks left
0%                                                    left 161169318, 
1250 /sec

The file metadata.bz2 is still growing. The 1250 /sec increased slowly 
during fsck and went up to 6000 /sec towards the end. At the current 
pace, it would take 36h...

Greetz
    Marc




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Bad root block 0. (--rebuild-tree did not complete)
  2003-12-02 20:53       ` Marc Schmitt
@ 2003-12-02 21:17         ` Carl-Daniel Hailfinger
  0 siblings, 0 replies; 6+ messages in thread
From: Carl-Daniel Hailfinger @ 2003-12-02 21:17 UTC (permalink / raw)
  To: Marc Schmitt; +Cc: Vitaly Fertman, ReiserFS-List

Hi Marc,

Marc Schmitt wrote:
> 
> Carl-Daniel Hailfinger wrote:
> 
>> Marc Schmitt wrote:
>>  
>>> [...]
>>>
>>> It is running now. I'll make the file metadata.bz2 available for
>>> downloading asap. ETA is 10h, right, if the `reiserfsck --rebuild-tree`
>>> took 10h before it segfaulted?
>>
>> No. More like a few minutes for disk accesses and some time for
>> compressing the result. The command is of the "read all metadata and dump
>> it without any additional checks whatsoever" variety.
>>
> Hmm, it's running for 15' now and it does not look like it's going to
> finish soon:
> 
[...]
> Super block, bitmaps, journal - 13590 blocks - done, 162967267 blocks left
> 0%                                                    left 161169318,
> 1250 /sec
> 
> The file metadata.bz2 is still growing. The 1250 /sec increased slowly
> during fsck and went up to 6000 /sec towards the end. At the current
> pace, it would take 36h...

Sorry, I totally underestimated your disk_size/transfer_rate ratio. Your
disks are probably 1000x larger than mine (4GB), but not 1000x faster.


Regards,
Carl-Daniel
-- 
http://www.hailfinger.org/


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2003-12-02 21:17 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-12-02 19:28 Bad root block 0. (--rebuild-tree did not complete) Marc Schmitt
2003-12-02 20:06 ` Vitaly Fertman
2003-12-02 20:28   ` Marc Schmitt
2003-12-02 20:39     ` Carl-Daniel Hailfinger
2003-12-02 20:53       ` Marc Schmitt
2003-12-02 21:17         ` Carl-Daniel Hailfinger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.