* reiserfsck --rebuild-tree on a big filesystem
@ 2002-11-25 14:36 Mira Tempír
2002-11-25 14:48 ` Oleg Drokin
` (2 more replies)
0 siblings, 3 replies; 17+ messages in thread
From: Mira Tempír @ 2002-11-25 14:36 UTC (permalink / raw)
To: reiserfs-list
Hello,
my setup:
linux 2.4.18 SMP, HW RAID5, nearly full 240GB reiserfs of mostly small files,
reiserprogs 3.6.4.
After some power outages (but this is only my opinion, that this
caused of our problems), our array run into situation when
parts of filesystem structure were broken (it seemed so, but were still
usable):
Nov 23 22:24:55 zeus kernel: vs-13070: reiserfs_read_inode2: i/o failure
occurred trying to find stat data of [9221389 9221390 0x0 SD]
Nov 23 22:24:55 zeus kernel: is_tree_node: node level 12664 does not
match to the expected one 1
Nov 23 22:24:55 zeus kernel: vs-5150:
search_by_key: invalid format found in block 56940122. Fsck?
So I've started filesystem check with these results:
node (3117838) with wrong level (0) found in the tree (should be 1)
whole subtree skipped
node (56940122) with wrong level (12664) found in the tree (should be 1)
whole subtree skipped
free block count 41569 mismatches with a correct one 43484.
on-disk bitmap does not match to the correct one.
With advice to run --rebuild-tree.
But after 16 hours of running it, it is still on 0% and whole process
would take about 1 month to complete (started at 2000 blocks/sec,
but now running at 26b/s).
My question is: is there any way how to speed up filesystem restore?
I still believe, that problem is not in hardware and data is recoverable.
We are ready to pay for support if it can help in same way.
--
mira tempír <mira@cekit.cz> -- èekit s.r.o.
bratislavská 2, 602 00 brno, czech republic
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: reiserfsck --rebuild-tree on a big filesystem
2002-11-25 14:36 reiserfsck --rebuild-tree on a big filesystem Mira Tempír
@ 2002-11-25 14:48 ` Oleg Drokin
[not found] ` <20021125153154.GA19254@mail.cekit.cz>
2002-12-19 16:51 ` Zygo Blaxell
2002-11-25 14:59 ` Anders Widman
2002-11-26 10:58 ` SPAMTEST Alexander Lyamin
2 siblings, 2 replies; 17+ messages in thread
From: Oleg Drokin @ 2002-11-25 14:48 UTC (permalink / raw)
To: Mira Temp?r; +Cc: reiserfs-list
Hello!
On Mon, Nov 25, 2002 at 03:36:34PM +0100, Mira Temp?r wrote:
> my setup:
> linux 2.4.18 SMP, HW RAID5, nearly full 240GB reiserfs of mostly small files,
> reiserprogs 3.6.4.
> With advice to run --rebuild-tree.
> But after 16 hours of running it, it is still on 0% and whole process
> would take about 1 month to complete (started at 2000 blocks/sec,
> but now running at 26b/s).
This is very strange, We've seen reports of reiserfsck --rebuild-tree
of 500Gb volume (filled to 170Gb) finished in 8 hours.
Can you please verify that the read speed of your RAID5 is constant in all
the areas of disk, that it is not opearting in some kind of degraded mode
and is generally in good state?
What is the ARID5 hardware you are using?
> My question is: is there any way how to speed up filesystem restore?
> I still believe, that problem is not in hardware and data is recoverable.
Data is recoverable for sure, but the speed degradation you are seeing
is strange and I do not yet think it should be attributed to reiserfsck.
Bye,
Oleg
^ permalink raw reply [flat|nested] 17+ messages in thread[parent not found: <20021125153154.GA19254@mail.cekit.cz>]
* Re: reiserfsck --rebuild-tree on a big filesystem
[not found] ` <20021125153154.GA19254@mail.cekit.cz>
@ 2002-11-25 15:44 ` Oleg Drokin
2002-11-26 22:56 ` Mira Tempir
0 siblings, 1 reply; 17+ messages in thread
From: Oleg Drokin @ 2002-11-25 15:44 UTC (permalink / raw)
To: Mira Temp?r; +Cc: reiserfs-list
Hello!
On Mon, Nov 25, 2002 at 04:31:54PM +0100, Mira Temp?r wrote:
> Oleg Drokin - 25/11/02 17:48 napsal:
> | > But after 16 hours of running it, it is still on 0% and whole process
> | > would take about 1 month to complete (started at 2000 blocks/sec,
> | > but now running at 26b/s).
> | This is very strange, We've seen reports of reiserfsck --rebuild-tree
> | of 500Gb volume (filled to 170Gb) finished in 8 hours.
> Can be the speed affected with high usage of FS (about 99%)?
Not at this stage at least. How much RAM do you have?
> One point that I've missed in previous e-mail: during rebuild-tree,
> there is a lot of messages in log like this:
> pass0: vpf-10700: block 1615323, item 6: The item with wrong offset or
> length found [9532827 9534032 0x3001 DRCT (2)], len 1000 - deleted
This is pretty strange thing indeed. File tail beyond border that is allowed.
I do not know how it might have happen.
> | Can you please verify that the read speed of your RAID5 is constant in all
> | the areas of disk, that it is not opearting in some kind of degraded mode
> | and is generally in good state?
> raid is reporting to be ok, data read speed are almost the same at
> various places:
> 0:03.80elapsed 18%CPU
> 0:02.97elapsed 19%CPU
> 0:01.76elapsed 24%CPU
> 0:01.51elapsed 36%CPU
> ... (dd reading 40MBytes)
BTW, what is the CPU usage when the speed is degraded?
(have you already interrupted reiserfsck process? if not then attach to it with
gd and obtain a pair of stacktraces for us please.)
Bye,
Oleg
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: reiserfsck --rebuild-tree on a big filesystem
2002-11-25 15:44 ` Oleg Drokin
@ 2002-11-26 22:56 ` Mira Tempir
2002-11-27 7:33 ` Oleg Drokin
0 siblings, 1 reply; 17+ messages in thread
From: Mira Tempir @ 2002-11-26 22:56 UTC (permalink / raw)
To: Oleg Drokin; +Cc: reiserfs-list
Hello,
Sorry for the delay. Meantime, I've done some tests
on hardware, but haven't got any interesting results.
I have no idea what to do now :(
Oleg Drokin - 25/11/02 18:44 wrote:
|
| > Can be the speed affected with high usage of FS (about 99%)?
|
| Not at this stage at least. How much RAM do you have?
I'm sure, this is not the bottleneck. There is more than 1GB cached
memory for fsck now.
|
| BTW, what is the CPU usage when the speed is degraded?
Nothing strange - about 80% during whole process.
| (have you already interrupted reiserfsck process? if not then attach to it with
| gd and obtain a pair of stacktraces for us please.)
Well, I've run fsck for many times last few days. I hope, this is not a
problem.
I'm not sure, if this is what you wanted and how this can help:
(gdb) bt
#0 0x40078fc5 in memmove () at ../sysdeps/generic/memmove.c:108
#1 0x805881f in __mark_objectid_really_used (map=0x8110b20, id=413432, pos=27804)
at uobjectid.c:203
#2 0x80588e4 in mark_objectid_really_used (map=0x8110b20, id=413432)
at uobjectid.c:240
#3 0x804e5ed in pass0_correct_leaf (fs=0x808a9d8, bh=0x80fc0c4)
at pass0.c:1254
#4 0x804ed93 in do_pass_0 (fs=0x808a9d8) at pass0.c:1498
#5 0x804fda2 in pass_0 (fs=0x808a9d8) at pass0.c:1893
#6 0x804a851 in rebuild_tree (fs=0x808a9d8) at main.c:737
#7 0x804b7bf in main (argc=3, argv=0xbffffcd4) at main.c:1102
(gdb) bt
#0 0x40078fc5 in memmove () at ../sysdeps/generic/memmove.c:108
#1 0x805881f in __mark_objectid_really_used (map=0x8110b20, id=4210980, pos=139062)
at uobjectid.c:203
#2 0x80588e4 in mark_objectid_really_used (map=0x8110b20, id=4210980)
at uobjectid.c:240
#3 0x804e5a6 in pass0_correct_leaf (fs=0x808a9d8, bh=0x808ab18) at pass0.c:1251
#4 0x804ed93 in do_pass_0 (fs=0x808a9d8) at pass0.c:1498
#5 0x804fda2 in pass_0 (fs=0x808a9d8) at pass0.c:1893
#6 0x804a851 in rebuild_tree (fs=0x808a9d8) at main.c:737
#7 0x804b7bf in main (argc=3, argv=0xbffffcd4) at main.c:1102
Maybe more interesting could be output from profiler:
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
80.52 118.11 118.11 174768 0.68 0.68 memmove
8.69 130.86 12.75 __mcount_internal
2.07 133.89 3.03 94972 0.03 0.03 read
1.55 136.16 2.27 mcount
1.35 138.14 1.98 96611 0.02 0.02 memset
1.21 139.92 1.78 64747384 0.00 0.00 reiserfs_bitmap_test_bit
0.68 140.92 1.00 1 1000.00 2809.34 reiserfs_fetch_ondisk_bitmap
0.52 141.68 0.76 24852879 0.00 0.00 comp_ids
0.50 142.42 0.74 72662 0.01 1.69 pass0_correct_leaf
0.28 142.83 0.41 13535446 0.00 0.00 get_type
0.26 143.21 0.38 1510128 0.00 0.00 reiserfs_bin_search
0.20 143.50 0.29 94971 0.00 0.00 llseek
0.16 143.73 0.23 2254456 0.00 0.00 block_of_bitmap
0.15 143.95 0.22 351991 0.00 0.00 _IO_vfscanf
0.12 144.12 0.17 354988 0.00 0.00 vfprintf
0.10 144.26 0.14 2235113 0.00 0.00 reiserfs_bitmap_set_bit
0.09 144.39 0.13 704574 0.00 0.00 _IO_str_init_static
0.09 144.52 0.13 92006 0.00 0.00 time
...
Is it ok with memmove ?
Thanks a lot.
--
mira tempir <mira#cekit,cz> -- cekit s.r.o.
bratislavska 2, 602 00 brno, czech republic
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: reiserfsck --rebuild-tree on a big filesystem
2002-11-26 22:56 ` Mira Tempir
@ 2002-11-27 7:33 ` Oleg Drokin
2002-11-27 12:46 ` Vitaly Fertman
0 siblings, 1 reply; 17+ messages in thread
From: Oleg Drokin @ 2002-11-27 7:33 UTC (permalink / raw)
To: Mira Tempir; +Cc: reiserfs-list, vitaly
Hello!
On Tue, Nov 26, 2002 at 11:56:25PM +0100, Mira Tempir wrote:
> | > Can be the speed affected with high usage of FS (about 99%)?
> | Not at this stage at least. How much RAM do you have?
> I'm sure, this is not the bottleneck. There is more than 1GB cached
> memory for fsck now.
Ok, just making sure.
> | BTW, what is the CPU usage when the speed is degraded?
> Nothing strange - about 80% during whole process.
Even when the speed is degraded to ~26 blocks/sec?
> | (have you already interrupted reiserfsck process? if not then attach to it with
> | gd and obtain a pair of stacktraces for us please.)
> Well, I've run fsck for many times last few days. I hope, this is not a
> problem.
Hm.
Can you make a FS snapshot for us please?
Please run debugreiserfs -p /dev/your_device | gzip -9c >metadata.gz
and make the file available for download.
I hope this won't take much time, if it will we may ask for some remote
access to that box if possible.
> I'm not sure, if this is what you wanted and how this can help:
> (gdb) bt
> #0 0x40078fc5 in memmove () at ../sysdeps/generic/memmove.c:108
> #1 0x805881f in __mark_objectid_really_used (map=0x8110b20, id=413432, pos=27804)
> at uobjectid.c:203
> #2 0x80588e4 in mark_objectid_really_used (map=0x8110b20, id=413432)
> at uobjectid.c:240
> #3 0x804e5ed in pass0_correct_leaf (fs=0x808a9d8, bh=0x80fc0c4)
> at pass0.c:1254
> #4 0x804ed93 in do_pass_0 (fs=0x808a9d8) at pass0.c:1498
> #5 0x804fda2 in pass_0 (fs=0x808a9d8) at pass0.c:1893
> #6 0x804a851 in rebuild_tree (fs=0x808a9d8) at main.c:737
> #7 0x804b7bf in main (argc=3, argv=0xbffffcd4) at main.c:1102
>
> (gdb) bt
> #0 0x40078fc5 in memmove () at ../sysdeps/generic/memmove.c:108
> #1 0x805881f in __mark_objectid_really_used (map=0x8110b20, id=4210980, pos=139062)
> at uobjectid.c:203
> #2 0x80588e4 in mark_objectid_really_used (map=0x8110b20, id=4210980)
> at uobjectid.c:240
> #3 0x804e5a6 in pass0_correct_leaf (fs=0x808a9d8, bh=0x808ab18) at pass0.c:1251
> #4 0x804ed93 in do_pass_0 (fs=0x808a9d8) at pass0.c:1498
> #5 0x804fda2 in pass_0 (fs=0x808a9d8) at pass0.c:1893
> #6 0x804a851 in rebuild_tree (fs=0x808a9d8) at main.c:737
> #7 0x804b7bf in main (argc=3, argv=0xbffffcd4) at main.c:1102
>
>
> Maybe more interesting could be output from profiler:
>
> Flat profile:
>
> Each sample counts as 0.01 seconds.
> % cumulative self self total
> time seconds seconds calls ms/call ms/call name
> 80.52 118.11 118.11 174768 0.68 0.68 memmove
> 8.69 130.86 12.75 __mcount_internal
> 2.07 133.89 3.03 94972 0.03 0.03 read
> 1.55 136.16 2.27 mcount
> 1.35 138.14 1.98 96611 0.02 0.02 memset
> 1.21 139.92 1.78 64747384 0.00 0.00 reiserfs_bitmap_test_bit
> 0.68 140.92 1.00 1 1000.00 2809.34 reiserfs_fetch_ondisk_bitmap
> 0.52 141.68 0.76 24852879 0.00 0.00 comp_ids
> 0.50 142.42 0.74 72662 0.01 1.69 pass0_correct_leaf
> 0.28 142.83 0.41 13535446 0.00 0.00 get_type
> 0.26 143.21 0.38 1510128 0.00 0.00 reiserfs_bin_search
> 0.20 143.50 0.29 94971 0.00 0.00 llseek
> 0.16 143.73 0.23 2254456 0.00 0.00 block_of_bitmap
> 0.15 143.95 0.22 351991 0.00 0.00 _IO_vfscanf
> 0.12 144.12 0.17 354988 0.00 0.00 vfprintf
> 0.10 144.26 0.14 2235113 0.00 0.00 reiserfs_bitmap_set_bit
> 0.09 144.39 0.13 704574 0.00 0.00 _IO_str_init_static
> 0.09 144.52 0.13 92006 0.00 0.00 time
> ...
> Is it ok with memmove ?
Vitaly, what do you think about these?
Bye,
Oleg
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: reiserfsck --rebuild-tree on a big filesystem
2002-11-27 7:33 ` Oleg Drokin
@ 2002-11-27 12:46 ` Vitaly Fertman
2002-11-27 13:38 ` Mira Tempir
0 siblings, 1 reply; 17+ messages in thread
From: Vitaly Fertman @ 2002-11-27 12:46 UTC (permalink / raw)
To: Oleg Drokin, Mira Tempir; +Cc: reiserfs-list
>
> > I'm not sure, if this is what you wanted and how this can help:
> > (gdb) bt
> > #0 0x40078fc5 in memmove () at ../sysdeps/generic/memmove.c:108
> > #1 0x805881f in __mark_objectid_really_used (map=0x8110b20, id=413432,
> > pos=27804) at uobjectid.c:203
> > #2 0x80588e4 in mark_objectid_really_used (map=0x8110b20, id=413432)
> > at uobjectid.c:240
> > #3 0x804e5ed in pass0_correct_leaf (fs=0x808a9d8, bh=0x80fc0c4)
> > at pass0.c:1254
> > #4 0x804ed93 in do_pass_0 (fs=0x808a9d8) at pass0.c:1498
> > #5 0x804fda2 in pass_0 (fs=0x808a9d8) at pass0.c:1893
> > #6 0x804a851 in rebuild_tree (fs=0x808a9d8) at main.c:737
> > #7 0x804b7bf in main (argc=3, argv=0xbffffcd4) at main.c:1102
> >
> > (gdb) bt
> > #0 0x40078fc5 in memmove () at ../sysdeps/generic/memmove.c:108
> > #1 0x805881f in __mark_objectid_really_used (map=0x8110b20, id=4210980,
> > pos=139062) at uobjectid.c:203
> > #2 0x80588e4 in mark_objectid_really_used (map=0x8110b20, id=4210980)
> > at uobjectid.c:240
> > #3 0x804e5a6 in pass0_correct_leaf (fs=0x808a9d8, bh=0x808ab18) at
> > pass0.c:1251 #4 0x804ed93 in do_pass_0 (fs=0x808a9d8) at pass0.c:1498
> > #5 0x804fda2 in pass_0 (fs=0x808a9d8) at pass0.c:1893
> > #6 0x804a851 in rebuild_tree (fs=0x808a9d8) at main.c:737
> > #7 0x804b7bf in main (argc=3, argv=0xbffffcd4) at main.c:1102
> > Maybe more interesting could be output from profiler:
> >
> > Flat profile:
> >
> > Each sample counts as 0.01 seconds.
> > % cumulative self self total
> > time seconds seconds calls ms/call ms/call name
> > 80.52 118.11 118.11 174768 0.68 0.68 memmove
> > 8.69 130.86 12.75 __mcount_internal
> > 2.07 133.89 3.03 94972 0.03 0.03 read
> > 1.55 136.16 2.27 mcount
> > 1.35 138.14 1.98 96611 0.02 0.02 memset
> > 1.21 139.92 1.78 64747384 0.00 0.00
> > reiserfs_bitmap_test_bit 0.68 140.92 1.00 1 1000.00
> > 2809.34 reiserfs_fetch_ondisk_bitmap 0.52 141.68 0.76 24852879
> > 0.00 0.00 comp_ids
> > 0.50 142.42 0.74 72662 0.01 1.69 pass0_correct_leaf
> > 0.28 142.83 0.41 13535446 0.00 0.00 get_type
> > 0.26 143.21 0.38 1510128 0.00 0.00 reiserfs_bin_search
> > 0.20 143.50 0.29 94971 0.00 0.00 llseek
> > 0.16 143.73 0.23 2254456 0.00 0.00 block_of_bitmap
> > 0.15 143.95 0.22 351991 0.00 0.00 _IO_vfscanf
> > 0.12 144.12 0.17 354988 0.00 0.00 vfprintf
> > 0.10 144.26 0.14 2235113 0.00 0.00
> > reiserfs_bitmap_set_bit 0.09 144.39 0.13 704574 0.00
> > 0.00 _IO_str_init_static 0.09 144.52 0.13 92006 0.00
> > 0.00 time
> > ...
> > Is it ok with memmove ?
>
> Vitaly, what do you think about these?
Did you run fsck some times during last days and you had the
same low speed, right?
memmove is called when an item is removed and when an objectid
is marked as used in objectid map. Backtraces say that it is
the second case here, strange. Could you run reiserfsck again,
wait for such a low speed and look at map->m_page_count and
map->m_used_slots_count from mark_objectid_really_used.
And it is strange also that there are so many wrong tails.
Which kernels did you use before? Which kernel do you think
you used when the most amount of files were created?
Ok, that snapshot of the fs, Oleg asked you for, would give
us some answers, at lest about map size.
--
Thanks,
Vitaly Fertman
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: reiserfsck --rebuild-tree on a big filesystem
2002-11-27 12:46 ` Vitaly Fertman
@ 2002-11-27 13:38 ` Mira Tempir
0 siblings, 0 replies; 17+ messages in thread
From: Mira Tempir @ 2002-11-27 13:38 UTC (permalink / raw)
To: Vitaly Fertman; +Cc: Oleg Drokin, reiserfs-list
Hi!
Vitaly Fertman - 27/11/02 15:46 napsal:
|
| Did you run fsck some times during last days and you had the
| same low speed, right?
Last night, I tried older r-fsck 3.6.2 (I was quite desperate)
and after 14hours, it was still more than 100 blocs per sec.
(which was great in comparison with current r-fsck, but still
not enough to rebuild whole FS in few weeks)
| memmove is called when an item is removed and when an objectid
| is marked as used in objectid map. Backtraces say that it is
| the second case here, strange. Could you run reiserfsck again,
| wait for such a low speed and look at map->m_page_count and
| map->m_used_slots_count from mark_objectid_really_used.
|
| And it is strange also that there are so many wrong tails.
| Which kernels did you use before? Which kernel do you think
| you used when the most amount of files were created?
I'm not sure about the first used with reiserfs, it could be
something like 2.4.1x, most files were created under 2.4.18 (and .17).
Thanks
--
mira tempir <mira#cekit,cz> -- cekit s.r.o.
bratislavska 2, 602 00 brno, czech republic
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: reiserfsck --rebuild-tree on a big filesystem
2002-11-25 14:48 ` Oleg Drokin
[not found] ` <20021125153154.GA19254@mail.cekit.cz>
@ 2002-12-19 16:51 ` Zygo Blaxell
2002-12-19 17:00 ` Oleg Drokin
2002-12-23 7:54 ` Hans Reiser
1 sibling, 2 replies; 17+ messages in thread
From: Zygo Blaxell @ 2002-12-19 16:51 UTC (permalink / raw)
To: reiserfs-list
This thread is rather old, but I have basically the same problem...
In article <20021125174846.A31778@namesys.com>,
Oleg Drokin <green@namesys.com> wrote:
>On Mon, Nov 25, 2002 at 03:36:34PM +0100, Mira Temp?r wrote:
>> my setup:
>> linux 2.4.18 SMP, HW RAID5, nearly full 240GB reiserfs of mostly small files,
>> reiserprogs 3.6.4.
>> With advice to run --rebuild-tree.
Mine:
Linux 2.4.18 UP (SMP kernel), SW RAID0 and linear, nearly full 160GB
reiserfs of mostly small files, reiserfsprogs 1:3.6.3-1 (Debian). I had
two machines go through a power failure while doing lots of concurrent
hard links and deletes. The linear system survived apparently intact,
while the RAID0 system had the usual "stat Permission denied"
problem.
This has happened a number of times before on a variety of systems
(it seems to be reiserfs's most common failure mode). The events are
very similar each time. AFAICT you just have to set up a reiserfs on
some kind of RAID system and let it run through a few power failures to
reproduce the problem.
>> But after 16 hours of running it, it is still on 0% and whole process
>> would take about 1 month to complete (started at 2000 blocks/sec,
>> but now running at 26b/s).
My reiserfsck --rebuild-tree has been running for a little over an hour
and has degraded to 49 blocks/sec (about 10 days). r-fsck absorbs all
the CPU it can get. The reiserfsck process's memory usage increases by
about 8 bytes per block read, and as far as I can tell all of the RAM
allocated after the first 24 megs or so is active (I determine this by
creating a large process to swap out all the inactive pages, then
counting what remains). Nothing is physically wrong with the disks,
it was just another power failure.
Based on past events, the read speed will probably degrade to zero in a
week or two, if the reiserfsck process doesn't run out of RAM+swap first.
It only takes 20 days to repopulate the disk, so in the past when this
happens I usually just mkreiserfs (or mke2fs -j!) and move on. I've never
seen a reiserfsck --rebuild-tree run to completion.
If you feel that 'debugreiserfs -p ... | bzip2' output would be helpful,
I could generate it, but that's about 40GB of data.
--
Opinions expressed are my own, I don't speak for my employer, and all that.
Encrypted email preferred. Go ahead, you know you want to. ;-)
OpenPGP at work: 3528 A66A A62D 7ACE 7258 E561 E665 AA6F 263D 2C3D
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: reiserfsck --rebuild-tree on a big filesystem
2002-12-19 16:51 ` Zygo Blaxell
@ 2002-12-19 17:00 ` Oleg Drokin
2002-12-19 20:09 ` Zygo Blaxell
2002-12-23 7:54 ` Hans Reiser
1 sibling, 1 reply; 17+ messages in thread
From: Oleg Drokin @ 2002-12-19 17:00 UTC (permalink / raw)
To: Zygo Blaxell; +Cc: reiserfs-list
Hello!
On Thu, Dec 19, 2002 at 04:51:23PM +0000, Zygo Blaxell wrote:
> >> would take about 1 month to complete (started at 2000 blocks/sec,
> >> but now running at 26b/s).
> My reiserfsck --rebuild-tree has been running for a little over an hour
> and has degraded to 49 blocks/sec (about 10 days). r-fsck absorbs all
Part of this problem is solved - get latest 3.6.5-pre version of reiserfsprogs
and give it a try.
> Based on past events, the read speed will probably degrade to zero in a
> week or two, if the reiserfsck process doesn't run out of RAM+swap first.
No it won't.
> If you feel that 'debugreiserfs -p ... | bzip2' output would be helpful,
> I could generate it, but that's about 40GB of data.
No, we know what the problem with read speed degradation at pass0 is and
Vitaly have fixed it already.
Also there is similar problem expected at semantic pass that was not dealt with
because we are not yet sure how to solve that. But overall
3.6.5-pre1 should be noticeable faster.
Bye,
Oleg
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: reiserfsck --rebuild-tree on a big filesystem
2002-12-19 17:00 ` Oleg Drokin
@ 2002-12-19 20:09 ` Zygo Blaxell
2002-12-20 16:12 ` Zygo Blaxell
0 siblings, 1 reply; 17+ messages in thread
From: Zygo Blaxell @ 2002-12-19 20:09 UTC (permalink / raw)
To: reiserfs-list
In article <20021219200020.A7258@namesys.com>,
Oleg Drokin <green@namesys.com> wrote:
>Part of this problem is solved - get latest 3.6.5-pre version of reiserfsprogs
>and give it a try.
In less than 7 hours pass 0 will be done, at 600-1900 blocks per second.
Memory usage is also constant now--the reiserfsck process has been at
24104K since it started.
>Also there is similar problem expected at semantic pass that was not dealt with
>because we are not yet sure how to solve that.
I guess I'll find out in 7 hours what the semantic pass will be like.
Fortunately I don't need to use this particular filesystem until 2003,
so there is time to run little experiments like this...
--
Opinions expressed are my own, I don't speak for my employer, and all that.
Encrypted email preferred. Go ahead, you know you want to. ;-)
OpenPGP at work: 3528 A66A A62D 7ACE 7258 E561 E665 AA6F 263D 2C3D
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: reiserfsck --rebuild-tree on a big filesystem
2002-12-19 20:09 ` Zygo Blaxell
@ 2002-12-20 16:12 ` Zygo Blaxell
0 siblings, 0 replies; 17+ messages in thread
From: Zygo Blaxell @ 2002-12-20 16:12 UTC (permalink / raw)
To: reiserfs-list
In article <att918$u36$1@genki.hungrycats.org>,
Zygo Blaxell <eazgwmir@umail.furryterror.org> wrote:
>I guess I'll find out in 7 hours what the semantic pass will be like.
>Fortunately I don't need to use this particular filesystem until 2003,
>so there is time to run little experiments like this...
reiserfsck has gotten to pass 3 and is now doing a search of the
filesystem tree file-by-file. It appears to be progressing without
any major CPU or memory problems.
Traversing the filesystem tree on this machine (e.g. with 'find')
normally takes 3 days, so I guess we'll know if it works just in time
for Christmas. ;-)
Is it normal to see hundreds of
vpf-10680: The file [5029518 5034490] has the wrong block count in the StatData (16) - corrected to (8)
messages?
--
Opinions expressed are my own, I don't speak for my employer, and all that.
Encrypted email preferred. Go ahead, you know you want to. ;-)
OpenPGP at work: 3528 A66A A62D 7ACE 7258 E561 E665 AA6F 263D 2C3D
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: reiserfsck --rebuild-tree on a big filesystem
2002-12-19 16:51 ` Zygo Blaxell
2002-12-19 17:00 ` Oleg Drokin
@ 2002-12-23 7:54 ` Hans Reiser
2002-12-23 7:59 ` Oleg Drokin
1 sibling, 1 reply; 17+ messages in thread
From: Hans Reiser @ 2002-12-23 7:54 UTC (permalink / raw)
To: Zygo Blaxell; +Cc: reiserfs-list, Oleg Drokin
Zygo Blaxell wrote:
>This thread is rather old, but I have basically the same problem...
>
>In article <20021125174846.A31778@namesys.com>,
>Oleg Drokin <green@namesys.com> wrote:
>
>
>>On Mon, Nov 25, 2002 at 03:36:34PM +0100, Mira Temp?r wrote:
>>
>>
>>>my setup:
>>>linux 2.4.18 SMP, HW RAID5, nearly full 240GB reiserfs of mostly small files,
>>>reiserprogs 3.6.4.
>>>With advice to run --rebuild-tree.
>>>
>>>
>
>Mine:
>
>Linux 2.4.18 UP (SMP kernel), SW RAID0 and linear, nearly full 160GB
>reiserfs of mostly small files, reiserfsprogs 1:3.6.3-1 (Debian). I had
>two machines go through a power failure while doing lots of concurrent
>hard links and deletes. The linear system survived apparently intact,
>while the RAID0 system had the usual "stat Permission denied"
>problem.
>
>This has happened a number of times before on a variety of systems
>(it seems to be reiserfs's most common failure mode). The events are
>very similar each time. AFAICT you just have to set up a reiserfs on
>some kind of RAID system and let it run through a few power failures to
>reproduce the problem.
>
>
>
Oleg, please reproduce this. It is not acceptable....
Thanks for the report Zygo.
Hans
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: reiserfsck --rebuild-tree on a big filesystem
2002-12-23 7:54 ` Hans Reiser
@ 2002-12-23 7:59 ` Oleg Drokin
0 siblings, 0 replies; 17+ messages in thread
From: Oleg Drokin @ 2002-12-23 7:59 UTC (permalink / raw)
To: Hans Reiser; +Cc: Zygo Blaxell, reiserfs-list
Hello!
On Mon, Dec 23, 2002 at 10:54:58AM +0300, Hans Reiser wrote:
> >Linux 2.4.18 UP (SMP kernel), SW RAID0 and linear, nearly full 160GB
> >reiserfs of mostly small files, reiserfsprogs 1:3.6.3-1 (Debian). I had
> >two machines go through a power failure while doing lots of concurrent
> >hard links and deletes. The linear system survived apparently intact,
> >while the RAID0 system had the usual "stat Permission denied"
> >problem.
> >This has happened a number of times before on a variety of systems
> >(it seems to be reiserfs's most common failure mode). The events are
> >very similar each time. AFAICT you just have to set up a reiserfs on
> >some kind of RAID system and let it run through a few power failures to
> >reproduce the problem.
> Oleg, please reproduce this. It is not acceptable....
I believe this is infamous "write cache enabled" problem.
Bye,
Oleg
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: reiserfsck --rebuild-tree on a big filesystem
2002-11-25 14:36 reiserfsck --rebuild-tree on a big filesystem Mira Tempír
2002-11-25 14:48 ` Oleg Drokin
@ 2002-11-25 14:59 ` Anders Widman
2002-11-26 10:58 ` SPAMTEST Alexander Lyamin
2 siblings, 0 replies; 17+ messages in thread
From: Anders Widman @ 2002-11-25 14:59 UTC (permalink / raw)
To: reiserfs-list
> Hello,
> my setup:
> linux 2.4.18 SMP, HW RAID5, nearly full 240GB reiserfs of mostly small files,
> reiserprogs 3.6.4.
> After some power outages (but this is only my opinion, that this
> caused of our problems), our array run into situation when
> parts of filesystem structure were broken (it seemed so, but were still
> usable):
> Nov 23 22:24:55 zeus kernel: vs-13070: reiserfs_read_inode2: i/o failure
> occurred trying to find stat data of [9221389 9221390 0x0 SD]
> Nov 23 22:24:55 zeus kernel: is_tree_node: node level 12664 does not
> match to the expected one 1
> Nov 23 22:24:55 zeus kernel: vs-5150:
> search_by_key: invalid format found in block 56940122. Fsck?
> So I've started filesystem check with these results:
> node (3117838) with wrong level (0) found in the tree (should be 1)
> whole subtree skipped
> node (56940122) with wrong level (12664) found in the tree (should be 1)
> whole subtree skipped
> free block count 41569 mismatches with a correct one 43484.
> on-disk bitmap does not match to the correct one.
> With advice to run --rebuild-tree.
> But after 16 hours of running it, it is still on 0% and whole process
> would take about 1 month to complete (started at 2000 blocks/sec,
> but now running at 26b/s).
Have had this kind of problems when there were read-errors on disk.
Try checking with badblocks, or Drive Fitness Tool from your disk
vendor (Such as Maxtor PowerMax, IBM DFT, etc).
> My question is: is there any way how to speed up filesystem restore?
> I still believe, that problem is not in hardware and data is recoverable.
> We are ready to pay for support if it can help in same way.
^ permalink raw reply [flat|nested] 17+ messages in thread
* SPAMTEST
2002-11-25 14:36 reiserfsck --rebuild-tree on a big filesystem Mira Tempír
2002-11-25 14:48 ` Oleg Drokin
2002-11-25 14:59 ` Anders Widman
@ 2002-11-26 10:58 ` Alexander Lyamin
2 siblings, 0 replies; 17+ messages in thread
From: Alexander Lyamin @ 2002-11-26 10:58 UTC (permalink / raw)
To: reiserfs-list
THIS ENTERPRISE IS AWESOMELY FEATURED
IN SEPTEMBER 2000 MILLIONAIRE,
AUGUST 2000 TYCOONS AND
AUGUST 2000 ENTREPRENEUR Magazine.
====> Do you have a burning desire to change the quality of your existing life?
====> Would you like to live the life that others only dream about?
====> The fact is we have many people in our enterprise that earn over 50k per month
from the privacy of their own home and are retiring in 2-3 years.
====> Become Wealthy and having total freedom both personal and financial.
READ ON! READ ON! READ ON! READ ON! READ ON! READ ON! READ ON!!!
How would you like to:(LEGALLY & LAWFULLY)
1. KEEP MOST OF YOUR TAX DOLLARS
2. Drastically reduce personal, business and capital gains taxes?
3. Protect all assets from any form of seizure, liens, or judgments?
4. Create a six figure income every 4 months?
5. Restoring and preserving complete personal and financial privacy?
6. Create and amass personal wealth, multiply it and protect it?
7. Realize a 3 to 6 times greater returns on your money?
8. Legally make yourself and your assets completely judgment-proof,
SEIZURE-PROOOOF, LIEN-PROOOOOOF, DIVORCE-PROOOOOOF, ATTORNEY-PROOOOOOF, IRS-PROOOOOOF
((((((((((((((((((((BECOME COMPLETELY INSULATED))))))))))))))))))))))))
(((((((((((((((((((((((((HELP PEOPLE DO THE SAME))))))))))))))))))))))))))
===> Are you a thinker, and a person that believes they deserve to have the best in life?
===> Are you capable of recognizing a once in a lifetime opportunity when
it's looking right at you?
===> Countless others have missed their shot. Don't look back years later
and wish you made the move.
===> It's to my benefit to train you for success.
===> In fact, I'm so sure that I can do so,
I'm willing to put my money where my mouth is!
===> Upon accepting you as a member on my team, I will provide you with
complete Professional Training as well as FRESH inquiring LEADS to put
you immediately on the road to success.
If you are skeptical that's OK but don't let that stop you
from getting all the information you need.
DROP THE MOUSE=====> AND CALL 800-320-9895 x2068 <======= DROP THE MOUSE AND CALL
************************************800-320-9895 x2068**************************************
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
Your E-mail Address Removal/Deletion Instructions:
We comply with proposed federal legislation regarding unsolicited
commercial e-mail by providing you with a method for your e-mail address
to be permanently removed from our database and any future mailings from
our company.
To remove your address, please send an e-mail message with the word REMOVE
in the subject line to: maillistdrop@post.com
If you do not type the word REMOVE in the subject line, your request to
be removed will not be processed.
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
yaddayaddayadda...
--
"Cache remedies via multi-variable logic shorts will leave you crying."(cl)
Lex Lyamin
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: reiserfsck --rebuild-tree on a big filesystem
@ 2002-11-25 15:38 Mira Tempír
2002-11-27 4:52 ` Todd Lyons
0 siblings, 1 reply; 17+ messages in thread
From: Mira Tempír @ 2002-11-25 15:38 UTC (permalink / raw)
To: reiserfs-list
(sorry, i forgot to post it back to list)
Hello and thanks for quick replies.
Oleg Drokin - 25/11/02 17:48 wrote:
| > But after 16 hours of running it, it is still on 0% and whole process
| > would take about 1 month to complete (started at 2000 blocks/sec,
| > but now running at 26b/s).
|
| This is very strange, We've seen reports of reiserfsck --rebuild-tree
| of 500Gb volume (filled to 170Gb) finished in 8 hours.
Can be the speed affected with high usage of FS (about 99%)?
One point that I've missed in previous e-mail: during rebuild-tree,
there is a lot of messages in log like this:
pass0: vpf-10700: block 1615323, item 6: The item with wrong offset or
length found [9532827 9534032 0x3001 DRCT (2)], len 1000 - deleted
I have no idea what to think about this.
| Can you please verify that the read speed of your RAID5 is constant in all
| the areas of disk, that it is not opearting in some kind of degraded mode
| and is generally in good state?
raid is reporting to be ok, data read speed are almost the same at
various places:
0:03.80elapsed 18%CPU
0:02.97elapsed 19%CPU
0:01.76elapsed 24%CPU
0:01.51elapsed 36%CPU
... (dd reading 40MBytes)
| What is the ARID5 hardware you are using?
3ware 7410 RAID5 of 4 IDE drives
| Data is recoverable for sure, but the speed degradation you are seeing
| is strange and I do not yet think it should be attributed to reiserfsck.
Yes, I'm affraid, I have to agree ... at least, I don't see any other
option.
Thanks anyway
--
mira tempír <mira@cekit.cz> -- èekit s.r.o.
bratislavská 2, 602 00 brno, czech republic
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: reiserfsck --rebuild-tree on a big filesystem
2002-11-25 15:38 reiserfsck --rebuild-tree on a big filesystem Mira Tempír
@ 2002-11-27 4:52 ` Todd Lyons
0 siblings, 0 replies; 17+ messages in thread
From: Todd Lyons @ 2002-11-27 4:52 UTC (permalink / raw)
To: reiserfs-list
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Mira TempÃr wanted us to know:
>| This is very strange, We've seen reports of reiserfsck --rebuild-tree
>| of 500Gb volume (filled to 170Gb) finished in 8 hours.
>Can be the speed affected with high usage of FS (about 99%)?
I don't know for sure. I was one of the 500 GB users who finished in
8 hours. That particular server was at 90% utilization IIRC. If it is
sensitive to the size of the data on the disks, then it is a nonlinear
relationship. I doubt it has anything to do with it and has more to do
with bad blocks.
- --
Blue skies... Todd
| Get a bigger hammer! | Proponents of MIME in general e-mail |
| http://www.mrball.net | can go multipart/encrypt themselves. |
| http://faq.mrball.net | --Rick Moen |
Linux kernel 2.5.46-1 1 user, load average: 0.10, 0.14, 0.20
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)
iD8DBQE95E+XIBT1264ScBURAgPIAJ98slpieL72fK7/TPURFEbJgw6w3gCgqXv9
brP7koUY9ffRrda1J0oJbAc=
=HEsb
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2002-12-23 7:59 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-11-25 14:36 reiserfsck --rebuild-tree on a big filesystem Mira Tempír
2002-11-25 14:48 ` Oleg Drokin
[not found] ` <20021125153154.GA19254@mail.cekit.cz>
2002-11-25 15:44 ` Oleg Drokin
2002-11-26 22:56 ` Mira Tempir
2002-11-27 7:33 ` Oleg Drokin
2002-11-27 12:46 ` Vitaly Fertman
2002-11-27 13:38 ` Mira Tempir
2002-12-19 16:51 ` Zygo Blaxell
2002-12-19 17:00 ` Oleg Drokin
2002-12-19 20:09 ` Zygo Blaxell
2002-12-20 16:12 ` Zygo Blaxell
2002-12-23 7:54 ` Hans Reiser
2002-12-23 7:59 ` Oleg Drokin
2002-11-25 14:59 ` Anders Widman
2002-11-26 10:58 ` SPAMTEST Alexander Lyamin
-- strict thread matches above, loose matches on Subject: below --
2002-11-25 15:38 reiserfsck --rebuild-tree on a big filesystem Mira Tempír
2002-11-27 4:52 ` Todd Lyons
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.