reiser fs slow on mksf and mount

All of lore.kernel.org
 help / color / mirror / Atom feed

* reiser fs slow on mksf and mount
@ 2005-08-26 16:45 Ming Zhang
  2005-08-26 17:04 ` Vladimir V. Saveliev
  0 siblings, 1 reply; 28+ messages in thread
From: Ming Zhang @ 2005-08-26 16:45 UTC (permalink / raw)
  To: reiserfs-list

Hi, folks

I am not sure if this is normal or not.

I try to create&use a reiserfs on a 8 disk raid0. Then I found that mkfs
need ~90 sec and mount need ~70 seconds. 

Is there anything wrong on my side?

Thanks!

Ming

Detailed info followed.
---------------------------------------------------------------------
[root@bakstor2u root]# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid5] [multipath] [raid6]
[raid10] [faulty]
md0 : active raid0 sda[0] sdh[7] sdg[6] sdf[5] sde[4] sdd[3] sdc[2] sdb
[1]
      3125690368 blocks 64k chunks

unused devices: <none>

[root@bakstor2u root]# time mkfs.reiserfs /dev/md0 -ff
mkfs.reiserfs 3.6.13 (2003 www.namesys.com)

<...>

Guessing about desired format.. Kernel 2.6.11.12 is running.
Format 3.6 with standard journal
Count of blocks on the device: 781422592
Number of blocks consumed by mkreiserfs formatting process: 32059
Blocksize: 4096
Hash function used to sort names: "r5"
Journal Size 8193 blocks (first block 18)
Journal Max transaction length 1024
inode generation number: 0
UUID: 98d990f3-d54f-43e3-9fde-8c9c9a6d3481
Initializing journal - 0%....20%....40%....60%....80%....100%
Syncing..ok

Tell your friends to use a kernel based on 2.4.18 or later, and
especially not a
kernel based on 2.4.9, when you use reiserFS. Have fun.

ReiserFS is successfully created on /dev/md0.

real    1m28.783s
user    0m0.151s
sys     0m0.398s

[root@bakstor2u root]# time mount /dev/md0  t

real    1m11.448s
user    0m0.000s
sys     0m0.225s

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: reiser fs slow on mksf and mount
  2005-08-26 16:45 reiser fs slow on mksf and mount Ming Zhang
@ 2005-08-26 17:04 ` Vladimir V. Saveliev
  2005-08-26 17:08   ` Ming Zhang
  2005-08-29 16:44   ` Ming Zhang
  0 siblings, 2 replies; 28+ messages in thread
From: Vladimir V. Saveliev @ 2005-08-26 17:04 UTC (permalink / raw)
  To: mingz; +Cc: reiserfs-list

Hello

Ming Zhang wrote:
> Hi, folks
> 
> I am not sure if this is normal or not.
> 
> I try to create&use a reiserfs on a 8 disk raid0. Then I found that mkfs
> need ~90 sec and mount need ~70 seconds. 
> 
> Is there anything wrong on my side?
> 

Your device is too big.

> Thanks!
> 
> 
> Ming
> 
> 
> 
> Detailed info followed.
> ---------------------------------------------------------------------
> [root@bakstor2u root]# cat /proc/mdstat
> Personalities : [linear] [raid0] [raid1] [raid5] [multipath] [raid6]
> [raid10] [faulty]
> md0 : active raid0 sda[0] sdh[7] sdg[6] sdf[5] sde[4] sdd[3] sdc[2] sdb
> [1]
>       3125690368 blocks 64k chunks
> 
> unused devices: <none>
> 
> [root@bakstor2u root]# time mkfs.reiserfs /dev/md0 -ff
> mkfs.reiserfs 3.6.13 (2003 www.namesys.com)
> 
> <...>
> 
> Guessing about desired format.. Kernel 2.6.11.12 is running.
> Format 3.6 with standard journal
> Count of blocks on the device: 781422592
> Number of blocks consumed by mkreiserfs formatting process: 32059
> Blocksize: 4096
> Hash function used to sort names: "r5"
> Journal Size 8193 blocks (first block 18)
> Journal Max transaction length 1024
> inode generation number: 0
> UUID: 98d990f3-d54f-43e3-9fde-8c9c9a6d3481
> Initializing journal - 0%....20%....40%....60%....80%....100%
> Syncing..ok
> 
> Tell your friends to use a kernel based on 2.4.18 or later, and
> especially not a
> kernel based on 2.4.9, when you use reiserFS. Have fun.
> 
> ReiserFS is successfully created on /dev/md0.
> 
> real    1m28.783s
> user    0m0.151s
> sys     0m0.398s
> 

Hmm, mkfs.reiserfs had to write 32059 blocks. It is about 131mb. 1m28s is too much for that.
Could it be that some of disks used in that raid were not spinning when you started mkreiserfs?

> [root@bakstor2u root]# time mount /dev/md0  t
> 
> real    1m11.448s
> user    0m0.000s
> sys     0m0.225s
> 

There is a patch to cure this problem.
http://www.mail-archive.com/reiserfs-list@namesys.com/msg18442.html
Please note that it is experimental one.

> 
> 
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: reiser fs slow on mksf and mount
  2005-08-26 17:04 ` Vladimir V. Saveliev
@ 2005-08-26 17:08   ` Ming Zhang
  2005-08-26 17:15     ` Ming Zhang
  2005-08-29 16:44   ` Ming Zhang
  1 sibling, 1 reply; 28+ messages in thread
From: Ming Zhang @ 2005-08-26 17:08 UTC (permalink / raw)
  To: Vladimir V. Saveliev; +Cc: reiserfs-list

i think 3.2TB partition is not that big these days right?

i would think some people that hold millions of files will have much
larger partition than this.

so you think this is a normal speed for such size partition?

also iostat 1 -k shows that during mount, there are small size read
happen each second. so this is because read meta data and metadata is
not continuous on disk?

ming


On Fri, 2005-08-26 at 21:04 +0400, Vladimir V. Saveliev wrote:
> Hello
> 
> Ming Zhang wrote:
> > Hi, folks
> > 
> > I am not sure if this is normal or not.
> > 
> > I try to create&use a reiserfs on a 8 disk raid0. Then I found that mkfs
> > need ~90 sec and mount need ~70 seconds. 
> > 
> > Is there anything wrong on my side?
> > 
> 
> Your device is too big.
> 
> > Thanks!
> > 
> > 
> > Ming
> > 
> > 
> > 
> > Detailed info followed.
> > ---------------------------------------------------------------------
> > [root@bakstor2u root]# cat /proc/mdstat
> > Personalities : [linear] [raid0] [raid1] [raid5] [multipath] [raid6]
> > [raid10] [faulty]
> > md0 : active raid0 sda[0] sdh[7] sdg[6] sdf[5] sde[4] sdd[3] sdc[2] sdb
> > [1]
> >       3125690368 blocks 64k chunks
> > 
> > unused devices: <none>
> > 
> > [root@bakstor2u root]# time mkfs.reiserfs /dev/md0 -ff
> > mkfs.reiserfs 3.6.13 (2003 www.namesys.com)
> > 
> > <...>
> > 
> > Guessing about desired format.. Kernel 2.6.11.12 is running.
> > Format 3.6 with standard journal
> > Count of blocks on the device: 781422592
> > Number of blocks consumed by mkreiserfs formatting process: 32059
> > Blocksize: 4096
> > Hash function used to sort names: "r5"
> > Journal Size 8193 blocks (first block 18)
> > Journal Max transaction length 1024
> > inode generation number: 0
> > UUID: 98d990f3-d54f-43e3-9fde-8c9c9a6d3481
> > Initializing journal - 0%....20%....40%....60%....80%....100%
> > Syncing..ok
> > 
> > Tell your friends to use a kernel based on 2.4.18 or later, and
> > especially not a
> > kernel based on 2.4.9, when you use reiserFS. Have fun.
> > 
> > ReiserFS is successfully created on /dev/md0.
> > 
> > real    1m28.783s
> > user    0m0.151s
> > sys     0m0.398s
> > 
> 
> Hmm, mkfs.reiserfs had to write 32059 blocks. It is about 131mb. 1m28s is too much for that.
> Could it be that some of disks used in that raid were not spinning when you started mkreiserfs?
> 
> > [root@bakstor2u root]# time mount /dev/md0  t
> > 
> > real    1m11.448s
> > user    0m0.000s
> > sys     0m0.225s
> > 
> 
> There is a patch to cure this problem.
> http://www.mail-archive.com/reiserfs-list@namesys.com/msg18442.html
> Please note that it is experimental one.
> 
> > 
> > 
> > 
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: reiser fs slow on mksf and mount
  2005-08-26 17:08   ` Ming Zhang
@ 2005-08-26 17:15     ` Ming Zhang
  2005-08-26 17:32       ` Vladimir V. Saveliev
  0 siblings, 1 reply; 28+ messages in thread
From: Ming Zhang @ 2005-08-26 17:15 UTC (permalink / raw)
  To: Vladimir V. Saveliev; +Cc: reiserfs-list

forget to mention that original mount is immediately after mkfs. so no
files in fs at all.

now i create 1048576 4KB files

then umount and remount

[root@bakstor2u root]# time mount /dev/md0 t

real    1m10.971s
user    0m0.001s
sys     0m0.188s

almost same as first one.

this is from dmesg

ReiserFS: md0: found reiserfs format "3.6" with standard journal
ReiserFS: md0: using ordered data mode
ReiserFS: md0: journal params: device md0, size 8192, journal first
block 18, max trans len 1024, max batch 900, max commit age 30, max
trans age 30
ReiserFS: md0: checking transaction log (md0)
ReiserFS: md0: Using r5 hash to sort names


so i could not understand why mount a fs with 0 files is same time with
mount a fs with 1M files.

Thanks!

Ming



On Fri, 2005-08-26 at 13:08 -0400, Ming Zhang wrote:
> i think 3.2TB partition is not that big these days right?
> 
> i would think some people that hold millions of files will have much
> larger partition than this.
> 
> so you think this is a normal speed for such size partition?
> 
> also iostat 1 -k shows that during mount, there are small size read
> happen each second. so this is because read meta data and metadata is
> not continuous on disk?
> 
> ming
> 
> 
> On Fri, 2005-08-26 at 21:04 +0400, Vladimir V. Saveliev wrote:
> > Hello
> > 
> > Ming Zhang wrote:
> > > Hi, folks
> > > 
> > > I am not sure if this is normal or not.
> > > 
> > > I try to create&use a reiserfs on a 8 disk raid0. Then I found that mkfs
> > > need ~90 sec and mount need ~70 seconds. 
> > > 
> > > Is there anything wrong on my side?
> > > 
> > 
> > Your device is too big.
> > 
> > > Thanks!
> > > 
> > > 
> > > Ming
> > > 
> > > 
> > > 
> > > Detailed info followed.
> > > ---------------------------------------------------------------------
> > > [root@bakstor2u root]# cat /proc/mdstat
> > > Personalities : [linear] [raid0] [raid1] [raid5] [multipath] [raid6]
> > > [raid10] [faulty]
> > > md0 : active raid0 sda[0] sdh[7] sdg[6] sdf[5] sde[4] sdd[3] sdc[2] sdb
> > > [1]
> > >       3125690368 blocks 64k chunks
> > > 
> > > unused devices: <none>
> > > 
> > > [root@bakstor2u root]# time mkfs.reiserfs /dev/md0 -ff
> > > mkfs.reiserfs 3.6.13 (2003 www.namesys.com)
> > > 
> > > <...>
> > > 
> > > Guessing about desired format.. Kernel 2.6.11.12 is running.
> > > Format 3.6 with standard journal
> > > Count of blocks on the device: 781422592
> > > Number of blocks consumed by mkreiserfs formatting process: 32059
> > > Blocksize: 4096
> > > Hash function used to sort names: "r5"
> > > Journal Size 8193 blocks (first block 18)
> > > Journal Max transaction length 1024
> > > inode generation number: 0
> > > UUID: 98d990f3-d54f-43e3-9fde-8c9c9a6d3481
> > > Initializing journal - 0%....20%....40%....60%....80%....100%
> > > Syncing..ok
> > > 
> > > Tell your friends to use a kernel based on 2.4.18 or later, and
> > > especially not a
> > > kernel based on 2.4.9, when you use reiserFS. Have fun.
> > > 
> > > ReiserFS is successfully created on /dev/md0.
> > > 
> > > real    1m28.783s
> > > user    0m0.151s
> > > sys     0m0.398s
> > > 
> > 
> > Hmm, mkfs.reiserfs had to write 32059 blocks. It is about 131mb. 1m28s is too much for that.
> > Could it be that some of disks used in that raid were not spinning when you started mkreiserfs?
> > 
> > > [root@bakstor2u root]# time mount /dev/md0  t
> > > 
> > > real    1m11.448s
> > > user    0m0.000s
> > > sys     0m0.225s
> > > 
> > 
> > There is a patch to cure this problem.
> > http://www.mail-archive.com/reiserfs-list@namesys.com/msg18442.html
> > Please note that it is experimental one.
> > 
> > > 
> > > 
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: reiser fs slow on mksf and mount
  2005-08-26 17:15     ` Ming Zhang
@ 2005-08-26 17:32       ` Vladimir V. Saveliev
  2005-08-26 18:07         ` Ming Zhang
  2005-08-26 18:16         ` Ming Zhang
  0 siblings, 2 replies; 28+ messages in thread
From: Vladimir V. Saveliev @ 2005-08-26 17:32 UTC (permalink / raw)
  To: mingz; +Cc: reiserfs-list

Hello

Ming Zhang wrote:
> forget to mention that original mount is immediately after mkfs. so no
> files in fs at all.
> 
> now i create 1048576 4KB files
> 
> then umount and remount
> 
> [root@bakstor2u root]# time mount /dev/md0 t
> 
> real    1m10.971s
> user    0m0.001s
> sys     0m0.188s
> 
> almost same as first one.
> 
> this is from dmesg
> 
> ReiserFS: md0: found reiserfs format "3.6" with standard journal
> ReiserFS: md0: using ordered data mode
> ReiserFS: md0: journal params: device md0, size 8192, journal first
> block 18, max trans len 1024, max batch 900, max commit age 30, max
> trans age 30
> ReiserFS: md0: checking transaction log (md0)
> ReiserFS: md0: Using r5 hash to sort names
> 
> 
> so i could not understand why mount a fs with 0 files is same time with
> mount a fs with 1M files.
> 
> Thanks!
> 
> Ming
> 
> 
> 
> On Fri, 2005-08-26 at 13:08 -0400, Ming Zhang wrote:
>>i think 3.2TB partition is not that big these days right?
>>
>>i would think some people that hold millions of files will have much
>>larger partition than this.
>>
>>so you think this is a normal speed for such size partition?
>>

No. That was joke actually.
Speed of mkfs.reiserfs confuses me.
As you did answer about whether all disks were spinning - I suppose that they were.
Can you please run vmstat 1  or iostat 1 while mkreiserfs is running?

>>also iostat 1 -k shows that during mount, there are small size read
>>happen each second. so this is because read meta data and metadata is
>>not continuous on disk?
>>

Yes. On mount reiserfs reads all bitmap blocks to memory. Those blocks are spread over whole disk. There is workaround for this problem.

>>>>
>>>There is a patch to cure this problem.
>>>http://www.mail-archive.com/reiserfs-list@namesys.com/msg18442.html
>>>Please note that it is experimental one.
>>>
>>>>
>>>>
> 
> 
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: reiser fs slow on mksf and mount
  2005-08-26 17:32       ` Vladimir V. Saveliev
@ 2005-08-26 18:07         ` Ming Zhang
  2005-08-26 18:16         ` Ming Zhang
  1 sibling, 0 replies; 28+ messages in thread
From: Ming Zhang @ 2005-08-26 18:07 UTC (permalink / raw)
  To: Vladimir V. Saveliev; +Cc: reiserfs-list

[-- Attachment #1: Type: text/plain, Size: 1304 bytes --]

On Fri, 2005-08-26 at 21:32 +0400, Vladimir V. Saveliev wrote:
> Hello
> 
> > On Fri, 2005-08-26 at 13:08 -0400, Ming Zhang wrote:
> >>i think 3.2TB partition is not that big these days right?
> >>
> >>i would think some people that hold millions of files will have much
> >>larger partition than this.
> >>
> >>so you think this is a normal speed for such size partition?
> >>
> 
> No. That was joke actually.
> Speed of mkfs.reiserfs confuses me.

:) i captured vmstat 1 and iostat 1. i bet u will be confused more.

it is pretty big for email so i attached them.


> As you did answer about whether all disks were spinning - I suppose that they were.

no


> Can you please run vmstat 1  or iostat 1 while mkreiserfs is running?
> 
> >>also iostat 1 -k shows that during mount, there are small size read
> >>happen each second. so this is because read meta data and metadata is
> >>not continuous on disk?
> >>
> 
> Yes. On mount reiserfs reads all bitmap blocks to memory. Those blocks are spread over whole disk. There is workaround for this problem.
> 
> >>>>
> >>>There is a patch to cure this problem.
> >>>http://www.mail-archive.com/reiserfs-list@namesys.com/msg18442.html
> >>>Please note that it is experimental one.
> >>>
> >>>>

Does reiserfs4 have any improvement on this?

Thanks!

Ming


[-- Attachment #2: iostat-log.gz --]
[-- Type: application/x-gzip, Size: 3293 bytes --]

[-- Attachment #3: vmstat-log.gz --]
[-- Type: application/x-gzip, Size: 1207 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: reiser fs slow on mksf and mount
  2005-08-26 17:32       ` Vladimir V. Saveliev
  2005-08-26 18:07         ` Ming Zhang
@ 2005-08-26 18:16         ` Ming Zhang
  2005-08-27 19:29           ` Jeff Mahoney
  1 sibling, 1 reply; 28+ messages in thread
From: Ming Zhang @ 2005-08-26 18:16 UTC (permalink / raw)
  To: Vladimir V. Saveliev; +Cc: reiserfs-list

On Fri, 2005-08-26 at 21:32 +0400, Vladimir V. Saveliev wrote:


one more question about this bitmap blocks

are this bitmap data is pinned into system thus will not be swapped out?

Thanks!


Ming


> Yes. On mount reiserfs reads all bitmap blocks to memory. Those blocks are spread over whole disk. There is workaround for this problem.
> 
> >>>>
> >>>There is a patch to cure this problem.
> >>>http://www.mail-archive.com/reiserfs-list@namesys.com/msg18442.html
> >>>Please note that it is experimental one.
> >>>
> >>>>
> >>>>
> > 
> > 
> > 
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: reiser fs slow on mksf and mount
  2005-08-26 18:16         ` Ming Zhang
@ 2005-08-27 19:29           ` Jeff Mahoney
  2005-08-27 21:45             ` Christian Iversen
                               ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Jeff Mahoney @ 2005-08-27 19:29 UTC (permalink / raw)
  To: mingz; +Cc: Vladimir V. Saveliev, reiserfs-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ming Zhang wrote:
> On Fri, 2005-08-26 at 21:32 +0400, Vladimir V. Saveliev wrote:
> 
> 
> one more question about this bitmap blocks
> 
> are this bitmap data is pinned into system thus will not be swapped out?

Yes, any buffers/pages with active reference counts are kept in memory.
Since the current reiserfs bitmap implementation keeps a reference until
filesystem umount, the bitmaps are pinned.

My dynamic bitmap patch fixes both of the problems you've posed so far.
Mount time is reduced to O(1) time, since only the superblock and root
node are read at mount time. On my system, it's something along the
lines of 0.2s. Memory consumption is reduced also, because the bitmap
block is released after the allocation/free that required it is complete.

It's a relatively straightforward patch - the error handling I refer to
is how to handle block read failures, which would only occur if your
disk is failing.

- -Jeff

- --
Jeff Mahoney
SuSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFDEL81LPWxlyuTD7IRAnHWAJ9TmL/5ziKt4ObSUR9c/MJps4HydQCfXj0s
Kd4u+V+PYZQydA/YqelyJvo=
=pHCV
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: reiser fs slow on mksf and mount
  2005-08-27 19:29           ` Jeff Mahoney
@ 2005-08-27 21:45             ` Christian Iversen
  2005-08-27 21:55               ` David Masover
                                 ` (2 more replies)
  2005-08-27 22:53             ` Ming Zhang
  2005-08-29 19:40             ` Hans Reiser
  2 siblings, 3 replies; 28+ messages in thread
From: Christian Iversen @ 2005-08-27 21:45 UTC (permalink / raw)
  To: reiserfs-list

On Saturday 27 August 2005 21:29, Jeff Mahoney wrote:
> Ming Zhang wrote:
> > On Fri, 2005-08-26 at 21:32 +0400, Vladimir V. Saveliev wrote:
> >
> >
> > one more question about this bitmap blocks
> >
> > are this bitmap data is pinned into system thus will not be swapped out?
>
> Yes, any buffers/pages with active reference counts are kept in memory.
> Since the current reiserfs bitmap implementation keeps a reference until
> filesystem umount, the bitmaps are pinned.
>
> My dynamic bitmap patch fixes both of the problems you've posed so far.
> Mount time is reduced to O(1) time, since only the superblock and root
> node are read at mount time. On my system, it's something along the
> lines of 0.2s. Memory consumption is reduced also, because the bitmap
> block is released after the allocation/free that required it is complete.

I've been reading about this patch with quite some interest. Would you say 
it's stable enough for daily use? I have a terabyte array that takes forever 
to mount, and probably uses quite a bit of memory too.

Another thing is that it can easily take several seconds to do "ls -l" on a 
directory with a 0-10 GB data in it. Is that normal? There's usually less 
than 50 files of test data, ranging in size from 200MB to 900MB. I've 
disabled atime updates, but that didn't help much. The controller and disks 
are plenty fast, so I feel something is amiss. 

-- 
Regards,
Christian Iversen

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: reiser fs slow on mksf and mount
  2005-08-27 21:45             ` Christian Iversen
@ 2005-08-27 21:55               ` David Masover
  2005-08-29 19:44                 ` Hans Reiser
  2005-08-27 22:54               ` Ming Zhang
  2005-08-29 15:07               ` Jeff Mahoney
  2 siblings, 1 reply; 28+ messages in thread
From: David Masover @ 2005-08-27 21:55 UTC (permalink / raw)
  To: Christian Iversen; +Cc: reiserfs-list

Christian Iversen wrote:
> On Saturday 27 August 2005 21:29, Jeff Mahoney wrote:
> 
>>Ming Zhang wrote:

> Another thing is that it can easily take several seconds to do "ls -l" on a 
> directory with a 0-10 GB data in it. Is that normal? There's usually less 
> than 50 files of test data, ranging in size from 200MB to 900MB. I've 
> disabled atime updates, but that didn't help much. The controller and disks 
> are plenty fast, so I feel something is amiss. 

Interesting, I'd always assumed this was an issue with the lazy 
allocation.  On my box, this meant that occasionally, I'd run into a 
situation where some random FS operation would take 5-10 seconds, 
because (I assumed) it would have been the random operation that used up 
enough RAM that the FS decided to flush.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: reiser fs slow on mksf and mount
  2005-08-27 19:29           ` Jeff Mahoney
  2005-08-27 21:45             ` Christian Iversen
@ 2005-08-27 22:53             ` Ming Zhang
  2005-08-28  0:01               ` Jeff Mahoney
  2005-08-29 19:40             ` Hans Reiser
  2 siblings, 1 reply; 28+ messages in thread
From: Ming Zhang @ 2005-08-27 22:53 UTC (permalink / raw)
  To: Jeff Mahoney; +Cc: Vladimir V. Saveliev, reiserfs-list

On Sat, 2005-08-27 at 15:29 -0400, Jeff Mahoney wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Ming Zhang wrote:
> > On Fri, 2005-08-26 at 21:32 +0400, Vladimir V. Saveliev wrote:
> > 
> > 
> > one more question about this bitmap blocks
> > 
> > are this bitmap data is pinned into system thus will not be swapped out?
> 
> Yes, any buffers/pages with active reference counts are kept in memory.
> Since the current reiserfs bitmap implementation keeps a reference until
> filesystem umount, the bitmaps are pinned.
so u always keep a reference on all bitmap pages and thus they can not
be umounted. yes, by this way it can be pretty fast to do any meta-data
operation. but what if current RAM can not hold these bitmap. maybe u
think if i want to use tens of TB storage, i of course will have 32GB
RAM. :P



> 
> My dynamic bitmap patch fixes both of the problems you've posed so far.
> Mount time is reduced to O(1) time, since only the superblock and root
> node are read at mount time. On my system, it's something along the
> lines of 0.2s. Memory consumption is reduced also, because the bitmap
> block is released after the allocation/free that required it is complete.
ic. this is like a delayed lazy allocation. 



> 
> It's a relatively straightforward patch - the error handling I refer to
> is how to handle block read failures, which would only occur if your
> disk is failing.
yes, understandable. thanks!

Ming


> 
> - -Jeff
> 
> - --
> Jeff Mahoney
> SuSE Labs
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.0 (GNU/Linux)
> 
> iD8DBQFDEL81LPWxlyuTD7IRAnHWAJ9TmL/5ziKt4ObSUR9c/MJps4HydQCfXj0s
> Kd4u+V+PYZQydA/YqelyJvo=
> =pHCV
> -----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: reiser fs slow on mksf and mount
  2005-08-27 21:45             ` Christian Iversen
  2005-08-27 21:55               ` David Masover
@ 2005-08-27 22:54               ` Ming Zhang
  2005-08-29 15:07               ` Jeff Mahoney
  2 siblings, 0 replies; 28+ messages in thread
From: Ming Zhang @ 2005-08-27 22:54 UTC (permalink / raw)
  To: Christian Iversen; +Cc: reiserfs-list

On Sat, 2005-08-27 at 23:45 +0200, Christian Iversen wrote:
> On Saturday 27 August 2005 21:29, Jeff Mahoney wrote:
> > Ming Zhang wrote:
> > > On Fri, 2005-08-26 at 21:32 +0400, Vladimir V. Saveliev wrote:
> > >
> > >
> > > one more question about this bitmap blocks
> > >
> > > are this bitmap data is pinned into system thus will not be swapped out?
> >
> > Yes, any buffers/pages with active reference counts are kept in memory.
> > Since the current reiserfs bitmap implementation keeps a reference until
> > filesystem umount, the bitmaps are pinned.
> >
> > My dynamic bitmap patch fixes both of the problems you've posed so far.
> > Mount time is reduced to O(1) time, since only the superblock and root
> > node are read at mount time. On my system, it's something along the
> > lines of 0.2s. Memory consumption is reduced also, because the bitmap
> > block is released after the allocation/free that required it is complete.
> 
> I've been reading about this patch with quite some interest. Would you say 
> it's stable enough for daily use? I have a terabyte array that takes forever 
> to mount, and probably uses quite a bit of memory too.
yes, i think this will be a problem for 32bit box with large size
storage.


> 
> Another thing is that it can easily take several seconds to do "ls -l" on a 
> directory with a 0-10 GB data in it. Is that normal? There's usually less 
> than 50 files of test data, ranging in size from 200MB to 900MB. I've 
> disabled atime updates, but that didn't help much. The controller and disks 
> are plenty fast, so I feel something is amiss. 
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: reiser fs slow on mksf and mount
  2005-08-27 22:53             ` Ming Zhang
@ 2005-08-28  0:01               ` Jeff Mahoney
  2005-08-28 15:40                 ` Ming Zhang
  0 siblings, 1 reply; 28+ messages in thread
From: Jeff Mahoney @ 2005-08-28  0:01 UTC (permalink / raw)
  To: mingz; +Cc: Vladimir V. Saveliev, reiserfs-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ming Zhang wrote:
> On Sat, 2005-08-27 at 15:29 -0400, Jeff Mahoney wrote:
>>>are this bitmap data is pinned into system thus will not be swapped out?
>> Yes, any buffers/pages with active reference counts are kept in memory.
>> Since the current reiserfs bitmap implementation keeps a reference until
>> filesystem umount, the bitmaps are pinned.
> so u always keep a reference on all bitmap pages and thus they can not
> be umounted. yes, by this way it can be pretty fast to do any meta-data
> operation. but what if current RAM can not hold these bitmap. maybe u
> think if i want to use tens of TB storage, i of course will have 32GB
> RAM. :P

That's been the argument Hans has been presenting so far. I tend to
disagree with it for several reasons:
* It used to be unheard of for huge filesystems to be accessible to
users without high priced RAID arrays. Now, with individual disk
capacities over 500 GB and the ease of software raid, multiple-TB
filesystems are quite possible on a desktop machine. These machines, as
desktops, have no need for 32 GB of RAM, but have a very real demand for
large storage.
* We don't cache any other metadata (other than the superblock, which is
standard practice) specially. In a mostly-reader environment, bitmaps
would rank very low in importance for caching.

In short, we shouldn't be demanding that users of large storage also
have loads of memory for what I feel is a very shaky argument in the
first place.

>> My dynamic bitmap patch fixes both of the problems you've posed so far.
>> Mount time is reduced to O(1) time, since only the superblock and root
>> node are read at mount time. On my system, it's something along the
>> lines of 0.2s. Memory consumption is reduced also, because the bitmap
>> block is released after the allocation/free that required it is complete.
> ic. this is like a delayed lazy allocation. 

If I understand you correctly, you're referring to allocate-on-flush,
which is a different idea entirely. What the dynamic bitmap patch does
is similar to what most other filesystems do -- treat the bitmaps as any
other kind of metadata is treated and read it on-demand.
Allocate-on-flush allows the filesystem to wait until the last possible
moment to allocate the space on disk, which makes performance a little
nicer, but more importantly, allows the allocator to allocate entire
chunks of a file rather than a block-at-a-time.

- -Jeff

- --
Jeff Mahoney
SuSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFDEP7ALPWxlyuTD7IRAj2+AJ9f2tHTlV3Mrl7m0jDtn50p1egacwCgjbT9
2HSYlvH9sIG53JGjBHgT+9s=
=Suud
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: reiser fs slow on mksf and mount
  2005-08-28  0:01               ` Jeff Mahoney
@ 2005-08-28 15:40                 ` Ming Zhang
  2005-08-28 18:44                   ` Jeff Mahoney
  0 siblings, 1 reply; 28+ messages in thread
From: Ming Zhang @ 2005-08-28 15:40 UTC (permalink / raw)
  To: Jeff Mahoney; +Cc: Vladimir V. Saveliev, reiserfs

On Sat, 2005-08-27 at 20:01 -0400, Jeff Mahoney wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Ming Zhang wrote:
> > On Sat, 2005-08-27 at 15:29 -0400, Jeff Mahoney wrote:
> >>>are this bitmap data is pinned into system thus will not be swapped out?
> >> Yes, any buffers/pages with active reference counts are kept in memory.
> >> Since the current reiserfs bitmap implementation keeps a reference until
> >> filesystem umount, the bitmaps are pinned.
> > so u always keep a reference on all bitmap pages and thus they can not
> > be umounted. yes, by this way it can be pretty fast to do any meta-data
> > operation. but what if current RAM can not hold these bitmap. maybe u
> > think if i want to use tens of TB storage, i of course will have 32GB
> > RAM. :P
> 
> That's been the argument Hans has been presenting so far. I tend to
> disagree with it for several reasons:
> * It used to be unheard of for huge filesystems to be accessible to
> users without high priced RAID arrays. Now, with individual disk
> capacities over 500 GB and the ease of software raid, multiple-TB
> filesystems are quite possible on a desktop machine. These machines, as
> desktops, have no need for 32 GB of RAM, but have a very real demand for
> large storage.

yes, i have a 12*400GB SATA MD raid that want to store my huge number of
pictures (i am not a good photographer, but a quick shooter.) all these
files are named as DSCxxxxxx.jpg but not continuous since some are
really bad so deleted. so i jump out these questions to stress the
scalability of fs.


> * We don't cache any other metadata (other than the superblock, which is
> standard practice) specially. In a mostly-reader environment, bitmaps
> would rank very low in importance for caching.

could u explain a bit more on what is the purpose of these bitmaps? what
is the relationship between these bitmap and other metadata?


> 
> In short, we shouldn't be demanding that users of large storage also
> have loads of memory for what I feel is a very shaky argument in the
> first place.

assumed i have 2GB or 4GB ram, which is not unbelievable for a desktop
now. but can these RAM be used by 32BIT arch?


> 
> >> My dynamic bitmap patch fixes both of the problems you've posed so far.
> >> Mount time is reduced to O(1) time, since only the superblock and root
> >> node are read at mount time. On my system, it's something along the
> >> lines of 0.2s. Memory consumption is reduced also, because the bitmap
> >> block is released after the allocation/free that required it is complete.
> > ic. this is like a delayed lazy allocation. 
> 
> If I understand you correctly, you're referring to allocate-on-flush,
> which is a different idea entirely. What the dynamic bitmap patch does
> is similar to what most other filesystems do -- treat the bitmaps as any
> other kind of metadata is treated and read it on-demand.

yes, sorry, it is not lazy allocation. it is load (into ram) on-demand
or lazy load.



> Allocate-on-flush allows the filesystem to wait until the last possible
> moment to allocate the space on disk, which makes performance a little
> nicer, but more importantly, allows the allocator to allocate entire
> chunks of a file rather than a block-at-a-time.

are u talking about allocating space after that file content is cached
in RAM and before need to be flushed? this is then like a write-any file
system that you can write to a place where it is still continuous and
near to current disk head (though latter is hard to achieve since it is
hidden by LVM/MD/...).


> 
> - -Jeff
> 
> - --
> Jeff Mahoney
> SuSE Labs
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.0 (GNU/Linux)
> 
> iD8DBQFDEP7ALPWxlyuTD7IRAj2+AJ9f2tHTlV3Mrl7m0jDtn50p1egacwCgjbT9
> 2HSYlvH9sIG53JGjBHgT+9s=
> =Suud
> -----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: reiser fs slow on mksf and mount
  2005-08-28 15:40                 ` Ming Zhang
@ 2005-08-28 18:44                   ` Jeff Mahoney
  2005-08-29 12:39                     ` Ming Zhang
  0 siblings, 1 reply; 28+ messages in thread
From: Jeff Mahoney @ 2005-08-28 18:44 UTC (permalink / raw)
  To: mingz; +Cc: Vladimir V. Saveliev, reiserfs

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ming Zhang wrote:
> On Sat, 2005-08-27 at 20:01 -0400, Jeff Mahoney wrote:
> 
>> yes, i have a 12*400GB SATA MD raid that want to store my huge number of
>> pictures (i am not a good photographer, but a quick shooter.) all these
>> files are named as DSCxxxxxx.jpg but not continuous since some are
>> really bad so deleted. so i jump out these questions to stress the
>> scalability of fs.

ReiserFS is very well suited for lots of files, so you should be all set
in that respect.

> * We don't cache any other metadata (other than the superblock, which is
> standard practice) specially. In a mostly-reader environment, bitmaps
> would rank very low in importance for caching.
> 
>> could u explain a bit more on what is the purpose of these bitmaps? what
>> is the relationship between these bitmap and other metadata?

The bitmaps are used to keep track of which blocks on disk are used, and
which are available for allocation. Every (blocksize * 8) blocks, there
is a block reserved to keep track of which blocks in that range are
allocated or not. On a 4k block filesystem, that boils down to 1 4k
block for every 128 MB. If a block is used, the bit corresponding to it
is set. When the block is freed, the bit is cleared.

Well there are a several kinds of metadata on the filesystem: The super
block, the bitmaps, the journal, and the reiserfs s-tree itself. The
journal and bitmaps are only used when writing to the filesystem. The
superblock and s-tree are used for any filesystem access. The
relationship is that before a file data block or an s-tree node can be
allocated on disk, the bitmaps must be checked to see where the block
can be allocated.

>> assumed i have 2GB or 4GB ram, which is not unbelievable for a desktop
>> now. but can these RAM be used by 32BIT arch?

The RAM can be used, sure, but not for the bitmaps. I believe the buffer
heads for the bitmaps need to come out of the memory < 1 GB. It would be
possible to put the bitmaps in high memory (like any other data), but
the patch to do so would likely be more involved than the dynamic bitmap
patch, and still waste the memory anyway.

> Allocate-on-flush allows the filesystem to wait until the last possible
> moment to allocate the space on disk, which makes performance a little
> nicer, but more importantly, allows the allocator to allocate entire
> chunks of a file rather than a block-at-a-time.
> 
>> are u talking about allocating space after that file content is cached
>> in RAM and before need to be flushed? this is then like a write-any file
>> system that you can write to a place where it is still continuous and
>> near to current disk head (though latter is hard to achieve since it is
>> hidden by LVM/MD/...).

Well those are two different issues. Allocate on flush would try to keep
the file as contiguous as possible, whether by appending (ideal) or by
keeping the new chunk of data all together as a separate fragment rather
than individual blocks scattered everywhere. As for writing near the
current disk head, that is an operation that is performed by the block
layer. It can make the best decisions on that, since it its at the
lowest level of abstraction. It's entirely possible that a filesystem be
mounted via file-loopback on an NFS mount. In that case, the local
system has no information at all about where the disk head would be.

- -Jeff

- --
Jeff Mahoney
SuSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFDEgYNLPWxlyuTD7IRAlkgAKCM8evk+X3FSAw9IzEbeRKyo+N2tgCffyNi
yNcc2G2Uy09X5zMI97AKaJc=
=UzK+
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: reiser fs slow on mksf and mount
  2005-08-28 18:44                   ` Jeff Mahoney
@ 2005-08-29 12:39                     ` Ming Zhang
  2005-08-29 14:26                       ` Jeff Mahoney
  0 siblings, 1 reply; 28+ messages in thread
From: Ming Zhang @ 2005-08-29 12:39 UTC (permalink / raw)
  To: Jeff Mahoney; +Cc: Vladimir V. Saveliev, reiserfs

On Sun, 2005-08-28 at 14:44 -0400, Jeff Mahoney wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Ming Zhang wrote:
> > On Sat, 2005-08-27 at 20:01 -0400, Jeff Mahoney wrote:
> > 
> >> yes, i have a 12*400GB SATA MD raid that want to store my huge number of
> >> pictures (i am not a good photographer, but a quick shooter.) all these
> >> files are named as DSCxxxxxx.jpg but not continuous since some are
> >> really bad so deleted. so i jump out these questions to stress the
> >> scalability of fs.
> 
> ReiserFS is very well suited for lots of files, so you should be all set
> in that respect.

ic. thx.


> 
> > * We don't cache any other metadata (other than the superblock, which is
> > standard practice) specially. In a mostly-reader environment, bitmaps
> > would rank very low in importance for caching.
> > 
> >> could u explain a bit more on what is the purpose of these bitmaps? what
> >> is the relationship between these bitmap and other metadata?
> 
> The bitmaps are used to keep track of which blocks on disk are used, and
> which are available for allocation. Every (blocksize * 8) blocks, there

here blocksize is 512bytes right from followed data? this comes from
sector size?

so what is the on disk layout? i asked this because when i have a slow
mount reiserfs on top of RAID1, I saw many small write each second. I
guess they scatter over whole disk.

> is a block reserved to keep track of which blocks in that range are
> allocated or not. On a 4k block filesystem, that boils down to 1 4k
> block for every 128 MB. If a block is used, the bit corresponding to it
> is set. When the block is freed, the bit is cleared.
> 
> Well there are a several kinds of metadata on the filesystem: The super
> block, the bitmaps, the journal, and the reiserfs s-tree itself. The
> journal and bitmaps are only used when writing to the filesystem. The
> superblock and s-tree are used for any filesystem access. The
> relationship is that before a file data block or an s-tree node can be
> allocated on disk, the bitmaps must be checked to see where the block
> can be allocated.

ic. so other meta-data is checked as other file systems.


> 
> >> assumed i have 2GB or 4GB ram, which is not unbelievable for a desktop
> >> now. but can these RAM be used by 32BIT arch?
> 
> The RAM can be used, sure, but not for the bitmaps. I believe the buffer
> heads for the bitmaps need to come out of the memory < 1 GB. It would be
> possible to put the bitmaps in high memory (like any other data), but
> the patch to do so would likely be more involved than the dynamic bitmap
> patch, and still waste the memory anyway.

yes, i also suspect this 1GB limit. So 64bit is the way and AMD64 is
cheap anyway rite?

> 
> > Allocate-on-flush allows the filesystem to wait until the last possible
> > moment to allocate the space on disk, which makes performance a little
> > nicer, but more importantly, allows the allocator to allocate entire
> > chunks of a file rather than a block-at-a-time.
> > 
> >> are u talking about allocating space after that file content is cached
> >> in RAM and before need to be flushed? this is then like a write-any file
> >> system that you can write to a place where it is still continuous and
> >> near to current disk head (though latter is hard to achieve since it is
> >> hidden by LVM/MD/...).
> 
> Well those are two different issues. Allocate on flush would try to keep
> the file as contiguous as possible, whether by appending (ideal) or by
> keeping the new chunk of data all together as a separate fragment rather
> than individual blocks scattered everywhere. As for writing near the

ic. yes, delay to that point will have best knowledge.


> current disk head, that is an operation that is performed by the block
> layer. It can make the best decisions on that, since it its at the
> lowest level of abstraction. It's entirely possible that a filesystem be
> mounted via file-loopback on an NFS mount. In that case, the local
> system has no information at all about where the disk head would be.

yes, but then block layer will need another bitmap to track which block
is used or not and also do a mapping again...

the cost of layering?

ming


> 
> - -Jeff
> 
> - --
> Jeff Mahoney
> SuSE Labs
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.0 (GNU/Linux)
> 
> iD8DBQFDEgYNLPWxlyuTD7IRAlkgAKCM8evk+X3FSAw9IzEbeRKyo+N2tgCffyNi
> yNcc2G2Uy09X5zMI97AKaJc=
> =UzK+
> -----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: reiser fs slow on mksf and mount
  2005-08-29 12:39                     ` Ming Zhang
@ 2005-08-29 14:26                       ` Jeff Mahoney
  2005-08-29 14:41                         ` Ming Zhang
  0 siblings, 1 reply; 28+ messages in thread
From: Jeff Mahoney @ 2005-08-29 14:26 UTC (permalink / raw)
  To: Ming Zhang; +Cc: Vladimir V. Saveliev, reiserfs

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ming Zhang wrote:
> On Sun, 2005-08-28 at 14:44 -0400, Jeff Mahoney wrote:
>>* We don't cache any other metadata (other than the superblock, which is
>>standard practice) specially. In a mostly-reader environment, bitmaps
>>would rank very low in importance for caching.
> 
>>>could u explain a bit more on what is the purpose of these bitmaps? what
>>>is the relationship between these bitmap and other metadata?
> The bitmaps are used to keep track of which blocks on disk are used, and
> which are available for allocation. Every (blocksize * 8) blocks, there
> 
>> here blocksize is 512bytes right from followed data? this comes from
>> sector size?

No. Block size is the declared filesystem blocksize, not the hardware
sector size. It must be a power of 2, and 512-8192 bytes. The "standard"
filesystem blocksize is 4k. If you've declared your block size as 512
bytes (using mkreiserfs -b 512), that would certainly be another source
of performance issues.

>> so what is the on disk layout? i asked this because when i have a slow
>> mount reiserfs on top of RAID1, I saw many small write each second. I
>> guess they scatter over whole disk.

Well two things occur on mount: Reading the bitmaps causes a read every
128M to occur, and replaying the journal can cause up to 8192 block
writes to occur. Replaying the journal is generally pretty quick.
Reading the bitmaps on a large filesystem can take a while. This is the
issue you originally asked about.

> is a block reserved to keep track of which blocks in that range are
> allocated or not. On a 4k block filesystem, that boils down to 1 4k
> block for every 128 MB. If a block is used, the bit corresponding to it
> is set. When the block is freed, the bit is cleared.
> 
> Well there are a several kinds of metadata on the filesystem: The super
> block, the bitmaps, the journal, and the reiserfs s-tree itself. The
> journal and bitmaps are only used when writing to the filesystem. The
> superblock and s-tree are used for any filesystem access. The
> relationship is that before a file data block or an s-tree node can be
> allocated on disk, the bitmaps must be checked to see where the block
> can be allocated.
> 
>> ic. so other meta-data is checked as other file systems.

No. The bitmaps and journal are still part of the same filesystem. They
are just not part of the s-tree.

>>>assumed i have 2GB or 4GB ram, which is not unbelievable for a desktop
>>>now. but can these RAM be used by 32BIT arch?
> The RAM can be used, sure, but not for the bitmaps. I believe the buffer
> heads for the bitmaps need to come out of the memory < 1 GB. It would be
> possible to put the bitmaps in high memory (like any other data), but
> the patch to do so would likely be more involved than the dynamic bitmap
> patch, and still waste the memory anyway.
> 
>> yes, i also suspect this 1GB limit. So 64bit is the way and AMD64 is
>> cheap anyway rite?

Personally, I think so.

> current disk head, that is an operation that is performed by the block
> layer. It can make the best decisions on that, since it its at the
> lowest level of abstraction. It's entirely possible that a filesystem be
> mounted via file-loopback on an NFS mount. In that case, the local
> system has no information at all about where the disk head would be.
> 
>> yes, but then block layer will need another bitmap to track which block
>> is used or not and also do a mapping again...
> 
>> the cost of layering?

The ideas of "in use" and "available" are purely filesystem abstractions
to keep track of where we already have filesystem data/metadata. The
block layer doesn't know or care about them - it's just a collection of
blocks that the user may do whatever they please with. Now, not to
confuse the issue, but the example of a loopback-mounted filesystem can
cause an allocation if the host file is sparse, but that's really a
corner case.

- -Jeff

- --
Jeff Mahoney
SuSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFDExsrLPWxlyuTD7IRAvGmAJ9QU16I2oz/kkCbqwdeGcIgkey8TgCgqS8s
lI6YzJEJ20j5LiheAqw6eoE=
=YD9V
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: reiser fs slow on mksf and mount
  2005-08-29 14:26                       ` Jeff Mahoney
@ 2005-08-29 14:41                         ` Ming Zhang
  2005-08-29 14:51                           ` Jeff Mahoney
  0 siblings, 1 reply; 28+ messages in thread
From: Ming Zhang @ 2005-08-29 14:41 UTC (permalink / raw)
  To: Jeff Mahoney; +Cc: Vladimir V. Saveliev, reiserfs

On Mon, 2005-08-29 at 10:26 -0400, Jeff Mahoney wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Ming Zhang wrote:
> > On Sun, 2005-08-28 at 14:44 -0400, Jeff Mahoney wrote:
> >>* We don't cache any other metadata (other than the superblock, which is
> >>standard practice) specially. In a mostly-reader environment, bitmaps
> >>would rank very low in importance for caching.
> > 
> >>>could u explain a bit more on what is the purpose of these bitmaps? what
> >>>is the relationship between these bitmap and other metadata?
> > The bitmaps are used to keep track of which blocks on disk are used, and
> > which are available for allocation. Every (blocksize * 8) blocks, there
> > 
> >> here blocksize is 512bytes right from followed data? this comes from
> >> sector size?
> 
> No. Block size is the declared filesystem blocksize, not the hardware
> sector size. It must be a power of 2, and 512-8192 bytes. The "standard"
> filesystem blocksize is 4k. If you've declared your block size as 512
> bytes (using mkreiserfs -b 512), that would certainly be another source
> of performance issues.

so 1 block per bit, thus (blocksize * 8) block per block.


> 
> >> so what is the on disk layout? i asked this because when i have a slow
> >> mount reiserfs on top of RAID1, I saw many small write each second. I
> >> guess they scatter over whole disk.
> 
> Well two things occur on mount: Reading the bitmaps causes a read every
> 128M to occur, and replaying the journal can cause up to 8192 block
> writes to occur. Replaying the journal is generally pretty quick.
> Reading the bitmaps on a large filesystem can take a while. This is the
> issue you originally asked about.

since that is a newly formatted fs, there is no journal to replay.
because that FS is big with 3.2TB, if bitmap is not continuous on disks,
then the read is like a random read to read around total ~100MB 4K piece
from disk. so this is why it is slow?

any way to store these bitmap together?

> 
> > is a block reserved to keep track of which blocks in that range are
> > allocated or not. On a 4k block filesystem, that boils down to 1 4k
> > block for every 128 MB. If a block is used, the bit corresponding to it
> > is set. When the block is freed, the bit is cleared.
> > 
> > Well there are a several kinds of metadata on the filesystem: The super
> > block, the bitmaps, the journal, and the reiserfs s-tree itself. The
> > journal and bitmaps are only used when writing to the filesystem. The
> > superblock and s-tree are used for any filesystem access. The
> > relationship is that before a file data block or an s-tree node can be
> > allocated on disk, the bitmaps must be checked to see where the block
> > can be allocated.
> > 
> >> ic. so other meta-data is checked as other file systems.
> 
> No. The bitmaps and journal are still part of the same filesystem. They
> are just not part of the s-tree.

yes. sorry i should say that file system still use s-tree to locate file
data while bitmap is to assist the block allocation and journal is for
consistency.

> 
> >>>assumed i have 2GB or 4GB ram, which is not unbelievable for a desktop
> >>>now. but can these RAM be used by 32BIT arch?
> > The RAM can be used, sure, but not for the bitmaps. I believe the buffer
> > heads for the bitmaps need to come out of the memory < 1 GB. It would be
> > possible to put the bitmaps in high memory (like any other data), but
> > the patch to do so would likely be more involved than the dynamic bitmap
> > patch, and still waste the memory anyway.
> > 
> >> yes, i also suspect this 1GB limit. So 64bit is the way and AMD64 is
> >> cheap anyway rite?
> 
> Personally, I think so.
> 
> > current disk head, that is an operation that is performed by the block
> > layer. It can make the best decisions on that, since it its at the
> > lowest level of abstraction. It's entirely possible that a filesystem be
> > mounted via file-loopback on an NFS mount. In that case, the local
> > system has no information at all about where the disk head would be.
> > 
> >> yes, but then block layer will need another bitmap to track which block
> >> is used or not and also do a mapping again...
> > 
> >> the cost of layering?
> 
> The ideas of "in use" and "available" are purely filesystem abstractions
> to keep track of where we already have filesystem data/metadata. The
> block layer doesn't know or care about them - it's just a collection of
> blocks that the user may do whatever they please with. Now, not to
> confuse the issue, but the example of a loopback-mounted filesystem can
> cause an allocation if the host file is sparse, but that's really a
> corner case.
> 

yes, that is cost worthy being paid. file system just need a set of
blocks to working on...


> - -Jeff
> 
> - --
> Jeff Mahoney
> SuSE Labs
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.0 (GNU/Linux)
> 
> iD8DBQFDExsrLPWxlyuTD7IRAvGmAJ9QU16I2oz/kkCbqwdeGcIgkey8TgCgqS8s
> lI6YzJEJ20j5LiheAqw6eoE=
> =YD9V
> -----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: reiser fs slow on mksf and mount
  2005-08-29 14:41                         ` Ming Zhang
@ 2005-08-29 14:51                           ` Jeff Mahoney
  2005-08-29 15:20                             ` Ming Zhang
  0 siblings, 1 reply; 28+ messages in thread
From: Jeff Mahoney @ 2005-08-29 14:51 UTC (permalink / raw)
  To: mingz; +Cc: Vladimir V. Saveliev, reiserfs

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ming Zhang wrote:
> On Mon, 2005-08-29 at 10:26 -0400, Jeff Mahoney wrote:
> No. Block size is the declared filesystem blocksize, not the hardware
> sector size. It must be a power of 2, and 512-8192 bytes. The "standard"
> filesystem blocksize is 4k. If you've declared your block size as 512
> bytes (using mkreiserfs -b 512), that would certainly be another source
> of performance issues.
> 
>> so 1 block per bit, thus (blocksize * 8) block per block.

Exactly.

>> since that is a newly formatted fs, there is no journal to replay.
>> because that FS is big with 3.2TB, if bitmap is not continuous on disks,
>> then the read is like a random read to read around total ~100MB 4K piece
>> from disk. so this is why it is slow?

I need to look into this some more, but I suspect it may be related to
congestion avoidance. The requests don't bind up in waiting for the data
to come back, but, rather, allocating the request in the first place.

>> any way to store these bitmap together?

The "old" reiserfs disk format did exactly that. However, the gain
realized (if any, see above) at mount time is quickly lost when the
filesystem can no longer be dynamically expanded/shrunk, and if the
bitmaps are actually read on-demand, then it causes needless seeks to
the "bitmap secion."

- -Jeff

- --
Jeff Mahoney
SuSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFDEyDlLPWxlyuTD7IRArjZAJoCxQCJ8Qs4AM1OQZEJIhz1BvYwDQCeIRk+
VvRxXcyH1puW2vq1xDYygL0=
=FVcM
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: reiser fs slow on mksf and mount
  2005-08-27 21:45             ` Christian Iversen
  2005-08-27 21:55               ` David Masover
  2005-08-27 22:54               ` Ming Zhang
@ 2005-08-29 15:07               ` Jeff Mahoney
  2 siblings, 0 replies; 28+ messages in thread
From: Jeff Mahoney @ 2005-08-29 15:07 UTC (permalink / raw)
  To: Christian Iversen; +Cc: reiserfs-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Christian Iversen wrote:
> On Saturday 27 August 2005 21:29, Jeff Mahoney wrote:
>>My dynamic bitmap patch fixes both of the problems you've posed so far.
>>Mount time is reduced to O(1) time, since only the superblock and root
>>node are read at mount time. On my system, it's something along the
>>lines of 0.2s. Memory consumption is reduced also, because the bitmap
>>block is released after the allocation/free that required it is complete.
> 
> I've been reading about this patch with quite some interest. Would you say 
> it's stable enough for daily use? I have a terabyte array that takes forever 
> to mount, and probably uses quite a bit of memory too.

I've done testing with it, and it's been ok. I haven't heard any bug
reports, but I haven't heard any "hey it works" comments either. It
should be pretty stable - the only thing I'm concerned about is how it
deals with I/O errors. That part of the patch will be dependent on my
developing better i/o handling for reiserfs in general. That patch is
also mostly done, but needs more testing.

- -Jeff

- --
Jeff Mahoney
SuSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFDEySgLPWxlyuTD7IRAoMXAJ9IozJtv2236HK1R/4bMMaK8/mrIACeMYUu
tz7nPOMFTIY2dJ/j9DXjMG0=
=lGWE
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: reiser fs slow on mksf and mount
  2005-08-29 14:51                           ` Jeff Mahoney
@ 2005-08-29 15:20                             ` Ming Zhang
  2005-08-29 15:28                               ` Jeff Mahoney
  0 siblings, 1 reply; 28+ messages in thread
From: Ming Zhang @ 2005-08-29 15:20 UTC (permalink / raw)
  To: Jeff Mahoney; +Cc: Vladimir V. Saveliev, reiserfs

On Mon, 2005-08-29 at 10:51 -0400, Jeff Mahoney wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Ming Zhang wrote:
> > On Mon, 2005-08-29 at 10:26 -0400, Jeff Mahoney wrote:
> > No. Block size is the declared filesystem blocksize, not the hardware
> > sector size. It must be a power of 2, and 512-8192 bytes. The "standard"
> > filesystem blocksize is 4k. If you've declared your block size as 512
> > bytes (using mkreiserfs -b 512), that would certainly be another source
> > of performance issues.
> > 
> >> so 1 block per bit, thus (blocksize * 8) block per block.
> 
> Exactly.
> 
> >> since that is a newly formatted fs, there is no journal to replay.
> >> because that FS is big with 3.2TB, if bitmap is not continuous on disks,
> >> then the read is like a random read to read around total ~100MB 4K piece
> >> from disk. so this is why it is slow?
> 
> I need to look into this some more, but I suspect it may be related to
> congestion avoidance. The requests don't bind up in waiting for the data
> to come back, but, rather, allocating the request in the first place.
> 
> >> any way to store these bitmap together?
> 
> The "old" reiserfs disk format did exactly that. However, the gain
> realized (if any, see above) at mount time is quickly lost when the
> filesystem can no longer be dynamically expanded/shrunk, and if the
> bitmaps are actually read on-demand, then it causes needless seeks to
> the "bitmap secion."

but anyway the bitmap will not scattered all around the disk rite?

so where i could find a document about this bitmap layout? also detailed
information on whole file system layout?

thanks.

ming



> 
> - -Jeff
> 
> - --
> Jeff Mahoney
> SuSE Labs
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.0 (GNU/Linux)
> 
> iD8DBQFDEyDlLPWxlyuTD7IRArjZAJoCxQCJ8Qs4AM1OQZEJIhz1BvYwDQCeIRk+
> VvRxXcyH1puW2vq1xDYygL0=
> =FVcM
> -----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: reiser fs slow on mksf and mount
  2005-08-29 15:20                             ` Ming Zhang
@ 2005-08-29 15:28                               ` Jeff Mahoney
  2005-08-29 15:37                                 ` Ming Zhang
  0 siblings, 1 reply; 28+ messages in thread
From: Jeff Mahoney @ 2005-08-29 15:28 UTC (permalink / raw)
  To: mingz; +Cc: Vladimir V. Saveliev, reiserfs, Hans Reiser

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ming Zhang wrote:
>>>any way to store these bitmap together?
> The "old" reiserfs disk format did exactly that. However, the gain
> realized (if any, see above) at mount time is quickly lost when the
> filesystem can no longer be dynamically expanded/shrunk, and if the
> bitmaps are actually read on-demand, then it causes needless seeks to
> the "bitmap secion."
> 
>> but anyway the bitmap will not scattered all around the disk rite?

If they were grouped together, no. As I said, though, there are other
reasons not to do that.

>> so where i could find a document about this bitmap layout? also detailed
>> information on whole file system layout?

Since Reiser4 became the focus of Namesys development, Reiser3
information has been somewhat difficult to find on the web site. It's
possible to find it using archive.org, however.

Hans - would you consider restoring reiser3 information to the namesys
web site?

- -Jeff

- --
Jeff Mahoney
SuSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFDEymZLPWxlyuTD7IRAv5wAJ9coGB6bChWSLyK1x7lB1LF2E4mIwCfUsEN
4nZnXzED8ytA5jiTkv0LbcE=
=Tocq
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: reiser fs slow on mksf and mount
  2005-08-29 15:28                               ` Jeff Mahoney
@ 2005-08-29 15:37                                 ` Ming Zhang
  0 siblings, 0 replies; 28+ messages in thread
From: Ming Zhang @ 2005-08-29 15:37 UTC (permalink / raw)
  To: Jeff Mahoney; +Cc: Vladimir V. Saveliev, reiserfs, Hans Reiser

On Mon, 2005-08-29 at 11:28 -0400, Jeff Mahoney wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Ming Zhang wrote:
> >>>any way to store these bitmap together?
> > The "old" reiserfs disk format did exactly that. However, the gain
> > realized (if any, see above) at mount time is quickly lost when the
> > filesystem can no longer be dynamically expanded/shrunk, and if the
> > bitmaps are actually read on-demand, then it causes needless seeks to
> > the "bitmap secion."
> > 
> >> but anyway the bitmap will not scattered all around the disk rite?
> 
> If they were grouped together, no. As I said, though, there are other
> reasons not to do that.
> 
> >> so where i could find a document about this bitmap layout? also detailed
> >> information on whole file system layout?
> 
> Since Reiser4 became the focus of Namesys development, Reiser3
> information has been somewhat difficult to find on the web site. It's
> possible to find it using archive.org, however.
> 
> Hans - would you consider restoring reiser3 information to the namesys
> web site?

V4 information is great as well. but the namesys.com does not have
detailed info.

ming

> 
> - -Jeff
> 
> - --
> Jeff Mahoney
> SuSE Labs
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.0 (GNU/Linux)
> 
> iD8DBQFDEymZLPWxlyuTD7IRAv5wAJ9coGB6bChWSLyK1x7lB1LF2E4mIwCfUsEN
> 4nZnXzED8ytA5jiTkv0LbcE=
> =Tocq
> -----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: reiser fs slow on mksf and mount
  2005-08-26 17:04 ` Vladimir V. Saveliev
  2005-08-26 17:08   ` Ming Zhang
@ 2005-08-29 16:44   ` Ming Zhang
  1 sibling, 0 replies; 28+ messages in thread
From: Ming Zhang @ 2005-08-29 16:44 UTC (permalink / raw)
  To: Vladimir V. Saveliev; +Cc: reiserfs

just reproduced this on a 18GB SCSI disk. mount is still slow. so not
related to RAID but only with bitmap. i just do modprobe aic7xxx, mkfs,
then do mount, so the disk should be spin up.

ming


--------------------------------------

[root@sc420 root]# time mkfs.reiserfs -ff /dev/sdh1
mkfs.reiserfs 3.6.13 (2003 www.namesys.com)

...

Guessing about desired format.. Kernel 2.6.12.4 is running.
Format 3.6 with standard journal
Count of blocks on the device: 4421872
Number of blocks consumed by mkreiserfs formatting process: 8346
Blocksize: 4096
Hash function used to sort names: "r5"
Journal Size 8193 blocks (first block 18)
Journal Max transaction length 1024
inode generation number: 0
UUID: b3de310a-b494-4415-a921-090d94f2f211
Initializing journal - 0%....20%....40%....60%....80%....100%
Syncing..ok

Tell your friends to use a kernel based on 2.4.18 or later, and
especially not a
kernel based on 2.4.9, when you use reiserFS. Have fun.

ReiserFS is successfully created on /dev/sdh1.

real    0m5.018s
user    0m0.028s
sys     0m0.134s

[root@sc420 root]# time mount /dev/sdh1 t

real    1m3.608s
user    0m0.000s
sys     0m0.052s

[root@sc420 root]# df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda5             20161172   6417940  12719092  34% /
/dev/sda3               194449     18990    165419  11% /boot
none                    127544         0    127544   0% /dev/shm
/dev/sdh1             17686944     32840  17654104   1% /root/t

[root@sc420 root]# rpm -q reiserfs-utils
reiserfs-utils-3.6.13-1


On Fri, 2005-08-26 at 21:04 +0400, Vladimir V. Saveliev wrote:
> Hello
> 
> Ming Zhang wrote:
> > Hi, folks
> > 
> > I am not sure if this is normal or not.
> > 
> > I try to create&use a reiserfs on a 8 disk raid0. Then I found that mkfs
> > need ~90 sec and mount need ~70 seconds. 
> > 
> > Is there anything wrong on my side?
> > 
> 
> Your device is too big.
> 
> > Thanks!
> > 
> > 
> > Ming
> > 
> > 
> > 
> > Detailed info followed.
> > ---------------------------------------------------------------------
> > [root@bakstor2u root]# cat /proc/mdstat
> > Personalities : [linear] [raid0] [raid1] [raid5] [multipath] [raid6]
> > [raid10] [faulty]
> > md0 : active raid0 sda[0] sdh[7] sdg[6] sdf[5] sde[4] sdd[3] sdc[2] sdb
> > [1]
> >       3125690368 blocks 64k chunks
> > 
> > unused devices: <none>
> > 
> > [root@bakstor2u root]# time mkfs.reiserfs /dev/md0 -ff
> > mkfs.reiserfs 3.6.13 (2003 www.namesys.com)
> > 
> > <...>
> > 
> > Guessing about desired format.. Kernel 2.6.11.12 is running.
> > Format 3.6 with standard journal
> > Count of blocks on the device: 781422592
> > Number of blocks consumed by mkreiserfs formatting process: 32059
> > Blocksize: 4096
> > Hash function used to sort names: "r5"
> > Journal Size 8193 blocks (first block 18)
> > Journal Max transaction length 1024
> > inode generation number: 0
> > UUID: 98d990f3-d54f-43e3-9fde-8c9c9a6d3481
> > Initializing journal - 0%....20%....40%....60%....80%....100%
> > Syncing..ok
> > 
> > Tell your friends to use a kernel based on 2.4.18 or later, and
> > especially not a
> > kernel based on 2.4.9, when you use reiserFS. Have fun.
> > 
> > ReiserFS is successfully created on /dev/md0.
> > 
> > real    1m28.783s
> > user    0m0.151s
> > sys     0m0.398s
> > 
> 
> Hmm, mkfs.reiserfs had to write 32059 blocks. It is about 131mb. 1m28s is too much for that.
> Could it be that some of disks used in that raid were not spinning when you started mkreiserfs?
> 
> > [root@bakstor2u root]# time mount /dev/md0  t
> > 
> > real    1m11.448s
> > user    0m0.000s
> > sys     0m0.225s
> > 
> 
> There is a patch to cure this problem.
> http://www.mail-archive.com/reiserfs-list@namesys.com/msg18442.html
> Please note that it is experimental one.
> 
> > 
> > 
> > 
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: reiser fs slow on mksf and mount
  2005-08-27 19:29           ` Jeff Mahoney
  2005-08-27 21:45             ` Christian Iversen
  2005-08-27 22:53             ` Ming Zhang
@ 2005-08-29 19:40             ` Hans Reiser
  2005-08-29 19:44               ` Jeff Mahoney
  2 siblings, 1 reply; 28+ messages in thread
From: Hans Reiser @ 2005-08-29 19:40 UTC (permalink / raw)
  To: Jeff Mahoney; +Cc: mingz, Vladimir V. Saveliev, reiserfs-list

Did you ever look into my question about device congestion, and whether
raising that limit would fix the bitmap loading time issue?

Hans

Jeff Mahoney wrote:

> Ming Zhang wrote:
>
> >On Fri, 2005-08-26 at 21:32 +0400, Vladimir V. Saveliev wrote:
>
>
> >one more question about this bitmap blocks
>
> >are this bitmap data is pinned into system thus will not be swapped out?
>
>
> Yes, any buffers/pages with active reference counts are kept in memory.
> Since the current reiserfs bitmap implementation keeps a reference until
> filesystem umount, the bitmaps are pinned.
>
> My dynamic bitmap patch fixes both of the problems you've posed so far.
> Mount time is reduced to O(1) time, since only the superblock and root
> node are read at mount time. On my system, it's something along the
> lines of 0.2s. Memory consumption is reduced also, because the bitmap
> block is released after the allocation/free that required it is complete.
>
> It's a relatively straightforward patch - the error handling I refer to
> is how to handle block read failures, which would only occur if your
> disk is failing.
>
> -Jeff
>
> --
> Jeff Mahoney
> SuSE Labs


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: reiser fs slow on mksf and mount
  2005-08-27 21:55               ` David Masover
@ 2005-08-29 19:44                 ` Hans Reiser
  0 siblings, 0 replies; 28+ messages in thread
From: Hans Reiser @ 2005-08-29 19:44 UTC (permalink / raw)
  To: David Masover; +Cc: Christian Iversen, reiserfs-list

David Masover wrote:

> Christian Iversen wrote:
>
>> On Saturday 27 August 2005 21:29, Jeff Mahoney wrote:
>>
>>> Ming Zhang wrote:
>>
>
>> Another thing is that it can easily take several seconds to do "ls
>> -l" on a directory with a 0-10 GB data in it. Is that normal? There's
>> usually less than 50 files of test data, ranging in size from 200MB
>> to 900MB. I've disabled atime updates, but that didn't help much. The
>> controller and disks are plenty fast, so I feel something is amiss. 
>
>
> Interesting, I'd always assumed this was an issue with the lazy
> allocation.  On my box, this meant that occasionally, I'd run into a
> situation where some random FS operation would take 5-10 seconds,
> because (I assumed) it would have been the random operation that used
> up enough RAM that the FS decided to flush.
>
>
This performance issue should be fixed in reiser4, please give it a
try.  It has to do with where stat data get stored.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: reiser fs slow on mksf and mount
  2005-08-29 19:40             ` Hans Reiser
@ 2005-08-29 19:44               ` Jeff Mahoney
  2005-08-29 19:53                 ` Hans Reiser
  0 siblings, 1 reply; 28+ messages in thread
From: Jeff Mahoney @ 2005-08-29 19:44 UTC (permalink / raw)
  To: Hans Reiser; +Cc: mingz, Vladimir V. Saveliev, reiserfs-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hans Reiser wrote:
> Did you ever look into my question about device congestion, and whether
> raising that limit would fix the bitmap loading time issue?

I haven't really had the time recently. I'll look into it, but it
doesn't change the fact that we're wasting RAM on huge filesystems.

- -Jeff

- --
Jeff Mahoney
SuSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFDE2WqLPWxlyuTD7IRAt8SAJ0b9JGq+vLlc4OXB+RSQUJsJIsfigCdFPVh
6UHE2ZO/z2cdXu/sQrkEYSY=
=Gvd6
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: reiser fs slow on mksf and mount
  2005-08-29 19:44               ` Jeff Mahoney
@ 2005-08-29 19:53                 ` Hans Reiser
  0 siblings, 0 replies; 28+ messages in thread
From: Hans Reiser @ 2005-08-29 19:53 UTC (permalink / raw)
  To: Jeff Mahoney; +Cc: mingz, Vladimir V. Saveliev, reiserfs-list, Nate Diller

Jeff Mahoney wrote:

> Hans Reiser wrote:
>
> >Did you ever look into my question about device congestion, and whether
> >raising that limit would fix the bitmap loading time issue?
>
>
> I haven't really had the time recently. I'll look into it, but it
> doesn't change the fact that we're wasting RAM on huge filesystems.
>
> -Jeff
>
> --
> Jeff Mahoney
> SuSE Labs

Do you understand the argument, namely that it does not do any good to
have N spindles if the device congestion limit prevents them from going
in parallel?

An easy way to test this would be to see if striped devices mount faster
than concatenated ones.  If yes, then there is crap io scheduler and/or
raid device driver code to fix, and it matters more than the original
issue at question.

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2005-08-29 19:53 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-26 16:45 reiser fs slow on mksf and mount Ming Zhang
2005-08-26 17:04 ` Vladimir V. Saveliev
2005-08-26 17:08   ` Ming Zhang
2005-08-26 17:15     ` Ming Zhang
2005-08-26 17:32       ` Vladimir V. Saveliev
2005-08-26 18:07         ` Ming Zhang
2005-08-26 18:16         ` Ming Zhang
2005-08-27 19:29           ` Jeff Mahoney
2005-08-27 21:45             ` Christian Iversen
2005-08-27 21:55               ` David Masover
2005-08-29 19:44                 ` Hans Reiser
2005-08-27 22:54               ` Ming Zhang
2005-08-29 15:07               ` Jeff Mahoney
2005-08-27 22:53             ` Ming Zhang
2005-08-28  0:01               ` Jeff Mahoney
2005-08-28 15:40                 ` Ming Zhang
2005-08-28 18:44                   ` Jeff Mahoney
2005-08-29 12:39                     ` Ming Zhang
2005-08-29 14:26                       ` Jeff Mahoney
2005-08-29 14:41                         ` Ming Zhang
2005-08-29 14:51                           ` Jeff Mahoney
2005-08-29 15:20                             ` Ming Zhang
2005-08-29 15:28                               ` Jeff Mahoney
2005-08-29 15:37                                 ` Ming Zhang
2005-08-29 19:40             ` Hans Reiser
2005-08-29 19:44               ` Jeff Mahoney
2005-08-29 19:53                 ` Hans Reiser
2005-08-29 16:44   ` Ming Zhang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.