From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeff Mahoney Subject: Re: reiser fs slow on mksf and mount Date: Mon, 29 Aug 2005 10:26:51 -0400 Message-ID: <43131B2B.8020406@suse.com> References: <1125074717.5549.44.camel@localhost.localdomain> <430F4B91.7030909@namesys.com> <1125076138.5549.65.camel@localhost.localdomain> <1125076558.5549.72.camel@localhost.localdomain> <430F5226.50701@namesys.com> <1125080213.5549.100.camel@localhost.localdomain> <4310BF35.3060603@suse.com> <1125183208.5569.3.camel@localhost.localdomain> <4310FEC1.5020600@suse.com> <1125243608.5544.18.camel@localhost.localdomain> <4312060D.8020904@suse.com> <1125319151.5544.16.camel@localhost.localdomain> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Return-path: list-help: list-unsubscribe: list-post: Errors-To: flx@namesys.com In-Reply-To: <1125319151.5544.16.camel@localhost.localdomain> List-Id: Content-Type: text/plain; charset="us-ascii" To: Ming Zhang Cc: "Vladimir V. Saveliev" , reiserfs -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Ming Zhang wrote: > On Sun, 2005-08-28 at 14:44 -0400, Jeff Mahoney wrote: >>* We don't cache any other metadata (other than the superblock, which is >>standard practice) specially. In a mostly-reader environment, bitmaps >>would rank very low in importance for caching. > >>>could u explain a bit more on what is the purpose of these bitmaps? what >>>is the relationship between these bitmap and other metadata? > The bitmaps are used to keep track of which blocks on disk are used, and > which are available for allocation. Every (blocksize * 8) blocks, there > >> here blocksize is 512bytes right from followed data? this comes from >> sector size? No. Block size is the declared filesystem blocksize, not the hardware sector size. It must be a power of 2, and 512-8192 bytes. The "standard" filesystem blocksize is 4k. If you've declared your block size as 512 bytes (using mkreiserfs -b 512), that would certainly be another source of performance issues. >> so what is the on disk layout? i asked this because when i have a slow >> mount reiserfs on top of RAID1, I saw many small write each second. I >> guess they scatter over whole disk. Well two things occur on mount: Reading the bitmaps causes a read every 128M to occur, and replaying the journal can cause up to 8192 block writes to occur. Replaying the journal is generally pretty quick. Reading the bitmaps on a large filesystem can take a while. This is the issue you originally asked about. > is a block reserved to keep track of which blocks in that range are > allocated or not. On a 4k block filesystem, that boils down to 1 4k > block for every 128 MB. If a block is used, the bit corresponding to it > is set. When the block is freed, the bit is cleared. > > Well there are a several kinds of metadata on the filesystem: The super > block, the bitmaps, the journal, and the reiserfs s-tree itself. The > journal and bitmaps are only used when writing to the filesystem. The > superblock and s-tree are used for any filesystem access. The > relationship is that before a file data block or an s-tree node can be > allocated on disk, the bitmaps must be checked to see where the block > can be allocated. > >> ic. so other meta-data is checked as other file systems. No. The bitmaps and journal are still part of the same filesystem. They are just not part of the s-tree. >>>assumed i have 2GB or 4GB ram, which is not unbelievable for a desktop >>>now. but can these RAM be used by 32BIT arch? > The RAM can be used, sure, but not for the bitmaps. I believe the buffer > heads for the bitmaps need to come out of the memory < 1 GB. It would be > possible to put the bitmaps in high memory (like any other data), but > the patch to do so would likely be more involved than the dynamic bitmap > patch, and still waste the memory anyway. > >> yes, i also suspect this 1GB limit. So 64bit is the way and AMD64 is >> cheap anyway rite? Personally, I think so. > current disk head, that is an operation that is performed by the block > layer. It can make the best decisions on that, since it its at the > lowest level of abstraction. It's entirely possible that a filesystem be > mounted via file-loopback on an NFS mount. In that case, the local > system has no information at all about where the disk head would be. > >> yes, but then block layer will need another bitmap to track which block >> is used or not and also do a mapping again... > >> the cost of layering? The ideas of "in use" and "available" are purely filesystem abstractions to keep track of where we already have filesystem data/metadata. The block layer doesn't know or care about them - it's just a collection of blocks that the user may do whatever they please with. Now, not to confuse the issue, but the example of a loopback-mounted filesystem can cause an allocation if the host file is sparse, but that's really a corner case. - -Jeff - -- Jeff Mahoney SuSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (GNU/Linux) iD8DBQFDExsrLPWxlyuTD7IRAvGmAJ9QU16I2oz/kkCbqwdeGcIgkey8TgCgqS8s lI6YzJEJ20j5LiheAqw6eoE= =YD9V -----END PGP SIGNATURE-----