From mboxrd@z Thu Jan 1 00:00:00 1970 From: Hans Reiser Subject: Re: Calling stat with millions of files Date: Tue, 08 Jun 2004 09:28:29 -0700 Message-ID: <40C5E92D.2090101@namesys.com> References: Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Return-path: list-help: list-unsubscribe: list-post: Errors-To: flx@namesys.com In-Reply-To: List-Id: Content-Type: text/plain; charset="us-ascii"; format="flowed" To: Ross Skaliotis Cc: reiserfs-list@namesys.com Ross Skaliotis wrote: >We have a backup system (backupPC) that is responsible for backing up >millions of files and holding over 2 TB of data. The backup system saves >space by creating hard links where it can. This actually reduces the total >size down to 200 GB, however there are still millions of files/hard links. > >What does this have to do with reiserfs? Well, the partition runs reiserfs >3.6. Over the course of a few months (and the addition of millions more >files/hard links) the performance of the filesystem has become painfully >slow. > >In a directory of ~1000 files, running 'ls -l' or 'stat *' takes 45 >seconds to complete (assuming nothing is previously cached). We are using >the r5 hash, and mount the filesystem with noatime. No one directory has >(or will ever have) more than about 2000 files and hard links inside. > >Is this performance decrease when calling stat to be expected with >increasing millions of files/hard links? Would XFS or another filesystem >get around this somehow? I'd rather not use reiserfs4, as this is a >production backup server and needs to be as stable as possible. > >Thanks very much for your help, > >-Ross Skaliotis > > > > hardlinks destroy locality of reference for stat data, this is probably your problem. Probably the readahead for the directory is broken by hard links also. Likely could be optimized but nobody else is complaining about it so I can't really afford to do it just for you, sorry.