All of lore.kernel.org
 help / color / mirror / Atom feed
* tuning a fs for large number of small files
@ 2002-11-06 12:45 Marko Asplund
  0 siblings, 0 replies; 4+ messages in thread
From: Marko Asplund @ 2002-11-06 12:45 UTC (permalink / raw)
  To: reiserfs-list


i'm testing the performance of an ext3 filesystem which has become
painfully slow on ReiserFS. the filesystem has a large number of small
files (~ 2 million) and a many directories (~ 1 million). is there
something i can do to tune the ReiserFS filesystem in this case e.g. by
selecting a specific hash funtion to use?

best regards,
-- 
	aspa



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: tuning a fs for large number of small files
       [not found] <20021106160733.T23227@vestdata.no>
@ 2002-11-06 15:25 ` Marko Asplund
  2002-11-06 17:16   ` Mike Benoit
  2002-11-10 20:57   ` Hans Reiser
  0 siblings, 2 replies; 4+ messages in thread
From: Marko Asplund @ 2002-11-06 15:25 UTC (permalink / raw)
  To: reiserfs-list; +Cc: Ragnar Kjørstad

On Wed, 6 Nov 2002, Ragnar Kjørstad wrote:

> ...
> If you have the chance you can tar together all the files, and extract
> them again. This will help in two ways:
> - reduce fragmentation
> - write files to the filesystem in reiserfs-optimal order.
>   If you create the files in the increasing hash-order access will
>   typlicly be much faster. If you create a tar of the files from a
>   reiserfs-filesystem (with the same hash) you will accomplish just
>   that.

i was thinking about using something like the following method for 
migrating the files to the new filesystem:
tar clf - /mnt/oldext3fs | tar xf -

this should reduce fragmentation but will it write the files in
filesreiserfs-optimal order?

> What hash-function to use doesn't really matter unless you have a lot of
> files in a single directory. Do you have that? Is there any typlical
> "pattern" for the filenames, and the order they are created in?

there shouldn't be particularly many files in a single directory. the
filesystem is actually a hierarchy "database" for our backup system
(Arkeia) and it replicates the filesystem trees of the machines being
backed up. i can't come up with a typical file naming or creation order
patterns. most of the directory hierarchies are quite static except for
the home directories perhaps.

best regards,
-- 
	aspa


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: tuning a fs for large number of small files
  2002-11-06 15:25 ` tuning a fs for large number of small files Marko Asplund
@ 2002-11-06 17:16   ` Mike Benoit
  2002-11-10 20:57   ` Hans Reiser
  1 sibling, 0 replies; 4+ messages in thread
From: Mike Benoit @ 2002-11-06 17:16 UTC (permalink / raw)
  To: Marko Asplund; +Cc: reiserfs-list

On Wed, 2002-11-06 at 07:25, Marko Asplund wrote:
> On Wed, 6 Nov 2002, Ragnar Kjørstad wrote:
> 
> > ...
> > If you have the chance you can tar together all the files, and extract
> > them again. This will help in two ways:
> > - reduce fragmentation
> > - write files to the filesystem in reiserfs-optimal order.
> >   If you create the files in the increasing hash-order access will
> >   typlicly be much faster. If you create a tar of the files from a
> >   reiserfs-filesystem (with the same hash) you will accomplish just
> >   that.
> 
> i was thinking about using something like the following method for 
> migrating the files to the new filesystem:
> tar clf - /mnt/oldext3fs | tar xf -
> 
> this should reduce fragmentation but will it write the files in
> filesreiserfs-optimal order?
> 
> > What hash-function to use doesn't really matter unless you have a lot of
> > files in a single directory. Do you have that? Is there any typlical
> > "pattern" for the filenames, and the order they are created in?
> 
> there shouldn't be particularly many files in a single directory. the
> filesystem is actually a hierarchy "database" for our backup system
> (Arkeia) and it replicates the filesystem trees of the machines being
> backed up. i can't come up with a typical file naming or creation order
> patterns. most of the directory hierarchies are quite static except for
> the home directories perhaps.
> 

Speak of the devil. (Arkeia) We have a similar setup to what you
described, our Arkeia server handles about 1TB of data per full backup,
consisting of over 25 million files. We keep the Arkeia database on
reiserfs 3.x using these mount options: 

defaults,noatime,notail,nodiratime

Though reiserfs handles all the small files without a problem, we found
that hacking the kernel to disable fdatasync was the only way to get
decent speeds from Arkeia. Beware though, Arkeia does like to crash
often, so this usually means more database restores (like I'm doing
right now), but our backup just wouldn't fit in the window otherwise. 

Regardless, Arkeia just can't handle our load well at all, it's database
is just way to inefficient and there support is terrible at best.
Honestly I wouldn't recommend Arkeia to my worst enemy if they need to
backup more then a few gigs.

-- 
Best Regards,
 
Mike Benoit
NetNation Communication Inc.
Systems Engineer
Tel: 604-684-6892 or 888-983-6600
 ---------------------------------------
 
 Disclaimer: Opinions expressed here are my own and not 
 necessarily those of my employer


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: tuning a fs for large number of small files
  2002-11-06 15:25 ` tuning a fs for large number of small files Marko Asplund
  2002-11-06 17:16   ` Mike Benoit
@ 2002-11-10 20:57   ` Hans Reiser
  1 sibling, 0 replies; 4+ messages in thread
From: Hans Reiser @ 2002-11-10 20:57 UTC (permalink / raw)
  To: Mike Benoit; +Cc: Marko Asplund, reiserfs-list

Mike Benoit wrote:

>On Wed, 2002-11-06 at 07:25, Marko Asplund wrote:
>  
>
>>On Wed, 6 Nov 2002, Ragnar Kjørstad wrote:
>>
>>    
>>
>>>...
>>>If you have the chance you can tar together all the files, and extract
>>>them again. This will help in two ways:
>>>- reduce fragmentation
>>>- write files to the filesystem in reiserfs-optimal order.
>>>  If you create the files in the increasing hash-order access will
>>>  typlicly be much faster. If you create a tar of the files from a
>>>  reiserfs-filesystem (with the same hash) you will accomplish just
>>>  that.
>>>      
>>>
>>i was thinking about using something like the following method for 
>>migrating the files to the new filesystem:
>>tar clf - /mnt/oldext3fs | tar xf -
>>
>>this should reduce fragmentation but will it write the files in
>>filesreiserfs-optimal order?
>>
>>    
>>
>>>What hash-function to use doesn't really matter unless you have a lot of
>>>files in a single directory. Do you have that? Is there any typlical
>>>"pattern" for the filenames, and the order they are created in?
>>>      
>>>
>>there shouldn't be particularly many files in a single directory. the
>>filesystem is actually a hierarchy "database" for our backup system
>>(Arkeia) and it replicates the filesystem trees of the machines being
>>backed up. i can't come up with a typical file naming or creation order
>>patterns. most of the directory hierarchies are quite static except for
>>the home directories perhaps.
>>
>>    
>>
>
>Speak of the devil. (Arkeia) We have a similar setup to what you
>described, our Arkeia server handles about 1TB of data per full backup,
>consisting of over 25 million files. We keep the Arkeia database on
>reiserfs 3.x using these mount options: 
>
>defaults,noatime,notail,nodiratime
>
>Though reiserfs handles all the small files without a problem, we found
>that hacking the kernel to disable fdatasync was the only way to get
>decent speeds from Arkeia. Beware though, Arkeia does like to crash
>often, so this usually means more database restores (like I'm doing
>right now), but our backup just wouldn't fit in the window otherwise. 
>
>Regardless, Arkeia just can't handle our load well at all, it's database
>is just way to inefficient and there support is terrible at best.
>Honestly I wouldn't recommend Arkeia to my worst enemy if they need to
>backup more then a few gigs.
>
>  
>
copy your data twice, once to reiserfs, and then again to reiserfs. 
 This will cause objectids to be assigned in reiser3 readdir() sorted 
order (ext3 does not sort).


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2002-11-10 20:57 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20021106160733.T23227@vestdata.no>
2002-11-06 15:25 ` tuning a fs for large number of small files Marko Asplund
2002-11-06 17:16   ` Mike Benoit
2002-11-10 20:57   ` Hans Reiser
2002-11-06 12:45 Marko Asplund

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.