* Re: tuning a fs for large number of small files
2002-11-06 15:25 ` tuning a fs for large number of small files Marko Asplund
@ 2002-11-06 17:16 ` Mike Benoit
2002-11-10 20:57 ` Hans Reiser
1 sibling, 0 replies; 4+ messages in thread
From: Mike Benoit @ 2002-11-06 17:16 UTC (permalink / raw)
To: Marko Asplund; +Cc: reiserfs-list
On Wed, 2002-11-06 at 07:25, Marko Asplund wrote:
> On Wed, 6 Nov 2002, Ragnar Kjørstad wrote:
>
> > ...
> > If you have the chance you can tar together all the files, and extract
> > them again. This will help in two ways:
> > - reduce fragmentation
> > - write files to the filesystem in reiserfs-optimal order.
> > If you create the files in the increasing hash-order access will
> > typlicly be much faster. If you create a tar of the files from a
> > reiserfs-filesystem (with the same hash) you will accomplish just
> > that.
>
> i was thinking about using something like the following method for
> migrating the files to the new filesystem:
> tar clf - /mnt/oldext3fs | tar xf -
>
> this should reduce fragmentation but will it write the files in
> filesreiserfs-optimal order?
>
> > What hash-function to use doesn't really matter unless you have a lot of
> > files in a single directory. Do you have that? Is there any typlical
> > "pattern" for the filenames, and the order they are created in?
>
> there shouldn't be particularly many files in a single directory. the
> filesystem is actually a hierarchy "database" for our backup system
> (Arkeia) and it replicates the filesystem trees of the machines being
> backed up. i can't come up with a typical file naming or creation order
> patterns. most of the directory hierarchies are quite static except for
> the home directories perhaps.
>
Speak of the devil. (Arkeia) We have a similar setup to what you
described, our Arkeia server handles about 1TB of data per full backup,
consisting of over 25 million files. We keep the Arkeia database on
reiserfs 3.x using these mount options:
defaults,noatime,notail,nodiratime
Though reiserfs handles all the small files without a problem, we found
that hacking the kernel to disable fdatasync was the only way to get
decent speeds from Arkeia. Beware though, Arkeia does like to crash
often, so this usually means more database restores (like I'm doing
right now), but our backup just wouldn't fit in the window otherwise.
Regardless, Arkeia just can't handle our load well at all, it's database
is just way to inefficient and there support is terrible at best.
Honestly I wouldn't recommend Arkeia to my worst enemy if they need to
backup more then a few gigs.
--
Best Regards,
Mike Benoit
NetNation Communication Inc.
Systems Engineer
Tel: 604-684-6892 or 888-983-6600
---------------------------------------
Disclaimer: Opinions expressed here are my own and not
necessarily those of my employer
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: tuning a fs for large number of small files
2002-11-06 15:25 ` tuning a fs for large number of small files Marko Asplund
2002-11-06 17:16 ` Mike Benoit
@ 2002-11-10 20:57 ` Hans Reiser
1 sibling, 0 replies; 4+ messages in thread
From: Hans Reiser @ 2002-11-10 20:57 UTC (permalink / raw)
To: Mike Benoit; +Cc: Marko Asplund, reiserfs-list
Mike Benoit wrote:
>On Wed, 2002-11-06 at 07:25, Marko Asplund wrote:
>
>
>>On Wed, 6 Nov 2002, Ragnar Kjørstad wrote:
>>
>>
>>
>>>...
>>>If you have the chance you can tar together all the files, and extract
>>>them again. This will help in two ways:
>>>- reduce fragmentation
>>>- write files to the filesystem in reiserfs-optimal order.
>>> If you create the files in the increasing hash-order access will
>>> typlicly be much faster. If you create a tar of the files from a
>>> reiserfs-filesystem (with the same hash) you will accomplish just
>>> that.
>>>
>>>
>>i was thinking about using something like the following method for
>>migrating the files to the new filesystem:
>>tar clf - /mnt/oldext3fs | tar xf -
>>
>>this should reduce fragmentation but will it write the files in
>>filesreiserfs-optimal order?
>>
>>
>>
>>>What hash-function to use doesn't really matter unless you have a lot of
>>>files in a single directory. Do you have that? Is there any typlical
>>>"pattern" for the filenames, and the order they are created in?
>>>
>>>
>>there shouldn't be particularly many files in a single directory. the
>>filesystem is actually a hierarchy "database" for our backup system
>>(Arkeia) and it replicates the filesystem trees of the machines being
>>backed up. i can't come up with a typical file naming or creation order
>>patterns. most of the directory hierarchies are quite static except for
>>the home directories perhaps.
>>
>>
>>
>
>Speak of the devil. (Arkeia) We have a similar setup to what you
>described, our Arkeia server handles about 1TB of data per full backup,
>consisting of over 25 million files. We keep the Arkeia database on
>reiserfs 3.x using these mount options:
>
>defaults,noatime,notail,nodiratime
>
>Though reiserfs handles all the small files without a problem, we found
>that hacking the kernel to disable fdatasync was the only way to get
>decent speeds from Arkeia. Beware though, Arkeia does like to crash
>often, so this usually means more database restores (like I'm doing
>right now), but our backup just wouldn't fit in the window otherwise.
>
>Regardless, Arkeia just can't handle our load well at all, it's database
>is just way to inefficient and there support is terrible at best.
>Honestly I wouldn't recommend Arkeia to my worst enemy if they need to
>backup more then a few gigs.
>
>
>
copy your data twice, once to reiserfs, and then again to reiserfs.
This will cause objectids to be assigned in reiser3 readdir() sorted
order (ext3 does not sort).
^ permalink raw reply [flat|nested] 4+ messages in thread