All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hans Reiser <reiser@namesys.com>
To: Mike Benoit <mikeb@netnation.com>
Cc: Marko Asplund <aspa@kronodoc.fi>, reiserfs-list@namesys.com
Subject: Re: tuning a fs for large number of small files
Date: Sun, 10 Nov 2002 12:57:52 -0800	[thread overview]
Message-ID: <3DCEC850.5080906@namesys.com> (raw)
In-Reply-To: <Pine.LNX.4.44.0211061711290.2873-100000@gamay.kronodoc.fi>

Mike Benoit wrote:

>On Wed, 2002-11-06 at 07:25, Marko Asplund wrote:
>  
>
>>On Wed, 6 Nov 2002, Ragnar Kjørstad wrote:
>>
>>    
>>
>>>...
>>>If you have the chance you can tar together all the files, and extract
>>>them again. This will help in two ways:
>>>- reduce fragmentation
>>>- write files to the filesystem in reiserfs-optimal order.
>>>  If you create the files in the increasing hash-order access will
>>>  typlicly be much faster. If you create a tar of the files from a
>>>  reiserfs-filesystem (with the same hash) you will accomplish just
>>>  that.
>>>      
>>>
>>i was thinking about using something like the following method for 
>>migrating the files to the new filesystem:
>>tar clf - /mnt/oldext3fs | tar xf -
>>
>>this should reduce fragmentation but will it write the files in
>>filesreiserfs-optimal order?
>>
>>    
>>
>>>What hash-function to use doesn't really matter unless you have a lot of
>>>files in a single directory. Do you have that? Is there any typlical
>>>"pattern" for the filenames, and the order they are created in?
>>>      
>>>
>>there shouldn't be particularly many files in a single directory. the
>>filesystem is actually a hierarchy "database" for our backup system
>>(Arkeia) and it replicates the filesystem trees of the machines being
>>backed up. i can't come up with a typical file naming or creation order
>>patterns. most of the directory hierarchies are quite static except for
>>the home directories perhaps.
>>
>>    
>>
>
>Speak of the devil. (Arkeia) We have a similar setup to what you
>described, our Arkeia server handles about 1TB of data per full backup,
>consisting of over 25 million files. We keep the Arkeia database on
>reiserfs 3.x using these mount options: 
>
>defaults,noatime,notail,nodiratime
>
>Though reiserfs handles all the small files without a problem, we found
>that hacking the kernel to disable fdatasync was the only way to get
>decent speeds from Arkeia. Beware though, Arkeia does like to crash
>often, so this usually means more database restores (like I'm doing
>right now), but our backup just wouldn't fit in the window otherwise. 
>
>Regardless, Arkeia just can't handle our load well at all, it's database
>is just way to inefficient and there support is terrible at best.
>Honestly I wouldn't recommend Arkeia to my worst enemy if they need to
>backup more then a few gigs.
>
>  
>
copy your data twice, once to reiserfs, and then again to reiserfs. 
 This will cause objectids to be assigned in reiser3 readdir() sorted 
order (ext3 does not sort).


  parent reply	other threads:[~2002-11-10 20:57 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20021106160733.T23227@vestdata.no>
2002-11-06 15:25 ` tuning a fs for large number of small files Marko Asplund
2002-11-06 17:16   ` Mike Benoit
2002-11-10 20:57   ` Hans Reiser [this message]
2002-11-06 12:45 Marko Asplund

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3DCEC850.5080906@namesys.com \
    --to=reiser@namesys.com \
    --cc=aspa@kronodoc.fi \
    --cc=mikeb@netnation.com \
    --cc=reiserfs-list@namesys.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.