Re: fsck performance. - Rogier Wolff

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Rogier Wolff <R.E.Wolff@BitWizard.nl>
To: Andreas Dilger <adilger@dilger.ca>
Cc: Pawe?? Brodacki <pawel.brodacki@googlemail.com>,
	Amir Goldstein <amir73il@gmail.com>, Ted Ts'o <tytso@mit.edu>,
	Rogier Wolff <R.E.Wolff@bitwizard.nl>,
	linux-ext4@vger.kernel.org
Subject: Re: fsck performance.
Date: Tue, 22 Feb 2011 11:20:56 +0100	[thread overview]
Message-ID: <20110222102056.GH21917@bitwizard.nl> (raw)
In-Reply-To: <C5662C0F-10E3-4AC3-8ACA-4CBBF4F36D70@dilger.ca>

On Mon, Feb 21, 2011 at 11:00:02AM -0700, Andreas Dilger wrote:
> On 2011-02-21, at 9:04 AM, Pawe?? Brodacki wrote:
> > 2011/2/21 Amir Goldstein <amir73il@gmail.com>:
> >> One thing I am not sure I understand is (excuse my ignorance) why is the
> >> swap space solution good only for 64bit processors?
> > 
> > It's an address space limit on 32 bit processors. Even with PAE the
> > user space process still won't have access to more than 2^32 bits,
> > that is 4 GiB of memory. Due to practical limitations (e.g. kernel
> > needing some address space) usually a process won't have access to
> > more than 3 GiB.
> 
> Roger,

> are you using the icount allocation reduction patches previously
> posted?  They won't help if you need more than 3GB of address space,
> but they definitely reduce the size of allocations and allow the
> icount data to be swapped.  See the thread "[PATCH]: icount: Replace
> the icount list by a two-level tree".

No I don't think I'm using those patches. (Unless they are in the git head). 

I wouldn't be surprised if I'd need more than 3G of RAM. When I
extrapolated "more than a few days" it was at under 20% of the
filesystem and had already allocated on the order of 800Gb of
memory. Now I'm not entirely sure that this is fair: memory use seems
to go up quickly in the beginning, and then stabilize: as if it has
decided that 800M of memory use is "acceptable" and somehow uses a
different strategy once it hits that limit.

On the other hand, things are going reasonably fast until it starts
hitting the CPU-bottleneck so I might have seen the memory usage
flatten out because it wasn't making any significant progress anymore.

Anyway, I've increased the hash size to 1M, up from 131. The TDB guys
suggested 10k: their TDB code is being used for MUCH larger cases than
they expected....

Succes! It's been running 2 hours, and it's past the half-way point
(i.e. 2.5 times further than previously after 24 hours).  (of pass
1). It currently has 1400Mb of memory allocated. Hope the 3G limit
doesn't hit me before it finishes....

> >> If it is common knowledge, do you know of an upper limit (depending on fs size,
> >> no. of inodes, etc)?
> >> 
> > 
> > I vaguely remember some estimation of memory requirements of fsck
> > being given somewhere, but I'm not able to find the posts now :(.

> My rule of thumb is about 1 byte of memory per block in the
> filesystem, for "normal" filesystems (i.e. mostly regular files, and
> a small fraction of directories).  For a 3TB filesystem this would
> mean ~768MB of RAM.  One problem is that the current icount
> implementation allocates almost 2x the peak usage when it is
> resizing the array, hence the patch mentioned above for filesystems
> with lots of directories and hard links.

My filesystem is a bit weird: I make an "rsync" copy of all my data
onto it. Then I run a cp -lr to copy the current copy to a copy with
the date in it. Next I run a program that will make a second copy
should the number of links exceed 8000....

In short I have a HUMUGOUS amount of directory entries, and lots and
lots of files. And still in the millions of inodes. (Some of the
filesystems backed up contain that many inodes). 

	Roger. 

-- 
** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**    Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233    **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. --------- Adapted from lxrbot FAQ

next prev parent reply	other threads:[~2011-02-22 10:20 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-20  9:06 fsck performance Rogier Wolff
2011-02-20 17:09 ` Ted Ts'o
2011-02-20 19:34   ` Ted Ts'o
2011-02-20 21:55     ` Rogier Wolff
2011-02-20 22:20       ` Ted Ts'o
2011-02-20 23:15         ` Rogier Wolff
2011-02-20 23:41           ` Ted Ts'o
2011-02-21 10:31             ` Amir Goldstein
2011-02-21 16:04               ` Paweł Brodacki
2011-02-21 18:00                 ` Andreas Dilger
2011-02-22 10:20                   ` Rogier Wolff [this message]
2011-02-22 13:36                     ` Rogier Wolff
2011-02-22 13:54                       ` Rogier Wolff
2011-02-22 16:32                         ` Andreas Dilger
2011-02-22 22:13                           ` Ted Ts'o
2011-02-23  4:44                             ` Rogier Wolff
2011-02-23 11:32                               ` Theodore Tso
2011-02-23 20:53                                 ` Rogier Wolff
2011-02-23 22:24                                   ` Andreas Dilger
2011-02-23 23:17                                     ` Ted Ts'o
2011-02-24  0:41                                       ` Andreas Dilger
2011-02-24  8:59                                         ` Rogier Wolff
2011-02-24  7:29                                     ` Rogier Wolff
2011-02-24  8:59                                       ` Amir Goldstein
2011-02-24  9:02                                         ` Rogier Wolff
2011-02-24  9:33                                           ` Amir Goldstein
2011-02-24 23:53                                         ` Rogier Wolff
2011-02-25  0:26                                       ` Daniel Taylor
2011-02-23  2:54                           ` Rogier Wolff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110222102056.GH21917@bitwizard.nl \
    --to=r.e.wolff@bitwizard.nl \
    --cc=adilger@dilger.ca \
    --cc=amir73il@gmail.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=pawel.brodacki@googlemail.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).