Re: [pagevec] resize pagevec to O(lg(NR_CPUS))

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Nick Piggin <nickpiggin@yahoo.com.au>
To: William Lee Irwin III <wli@holomorphy.com>
Cc: Andrew Morton <akpm@osdl.org>,
	marcelo.tosatti@cyclades.com, linux-kernel@vger.kernel.org
Subject: Re: [pagevec] resize pagevec to O(lg(NR_CPUS))
Date: Fri, 10 Sep 2004 14:56:11 +1000	[thread overview]
Message-ID: <414133EB.8020802@yahoo.com.au> (raw)
In-Reply-To: <20040910000717.GR3106@holomorphy.com>

William Lee Irwin III wrote:
> William Lee Irwin III <wli@holomorphy.com> wrote:
> 
>>>Reducing arrival rates by an Omega(NR_CPUS) factor would probably help,
>>>though that may blow the stack on e.g. larger Altixen. Perhaps
>>>O(lg(NR_CPUS)), e.g. NR_CPUS > 1 ? 4*lg(NR_CPUS) : 4 etc., will suffice,
>>>though we may have debates about how to evaluate lg(n) at compile-time...
>>>Would be nice if calls to sufficiently simple __attribute__((pure))
>>>functions with constant args were considered constant expressions by gcc.
> 
> 
> On Thu, Sep 09, 2004 at 04:22:45PM -0700, Andrew Morton wrote:
> 
>>Yes, that sort of thing.
>>It wouldn't be surprising if increasing the pagevec up to 64 slots on big
>>ia64 SMP provided a useful increase in some fs-intensive workloads.
>>One needs to watch stack consumption though.
> 
> 
> Okay, Marcelo, looks like we need to do cache alignment work with a
> variable-size pagevec.
> 
> In order to attempt to compensate for arrival rates to zone->lru_lock
> increasing as O(num_cpus_online()), this patch resizes the pagevec to
> O(lg(NR_CPUS)) for lock amortization that adjusts better to the size of
> the system. Compiletested on ia64.
> 

Yuck. I don't like things like this to depend on NR_CPUS, because your
kernel may behave quite differently depending on the value. But in this
case I guess "quite differently" is probably "a little bit differently",
and practical reality may dictate that you need to do something like
this at compile time, and NR_CPUS is your best approximation.

That said, I *don't* think this should go in hastily.

First reason is that the lru lock is per zone, so the premise that
zone->lru_lock aquisitions increases O(cpus) is wrong for anything large
enough to care (ie. it will be NUMA). It is possible that a 512 CPU Altix
will see less lru_lock contention than an 8-way Intel box.

Secondly is that you'll might really start putting pressure on small L1
caches (eg. Itanium 2) if you bite off too much in one go. If you blow
it, you'll have to pull all the pages into cache again as you process
the pagevec.

I don't think the smallish loop overhead constant (mainly pulling a lock
and a couple of hot cachelines off another CPU) would gain much from
increasing these a lot, either. The overhead should already at least an
order of magnitude smaller than the actual work cost.

Lock contention isn't a good argument either, because it shouldn't
significantly change as you tradeoff hold vs frequency if we assume
that the lock transfer and other overheads aren't significant (which
should be a safe assumption at PAGEVEC_SIZE of >= 16, I think).

Probably a PAGEVEC_SIZE of 4 on UP would be an interesting test, because
your loop overheads get a bit smaller.

next prev parent reply	other threads:[~2004-09-10  5:45 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-09-09 16:39 [PATCH] cacheline align pagevec structure Marcelo Tosatti
2004-09-09 22:49 ` Andrew Morton
2004-09-09 21:41   ` Marcelo Tosatti
2004-09-09 23:20     ` Andrew Morton
2004-09-09 22:52 ` Andrew Morton
2004-09-09 23:09   ` William Lee Irwin III
2004-09-09 22:12     ` Marcelo Tosatti
2004-09-09 23:59       ` William Lee Irwin III
2004-09-09 23:22     ` Andrew Morton
2004-09-10  0:07       ` [pagevec] resize pagevec to O(lg(NR_CPUS)) William Lee Irwin III
2004-09-10  4:56         ` Nick Piggin [this message]
2004-09-10  4:59           ` Nick Piggin
2004-09-10 17:49           ` Marcelo Tosatti
2004-09-12  0:29             ` Nick Piggin
2004-09-12  5:23               ` William Lee Irwin III
2004-09-12  4:36                 ` Nick Piggin
2004-09-12  4:56             ` William Lee Irwin III
2004-09-12  4:28               ` Nick Piggin
2004-09-12  6:27                 ` William Lee Irwin III
2004-09-12  6:03                   ` Nick Piggin
2004-09-12  7:19                     ` William Lee Irwin III
2004-09-12  7:42                       ` Andrew Morton
2004-09-14  2:18                         ` William Lee Irwin III
2004-09-14  2:57                           ` Andrew Morton
2004-09-14  3:12                             ` William Lee Irwin III
2004-09-12  8:57                       ` William Lee Irwin III
2004-09-13 22:21                 ` Marcelo Tosatti
2004-09-14  1:59                   ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=414133EB.8020802@yahoo.com.au \
    --to=nickpiggin@yahoo.com.au \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marcelo.tosatti@cyclades.com \
    --cc=wli@holomorphy.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox