From: Nishanth Aravamudan <nacc@us.ibm.com>
To: Christoph Lameter <clameter@sgi.com>
Cc: melgor@ie.ibm.com, apw@shadowen.org, agl@us.ibm.com,
wli@holomorphy.com, linux-mm@kvack.org
Subject: Re: [RFC][PATCH 2/2] Explicitly retry hugepage allocations
Date: Fri, 8 Feb 2008 09:11:32 -0800 [thread overview]
Message-ID: <20080208171132.GE15903@us.ibm.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0802061529480.22648@schroedinger.engr.sgi.com>
On 06.02.2008 [15:30:53 -0800], Christoph Lameter wrote:
> On Wed, 6 Feb 2008, Nishanth Aravamudan wrote:
>
> > Add __GFP_REPEAT to hugepage allocations. Do so to not necessitate
> > userspace putting pressure on the VM by repeated echo's into
> > /proc/sys/vm/nr_hugepages to grow the pool. With the previous patch
> > to allow for large-order __GFP_REPEAT attempts to loop for a bit (as
> > opposed to indefinitely), this increases the likelihood of getting
> > hugepages when the system experiences (or recently experienced)
> > load.
> >
> > On a 2-way x86_64, this doubles the number of hugepages (from 10 to
> > 20) obtained while compiling a kernel at the same time. On a 4-way
> > ppc64, a similar scale increase is seen (from 3 to 5 hugepages).
> > Finally, on a 2-way x86, this leads to a 5-fold increase in the
> > hugepages allocatable under load (90 to 554).
>
> Hmmm... How about defaulting to __GFP_REPEAT by default for larger
> page allocations? There are other users of larger allocs that would
> also benefit from the same measure. I think it would be fine as long
> as we are sure to fail at some point.
In thinking about this more, one of the harder parts for me to get my
head around was the implicit promotion of small-order allocations to
__GFP_REPEAT (and thus to __GFP_NOFAIL). I would prefer keeping the
large-order allocations explicit as to when they want the VM to try
harder to succeed. As far as I understand it, only hugepages really will
leverage this from code in in the kernel currently? I also feel like,
even if __GFP_REPEAT becomes a default behavior, it's better to use it
as a documentation of intent from the caller -- and perhaps indicate to
us sites that are over-stressing the VM unnecessarily by regularly
forcing reclaim?
I also am not 100% positive on how I would test the result of such a
change, since there are not that many large-order allocations in the
kernel... Did you have any thoughts on that?
Thanks,
Nish
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-02-08 17:11 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-02-06 23:07 [RFC][PATCH 1/2] Smarter retry of costly-order allocations Nishanth Aravamudan
2008-02-06 23:12 ` [RFC][PATCH 2/2] Explicitly retry hugepage allocations Nishanth Aravamudan
2008-02-06 23:30 ` Christoph Lameter
2008-02-07 1:04 ` Nishanth Aravamudan
2008-02-08 17:11 ` Nishanth Aravamudan [this message]
2008-02-08 19:19 ` Christoph Lameter
2008-02-08 23:40 ` Nishanth Aravamudan
2008-02-08 23:42 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080208171132.GE15903@us.ibm.com \
--to=nacc@us.ibm.com \
--cc=agl@us.ibm.com \
--cc=apw@shadowen.org \
--cc=clameter@sgi.com \
--cc=linux-mm@kvack.org \
--cc=melgor@ie.ibm.com \
--cc=wli@holomorphy.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.