From: Mel Gorman <mel@csn.ul.ie>
To: Christoph Lameter <cl@linux-foundation.org>
Cc: Lin Ming <ming.m.lin@intel.com>,
Pekka Enberg <penberg@cs.helsinki.fi>,
Linux Memory Management List <linux-mm@kvack.org>,
Rik van Riel <riel@redhat.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Nick Piggin <npiggin@suse.de>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Zhang Yanmin <yanmin_zhang@linux.intel.com>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@elte.hu>
Subject: Re: [RFC PATCH 00/19] Cleanup and optimise the page allocator V2
Date: Tue, 3 Mar 2009 21:48:42 +0000 [thread overview]
Message-ID: <20090303214842.GL10577@csn.ul.ie> (raw)
In-Reply-To: <alpine.DEB.1.10.0903031130550.26454@qirst.com>
On Tue, Mar 03, 2009 at 11:31:46AM -0500, Christoph Lameter wrote:
> On Mon, 2 Mar 2009, Mel Gorman wrote:
>
> > Going by the vanilla kernel, a *large* amount of time is spent doing
> > high-order allocations. Over 25% of the cost of buffered_rmqueue() is in
> > the branch dealing with high-order allocations. Does UDP-U-4K mean that 8K
> > pages are required for the packets? That means high-order allocations and
> > high contention on the zone-list. That is bad obviously and has implications
> > for the SLUB-passthru patch because whether 8K allocations are handled by
> > SL*B or the page allocator has a big impact on locking.
> >
> > Next, a little over 50% of the cost get_page_from_freelist() is being spent
> > acquiring the zone spinlock. The implication is that the SL*B allocators
> > passing in order-1 allocations to the page allocator are currently going to
> > hit scalability problems in a big way. The solution may be to extend the
> > per-cpu allocator to handle magazines up to PAGE_ALLOC_COSTLY_ORDER. I'll
> > check it out.
>
> Then we are increasing the number of queues dramatically in the page
> allocator. More of a memory sink. Less cache hotness.
>
It doesn't have to be more queues and networking is doing order-1 allocations
based on a quick instrumentation so we might be justified in doing this to
avoid contending excessively on the zone lock.
Without the patchset, we do a search of the pcp lists for a page of the
appropriate migrate type. There is a patch that removes this search at
the cost of one cache line per CPU and it works reasonably well.
However, if the search was left in place, you can add pages of other orders
and just search for those which should be a lot less costly. Yes, the search
is unfortunate but you avoid acquiring the zone lock without increasing
the size of the per-cpu structure. The search will require cache lines it's
probably less than acquiring teh zone lock and going through the whole buddy
allocator for order-1 pages.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-03-03 21:48 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-24 12:16 [RFC PATCH 00/19] Cleanup and optimise the page allocator V2 Mel Gorman
2009-02-24 12:16 ` [PATCH 01/19] Replace __alloc_pages_internal() with __alloc_pages_nodemask() Mel Gorman
2009-02-24 12:16 ` [PATCH 02/19] Do not sanity check order in the fast path Mel Gorman
2009-02-24 12:16 ` [PATCH 03/19] Do not check NUMA node ID when the caller knows the node is valid Mel Gorman
2009-02-24 17:17 ` Christoph Lameter
2009-02-24 12:17 ` [PATCH 04/19] Convert gfp_zone() to use a table of precalculated values Mel Gorman
2009-02-24 16:43 ` Christoph Lameter
2009-02-24 17:07 ` Mel Gorman
2009-02-24 12:17 ` [PATCH 05/19] Re-sort GFP flags and fix whitespace alignment for easier reading Mel Gorman
2009-02-24 12:17 ` [PATCH 06/19] Check only once if the zonelist is suitable for the allocation Mel Gorman
2009-02-24 17:24 ` Christoph Lameter
2009-02-24 12:17 ` [PATCH 07/19] Break up the allocator entry point into fast and slow paths Mel Gorman
2009-02-24 12:17 ` [PATCH 08/19] Simplify the check on whether cpusets are a factor or not Mel Gorman
2009-02-24 17:27 ` Christoph Lameter
2009-02-24 17:55 ` Mel Gorman
2009-02-24 12:17 ` [PATCH 09/19] Move check for disabled anti-fragmentation out of fastpath Mel Gorman
2009-02-24 12:17 ` [PATCH 10/19] Calculate the preferred zone for allocation only once Mel Gorman
2009-02-24 17:31 ` Christoph Lameter
2009-02-24 17:53 ` Mel Gorman
2009-02-24 12:17 ` [PATCH 11/19] Calculate the migratetype " Mel Gorman
2009-02-24 12:17 ` [PATCH 12/19] Calculate the alloc_flags " Mel Gorman
2009-02-24 12:17 ` [PATCH 13/19] Inline __rmqueue_smallest() Mel Gorman
2009-02-24 12:17 ` [PATCH 14/19] Inline buffered_rmqueue() Mel Gorman
2009-02-24 12:17 ` [PATCH 15/19] Do not call get_pageblock_migratetype() more than necessary Mel Gorman
2009-02-24 12:17 ` [PATCH 16/19] Do not disable interrupts in free_page_mlock() Mel Gorman
2009-02-24 12:17 ` [PATCH 17/19] Do not setup zonelist cache when there is only one node Mel Gorman
2009-02-24 12:17 ` [PATCH 18/19] Do not check for compound pages during the page allocator sanity checks Mel Gorman
2009-02-24 12:17 ` [PATCH 19/19] Split per-cpu list into one-list-per-migrate-type Mel Gorman
2009-02-26 9:10 ` [RFC PATCH 00/19] Cleanup and optimise the page allocator V2 Lin Ming
2009-02-26 9:26 ` Pekka Enberg
2009-02-26 9:27 ` Lin Ming
2009-02-26 11:03 ` Mel Gorman
2009-02-26 11:18 ` Pekka Enberg
2009-02-26 11:22 ` Mel Gorman
2009-02-26 12:27 ` Lin Ming
2009-02-27 8:44 ` Lin Ming
2009-03-02 11:21 ` Mel Gorman
2009-03-02 11:39 ` Nick Piggin
2009-03-02 12:16 ` Mel Gorman
2009-03-03 4:42 ` Nick Piggin
2009-03-03 8:25 ` Mel Gorman
2009-03-03 9:04 ` Nick Piggin
2009-03-03 13:51 ` Mel Gorman
2009-03-03 16:31 ` Christoph Lameter
2009-03-03 21:48 ` Mel Gorman [this message]
2009-03-04 2:05 ` Zhang, Yanmin
2009-03-04 7:23 ` Peter Zijlstra
2009-03-04 8:31 ` Zhang, Yanmin
2009-03-04 9:07 ` Nick Piggin
2009-03-05 1:56 ` Zhang, Yanmin
2009-03-05 10:34 ` Ingo Molnar
2009-03-06 8:33 ` Lin Ming
2009-03-06 9:39 ` Ingo Molnar
2009-03-06 13:03 ` Mel Gorman
2009-03-09 1:50 ` Zhang, Yanmin
2009-03-09 7:31 ` Lin Ming
2009-03-09 7:03 ` Lin Ming
2009-03-04 18:04 ` Mel Gorman
2009-02-26 16:28 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090303214842.GL10577@csn.ul.ie \
--to=mel@csn.ul.ie \
--cc=cl@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ming.m.lin@intel.com \
--cc=mingo@elte.hu \
--cc=npiggin@suse.de \
--cc=penberg@cs.helsinki.fi \
--cc=peterz@infradead.org \
--cc=riel@redhat.com \
--cc=yanmin_zhang@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).