From: Mel Gorman <mgorman@suse.de>
To: Christoph Lameter <cl@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Minchan Kim <minchan.kim@gmail.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] mm: page allocator: Initialise ZLC for first zone eligible for zone_reclaim
Date: Mon, 18 Jul 2011 22:13:25 +0100 [thread overview]
Message-ID: <20110718211325.GC5349@suse.de> (raw)
In-Reply-To: <alpine.DEB.2.00.1107181208050.31576@router.home>
On Mon, Jul 18, 2011 at 12:20:11PM -0500, Christoph Lameter wrote:
>
>
> On Mon, 18 Jul 2011, Mel Gorman wrote:
>
> > On Mon, Jul 18, 2011 at 09:56:31AM -0500, Christoph Lameter wrote:
> > > On Fri, 15 Jul 2011, Mel Gorman wrote:
> > >
> > > > Currently the zonelist cache is setup only after the first zone has
> > > > been considered and zone_reclaim() has been called. The objective was
> > > > to avoid a costly setup but zone_reclaim is itself quite expensive. If
> > > > it is failing regularly such as the first eligible zone having mostly
> > > > mapped pages, the cost in scanning and allocation stalls is far higher
> > > > than the ZLC initialisation step.
> > >
> > > Would it not be easier to set zlc_active and allowednodes based on the
> > > zone having an active ZLC at the start of get_pages()?
> > >
> >
> > What do you mean by a zones active ZLC? zonelists are on a per-node,
> > not a per-zone basis (see node_zonelist) so a zone doesn't have an
> > active ZLC as such. If the zlc_active is set at the beginning of
>
> Look at get_page_from_freelist(): It sets
> zlc_active = 0 even through the zonelist under consideration may have a
> ZLC. zlc_active = 0 can also mean that the function has not bothered to
> look for the zlc information of the current zonelist.
>
Yes. So? It's only necessary if the watermarks are not met.
> > get_page_from_freelist(), it implies that we are calling zlc_setup()
> > even when the watermarks are met which is unnecessary.
>
> Ok then that decision to not call zlc_setup() for performance reasons is
> what created the problem that you are trying to solve. In case that the
> first zones watermarks are okay we can avoid calling zlc_setup().
>
The original implementation did not check the ZLC in the first loop
at all. It wasn't just about avoiding the cost of setup. I suspect
this problem has been there a long time and it's taking this long
for bug reports to show up because NUMA machines are being used for
generic numa-unaware workloads.
> What we do now have is checking for zlc_active in the loop just so that
> the first time around we do not call zlc_setup().
>
Yes, why incur the cost for the common case?
> We may be able to simplify the function by:
>
> 1. Checking for the special case that the first zone is ok and that we do
> not want to call zlc_setup before we get to the loop.
>
> 2. Do the zlc_setup() before the loop.
>
> 3. Remove the zlc_setup() code as you did from the loop as well as the
> checks for zlc_active. zlc_active becomes not necessary since a zlc
> is always available when we go through the loop.
>
That initial test will involve duplication of things like the cpuset and
no watermarks check just to place the zlc_setup() in a different place.
I might be missing your point but it seems like the gain would be
marginal. Fancy posting a patch?
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-07-18 21:13 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-07-15 15:08 [PATCH 0/2] Reduce frequency of stalls due to zone_reclaim() on NUMA v2r1 Mel Gorman
2011-07-15 15:08 ` [PATCH 1/2] mm: page allocator: Initialise ZLC for first zone eligible for zone_reclaim Mel Gorman
2011-07-18 14:56 ` Christoph Lameter
2011-07-18 16:05 ` Mel Gorman
2011-07-18 17:20 ` Christoph Lameter
2011-07-18 21:13 ` Mel Gorman [this message]
2011-07-18 21:54 ` Christoph Lameter
2011-07-19 14:01 ` Christoph Lameter
2011-07-20 18:08 ` Christoph Lameter
2011-07-20 19:18 ` Mel Gorman
2011-07-20 19:28 ` Christoph Lameter
2011-07-20 19:52 ` Christoph Lameter
2011-07-20 21:17 ` Christoph Lameter
2011-07-20 22:48 ` Mel Gorman
2011-07-21 15:24 ` Christoph Lameter
2011-07-15 15:09 ` [PATCH 2/2] mm: page allocator: Reconsider zones for allocation after direct reclaim Mel Gorman
2011-07-19 11:46 ` [PATCH] mm: page allocator: Reconsider zones for allocation after direct reclaim fix Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110718211325.GC5349@suse.de \
--to=mgorman@suse.de \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=minchan.kim@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).