linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: Christoph Lameter <cl@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Minchan Kim <minchan.kim@gmail.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] mm: page allocator: Initialise ZLC for first zone eligible for zone_reclaim
Date: Mon, 18 Jul 2011 22:13:25 +0100	[thread overview]
Message-ID: <20110718211325.GC5349@suse.de> (raw)
In-Reply-To: <alpine.DEB.2.00.1107181208050.31576@router.home>

On Mon, Jul 18, 2011 at 12:20:11PM -0500, Christoph Lameter wrote:
> 
> 
> On Mon, 18 Jul 2011, Mel Gorman wrote:
> 
> > On Mon, Jul 18, 2011 at 09:56:31AM -0500, Christoph Lameter wrote:
> > > On Fri, 15 Jul 2011, Mel Gorman wrote:
> > >
> > > > Currently the zonelist cache is setup only after the first zone has
> > > > been considered and zone_reclaim() has been called. The objective was
> > > > to avoid a costly setup but zone_reclaim is itself quite expensive. If
> > > > it is failing regularly such as the first eligible zone having mostly
> > > > mapped pages, the cost in scanning and allocation stalls is far higher
> > > > than the ZLC initialisation step.
> > >
> > > Would it not be easier to set zlc_active and allowednodes based on the
> > > zone having an active ZLC at the start of get_pages()?
> > >
> >
> > What do you mean by a zones active ZLC? zonelists are on a per-node,
> > not a per-zone basis (see node_zonelist) so a zone doesn't have an
> > active ZLC as such. If the zlc_active is set at the beginning of
> 
> Look at get_page_from_freelist(): It sets
> zlc_active = 0 even through the zonelist under consideration may have a
> ZLC. zlc_active = 0 can also mean that the function has not bothered to
> look for the zlc information of the current zonelist.
> 

Yes. So? It's only necessary if the watermarks are not met.

> > get_page_from_freelist(), it implies that we are calling zlc_setup()
> > even when the watermarks are met which is unnecessary.
> 
> Ok then that decision to not call zlc_setup() for performance reasons is
> what created the problem that you are trying to solve. In case that the
> first zones watermarks are okay we can avoid calling zlc_setup().
> 

The original implementation did not check the ZLC in the first loop
at all. It wasn't just about avoiding the cost of setup. I suspect
this problem has been there a long time and it's taking this long
for bug reports to show up because NUMA machines are being used for
generic numa-unaware workloads.

> What we do now have is checking for zlc_active in the loop just so that
> the first time around we do not call zlc_setup().
> 

Yes, why incur the cost for the common case?

> We may be able to simplify the function by:
> 
> 1.  Checking for the special case that the first zone is ok and that we do
> not want to call zlc_setup before we get to the loop.
> 
> 2. Do the zlc_setup() before the loop.
> 
> 3. Remove the zlc_setup() code as you did from the loop as well as the
> checks for zlc_active. zlc_active becomes not necessary since a zlc
> is always available when we go through the loop.
> 

That initial test will involve duplication of things like the cpuset and
no watermarks check just to place the zlc_setup() in a different place.
I might be missing your point but it seems like the gain would be
marginal. Fancy posting a patch?

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-07-18 21:13 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-15 15:08 [PATCH 0/2] Reduce frequency of stalls due to zone_reclaim() on NUMA v2r1 Mel Gorman
2011-07-15 15:08 ` [PATCH 1/2] mm: page allocator: Initialise ZLC for first zone eligible for zone_reclaim Mel Gorman
2011-07-18 14:56   ` Christoph Lameter
2011-07-18 16:05     ` Mel Gorman
2011-07-18 17:20       ` Christoph Lameter
2011-07-18 21:13         ` Mel Gorman [this message]
2011-07-18 21:54           ` Christoph Lameter
2011-07-19 14:01             ` Christoph Lameter
2011-07-20 18:08               ` Christoph Lameter
2011-07-20 19:18                 ` Mel Gorman
2011-07-20 19:28                   ` Christoph Lameter
2011-07-20 19:52                     ` Christoph Lameter
2011-07-20 21:17                       ` Christoph Lameter
2011-07-20 22:48                         ` Mel Gorman
2011-07-21 15:24                           ` Christoph Lameter
2011-07-15 15:09 ` [PATCH 2/2] mm: page allocator: Reconsider zones for allocation after direct reclaim Mel Gorman
2011-07-19 11:46   ` [PATCH] mm: page allocator: Reconsider zones for allocation after direct reclaim fix Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110718211325.GC5349@suse.de \
    --to=mgorman@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan.kim@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).