All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Christoph Lameter <cl@linux.com>
Cc: Mel Gorman <mel@csn.ul.ie>, Shaohua Li <shaohua.li@intel.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	David Rientjes <rientjes@google.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>
Subject: Re: [PATCH 1/2] mm: page allocator: Adjust the per-cpu counter threshold when memory is low
Date: Fri, 29 Oct 2010 11:25:41 -0700	[thread overview]
Message-ID: <20101029112541.8ab906bb.akpm@linux-foundation.org> (raw)
In-Reply-To: <alpine.DEB.2.00.1010290955510.20370@router.home>

On Fri, 29 Oct 2010 09:58:25 -0500 (CDT)
Christoph Lameter <cl@linux.com> wrote:

> On Thu, 28 Oct 2010, Andrew Morton wrote:
> 
> > > To ensure that kswapd wakes up, a safe version of zone_watermark_ok()
> > > is introduced that takes a more accurate reading of NR_FREE_PAGES when
> > > called from wakeup_kswapd, when deciding whether it is really safe to go
> > > back to sleep in sleeping_prematurely() and when deciding if a zone is
> > > really balanced or not in balance_pgdat(). We are still using an expensive
> > > function but limiting how often it is called.
> >
> > Here I go again.  I have a feeling that I already said this, but I
> > can't find versions 2 or 3 in the archives..
> >
> > Did you evaluate using plain on percpu_counters for this?  They won't
> > solve the performance problem as they're basically the same thing as
> > these open-coded counters.  But they'd reduce the amount of noise and
> > custom-coded boilerplate in mm/.
> 
> The zone counters are done using the ZVCs in vmstat.c to save space

well, they actually waste space because of that threshold thing.

> and to
> be in the same cacheline as other hot data necessary for allocation and
> free.

Yes, that'll save some misses.

>  >
> > > +	threshold = max(1, (int)(watermark_distance / num_online_cpus()));
> > > +
> > > +	/*
> > > +	 * Maximum threshold is 125
> >
> > Reasoning?
> 
> Differentials are stored in 8 bit signed ints.
> 
> > > +	put_online_cpus();
> > > +}
> >
> > Given that ->stat_threshold is the same for each CPU, why store it for
> > each CPU at all?  Why not put it in the zone and eliminate the inner
> > loop?
> 
> Doing that caused cache misses in the past and reduced the performance of
> the ZVCs. This way the threshold is in the same cacheline as the
> differentials.

This sounds wrong.  As long as that threshold isn't stored in a
cacheline which other CPUs are modifying, all CPUs should be able to
happily cache it.  Maybe it needed a bit of padding inside the zone
struct.

WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: Christoph Lameter <cl@linux.com>
Cc: Mel Gorman <mel@csn.ul.ie>, Shaohua Li <shaohua.li@intel.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	David Rientjes <rientjes@google.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>
Subject: Re: [PATCH 1/2] mm: page allocator: Adjust the per-cpu counter threshold when memory is low
Date: Fri, 29 Oct 2010 11:25:41 -0700	[thread overview]
Message-ID: <20101029112541.8ab906bb.akpm@linux-foundation.org> (raw)
In-Reply-To: <alpine.DEB.2.00.1010290955510.20370@router.home>

On Fri, 29 Oct 2010 09:58:25 -0500 (CDT)
Christoph Lameter <cl@linux.com> wrote:

> On Thu, 28 Oct 2010, Andrew Morton wrote:
> 
> > > To ensure that kswapd wakes up, a safe version of zone_watermark_ok()
> > > is introduced that takes a more accurate reading of NR_FREE_PAGES when
> > > called from wakeup_kswapd, when deciding whether it is really safe to go
> > > back to sleep in sleeping_prematurely() and when deciding if a zone is
> > > really balanced or not in balance_pgdat(). We are still using an expensive
> > > function but limiting how often it is called.
> >
> > Here I go again.  I have a feeling that I already said this, but I
> > can't find versions 2 or 3 in the archives..
> >
> > Did you evaluate using plain on percpu_counters for this?  They won't
> > solve the performance problem as they're basically the same thing as
> > these open-coded counters.  But they'd reduce the amount of noise and
> > custom-coded boilerplate in mm/.
> 
> The zone counters are done using the ZVCs in vmstat.c to save space

well, they actually waste space because of that threshold thing.

> and to
> be in the same cacheline as other hot data necessary for allocation and
> free.

Yes, that'll save some misses.

>  >
> > > +	threshold = max(1, (int)(watermark_distance / num_online_cpus()));
> > > +
> > > +	/*
> > > +	 * Maximum threshold is 125
> >
> > Reasoning?
> 
> Differentials are stored in 8 bit signed ints.
> 
> > > +	put_online_cpus();
> > > +}
> >
> > Given that ->stat_threshold is the same for each CPU, why store it for
> > each CPU at all?  Why not put it in the zone and eliminate the inner
> > loop?
> 
> Doing that caused cache misses in the past and reduced the performance of
> the ZVCs. This way the threshold is in the same cacheline as the
> differentials.

This sounds wrong.  As long as that threshold isn't stored in a
cacheline which other CPUs are modifying, all CPUs should be able to
happily cache it.  Maybe it needed a bit of padding inside the zone
struct.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-10-29 18:26 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-28 15:13 [PATCH 0/2] Reduce the amount of time spent in watermark-related functions V4 Mel Gorman
2010-10-28 15:13 ` Mel Gorman
2010-10-28 15:13 ` [PATCH 1/2] mm: page allocator: Adjust the per-cpu counter threshold when memory is low Mel Gorman
2010-10-28 15:13   ` Mel Gorman
2010-10-28 22:04   ` Andrew Morton
2010-10-28 22:04     ` Andrew Morton
2010-10-29 10:12     ` Mel Gorman
2010-10-29 10:12       ` Mel Gorman
2010-10-29 19:40       ` Andrew Morton
2010-10-29 19:40         ` Andrew Morton
2010-11-02  0:53         ` Shaohua Li
2010-11-02  0:53           ` Shaohua Li
2010-11-09 11:33         ` Mel Gorman
2010-11-09 11:33           ` Mel Gorman
2010-11-09 16:48         ` Christoph Lameter
2010-11-09 16:48           ` Christoph Lameter
2010-10-29 14:58     ` Christoph Lameter
2010-10-29 14:58       ` Christoph Lameter
2010-10-29 18:25       ` Andrew Morton [this message]
2010-10-29 18:25         ` Andrew Morton
2010-10-29 19:33         ` Christoph Lameter
2010-10-29 19:33           ` Christoph Lameter
2010-10-28 15:13 ` [PATCH 2/2] mm: vmstat: Use a single setter function and callback for adjusting percpu thresholds Mel Gorman
2010-10-28 15:13   ` Mel Gorman
2010-10-28 22:09   ` Andrew Morton
2010-10-28 22:09     ` Andrew Morton
2010-10-29 10:17     ` Mel Gorman
2010-10-29 10:17       ` Mel Gorman
  -- strict thread matches above, loose matches on Subject: below --
2010-10-27  8:47 [PATCH 0/2] Reduce the amount of time spent in watermark-related functions Mel Gorman
2010-10-27  8:47 ` [PATCH 1/2] mm: page allocator: Adjust the per-cpu counter threshold when memory is low Mel Gorman
2010-10-27  8:47   ` Mel Gorman
2010-10-27 20:16   ` Christoph Lameter
2010-10-27 20:16     ` Christoph Lameter
2010-10-28  1:09   ` KAMEZAWA Hiroyuki
2010-10-28  1:09     ` KAMEZAWA Hiroyuki
2010-10-28  9:49     ` Mel Gorman
2010-10-28  9:49       ` Mel Gorman
2010-10-28  9:58       ` KAMEZAWA Hiroyuki
2010-10-28  9:58         ` KAMEZAWA Hiroyuki
2010-11-01  7:06   ` KOSAKI Motohiro
2010-11-01  7:06     ` KOSAKI Motohiro
2010-11-26 16:06   ` Kyle McMartin
2010-11-26 16:06     ` Kyle McMartin
2010-11-29  9:56     ` Mel Gorman
2010-11-29  9:56       ` Mel Gorman
2010-11-29 13:16       ` Kyle McMartin
2010-11-29 13:16         ` Kyle McMartin
2010-11-29 15:08         ` Mel Gorman
2010-11-29 15:08           ` Mel Gorman
2010-11-29 15:22           ` Kyle McMartin
2010-11-29 15:22             ` Kyle McMartin
2010-11-29 15:26             ` Kyle McMartin
2010-11-29 15:26               ` Kyle McMartin
2010-11-29 15:58             ` Mel Gorman
2010-11-29 15:58               ` Mel Gorman
2010-12-23 22:18               ` David Rientjes
2010-12-23 22:18                 ` David Rientjes
2010-12-23 22:35                 ` Andrew Morton
2010-12-23 22:35                   ` Andrew Morton
2010-12-23 23:00                   ` Kyle McMartin
2010-12-23 23:00                     ` Kyle McMartin
2010-12-23 23:07                   ` David Rientjes
2010-12-23 23:07                     ` David Rientjes
2010-12-23 23:17                     ` Andrew Morton
2010-12-23 23:17                       ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101029112541.8ab906bb.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=rientjes@google.com \
    --cc=shaohua.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.