All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dimitri Sivanich <sivanich@sgi.com>
To: Christoph Lameter <cl@gentwo.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Mel Gorman <mel@csn.ul.ie>
Subject: Re: [PATCH] Reduce vm_stat cacheline contention in __vm_enough_memory
Date: Thu, 13 Oct 2011 10:23:55 -0500	[thread overview]
Message-ID: <20111013152355.GB6966@sgi.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1110121452220.31218@router.home>

On Wed, Oct 12, 2011 at 02:57:53PM -0500, Christoph Lameter wrote:
> On Wed, 12 Oct 2011, Andrew Morton wrote:
> 
> > > Note that this patch is simply to illustrate the gains that can be made here.
> > > What I'm looking for is some guidance on an acceptable way to accomplish the
> > > task of reducing contention in this area, either by caching these values in a
> > > way similar to the attached patch, or by some other mechanism if this is
> > > unacceptable.
> >
> > Yes, the global vm_stat[] array is a problem - I'm surprised it's hung
> > around for this long.  Altering the sysctl_overcommit_memory mode will
> > hide the problem, but that's no good.
> 
> The global vm_stat array is keeping the state for the zone. It would be
> even more expensive to calculate this at every point where we need such
> data.
> 
> > I think we've discussed switching vm_stat[] to a contention-avoiding
> > counter scheme.  Simply using <percpu_counter.h> would be the simplest
> > approach.  They'll introduce inaccuracies but hopefully any problems
> > from that will be minor for the global page counters.
> 
> We already have a contention avoiding scheme for counter updates in
> vmstat.c. The problem here is that vm_stat is frequently read. Updates
> from other cpus that fold counter updates in a deferred way into the
> global statistics cause cacheline eviction. The updates occur too frequent
> in this load.

The test I did slowed down the reads by __vm_enough_memory by caching the
values and updating them every two seconds (in the OVERCOMMIT_GUESS area).
> 
> > otoh, I think we've been round this loop before and I don't recall why
> > nothing happened.
> 
> The update behavior can be tuned using /proc/sys/vm/stat_interval.
> Increase the interval to reduce the folding into the global counter (set
> maybe to 10?). This will reduce contention. The other approach is to

Increasing this interval to 10 (or even 100) had no effect on the vm_stat
contention on a 640 cpu test system, so vmstat_update() is not the culprit.

> increase the allowed delta per zone if frequent updates occur via the
> overflow checks in vmstat.c. See calculate_*_threshold there.

I tried changing the threshold in both directions, with slower throughput in
both cases.

> 
> Note that the deltas are current reduced for memory pressure situations
> (after recent patches by Mel). This will cause a significant increase in
> vm_stat cacheline contention compared to earlier kernels.
> 

WARNING: multiple messages have this Message-ID (diff)
From: Dimitri Sivanich <sivanich@sgi.com>
To: Christoph Lameter <cl@gentwo.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Mel Gorman <mel@csn.ul.ie>
Subject: Re: [PATCH] Reduce vm_stat cacheline contention in __vm_enough_memory
Date: Thu, 13 Oct 2011 10:23:55 -0500	[thread overview]
Message-ID: <20111013152355.GB6966@sgi.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1110121452220.31218@router.home>

On Wed, Oct 12, 2011 at 02:57:53PM -0500, Christoph Lameter wrote:
> On Wed, 12 Oct 2011, Andrew Morton wrote:
> 
> > > Note that this patch is simply to illustrate the gains that can be made here.
> > > What I'm looking for is some guidance on an acceptable way to accomplish the
> > > task of reducing contention in this area, either by caching these values in a
> > > way similar to the attached patch, or by some other mechanism if this is
> > > unacceptable.
> >
> > Yes, the global vm_stat[] array is a problem - I'm surprised it's hung
> > around for this long.  Altering the sysctl_overcommit_memory mode will
> > hide the problem, but that's no good.
> 
> The global vm_stat array is keeping the state for the zone. It would be
> even more expensive to calculate this at every point where we need such
> data.
> 
> > I think we've discussed switching vm_stat[] to a contention-avoiding
> > counter scheme.  Simply using <percpu_counter.h> would be the simplest
> > approach.  They'll introduce inaccuracies but hopefully any problems
> > from that will be minor for the global page counters.
> 
> We already have a contention avoiding scheme for counter updates in
> vmstat.c. The problem here is that vm_stat is frequently read. Updates
> from other cpus that fold counter updates in a deferred way into the
> global statistics cause cacheline eviction. The updates occur too frequent
> in this load.

The test I did slowed down the reads by __vm_enough_memory by caching the
values and updating them every two seconds (in the OVERCOMMIT_GUESS area).
> 
> > otoh, I think we've been round this loop before and I don't recall why
> > nothing happened.
> 
> The update behavior can be tuned using /proc/sys/vm/stat_interval.
> Increase the interval to reduce the folding into the global counter (set
> maybe to 10?). This will reduce contention. The other approach is to

Increasing this interval to 10 (or even 100) had no effect on the vm_stat
contention on a 640 cpu test system, so vmstat_update() is not the culprit.

> increase the allowed delta per zone if frequent updates occur via the
> overflow checks in vmstat.c. See calculate_*_threshold there.

I tried changing the threshold in both directions, with slower throughput in
both cases.

> 
> Note that the deltas are current reduced for memory pressure situations
> (after recent patches by Mel). This will cause a significant increase in
> vm_stat cacheline contention compared to earlier kernels.
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2011-10-13 15:23 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-12 16:02 [PATCH] Reduce vm_stat cacheline contention in __vm_enough_memory Dimitri Sivanich
2011-10-12 19:01 ` Andrew Morton
2011-10-12 19:01   ` Andrew Morton
2011-10-12 19:57   ` Christoph Lameter
2011-10-12 19:57     ` Christoph Lameter
2011-10-13 15:06     ` Mel Gorman
2011-10-13 15:06       ` Mel Gorman
2011-10-13 15:59       ` Andi Kleen
2011-10-13 15:59         ` Andi Kleen
2011-10-13 15:23     ` Dimitri Sivanich [this message]
2011-10-13 15:23       ` Dimitri Sivanich
2011-10-13 15:54       ` Christoph Lameter
2011-10-13 15:54         ` Christoph Lameter
2011-10-13 20:50         ` Andrew Morton
2011-10-13 20:50           ` Andrew Morton
2011-10-13 21:02           ` Christoph Lameter
2011-10-13 21:02             ` Christoph Lameter
2011-10-13 21:24             ` Andrew Morton
2011-10-13 21:24               ` Andrew Morton
2011-10-14 12:25               ` Dimitri Sivanich
2011-10-14 12:25                 ` Dimitri Sivanich
2011-10-14 13:50                 ` Dimitri Sivanich
2011-10-14 13:50                   ` Dimitri Sivanich
2011-10-14 13:57                   ` Christoph Lameter
2011-10-14 13:57                     ` Christoph Lameter
2011-10-14 14:19                     ` Dimitri Sivanich
2011-10-14 14:19                       ` Dimitri Sivanich
2011-10-14 14:34                       ` Christoph Lameter
2011-10-14 14:34                         ` Christoph Lameter
2011-10-14 15:18                         ` Christoph Lameter
2011-10-14 15:18                           ` Christoph Lameter
2011-10-14 16:16                           ` Dimitri Sivanich
2011-10-14 16:16                             ` Dimitri Sivanich
2011-10-18 13:48                             ` Dimitri Sivanich
2011-10-18 13:48                               ` Dimitri Sivanich
2011-10-18 14:36                               ` Christoph Lameter
2011-10-18 14:36                                 ` Christoph Lameter
2011-10-18 15:48                               ` Andi Kleen
2011-10-18 15:48                                 ` Andi Kleen
2011-10-19  1:16                                 ` David Rientjes
2011-10-19  1:16                                   ` David Rientjes
2011-10-19 14:54                                   ` Dimitri Sivanich
2011-10-19 14:54                                     ` Dimitri Sivanich
2011-10-19 15:31                                     ` Christoph Lameter
2011-10-19 15:31                                       ` Christoph Lameter
2011-10-24 14:59                                       ` Dimitri Sivanich
2011-10-24 14:59                                         ` Dimitri Sivanich
     [not found]   ` <CADE8fzrdMOBF1RyyEpMVi8aKcgOVKRQSKi0=c1Qvh3p6hHcXRA@mail.gmail.com>
2011-10-13  0:07     ` Tim Chen
2011-10-13  0:07       ` Tim Chen
2011-10-13 14:15       ` Christoph Lameter
2011-10-13 14:15         ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111013152355.GB6966@sgi.com \
    --to=sivanich@sgi.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@gentwo.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.