Re: [PATCH V3] Fix Committed

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH V3] Fix Committed_AS underflow
       [not found] <1240218590-16714-1-git-send-email-ebmunson@us.ibm.com>
@ 2009-04-20  9:18 ` KOSAKI Motohiro
  2009-04-20 16:15 ` Dave Hansen
  1 sibling, 0 replies; 5+ messages in thread
From: KOSAKI Motohiro @ 2009-04-20  9:18 UTC (permalink / raw)
  To: Eric B Munson; +Cc: kosaki.motohiro, dave, linux-mm, linux-kernel, mel, cl

> This patch makes a small change to an earlier patch by Kosaki Motohiro.
> The threshold calculation was changed to avoid the overhead of calculating
> the number of online cpus each time the threshold is needed.

IIRC, Mel have the patch of kill num_online_cups() calculation cost.
but I can accept your patch because mel patch haven't merged mainline yet.

Thanks :)




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH V3] Fix Committed_AS underflow
       [not found] <1240218590-16714-1-git-send-email-ebmunson@us.ibm.com>
  2009-04-20  9:18 ` [PATCH V3] Fix Committed_AS underflow KOSAKI Motohiro
@ 2009-04-20 16:15 ` Dave Hansen
  2009-04-20 19:49   ` Dave Hansen
  1 sibling, 1 reply; 5+ messages in thread
From: Dave Hansen @ 2009-04-20 16:15 UTC (permalink / raw)
  To: Eric B Munson; +Cc: kosaki.motohiro, linux-mm, linux-kernel, mel, cl

On Mon, 2009-04-20 at 10:09 +0100, Eric B Munson wrote:
> 1. Change NR_CPUS to min(64, NR_CPUS)
>    This will limit the amount of possible skew on kernels compiled for very
>    large SMP machines.  64 is an arbitrary number selected to limit the worst
>    of the skew without using more cache lines.  min(64, NR_CPUS) is used
>    instead of nr_online_cpus() because nr_online_cpus() requires a shared
>    cache line and a call to hweight to make the calculation.  Its runtime
>    overhead and keeping this counter accurate showed up in profiles and it's
>    possible that nr_online_cpus() would also show.



-- Dave


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH V3] Fix Committed_AS underflow
  2009-04-20 16:15 ` Dave Hansen
@ 2009-04-20 19:49   ` Dave Hansen
  2009-04-21  1:41     ` KOSAKI Motohiro
       [not found]     ` <20090423163148.GB5044@us.ibm.com>
  0 siblings, 2 replies; 5+ messages in thread
From: Dave Hansen @ 2009-04-20 19:49 UTC (permalink / raw)
  To: Eric B Munson; +Cc: kosaki.motohiro, linux-mm, linux-kernel, mel, cl

On Mon, 2009-04-20 at 09:15 -0700, Dave Hansen wrote:
> On Mon, 2009-04-20 at 10:09 +0100, Eric B Munson wrote:
> > 1. Change NR_CPUS to min(64, NR_CPUS)
> >    This will limit the amount of possible skew on kernels compiled for very
> >    large SMP machines.  64 is an arbitrary number selected to limit the worst
> >    of the skew without using more cache lines.  min(64, NR_CPUS) is used
> >    instead of nr_online_cpus() because nr_online_cpus() requires a shared
> >    cache line and a call to hweight to make the calculation.  Its runtime
> >    overhead and keeping this counter accurate showed up in profiles and it's
> >    possible that nr_online_cpus() would also show.

Wow, that empty reply was really informative, wasn't it? :)

My worry with this min(64, NR_CPUS) approach is that you effectively
ensure that you're going to be doing a lot more cacheline bouncing, but
it isn't quite as explicit.

Now, every time there's a mapping (or set of them) created or destroyed
that nets greater than 64 pages, you've got to go get a r/w cacheline to
a possibly highly contended atomic.  With a number this low, you're
almost guaranteed to hit it at fork() and exec().  Could you
double-check that this doesn't hurt any of the fork() AIM tests?

Another thought is that, instead of trying to fix this up in meminfo, we
could do this in a way that is guaranteed to never skew the global
counter negative: we always keep the *percpu* skew negative.  This
should be the same as what's in the kernel now:

void vm_acct_memory(long pages)
{
        long *local;
	long local_min = -ACCT_THRESHOLD;
	long local_max = ACCT_THRESHOLD;
	long local_goal = 0;

        preempt_disable();
        local = &__get_cpu_var(committed_space);
        *local += pages;
        if (*local > local_max || *local < local_min) {
                atomic_long_add(*local - local_goal, &vm_committed_space);
                *local = local_goal;
        }
        preempt_enable();
}

But now consider if we changed the local_* variables a bit:

	long local_min = -(ACCT_THRESHOLD*2);
	long local_max = 0
	long local_goal = -ACCT_THRESHOLD;

We'll get some possibly *large* numbers in meminfo, but it will at least
never underflow.

-- Dave

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH V3] Fix Committed_AS underflow
  2009-04-20 19:49   ` Dave Hansen
@ 2009-04-21  1:41     ` KOSAKI Motohiro
       [not found]     ` <20090423163148.GB5044@us.ibm.com>
  1 sibling, 0 replies; 5+ messages in thread
From: KOSAKI Motohiro @ 2009-04-21  1:41 UTC (permalink / raw)
  To: Dave Hansen
  Cc: kosaki.motohiro, Eric B Munson, linux-mm, linux-kernel, mel, cl

> void vm_acct_memory(long pages)
> {
>         long *local;
> 	long local_min = -ACCT_THRESHOLD;
> 	long local_max = ACCT_THRESHOLD;
> 	long local_goal = 0;
> 
>         preempt_disable();
>         local = &__get_cpu_var(committed_space);
>         *local += pages;
>         if (*local > local_max || *local < local_min) {
>                 atomic_long_add(*local - local_goal, &vm_committed_space);
>                 *local = local_goal;
>         }
>         preempt_enable();
> }
> 
> But now consider if we changed the local_* variables a bit:
> 
> 	long local_min = -(ACCT_THRESHOLD*2);
> 	long local_max = 0
> 	long local_goal = -ACCT_THRESHOLD;
> 
> We'll get some possibly *large* numbers in meminfo, but it will at least
> never underflow.

if *local == -(ACCT_THRESHOLD*2), 
  *local - local_goal = -(ACCT_THRESHOLD*2) + ACCT_THRESHOLD = -ACCT_THRESHOLD

Then, we still pass negative value to atomic_long_add().
IOW, vm_committed_space still can be negative value.

Am I missing anything?




^ permalink raw reply	[flat|nested] 5+ messages in thread

[parent not found: <20090423163148.GB5044@us.ibm.com>]

* Re: [PATCH V3] Fix Committed_AS underflow
       [not found]     ` <20090423163148.GB5044@us.ibm.com>
@ 2009-04-23 20:38       ` Dave Hansen
  0 siblings, 0 replies; 5+ messages in thread
From: Dave Hansen @ 2009-04-23 20:38 UTC (permalink / raw)
  To: Eric B Munson; +Cc: kosaki.motohiro, linux-mm, linux-kernel, mel, cl

On Thu, 2009-04-23 at 17:31 +0100, Eric B Munson wrote:
> On Mon, 20 Apr 2009, Dave Hansen wrote:
> 
> > On Mon, 2009-04-20 at 09:15 -0700, Dave Hansen wrote:
> > > On Mon, 2009-04-20 at 10:09 +0100, Eric B Munson wrote:
> > > > 1. Change NR_CPUS to min(64, NR_CPUS)
> > > >    This will limit the amount of possible skew on kernels compiled for very
> > > >    large SMP machines.  64 is an arbitrary number selected to limit the worst
> > > >    of the skew without using more cache lines.  min(64, NR_CPUS) is used
> > > >    instead of nr_online_cpus() because nr_online_cpus() requires a shared
> > > >    cache line and a call to hweight to make the calculation.  Its runtime
> > > >    overhead and keeping this counter accurate showed up in profiles and it's
> > > >    possible that nr_online_cpus() would also show.
> > 
> > Wow, that empty reply was really informative, wasn't it? :)
> > 
> > My worry with this min(64, NR_CPUS) approach is that you effectively
> > ensure that you're going to be doing a lot more cacheline bouncing, but
> > it isn't quite as explicit.
> 
> Unfortunately this is a choice we have to make, do we want to avoid cache
> line bouncing of fork-heavy workloads using more than 64 pages or bad
> information being used for overcommit decisions?

On SLES11, a new bash shell has ~9MB of mapped space.  A cat(1) has
~6.7MB.  The ACCT_THRESHOLD=64*2 pages with a 64k page is 8MB.  So,
you're basically *guaranteed* to go off-cpu every time a shell command
is performed.  So, I'd say 8MB is a bit on the low side.

Note that even with your suggested patch, we can still have 8GB of skew
on a 1024-way machine since there are still num_online_cpus()*8MB.  This
is better than what we have now, certainly.

> > Now, every time there's a mapping (or set of them) created or destroyed
> > that nets greater than 64 pages, you've got to go get a r/w cacheline to
> > a possibly highly contended atomic.  With a number this low, you're
> > almost guaranteed to hit it at fork() and exec().  Could you
> > double-check that this doesn't hurt any of the fork() AIM tests?
> 
> It is unlikely that the aim9 benchmarks would show if this patch was a
> problem because it forks in a tight loop and in a process that is not
> necessarily beig enough to hit ACCT_THRESHOLD, likely on a single CPU.
> In order to show any problems here we need a fork heavy workload with
> many threads on many CPUs.

That's true.  I'd suspect that aim is somewhere around the size of
'cat'.  It's a best-case scenario.  But, if you see degradation on the
best-case workload, we'll certainly see issues on worse workloads.

Also, I used fork as an obvious example here.  This issue will present
itself any time there is a mapping or set of mappings with a net change
in size of 8MB.

There's another alternative.  We could temporarily add statistics to
count how many times we go over ACCT_THRESHOLD to see just how bad the
situation is.  For instance, 64 causes 342.43 times as many touches of
the global counter as 8192 (or whatever it is).  You could also make
ACCT_THRESHOLD a tunable, at least for the duration of this testing.

There are also ways you can bias the global counter in controlled ways.
You could ensure, for instance, that a local counter never gets a
positive bias, causing the global one to be biased negatively.  That
would guarantee no underflows, although it would still be possible to
see vm_committed_space in an temporary *inflated* state.

-- Dave

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-04-23 20:38 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1240218590-16714-1-git-send-email-ebmunson@us.ibm.com>
2009-04-20  9:18 ` [PATCH V3] Fix Committed_AS underflow KOSAKI Motohiro
2009-04-20 16:15 ` Dave Hansen
2009-04-20 19:49   ` Dave Hansen
2009-04-21  1:41     ` KOSAKI Motohiro
     [not found]     ` <20090423163148.GB5044@us.ibm.com>
2009-04-23 20:38       ` Dave Hansen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox