From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail172.messagelabs.com (mail172.messagelabs.com [216.82.254.3]) by kanga.kvack.org (Postfix) with ESMTP id 402EC5F0001 for ; Mon, 20 Apr 2009 15:49:22 -0400 (EDT) Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by e1.ny.us.ibm.com (8.13.1/8.13.1) with ESMTP id n3KJkhOu025334 for ; Mon, 20 Apr 2009 15:46:43 -0400 Received: from d01av04.pok.ibm.com (d01av04.pok.ibm.com [9.56.224.64]) by d01relay04.pok.ibm.com (8.13.8/8.13.8/NCO v9.2) with ESMTP id n3KJo3vr176180 for ; Mon, 20 Apr 2009 15:50:03 -0400 Received: from d01av04.pok.ibm.com (loopback [127.0.0.1]) by d01av04.pok.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id n3KJo2Mo029777 for ; Mon, 20 Apr 2009 15:50:03 -0400 Subject: Re: [PATCH V3] Fix Committed_AS underflow From: Dave Hansen In-Reply-To: <1240244120.32604.278.camel@nimitz> References: <1240218590-16714-1-git-send-email-ebmunson@us.ibm.com> <1240244120.32604.278.camel@nimitz> Content-Type: text/plain Date: Mon, 20 Apr 2009 12:49:59 -0700 Message-Id: <1240256999.32604.330.camel@nimitz> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org To: Eric B Munson Cc: kosaki.motohiro@jp.fujitsu.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, mel@linux.vnet.ibm.com, cl@linux-foundation.org List-ID: On Mon, 2009-04-20 at 09:15 -0700, Dave Hansen wrote: > On Mon, 2009-04-20 at 10:09 +0100, Eric B Munson wrote: > > 1. Change NR_CPUS to min(64, NR_CPUS) > > This will limit the amount of possible skew on kernels compiled for very > > large SMP machines. 64 is an arbitrary number selected to limit the worst > > of the skew without using more cache lines. min(64, NR_CPUS) is used > > instead of nr_online_cpus() because nr_online_cpus() requires a shared > > cache line and a call to hweight to make the calculation. Its runtime > > overhead and keeping this counter accurate showed up in profiles and it's > > possible that nr_online_cpus() would also show. Wow, that empty reply was really informative, wasn't it? :) My worry with this min(64, NR_CPUS) approach is that you effectively ensure that you're going to be doing a lot more cacheline bouncing, but it isn't quite as explicit. Now, every time there's a mapping (or set of them) created or destroyed that nets greater than 64 pages, you've got to go get a r/w cacheline to a possibly highly contended atomic. With a number this low, you're almost guaranteed to hit it at fork() and exec(). Could you double-check that this doesn't hurt any of the fork() AIM tests? Another thought is that, instead of trying to fix this up in meminfo, we could do this in a way that is guaranteed to never skew the global counter negative: we always keep the *percpu* skew negative. This should be the same as what's in the kernel now: void vm_acct_memory(long pages) { long *local; long local_min = -ACCT_THRESHOLD; long local_max = ACCT_THRESHOLD; long local_goal = 0; preempt_disable(); local = &__get_cpu_var(committed_space); *local += pages; if (*local > local_max || *local < local_min) { atomic_long_add(*local - local_goal, &vm_committed_space); *local = local_goal; } preempt_enable(); } But now consider if we changed the local_* variables a bit: long local_min = -(ACCT_THRESHOLD*2); long local_max = 0 long local_goal = -ACCT_THRESHOLD; We'll get some possibly *large* numbers in meminfo, but it will at least never underflow. -- Dave -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org