From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx152.postini.com [74.125.245.152]) by kanga.kvack.org (Postfix) with SMTP id 04CFD6B0070 for ; Sun, 14 Oct 2012 00:57:48 -0400 (EDT) Date: Sun, 14 Oct 2012 06:57:16 +0200 From: Andrea Arcangeli Subject: Re: [PATCH 00/33] AutoNUMA27 Message-ID: <20121014045716.GE11663@redhat.com> References: <1349308275-2174-1-git-send-email-aarcange@redhat.com> <20121013184019.GA3837@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20121013184019.GA3837@linux.vnet.ibm.com> Sender: owner-linux-mm@kvack.org List-ID: To: Srikar Dronamraju Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, torvalds@linux-foundation.org, akpm@linux-foundation.org, pzijlstr@redhat.com, mingo@elte.hu, mel@csn.ul.ie, hughd@google.com, riel@redhat.com, hannes@cmpxchg.org, dhillf@gmail.com, drjones@redhat.com, tglx@linutronix.de, pjt@google.com, cl@linux.com, suresh.b.siddha@intel.com, efault@gmx.de, paulmck@linux.vnet.ibm.com, alex.shi@intel.com, konrad.wilk@oracle.com, benh@kernel.crashing.org Hi Srikar, On Sun, Oct 14, 2012 at 12:10:19AM +0530, Srikar Dronamraju wrote: > * Andrea Arcangeli [2012-10-04 01:50:42]: > > > Hello everyone, > > > > This is a new AutoNUMA27 release for Linux v3.6. > > > > > Here results of autonumabenchmark on a 328GB 64 core with ht disabled > comparing v3.6 with autonuma27. *snip* > numa01: 1805.19 1907.11 1866.39 -3.88% Interesting. So numa01 should be improved in autonuma28fast. Not sure why the hard binds show any difference, but I'm more concerned in optimizing numa01. I get the same results from hard bindings on upstream or autonuma, strange. Could you repeat only numa01 with the origin/autonuma28fast branch? Also if you could post the two pdf convergence chart generated by numa01 on autonuma27 and autonuma28fast, I think that would be interesting to see the full effect and why it is faster. I only had the time for a quick push after having the idea added in autonuma28fast (which is yet improved compared to autonuma28), but I've been told already that it's dealing with numa01 on the 8 node very well as expected. numa01 in the 8 node is a workload without a perfect solution (other than MADV_INTERLEAVE). Full convergence preventing cross-node traffic is impossible because there are 2 processes spanning over 8 nodes and all process memory is touched by all threads constantly. Yet autonuma28fast should deal optimally that scenario too. As a side note: numa01 on the 2 node instead converges fully (2 processes + 2 nodes = full convergence). numa01 on 2 nodes or >2nodes is a very different kind of test. I'll release an autonuma29 behaving like 28fast if there are no surprises. The new algorithm change in 28fast will also save memory once I rewrite it properly. Thanks! Andrea -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org