From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefan Lankes Subject: RE: [RFC PATCH 0/4]: affinity-on-next-touch Date: Mon, 11 May 2009 16:54:40 +0200 Message-ID: <001501c9d248$69e2a870$3da7f950$@rwth-aachen.de> References: <000c01c9d212$4c244720$e46cd560$@rwth-aachen.de> <87zldjn597.fsf@basil.nowhere.org> Mime-Version: 1.0 Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: In-reply-to: <87zldjn597.fsf@basil.nowhere.org> Content-language: de Sender: linux-numa-owner@vger.kernel.org List-ID: Content-Type: text/plain; charset="windows-1252" To: 'Andi Kleen' Cc: linux-kernel@vger.kernel.org, Lee.Schermerhorn@hp.com, linux-numa@vger.kernel.org, brice.goglin@inria.fr, "'Terboven, Christian'" , anmey@rz.rwth-aachen.de, 'Boris Bierbaum' > From: Andi Kleen [mailto:andi@firstfloor.org] >=20 > Stefan Lankes writes: > > > > [Patch 1/4]: Extend the system call madvise with a new parameter > > MADV_ACCESS_LWP (the same as used in Solaris). The specified memory > area >=20 > Linux does NUMA memory policies in mbind(), not madvise() > Also if there's a new NUMA policy it should be in the standard > Linux NUMA memory policy frame work, not inventing a new one By default, mbind only has an effect on new allocations. I think that t= his is different from what we need for applications with dynamic memory acc= ess patterns. The app gives the kernel a hint that the access pattern has b= een changed and the kernel has to redistribute the pages which are already allocated. > > [Patch 4/4]: This part of the patch adds some counters to detect > migration > > errors and publishes these counters via /proc/vmstat. Besides this, > the > > Kconfig file is extend with the parameter > CONFIG_AFFINITY_ON_NEXT_TOUCH. > > > > With this patch, the kernel reduces the overhead of page distributi= on > via > > "affinity-on-next-touch" from 2518ms to 366ms compared to the user- > level >=20 > The interesting part is less how much faster it is compared to an use= r > space implementation, but how much this migrate on touch approach > helps in general compared to already existing policies. Some hard > numbers on that would appreciated. >=20 > Note that for the OpenMP case old kernels sometimes had trouble becau= se > the threads tended to be not scheduled to the final target CPU > on the first time slice so the memory was often first-touched > on the wrong node. Later kernels avoided that by more aggressively > moving the threads early. >=20 "affinity-on-next-touch" is not a data distribution strategy for applications with a static access pattern. If the access pattern change= d, you could initialize the "affinity-on-next-touch" mechanism and afterwa= rds the kernel redistributes the pages.=20 =46or instance, Norden's PDE solvers using adaptive mesh refinements (A= MR) [1] is an application with a dynamic access pattern. We use this example to evaluate the performance of our patch. We ran this solver on our quad-socket, dual-core Opteron 875 (2.2GHz) system running CentOS 5.2. = The code was already optimized for NUMA architectures. Before the arrays ar= e initialized, the threads are bound to one core. In our test case, the s= olver needs 5318s. If we use our kernel extension, the solver needs 4489s.=20 Currently, we are testing some other apps. Stefan [1] Norden, M., L=F6f, H., Rantakokko, J., Holmgren, S.: Geographical L= ocality and Dynamic Data Migration for OpenMP Implementations of Adaptive PDE Solvers. In: Proceedings of the 2nd International Workshop on OpenMP (IWOMP), Reims, France (June 2006) 382=96393 -- To unsubscribe from this list: send the line "unsubscribe linux-numa" i= n the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html