From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andre Przywara Subject: Re: Xen 3.4.1 NUMA support Date: Tue, 10 Nov 2009 08:49:56 +0100 Message-ID: <4AF91B24.7060207@amd.com> References: <4AF82F12.6040400@amd.com> <4AF82FD8.6020409@eu.citrix.com> <4AF89D06.9010204@amd.com> <940bcfd20911092256t6c664d09ofced3db50211b6da@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <940bcfd20911092256t6c664d09ofced3db50211b6da@mail.gmail.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Dulloor Cc: George Dunlap , Dan Magenheimer , "xen-devel@lists.xensource.com" , Keir Fraser , Papagiannis Anastasios List-Id: xen-devel@lists.xenproject.org Dulloor wrote: > I am not finding this. Can you please point to the code ? tools/python/xen/xend/XendDomainInfo.py (around line 2600) with the core code being: ------------- index = nodeload.index( min(nodeload) ) cpumask = info['node_to_cpu'][index] for v in range(0, self.info['VCPUs_max']): xc.vcpu_setaffinity(self.domid, v, cpumask) -------------- The code got introduced with c/s 17131 and later got refined with c/s 17247 and c/s 17709. > > numa=on/off is only for setting up numa in xen (similar to the linux > knob, but turned off by default). The allocation of memory from a > single node (that you observe) could be because of the way > alloc_heap_pages is implemented (trying to allocate from all the heaps > from a node, before trying the next one) Yes, but if the domain is pinned before it allocated it's memory, then the natural behavior of Xen is to take memory from this local node. > - try looking at dump_numa > output. And, affinities are not set anywhere based on the node from > which allocation happens. It is the other way round, first the domain is pinned, later the memory is allocated (based on the node to which the currently scheduled CPU is belonging to). Regards, Andre. > > -dulloor > > On Mon, Nov 9, 2009 at 5:51 PM, Andre Przywara wrote: >> George Dunlap wrote: >>> Andre Przywara wrote: >>>> BTW: Shouldn't we set finally numa=on as the default value? >>>> >>> Is there any data to support the idea that this helps significantly on >>> common systems? >> I don't have any numbers handy, but I will try if I can generate some. >> >> Looking from a high level perspective it is a shame that it's not the >> default: With numa=off the Xen domain loader will allocate physical memory >> from some node (maybe even from several nodes) and will schedule the guest >> on some other (even rapidly changing) nodes. According to Murphy's law you >> will end up with _all_ the memory access of a guest to be remote. But in >> fact a NUMA architecture is really beneficial for virtualization: As there >> are close to zero cross domain memory accesses (except for Dom0), each node >> is more or less self contained and each guest can use the node's memory >> controller almost exclusively. >> But this is all spoiled as most people don't know about Xen's NUMA >> capabilities and don't set numa=on. Using this as a default would solve >> this. >> >> Regards, >> Andre. >> -- Andre Przywara AMD-Operating System Research Center (OSRC), Dresden, Germany Tel: +49 351 448 3567 12 ----to satisfy European Law for business letters: Advanced Micro Devices GmbH Karl-Hammerschmidt-Str. 34, 85609 Dornach b. Muenchen Geschaeftsfuehrer: Andrew Bowd; Thomas M. McCoy; Giuliano Meroni Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen Registergericht Muenchen, HRB Nr. 43632