From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andre Przywara Subject: Re: Xen 3.4.1 NUMA support Date: Mon, 9 Nov 2009 16:02:42 +0100 Message-ID: <4AF82F12.6040400@amd.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Dan Magenheimer Cc: George Dunlap , xen-devel@lists.xensource.com, Keir Fraser , Papagiannis Anastasios List-Id: xen-devel@lists.xenproject.org Dan Magenheimer wrote: >> Add Xen boot parameter 'numa=on' to enable NUMA detection. >> Then it's up to you to, for example, pin domains to specific nodes, >> using the 'cpus=...' option in the domain config file. See >> /etc/xen/xmexample1 for an example of its usage. > VMware has the notion of a "cell" where VMs can be > scheduled only within a cell, not across cells. > Cell boundaries are determined by VMware by > default, though certains settings can override them. Well, If I got this right, then you are describing the current behaviour of Xen. It has a similar feature for some time now (since 3.3, I guess). When you launch a domain on a numa=on machine, it will pick the least busiest node (which can hold the requested memory) and restrict the domain to that node (by only allowing CPUs of that node). This is in XendDomainInfo.py (c/s 17131, 17247, 17709) Looks like this one: (kernel xen.gz numa=on dom0_mem=6144M dom0_max_vcpus=6 dom0_vcpus_pin) # xm create opensuse.hvm # xm create opensuse2.hvm # xm vcpu-list Name ID VCPU CPU State Time(s) CPU Affinity 001-LTP 1 0 6 -b- 17.8 6-11 001-LTP 1 1 7 -b- 6.3 6-11 002-LTP 2 0 12 -b- 19.0 12-17 002-LTP 2 1 16 -b- 1.6 12-17 002-LTP 2 2 17 -b- 1.7 12-17 002-LTP 2 3 14 -b- 1.6 12-17 002-LTP 2 4 16 -b- 1.6 12-17 002-LTP 2 5 15 -b- 1.5 12-17 002-LTP 2 6 12 -b- 1.3 12-17 002-LTP 2 7 13 -b- 1.8 12-17 Domain-0 0 0 0 -b- 12.6 0 Domain-0 0 1 1 -b- 7.6 1 Domain-0 0 2 2 -b- 8.0 2 Domain-0 0 3 3 -b- 14.6 3 Domain-0 0 4 4 r-- 1.4 4 Domain-0 0 5 5 -b- 0.9 5 # xm debug-keys U (XEN) Domain 0 (total: 2097152): (XEN) Node 0: 2097152 (XEN) Node 1: 0 (XEN) Node 2: 0 (XEN) Node 3: 0 (XEN) Node 4: 0 (XEN) Node 5: 0 (XEN) Node 6: 0 (XEN) Node 7: 0 (XEN) Domain 1 (total: 394219): (XEN) Node 0: 0 (XEN) Node 1: 394219 (XEN) Node 2: 0 (XEN) Node 3: 0 (XEN) Node 4: 0 (XEN) Node 5: 0 (XEN) Node 6: 0 (XEN) Node 7: 0 (XEN) Domain 2 (total: 394219): (XEN) Node 0: 0 (XEN) Node 1: 0 (XEN) Node 2: 394219 (XEN) Node 3: 0 (XEN) Node 4: 0 (XEN) Node 5: 0 (XEN) Node 6: 0 (XEN) Node 7: 0 Note that there were no cpus= lines in the config files, Xen did that automatically. Domains can be localhost-migrated to another node: # xm migrate --node=4 1 localhost The only issue is with domains larger than a node. If someone has a useful use-case, I can start rebasing my old patches for NUMA aware HVM domains to Xen unstable. Regards, Andre. BTW: Shouldn't we set finally numa=on as the default value? -- Andre Przywara AMD-OSRC (Dresden) Tel: x29712