xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* Ping: c/s 20526 (tools: avoid cpu over-commitment if numa=on)
@ 2010-01-13  8:15 Jan Beulich
  2010-01-13  8:29 ` Keir Fraser
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Jan Beulich @ 2010-01-13  8:15 UTC (permalink / raw)
  To: andre.przywara; +Cc: xen-devel

Andre,

I'm afraid this change isn't really correct:

>+                cores_per_node = info['nr_cpus'] / info['nr_nodes']
>+                nodes_required = (self.info['VCPUs_max'] + cores_per_node - 1) / cores_per_node

Simply using cores_per_node (as calculated here) as a divisor is bound
to cause division-by-zero issues, namely when limiting the number of
CPUs on the Xen command line (maxcpus=). I'm not sure though, what
a reasonable solution to this might look like, since cores-per-node is
a meaningless thing in an artificial setup like this, and may also be
meaningless in asymmetric configurations. So perhaps we really need
to iterate over nodes while summing up the number of CPUs they
have until the number of needed vCPU-s was reached.

Jan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Ping: c/s 20526 (tools: avoid cpu over-commitment if numa=on)
  2010-01-13  8:15 Ping: c/s 20526 (tools: avoid cpu over-commitment if numa=on) Jan Beulich
@ 2010-01-13  8:29 ` Keir Fraser
  2010-01-13 14:24 ` Andre Przywara
  2010-01-13 22:10 ` [PATCH] NUMA: Fix computation of needed nodes Andre Przywara
  2 siblings, 0 replies; 4+ messages in thread
From: Keir Fraser @ 2010-01-13  8:29 UTC (permalink / raw)
  To: Jan Beulich, andre.przywara@amd.com; +Cc: xen-devel@lists.xensource.com

On 13/01/2010 08:15, "Jan Beulich" <JBeulich@novell.com> wrote:

> Andre,
> 
> I'm afraid this change isn't really correct:
> 
>> +                cores_per_node = info['nr_cpus'] / info['nr_nodes']
>> +                nodes_required = (self.info['VCPUs_max'] + cores_per_node -
>> 1) / cores_per_node
> 
> Simply using cores_per_node (as calculated here) as a divisor is bound
> to cause division-by-zero issues, namely when limiting the number of
> CPUs on the Xen command line (maxcpus=). I'm not sure though, what
> a reasonable solution to this might look like, since cores-per-node is
> a meaningless thing in an artificial setup like this, and may also be
> meaningless in asymmetric configurations. So perhaps we really need
> to iterate over nodes while summing up the number of CPUs they
> have until the number of needed vCPU-s was reached.

Yes, please!

 K.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Ping: c/s 20526 (tools: avoid cpu over-commitment if numa=on)
  2010-01-13  8:15 Ping: c/s 20526 (tools: avoid cpu over-commitment if numa=on) Jan Beulich
  2010-01-13  8:29 ` Keir Fraser
@ 2010-01-13 14:24 ` Andre Przywara
  2010-01-13 22:10 ` [PATCH] NUMA: Fix computation of needed nodes Andre Przywara
  2 siblings, 0 replies; 4+ messages in thread
From: Andre Przywara @ 2010-01-13 14:24 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Keir Fraser

Jan Beulich wrote:
> Andre,
> 
> I'm afraid this change isn't really correct:
> 
>> +                cores_per_node = info['nr_cpus'] / info['nr_nodes']
>> +                nodes_required = (self.info['VCPUs_max'] + cores_per_node - 1) / cores_per_node
> 
> Simply using cores_per_node (as calculated here) as a divisor is bound
> to cause division-by-zero issues, namely when limiting the number of
> CPUs on the Xen command line (maxcpus=). I'm not sure though, what
> a reasonable solution to this might look like, since cores-per-node is
> a meaningless thing in an artificial setup like this, and may also be
> meaningless in asymmetric configurations. So perhaps we really need
> to iterate over nodes while summing up the number of CPUs they
> have until the number of needed vCPU-s was reached.
Thanks for the heads-up. I have created a better patch basically doing 
what you described, actually it is even more elegant than the original 
version. I am about to test it and will send it out as soon as the 
testing is finished.

Regards,
Andre.

-- 
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 448 3567 12
----to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Karl-Hammerschmidt-Str. 34, 85609 Dornach b. Muenchen
Geschaeftsfuehrer: Andrew Bowd; Thomas M. McCoy; Giuliano Meroni
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH] NUMA: Fix computation of needed nodes
  2010-01-13  8:15 Ping: c/s 20526 (tools: avoid cpu over-commitment if numa=on) Jan Beulich
  2010-01-13  8:29 ` Keir Fraser
  2010-01-13 14:24 ` Andre Przywara
@ 2010-01-13 22:10 ` Andre Przywara
  2 siblings, 0 replies; 4+ messages in thread
From: Andre Przywara @ 2010-01-13 22:10 UTC (permalink / raw)
  To: Jan Beulich, Keir Fraser; +Cc: xen-devel

[-- Attachment #1: Type: text/plain, Size: 1220 bytes --]

Hi,

as Jan Beulich pointed out:
> I'm afraid this change isn't really correct:
> 
>> +                cores_per_node = info['nr_cpus'] / info['nr_nodes']
>> +                nodes_required = (self.info['VCPUs_max'] + cores_per_node - 1) / cores_per_node
> 
> Simply using cores_per_node (as calculated here) as a divisor is bound
> to cause division-by-zero issues, namely when limiting the number of
> CPUs on the Xen command line (maxcpus=).
Actually Jan's proposed method of getting additional nodes is more 
elegant, so I implemented: enumerating the best nodes and adding CPU 
affinity until all VCPUs can be backed by at least on physical core. 
This should fix problems with asymmetric NUMA configurations and cropped 
number of CPUs in Xen.

Regards,
Andre.

Signed-off-by: Andre Przywara <andre.przywara@amd.com>

-- 
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 488-3567-12
----to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Karl-Hammerschmidt-Str. 34, 85609 Dornach b. Muenchen
Geschaeftsfuehrer: Andrew Bowd; Thomas M. McCoy; Giuliano Meroni
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632

[-- Attachment #2: numa_div_by_zero.patch --]
[-- Type: text/plain, Size: 1494 bytes --]

diff -r 13d4e78ede97 tools/python/xen/xend/XendDomainInfo.py
--- a/tools/python/xen/xend/XendDomainInfo.py	Wed Jan 13 08:33:34 2010 +0000
+++ b/tools/python/xen/xend/XendDomainInfo.py	Wed Jan 13 22:55:56 2010 +0100
@@ -2724,13 +2724,12 @@
                         candidate_node_list.append(i)
                 best_node = find_relaxed_node(candidate_node_list)[0]
                 cpumask = info['node_to_cpu'][best_node]
-                cores_per_node = info['nr_cpus'] / info['nr_nodes']
-                nodes_required = (self.info['VCPUs_max'] + cores_per_node - 1) / cores_per_node
-                if nodes_required > 1:
-                    log.debug("allocating %d NUMA nodes", nodes_required)
-                    best_nodes = find_relaxed_node(filter(lambda x: x != best_node, range(0,info['nr_nodes'])))
-                    for i in best_nodes[:nodes_required - 1]:
-                        cpumask = cpumask + info['node_to_cpu'][i]
+                best_nodes = find_relaxed_node(filter(lambda x: x != best_node, range(0,info['nr_nodes'])))
+                for node_idx in best_nodes:
+                    if len(cpumask) >= self.info['VCPUs_max']:
+                        break
+                    cpumask = cpumask + info['node_to_cpu'][node_idx]
+                    log.debug("allocating additional NUMA node %d", node_idx)
                 for v in range(0, self.info['VCPUs_max']):
                     xc.vcpu_setaffinity(self.domid, v, cpumask)
         return index

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-01-13 22:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-13  8:15 Ping: c/s 20526 (tools: avoid cpu over-commitment if numa=on) Jan Beulich
2010-01-13  8:29 ` Keir Fraser
2010-01-13 14:24 ` Andre Przywara
2010-01-13 22:10 ` [PATCH] NUMA: Fix computation of needed nodes Andre Przywara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).