xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Wei Liu <wei.liu2@citrix.com>
To: Elena Ufimtseva <ufimtseva@gmail.com>
Cc: keir@xen.org, Ian.Campbell@citrix.com,
	stefano.stabellini@eu.citrix.com, george.dunlap@eu.citrix.com,
	msw@linux.com, dario.faggioli@citrix.com, lccycc123@gmail.com,
	ian.jackson@eu.citrix.com, xen-devel@lists.xen.org,
	JBeulich@suse.com, wei.liu2@citrix.com
Subject: Re: [PATCH v6 00/10] vnuma introduction
Date: Fri, 18 Jul 2014 10:53:59 +0100	[thread overview]
Message-ID: <20140718095359.GA5687@zion.uk.xensource.com> (raw)
In-Reply-To: <1405662609-31486-1-git-send-email-ufimtseva@gmail.com>

Hi! Another new series!

On Fri, Jul 18, 2014 at 01:49:59AM -0400, Elena Ufimtseva wrote:
[...]
> Current problems:
> 
> Warning on CPU bringup on other node
> 
>     The cpus in guest wich belong to different NUMA nodes are configured
>     to chare same l2 cache and thus considered to be siblings and cannot
>     be on the same node. One can see following WARNING during the boot time:
> 
> [    0.022750] SMP alternatives: switching to SMP code
> [    0.004000] ------------[ cut here ]------------
> [    0.004000] WARNING: CPU: 1 PID: 0 at arch/x86/kernel/smpboot.c:303 topology_sane.isra.8+0x67/0x79()
> [    0.004000] sched: CPU #1's smt-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
> [    0.004000] Modules linked in:
> [    0.004000] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.15.0-rc8+ #43
> [    0.004000]  0000000000000000 0000000000000009 ffffffff813df458 ffff88007abe7e60
> [    0.004000]  ffffffff81048963 ffff88007abe7e70 ffffffff8102fb08 ffffffff00000100
> [    0.004000]  0000000000000001 ffff8800f6e13900 0000000000000000 000000000000b018
> [    0.004000] Call Trace:
> [    0.004000]  [<ffffffff813df458>] ? dump_stack+0x41/0x51
> [    0.004000]  [<ffffffff81048963>] ? warn_slowpath_common+0x78/0x90
> [    0.004000]  [<ffffffff8102fb08>] ? topology_sane.isra.8+0x67/0x79
> [    0.004000]  [<ffffffff81048a13>] ? warn_slowpath_fmt+0x45/0x4a
> [    0.004000]  [<ffffffff8102fb08>] ? topology_sane.isra.8+0x67/0x79
> [    0.004000]  [<ffffffff8102fd2e>] ? set_cpu_sibling_map+0x1c9/0x3f7
> [    0.004000]  [<ffffffff81042146>] ? numa_add_cpu+0xa/0x18
> [    0.004000]  [<ffffffff8100b4e2>] ? cpu_bringup+0x50/0x8f
> [    0.004000]  [<ffffffff8100b544>] ? cpu_bringup_and_idle+0x1d/0x28
> [    0.004000] ---[ end trace 0e2e2fd5c7b76da5 ]---
> [    0.035371] x86: Booted up 2 nodes, 2 CPUs
> 
> The workaround is to specify cpuid in config file and not use SMT. But soon I will come up
> with some other acceptable solution.
> 

I've also encountered this. I suspect that even if you disble SMT with
cpuid in config file, the cpu topology in guest might still be wrong.
What do hwloc-ls and lscpu show? Do you see any weird topology like one
core belongs to one node while three belong to another? (I suspect not
because your vcpus are already pinned to a specific node)

What I did was to manipulate various "id"s in Linux kernel, so that I
create a topology like 1 core : 1 cpu : 1 socket mapping. In that case
guest scheduler won't be able to make any assumption on individual CPU
sharing caches with each other.

In any case we've already manipulated various ids of CPU0, I don't see
it harm to manipulate other CPUs as well.

Thoughts?

P.S. I'm benchmarking your v5, tell me if you're interested in the
result.

Wei.

(This patch should be applied to Linux and it's by no mean suitable for
upstream as is)
---8<---
>From be2b33088e521284c27d6a7679b652b688dba83d Mon Sep 17 00:00:00 2001
From: Wei Liu <wei.liu2@citrix.com>
Date: Tue, 17 Jun 2014 14:51:57 +0100
Subject: [PATCH] XXX: CPU topology hack!

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 arch/x86/xen/smp.c   |   17 +++++++++++++++++
 arch/x86/xen/vnuma.c |    2 ++
 2 files changed, 19 insertions(+)

diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 7005974..89656fe 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -81,6 +81,15 @@ static void cpu_bringup(void)
 	cpu = smp_processor_id();
 	smp_store_cpu_info(cpu);
 	cpu_data(cpu).x86_max_cores = 1;
+	cpu_physical_id(cpu) = cpu;
+	cpu_data(cpu).phys_proc_id = cpu;
+	cpu_data(cpu).cpu_core_id = cpu;
+	cpu_data(cpu).initial_apicid = cpu;
+	cpu_data(cpu).apicid = cpu;
+	per_cpu(cpu_llc_id, cpu) = cpu;
+	if (numa_cpu_node(cpu) != NUMA_NO_NODE)
+		cpu_data(cpu).phys_proc_id = numa_cpu_node(cpu);
+
 	set_cpu_sibling_map(cpu);
 
 	xen_setup_cpu_clockevents();
@@ -326,6 +335,14 @@ static void __init xen_smp_prepare_cpus(unsigned int max_cpus)
 
 	smp_store_boot_cpu_info();
 	cpu_data(0).x86_max_cores = 1;
+	cpu_physical_id(0) = 0;
+	cpu_data(0).phys_proc_id = 0;
+	cpu_data(0).cpu_core_id = 0;
+	per_cpu(cpu_llc_id, cpu) = 0;
+	cpu_data(0).initial_apicid = 0;
+	cpu_data(0).apicid = 0;
+	if (numa_cpu_node(0) != NUMA_NO_NODE)
+		per_cpu(x86_cpu_to_node_map, 0) = numa_cpu_node(0);
 
 	for_each_possible_cpu(i) {
 		zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
diff --git a/arch/x86/xen/vnuma.c b/arch/x86/xen/vnuma.c
index a02f9c6..418ced2 100644
--- a/arch/x86/xen/vnuma.c
+++ b/arch/x86/xen/vnuma.c
@@ -81,7 +81,9 @@ int __init xen_numa_init(void)
 	setup_nr_node_ids();
 	/* Setting the cpu, apicid to node */
 	for_each_cpu(cpu, cpu_possible_mask) {
+		/* Use cpu id as apicid */
 		set_apicid_to_node(cpu, cpu_to_node[cpu]);
+		cpu_data(cpu).initial_apicid = cpu;
 		numa_set_node(cpu, cpu_to_node[cpu]);
 		cpumask_set_cpu(cpu, node_to_cpumask_map[cpu_to_node[cpu]]);
 	}
-- 
1.7.10.4

  parent reply	other threads:[~2014-07-18  9:53 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-18  5:49 [PATCH v6 00/10] vnuma introduction Elena Ufimtseva
2014-07-18  5:50 ` [PATCH v6 01/10] xen: vnuma topology and subop hypercalls Elena Ufimtseva
2014-07-18 10:30   ` Wei Liu
2014-07-20 13:16     ` Elena Ufimtseva
2014-07-20 15:59       ` Wei Liu
2014-07-22 15:18         ` Dario Faggioli
2014-07-23  5:33           ` Elena Ufimtseva
2014-07-18 13:49   ` Konrad Rzeszutek Wilk
2014-07-20 13:26     ` Elena Ufimtseva
2014-07-22 15:14   ` Dario Faggioli
2014-07-23  5:22     ` Elena Ufimtseva
2014-07-23 14:06   ` Jan Beulich
2014-07-25  4:52     ` Elena Ufimtseva
2014-07-25  7:33       ` Jan Beulich
2014-07-18  5:50 ` [PATCH v6 02/10] xsm bits for vNUMA hypercalls Elena Ufimtseva
2014-07-18 13:50   ` Konrad Rzeszutek Wilk
2014-07-18 15:26     ` Daniel De Graaf
2014-07-20 13:48       ` Elena Ufimtseva
2014-07-18  5:50 ` [PATCH v6 03/10] vnuma hook to debug-keys u Elena Ufimtseva
2014-07-23 14:10   ` Jan Beulich
2014-07-18  5:50 ` [PATCH v6 04/10] libxc: Introduce xc_domain_setvnuma to set vNUMA Elena Ufimtseva
2014-07-18 10:33   ` Wei Liu
2014-07-29 10:33   ` Ian Campbell
2014-07-18  5:50 ` [PATCH v6 05/10] libxl: vnuma topology configuration parser and doc Elena Ufimtseva
2014-07-18 10:53   ` Wei Liu
2014-07-20 14:04     ` Elena Ufimtseva
2014-07-29 10:38   ` Ian Campbell
2014-07-29 10:42   ` Ian Campbell
2014-08-06  4:46     ` Elena Ufimtseva
2014-07-18  5:50 ` [PATCH v6 06/10] libxc: move code to arch_boot_alloc func Elena Ufimtseva
2014-07-29 10:38   ` Ian Campbell
2014-07-18  5:50 ` [PATCH v6 07/10] libxc: allocate domain memory for vnuma enabled Elena Ufimtseva
2014-07-29 10:43   ` Ian Campbell
2014-08-06  4:48     ` Elena Ufimtseva
2014-07-18  5:50 ` [PATCH v6 08/10] libxl: build numa nodes memory blocks Elena Ufimtseva
2014-07-18 11:01   ` Wei Liu
2014-07-20 12:58     ` Elena Ufimtseva
2014-07-20 15:59       ` Wei Liu
2014-07-18  5:50 ` [PATCH v6 09/10] libxl: vnuma nodes placement bits Elena Ufimtseva
2014-07-18  5:50 ` [PATCH v6 10/10] libxl: set vnuma for domain Elena Ufimtseva
2014-07-18 10:58   ` Wei Liu
2014-07-29 10:45   ` Ian Campbell
2014-08-12  3:52     ` Elena Ufimtseva
2014-08-12  9:42       ` Wei Liu
2014-08-12 17:10         ` Dario Faggioli
2014-08-12 17:13           ` Wei Liu
2014-08-12 17:24             ` Elena Ufimtseva
2014-07-18  6:16 ` [PATCH v6 00/10] vnuma introduction Elena Ufimtseva
2014-07-18  9:53 ` Wei Liu [this message]
2014-07-18 10:13   ` Dario Faggioli
2014-07-18 11:48     ` Wei Liu
2014-07-20 14:57       ` Elena Ufimtseva
2014-07-22 15:49         ` Dario Faggioli
2014-07-22 14:03       ` Dario Faggioli
2014-07-22 14:48         ` Wei Liu
2014-07-22 15:06           ` Dario Faggioli
2014-07-22 16:47             ` Wei Liu
2014-07-22 19:43         ` Is: cpuid creation of PV guests is not correct. Was:Re: " Konrad Rzeszutek Wilk
2014-07-22 22:34           ` Is: cpuid creation of PV guests is not correct Andrew Cooper
2014-07-22 22:53           ` Is: cpuid creation of PV guests is not correct. Was:Re: [PATCH v6 00/10] vnuma introduction Dario Faggioli
2014-07-23  6:00             ` Elena Ufimtseva
2014-07-22 12:49 ` Dario Faggioli
2014-07-23  5:59   ` Elena Ufimtseva

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140718095359.GA5687@zion.uk.xensource.com \
    --to=wei.liu2@citrix.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=dario.faggioli@citrix.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=keir@xen.org \
    --cc=lccycc123@gmail.com \
    --cc=msw@linux.com \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=ufimtseva@gmail.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).