* [PATCH v3] x86: correct socket_cpumask allocation @ 2015-07-09 14:23 Chao Peng 2015-07-09 15:16 ` Andrew Cooper 2015-07-10 14:29 ` Dario Faggioli 0 siblings, 2 replies; 11+ messages in thread From: Chao Peng @ 2015-07-09 14:23 UTC (permalink / raw) To: xen-devel; +Cc: andrew.cooper3, dario.faggioli, keir, boris.ostrovsky, JBeulich For booting cpu, the socket number is not needed to be 0 so it needs to be computed by cpu number. For secondary cpu, phys_proc_id is not valid in CPU_PREPARE notifier(cpu_smpboot_alloc), so cpu_to_socket(cpu) can't be used. Instead, pre-allocate secondary_cpu_mask in cpu_smpboot_alloc() and later consume it in smp_store_cpu_info(). This patch also change socket_cpumask type from 'cpumask_var_t *' to 'cpumask_t **' so that smaller NR_CPUS works. Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Tested-by: Dario Faggioli <dario.faggioli@citrix.com> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> --- Changes in v3: * use type safe xzalloc(cpumask_t) Changes in v2: * Fix case that booting cpu is on the socket other than socket0. * cpumask_var_t => cpumask_t * to make smaller NR_CPUS builds. --- xen/arch/x86/smpboot.c | 25 ++++++++++++++++++------- xen/include/asm-x86/smp.h | 2 +- 2 files changed, 19 insertions(+), 8 deletions(-) diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c index c73aa1b..0f03364 100644 --- a/xen/arch/x86/smpboot.c +++ b/xen/arch/x86/smpboot.c @@ -61,7 +61,8 @@ cpumask_t cpu_online_map __read_mostly; EXPORT_SYMBOL(cpu_online_map); unsigned int __read_mostly nr_sockets; -cpumask_var_t *__read_mostly socket_cpumask; +cpumask_t **__read_mostly socket_cpumask; +static cpumask_t *secondary_socket_cpumask; struct cpuinfo_x86 cpu_data[NR_CPUS]; @@ -84,11 +85,21 @@ void *stack_base[NR_CPUS]; static void smp_store_cpu_info(int id) { struct cpuinfo_x86 *c = cpu_data + id; + unsigned int socket; *c = boot_cpu_data; if ( id != 0 ) + { identify_cpu(c); + socket = cpu_to_socket(id); + if ( !socket_cpumask[socket] ) + { + socket_cpumask[socket] = secondary_socket_cpumask; + secondary_socket_cpumask = NULL; + } + } + /* * Certain Athlons might work (for various values of 'work') in SMP * but they are not certified as MP capable. @@ -658,7 +669,7 @@ static void cpu_smpboot_free(unsigned int cpu) if ( cpumask_empty(socket_cpumask[socket]) ) { - free_cpumask_var(socket_cpumask[socket]); + xfree(socket_cpumask[socket]); socket_cpumask[socket] = NULL; } @@ -705,7 +716,6 @@ static int cpu_smpboot_alloc(unsigned int cpu) nodeid_t node = cpu_to_node(cpu); struct desc_struct *gdt; unsigned long stub_page; - unsigned int socket = cpu_to_socket(cpu); if ( node != NUMA_NO_NODE ) memflags = MEMF_node(node); @@ -748,8 +758,8 @@ static int cpu_smpboot_alloc(unsigned int cpu) goto oom; per_cpu(stubs.addr, cpu) = stub_page + STUB_BUF_CPU_OFFS(cpu); - if ( !socket_cpumask[socket] && - !zalloc_cpumask_var(socket_cpumask + socket) ) + if ( secondary_socket_cpumask == NULL && + (secondary_socket_cpumask = xzalloc(cpumask_t)) == NULL ) goto oom; if ( zalloc_cpumask_var(&per_cpu(cpu_sibling_mask, cpu)) && @@ -804,8 +814,9 @@ void __init smp_prepare_cpus(unsigned int max_cpus) set_nr_sockets(); - socket_cpumask = xzalloc_array(cpumask_var_t, nr_sockets); - if ( !socket_cpumask || !zalloc_cpumask_var(socket_cpumask) ) + socket_cpumask = xzalloc_array(cpumask_t *, nr_sockets); + if ( socket_cpumask == NULL || + (socket_cpumask[cpu_to_socket(0)] = xzalloc(cpumask_t)) == NULL ) panic("No memory for socket CPU siblings map"); if ( !zalloc_cpumask_var(&per_cpu(cpu_sibling_mask, 0)) || diff --git a/xen/include/asm-x86/smp.h b/xen/include/asm-x86/smp.h index e594062..ea07888 100644 --- a/xen/include/asm-x86/smp.h +++ b/xen/include/asm-x86/smp.h @@ -67,7 +67,7 @@ extern unsigned int nr_sockets; void set_nr_sockets(void); /* Representing HT and core siblings in each socket. */ -extern cpumask_var_t *socket_cpumask; +extern cpumask_t **socket_cpumask; #endif /* !__ASSEMBLY__ */ -- 1.9.1 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v3] x86: correct socket_cpumask allocation 2015-07-09 14:23 [PATCH v3] x86: correct socket_cpumask allocation Chao Peng @ 2015-07-09 15:16 ` Andrew Cooper 2015-07-09 15:36 ` Boris Ostrovsky 2015-07-10 14:29 ` Dario Faggioli 1 sibling, 1 reply; 11+ messages in thread From: Andrew Cooper @ 2015-07-09 15:16 UTC (permalink / raw) To: Chao Peng, xen-devel; +Cc: dario.faggioli, keir, boris.ostrovsky, JBeulich On 09/07/15 15:23, Chao Peng wrote: > For booting cpu, the socket number is not needed to be 0 so > it needs to be computed by cpu number. > > For secondary cpu, phys_proc_id is not valid in CPU_PREPARE > notifier(cpu_smpboot_alloc), so cpu_to_socket(cpu) can't be used. > Instead, pre-allocate secondary_cpu_mask in cpu_smpboot_alloc() > and later consume it in smp_store_cpu_info(). > > This patch also change socket_cpumask type from 'cpumask_var_t *' > to 'cpumask_t **' so that smaller NR_CPUS works. > > Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> > Tested-by: Dario Faggioli <dario.faggioli@citrix.com> > Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> The v2 of this path seems to be holding up fine against the XenServer test pool. No failures to boot encountered so far. Tested-by: Andrew Cooper <andrew.cooper3@citrix.com> As the differences between v2 and v3 appear largely mechanical and not functional in nature, this Tested-by can probably still stand. ~Andrew ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v3] x86: correct socket_cpumask allocation 2015-07-09 15:16 ` Andrew Cooper @ 2015-07-09 15:36 ` Boris Ostrovsky 0 siblings, 0 replies; 11+ messages in thread From: Boris Ostrovsky @ 2015-07-09 15:36 UTC (permalink / raw) To: Andrew Cooper, Chao Peng, xen-devel; +Cc: dario.faggioli, keir, JBeulich On 07/09/2015 11:16 AM, Andrew Cooper wrote: > On 09/07/15 15:23, Chao Peng wrote: >> For booting cpu, the socket number is not needed to be 0 so >> it needs to be computed by cpu number. >> >> For secondary cpu, phys_proc_id is not valid in CPU_PREPARE >> notifier(cpu_smpboot_alloc), so cpu_to_socket(cpu) can't be used. >> Instead, pre-allocate secondary_cpu_mask in cpu_smpboot_alloc() >> and later consume it in smp_store_cpu_info(). >> >> This patch also change socket_cpumask type from 'cpumask_var_t *' >> to 'cpumask_t **' so that smaller NR_CPUS works. >> >> Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> >> Tested-by: Dario Faggioli <dario.faggioli@citrix.com> >> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> > The v2 of this path seems to be holding up fine against the XenServer > test pool. No failures to boot encountered so far. > > Tested-by: Andrew Cooper <andrew.cooper3@citrix.com> > > As the differences between v2 and v3 appear largely mechanical and not > functional in nature, this Tested-by can probably still stand. Works for me as well. (You can add 'Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>' although I think by now there are enough of those) -boris ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v3] x86: correct socket_cpumask allocation 2015-07-09 14:23 [PATCH v3] x86: correct socket_cpumask allocation Chao Peng 2015-07-09 15:16 ` Andrew Cooper @ 2015-07-10 14:29 ` Dario Faggioli 2015-07-10 14:47 ` Andrew Cooper 1 sibling, 1 reply; 11+ messages in thread From: Dario Faggioli @ 2015-07-10 14:29 UTC (permalink / raw) To: Chao Peng; +Cc: andrew.cooper3, boris.ostrovsky, keir, JBeulich, xen-devel [-- Attachment #1.1: Type: text/plain, Size: 4089 bytes --] On Thu, 2015-07-09 at 22:23 +0800, Chao Peng wrote: > For booting cpu, the socket number is not needed to be 0 so > it needs to be computed by cpu number. > This made my system boot, yes... but now it does not shutdown! :-/ I'm attaching the splat I see, reliably, on my testbox. I'll try have a look at this, but feel free (especially Chao) to preempt me, if you get to the solution quicker. Dario (XEN) ----[ Xen-4.6-unstable x86_64 debug=y Tainted: C ]---- (XEN) CPU: 0 (XEN) RIP: e008:[<ffff82d0801886aa>] cpu_smpboot_free+0x2b/0x255 (XEN) RFLAGS: 0000000000010206 CONTEXT: hypervisor (XEN) rax: ffff83032072a5c0 rbx: 00000000ffffffff rcx: 0000000000000000 (XEN) rdx: ffff82d08031ff00 rsi: 0000000000008008 rdi: 0000000000000001 (XEN) rbp: ffff8300dbaefd40 rsp: ffff8300dbaefd20 r8: ffff830320729df0 (XEN) r9: 00000000003206fd r10: 0000000000000001 r11: 0080000000000000 (XEN) r12: 0000000000000001 r13: ffff82d08029e348 r14: 0000000000008008 (XEN) r15: 0000000000008000 cr0: 000000008005003b cr4: 00000000000026e0 (XEN) cr3: 00000000dba9c000 cr2: ffff830b2072a5b8 (XEN) ds: 002b es: 002b fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) Xen stack trace from rsp=ffff8300dbaefd20: (XEN) 0000000000008008 0000000000000001 ffff82d08029f140 ffff82d08029e348 (XEN) ffff8300dbaefd70 ffff82d080189225 ffff82d08029f148 ffff82d08029f140 (XEN) ffff82d08029e348 0000000000008008 ffff8300dbaefdc0 ffff82d08011c019 (XEN) 0000000000000000 0000000000000001 ffff8300dbaefdb0 0000000000000000 (XEN) 0000000000000000 0000000000000001 ffff82d080334a88 ffffffffffffffff (XEN) ffff8300dbaefe00 ffff82d08010153e ffff8300dbaefdf0 ffff82d08029e340 (XEN) 0000000052414d44 0000000000000001 0000000000000001 ffff82d08028aca0 (XEN) ffff8300dbaefe30 ffff82d080101744 0000000000000000 0000000000000005 (XEN) ffff82d080334b60 ffff82d080334a88 ffff8300dbaefe80 ffff82d0801a8967 (XEN) ffff8300dbaefe60 ffff82d080165bee ffff82d080334a88 ffff830322da1400 (XEN) ffff8300dbb3b000 ffff82d080334b60 ffff82d080334a88 ffffffffffffffff (XEN) ffff8300dbaefea0 ffff82d080106212 ffff8300dbb3b1d0 0000000000000000 (XEN) ffff8300dbaefec0 ffff82d08012f8ae ffff8300dbaefec0 ffff82d080334b70 (XEN) ffff8300dbaefef0 ffff82d08012fbe4 0000000cdb9d803e ffff8300dbae8000 (XEN) 0000000cdb9d803e ffff8300dbdf4000 ffff8300dbaeff10 ffff82d0801617e0 (XEN) ffff82d08012cb4c ffff8300dbdf4000 ffff8300dbaefe10 00000000001d6000 (XEN) 00000000ffffffed 00000000001d6000 0000000000000000 ffff880012ae3eb0 (XEN) 0000000000000000 0000000000000246 0000000000000040 0000000000000000 (XEN) 00000000000000d2 0000000000000000 ffffffff810013aa 0100000000000000 (XEN) 00000000deadbeef 00000000deadbeef 0000010000000000 ffffffff810013aa (XEN) Xen call trace: (XEN) [<ffff82d0801886aa>] cpu_smpboot_free+0x2b/0x255 (XEN) [<ffff82d080189225>] cpu_smpboot_callback+0x317/0x327 (XEN) [<ffff82d08011c019>] notifier_call_chain+0x67/0x87 (XEN) [<ffff82d08010153e>] cpu_down+0xd9/0x12c (XEN) [<ffff82d080101744>] disable_nonboot_cpus+0x93/0x138 (XEN) [<ffff82d0801a8967>] enter_state_helper+0xbd/0x365 (XEN) [<ffff82d080106212>] continue_hypercall_tasklet_handler+0x4a/0xb1 (XEN) [<ffff82d08012f8ae>] do_tasklet_work+0x78/0xab (XEN) [<ffff82d08012fbe4>] do_tasklet+0x5e/0x8a (XEN) [<ffff82d0801617e0>] idle_loop+0x56/0x6b (XEN) (XEN) Pagetable walk from ffff830b2072a5b8: (XEN) L4[0x106] = 00000000dba9a063 ffffffffffffffff (XEN) L3[0x02c] = 0000000000000000 ffffffffffffffff (XEN) (XEN) **************************************** (XEN) Panic on CPU 0: (XEN) FATAL PAGE FAULT (XEN) [error_code=0000] (XEN) Faulting linear address: ffff830b2072a5b8 (XEN) **************************************** -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) [-- Attachment #1.2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 181 bytes --] [-- Attachment #2: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v3] x86: correct socket_cpumask allocation 2015-07-10 14:29 ` Dario Faggioli @ 2015-07-10 14:47 ` Andrew Cooper 2015-07-10 14:57 ` Dario Faggioli 0 siblings, 1 reply; 11+ messages in thread From: Andrew Cooper @ 2015-07-10 14:47 UTC (permalink / raw) To: Dario Faggioli, Chao Peng; +Cc: boris.ostrovsky, keir, JBeulich, xen-devel On 10/07/15 15:29, Dario Faggioli wrote: > On Thu, 2015-07-09 at 22:23 +0800, Chao Peng wrote: >> For booting cpu, the socket number is not needed to be 0 so >> it needs to be computed by cpu number. >> > This made my system boot, yes... but now it does not shutdown! :-/ > > I'm attaching the splat I see, reliably, on my testbox. > > I'll try have a look at this, but feel free (especially Chao) to preempt > me, if you get to the solution quicker. > > Dario > > (XEN) ----[ Xen-4.6-unstable x86_64 debug=y Tainted: C ]---- > (XEN) CPU: 0 > (XEN) RIP: e008:[<ffff82d0801886aa>] cpu_smpboot_free+0x2b/0x255 > (XEN) RFLAGS: 0000000000010206 CONTEXT: hypervisor > (XEN) rax: ffff83032072a5c0 rbx: 00000000ffffffff rcx: 0000000000000000 > (XEN) rdx: ffff82d08031ff00 rsi: 0000000000008008 rdi: 0000000000000001 > (XEN) rbp: ffff8300dbaefd40 rsp: ffff8300dbaefd20 r8: ffff830320729df0 > (XEN) r9: 00000000003206fd r10: 0000000000000001 r11: 0080000000000000 > (XEN) r12: 0000000000000001 r13: ffff82d08029e348 r14: 0000000000008008 > (XEN) r15: 0000000000008000 cr0: 000000008005003b cr4: 00000000000026e0 > (XEN) cr3: 00000000dba9c000 cr2: ffff830b2072a5b8 > (XEN) ds: 002b es: 002b fs: 0000 gs: 0000 ss: e010 cs: e008 > (XEN) Xen stack trace from rsp=ffff8300dbaefd20: > (XEN) 0000000000008008 0000000000000001 ffff82d08029f140 ffff82d08029e348 > (XEN) ffff8300dbaefd70 ffff82d080189225 ffff82d08029f148 ffff82d08029f140 > (XEN) ffff82d08029e348 0000000000008008 ffff8300dbaefdc0 ffff82d08011c019 > (XEN) 0000000000000000 0000000000000001 ffff8300dbaefdb0 0000000000000000 > (XEN) 0000000000000000 0000000000000001 ffff82d080334a88 ffffffffffffffff > (XEN) ffff8300dbaefe00 ffff82d08010153e ffff8300dbaefdf0 ffff82d08029e340 > (XEN) 0000000052414d44 0000000000000001 0000000000000001 ffff82d08028aca0 > (XEN) ffff8300dbaefe30 ffff82d080101744 0000000000000000 0000000000000005 > (XEN) ffff82d080334b60 ffff82d080334a88 ffff8300dbaefe80 ffff82d0801a8967 > (XEN) ffff8300dbaefe60 ffff82d080165bee ffff82d080334a88 ffff830322da1400 > (XEN) ffff8300dbb3b000 ffff82d080334b60 ffff82d080334a88 ffffffffffffffff > (XEN) ffff8300dbaefea0 ffff82d080106212 ffff8300dbb3b1d0 0000000000000000 > (XEN) ffff8300dbaefec0 ffff82d08012f8ae ffff8300dbaefec0 ffff82d080334b70 > (XEN) ffff8300dbaefef0 ffff82d08012fbe4 0000000cdb9d803e ffff8300dbae8000 > (XEN) 0000000cdb9d803e ffff8300dbdf4000 ffff8300dbaeff10 ffff82d0801617e0 > (XEN) ffff82d08012cb4c ffff8300dbdf4000 ffff8300dbaefe10 00000000001d6000 > (XEN) 00000000ffffffed 00000000001d6000 0000000000000000 ffff880012ae3eb0 > (XEN) 0000000000000000 0000000000000246 0000000000000040 0000000000000000 > (XEN) 00000000000000d2 0000000000000000 ffffffff810013aa 0100000000000000 > (XEN) 00000000deadbeef 00000000deadbeef 0000010000000000 ffffffff810013aa > (XEN) Xen call trace: > (XEN) [<ffff82d0801886aa>] cpu_smpboot_free+0x2b/0x255 > (XEN) [<ffff82d080189225>] cpu_smpboot_callback+0x317/0x327 > (XEN) [<ffff82d08011c019>] notifier_call_chain+0x67/0x87 > (XEN) [<ffff82d08010153e>] cpu_down+0xd9/0x12c > (XEN) [<ffff82d080101744>] disable_nonboot_cpus+0x93/0x138 > (XEN) [<ffff82d0801a8967>] enter_state_helper+0xbd/0x365 > (XEN) [<ffff82d080106212>] continue_hypercall_tasklet_handler+0x4a/0xb1 > (XEN) [<ffff82d08012f8ae>] do_tasklet_work+0x78/0xab > (XEN) [<ffff82d08012fbe4>] do_tasklet+0x5e/0x8a > (XEN) [<ffff82d0801617e0>] idle_loop+0x56/0x6b > (XEN) > (XEN) Pagetable walk from ffff830b2072a5b8: > (XEN) L4[0x106] = 00000000dba9a063 ffffffffffffffff > (XEN) L3[0x02c] = 0000000000000000 ffffffffffffffff > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 0: > (XEN) FATAL PAGE FAULT > (XEN) [error_code=0000] > (XEN) Faulting linear address: ffff830b2072a5b8 > (XEN) **************************************** Confirmed - this is also an issue for me. It is only shutdown; reboot appears to be fine. The splat, in case it highlights any further information: Storage is finalized. [ 102.874913] Power down. (XEN) [ 106.477710] Preparing system for ACPI S5 state. (XEN) [ 106.477725] Disabling non-boot CPUs ... (XEN) [ 106.478770] Broke affinity for irq 16 (XEN) [ 106.478780] Broke affinity for irq 17 (XEN) [ 106.478789] Broke affinity for irq 20 (XEN) [ 106.479883] ----[ Xen-4.6.0-xs103036-d x86_64 debug=y Not tainted ]---- (XEN) [ 106.479888] CPU: 0 (XEN) [ 106.479892] RIP: e008:[<ffff82d08018b7b9>] cpu_smpboot_free+0x28/0x24c (XEN) [ 106.479904] RFLAGS: 0000000000010206 CONTEXT: hypervisor (XEN) [ 106.479911] rax: ffff83007fdb9e60 rbx: 00000000ffffffff rcx: 0000000000010001 (XEN) [ 106.479917] rdx: ffff82d080365480 rsi: 0000000000008008 rdi: 0000000000000001 (XEN) [ 106.479922] rbp: ffff83007fd07d60 rsp: ffff83007fd07d40 r8: 0000000000000000 (XEN) [ 106.479927] r9: 0000000000000001 r10: ffff82e000ff5cc0 r11: 0000000000000001 (XEN) [ 106.479932] r12: 0000000000000001 r13: 0000000000008008 r14: ffff82d0802a3ca8 (XEN) [ 106.479937] r15: 0000000000008000 cr0: 000000008005003b cr4: 00000000000026e0 (XEN) [ 106.479942] cr3: 000000007fca1000 cr2: ffff83087fdb9e58 (XEN) [ 106.479947] ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) [ 106.479952] Xen stack trace from rsp=ffff83007fd07d40: (XEN) [ 106.479956] ffff82d0802a3ca8 0000000000000001 ffff82d0802a4820 0000000000008008 (XEN) [ 106.479965] ffff83007fd07d90 ffff82d08018c479 ffff82d0802a4828 ffff82d0802a4820 (XEN) [ 106.479974] 0000000000008008 ffff82d0802a3ca8 ffff83007fd07de0 ffff82d08011ed7a (XEN) [ 106.479983] 0000000000000000 0000000000000001 ffff83007fd07dd0 0000000000000000 (XEN) [ 106.479990] 0000000000000001 0000000000000000 ffff83007fdbb000 ffff83007fae4000 (XEN) [ 106.479999] ffff83007fd07e20 ffff82d08010327e ffff83007fd07e10 ffff82d0802a3ca0 (XEN) [ 106.480007] 0000000000000008 0000000000000001 0000000000000001 ffff82d08028f920 (XEN) [ 106.480015] ffff83007fd07e50 ffff82d080103485 0000000000000000 0000000000000005 (XEN) [ 106.480023] ffff82d08038bfe0 ffff83007fdbb000 ffff83007fd07e80 ffff82d0801abbb9 (XEN) [ 106.480032] ffff8300700b9680 ffff83007fdfc000 ffff82d08038bfe0 ffff83007fdbb000 (XEN) [ 106.480041] ffff83007fd07ea0 ffff82d0801082cf ffff83007fdfc1d0 0000000000000000 (XEN) [ 106.480050] ffff83007fd07ec0 ffff82d08013278d ffff83007fd07ec0 ffff82d08038bff0 (XEN) [ 106.480059] ffff83007fd07ef0 ffff82d080132abc ffff82d08012f992 ffff83007fd00000 (XEN) [ 106.480068] ffff83007fdfc000 00000000ffffffff ffff83007fd07f10 ffff82d08016486b (XEN) [ 106.480077] ffff82d08012f9ea ffff83007fdba000 ffff83007fd07da8 00000000fee1dead (XEN) [ 106.480085] 0000000000000801 0000000000000005 0000000000002801 ffff88002543dd78 (XEN) [ 106.480093] ffffffff81a73208 0000000000000246 666f5f7265776f70 000000000000034b (XEN) [ 106.480102] 0000000000000005 0000000000000000 ffffffff810010ea 0000000000002801 (XEN) [ 106.480110] 0000000000002801 00000000deadbeef 0000010000000000 ffffffff810010ea (XEN) [ 106.480118] 000000000000e033 0000000000000246 ffff88002543dcd0 000000000000e02b (XEN) [ 106.480127] Xen call trace: (XEN) [ 106.480132] [<ffff82d08018b7b9>] cpu_smpboot_free+0x28/0x24c (XEN) [ 106.480139] [<ffff82d08018c479>] cpu_smpboot_callback+0x424/0x444 (XEN) [ 106.480146] [<ffff82d08011ed7a>] notifier_call_chain+0x6a/0x90 (XEN) [ 106.480151] [<ffff82d08010327e>] cpu_down+0xc9/0x11d (XEN) [ 106.480157] [<ffff82d080103485>] disable_nonboot_cpus+0x91/0x13e (XEN) [ 106.480163] [<ffff82d0801abbb9>] enter_state_helper+0xb7/0x376 (XEN) [ 106.480170] [<ffff82d0801082cf>] continue_hypercall_tasklet_handler+0x4a/0xb1 (XEN) [ 106.480176] [<ffff82d08013278d>] do_tasklet_work+0x78/0xab (XEN) [ 106.480181] [<ffff82d080132abc>] do_tasklet+0x5e/0x8a (XEN) [ 106.480187] [<ffff82d08016486b>] idle_loop+0x56/0x70 (XEN) [ 106.480191] (XEN) [ 106.480195] Pagetable walk from ffff83087fdb9e58: (XEN) [ 106.480200] L4[0x106] = 000000007fc9f063 ffffffffffffffff (XEN) [ 106.480205] L3[0x021] = 0000000000000000 ffffffffffffffff (XEN) [ 106.840985] (XEN) [ 106.842975] **************************************** (XEN) [ 106.848426] Panic on CPU 0: (XEN) [ 106.851716] FATAL PAGE FAULT (XEN) [ 106.855090] [error_code=0000] (XEN) [ 106.858554] Faulting linear address: ffff83087fdb9e58 (XEN) [ 106.864090] **************************************** (XEN) [ 106.869544] (XEN) [ 106.871534] Reboot in five seconds... ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v3] x86: correct socket_cpumask allocation 2015-07-10 14:47 ` Andrew Cooper @ 2015-07-10 14:57 ` Dario Faggioli 2015-07-10 15:13 ` Jan Beulich 0 siblings, 1 reply; 11+ messages in thread From: Dario Faggioli @ 2015-07-10 14:57 UTC (permalink / raw) To: Andrew Cooper; +Cc: Chao Peng, boris.ostrovsky, keir, JBeulich, xen-devel [-- Attachment #1.1: Type: text/plain, Size: 2384 bytes --] On Fri, 2015-07-10 at 15:47 +0100, Andrew Cooper wrote: > On 10/07/15 15:29, Dario Faggioli wrote: > > (XEN) Xen call trace: > > (XEN) [<ffff82d0801886aa>] cpu_smpboot_free+0x2b/0x255 > > (XEN) [<ffff82d080189225>] cpu_smpboot_callback+0x317/0x327 > > (XEN) [<ffff82d08011c019>] notifier_call_chain+0x67/0x87 > > (XEN) [<ffff82d08010153e>] cpu_down+0xd9/0x12c > > (XEN) [<ffff82d080101744>] disable_nonboot_cpus+0x93/0x138 > > (XEN) [<ffff82d0801a8967>] enter_state_helper+0xbd/0x365 > > (XEN) [<ffff82d080106212>] continue_hypercall_tasklet_handler+0x4a/0xb1 > > (XEN) [<ffff82d08012f8ae>] do_tasklet_work+0x78/0xab > > (XEN) [<ffff82d08012fbe4>] do_tasklet+0x5e/0x8a > > (XEN) [<ffff82d0801617e0>] idle_loop+0x56/0x6b > > (XEN) > > (XEN) Pagetable walk from ffff830b2072a5b8: > > (XEN) L4[0x106] = 00000000dba9a063 ffffffffffffffff > > (XEN) L3[0x02c] = 0000000000000000 ffffffffffffffff > > (XEN) > > (XEN) **************************************** > > (XEN) Panic on CPU 0: > > (XEN) FATAL PAGE FAULT > > (XEN) [error_code=0000] > > (XEN) Faulting linear address: ffff830b2072a5b8 > > (XEN) **************************************** > > Confirmed - this is also an issue for me. It is only shutdown; reboot > appears to be fine. > I added this printk: printk(" cpu=%u cpu_to_socket=%u\n", cpu, socket); Right before this chunk of code, in cpu_smpboot_free(): if ( cpumask_empty(socket_cpumask[socket]) ) { xfree(socket_cpumask[socket]); socket_cpumask[socket] = NULL; } And it says this: ... (XEN) Preparing system for ACPI S5 state. (XEN) Disabling non-boot CPUs ... (XEN) Broke affinity for irq 9 (XEN) cpu=1 cpu_to_socket=4294967295 (XEN) ----[ Xen-4.6-unstable x86_64 debug=y Tainted: C ]---- (XEN) CPU: 0 (XEN) RIP: e008:[<ffff82d0801886c2>] cpu_smpboot_free+0x43/0x28b ... I.e., it looks like phys_proc_id has already been reset to XEN_INVALID_SOCKET_ID, as we're kind-of racing with remove_siblinginfo(). I'll keep looking, but not for too long... Chao, ping? Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) [-- Attachment #1.2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 181 bytes --] [-- Attachment #2: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v3] x86: correct socket_cpumask allocation 2015-07-10 14:57 ` Dario Faggioli @ 2015-07-10 15:13 ` Jan Beulich 2015-07-10 15:25 ` Jan Beulich 2015-07-10 15:33 ` Dario Faggioli 0 siblings, 2 replies; 11+ messages in thread From: Jan Beulich @ 2015-07-10 15:13 UTC (permalink / raw) To: Dario Faggioli, Chao Peng; +Cc: Andrew Cooper, boris.ostrovsky, keir, xen-devel >>> On 10.07.15 at 16:57, <dario.faggioli@citrix.com> wrote: > I added this printk: > > printk(" cpu=%u cpu_to_socket=%u\n", cpu, socket); > > Right before this chunk of code, in cpu_smpboot_free(): > > if ( cpumask_empty(socket_cpumask[socket]) ) > { > xfree(socket_cpumask[socket]); > socket_cpumask[socket] = NULL; > } > > And it says this: > > ... > (XEN) Preparing system for ACPI S5 state. > (XEN) Disabling non-boot CPUs ... > (XEN) Broke affinity for irq 9 > (XEN) cpu=1 cpu_to_socket=4294967295 > (XEN) ----[ Xen-4.6-unstable x86_64 debug=y Tainted: C ]---- > (XEN) CPU: 0 > (XEN) RIP: e008:[<ffff82d0801886c2>] cpu_smpboot_free+0x43/0x28b > ... > > I.e., it looks like phys_proc_id has already been reset to > XEN_INVALID_SOCKET_ID, as we're kind-of racing with > remove_siblinginfo(). Right. We have cpu_down() stop_machine_run(take_cpu_down, ...) notifier_call_chain(&cpu_chain, CPU_DYING, ...) __cpu_disable() remove_siblinginfo() __cpu_die() notifier_call_chain(&cpu_chain, CPU_DEAD, ...) cpu_smpboot_free() I.e. a clear use-after-invalidate. Jan ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v3] x86: correct socket_cpumask allocation 2015-07-10 15:13 ` Jan Beulich @ 2015-07-10 15:25 ` Jan Beulich 2015-07-10 16:03 ` Dario Faggioli 2015-07-10 15:33 ` Dario Faggioli 1 sibling, 1 reply; 11+ messages in thread From: Jan Beulich @ 2015-07-10 15:25 UTC (permalink / raw) To: Dario Faggioli, Chao Peng; +Cc: Andrew Cooper, boris.ostrovsky, keir, xen-devel >>> On 10.07.15 at 17:13, <JBeulich@suse.com> wrote: >>>> On 10.07.15 at 16:57, <dario.faggioli@citrix.com> wrote: >> I added this printk: >> >> printk(" cpu=%u cpu_to_socket=%u\n", cpu, socket); >> >> Right before this chunk of code, in cpu_smpboot_free(): >> >> if ( cpumask_empty(socket_cpumask[socket]) ) >> { >> xfree(socket_cpumask[socket]); >> socket_cpumask[socket] = NULL; >> } >> >> And it says this: >> >> ... >> (XEN) Preparing system for ACPI S5 state. >> (XEN) Disabling non-boot CPUs ... >> (XEN) Broke affinity for irq 9 >> (XEN) cpu=1 cpu_to_socket=4294967295 >> (XEN) ----[ Xen-4.6-unstable x86_64 debug=y Tainted: C ]---- >> (XEN) CPU: 0 >> (XEN) RIP: e008:[<ffff82d0801886c2>] cpu_smpboot_free+0x43/0x28b >> ... >> >> I.e., it looks like phys_proc_id has already been reset to >> XEN_INVALID_SOCKET_ID, as we're kind-of racing with >> remove_siblinginfo(). > > Right. We have > > cpu_down() > stop_machine_run(take_cpu_down, ...) > notifier_call_chain(&cpu_chain, CPU_DYING, ...) > __cpu_disable() > remove_siblinginfo() > __cpu_die() > notifier_call_chain(&cpu_chain, CPU_DEAD, ...) > cpu_smpboot_free() > > I.e. a clear use-after-invalidate. And I can't see a reason why we shouldn't be able to defer invoking remove_siblinginfo() until cpu_smpboot_free(): Other than its counterpart (set_cpu_sibling_map()) it doesn't require to be run on the subject CPU (that function itself doesn't appear to depend on that either, but it depends on identify_cpu() having run). Which would at once allow reducing code: The clearing of cpu_{core,sibling}_mask then becomes redundant with the freeing of these masks. Of course there may be hidden dependencies, so maybe a safer approach would be to just move the zapping of the three IDs (and maybe the clearing of the CPU's cpu_sibling_setup_map bit) into cpu_smpboot_free(). Jan ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v3] x86: correct socket_cpumask allocation 2015-07-10 15:25 ` Jan Beulich @ 2015-07-10 16:03 ` Dario Faggioli 2015-07-13 3:19 ` Chao Peng 0 siblings, 1 reply; 11+ messages in thread From: Dario Faggioli @ 2015-07-10 16:03 UTC (permalink / raw) To: Jan Beulich; +Cc: Chao Peng, boris.ostrovsky, xen-devel, keir, Andrew Cooper [-- Attachment #1.1: Type: text/plain, Size: 4612 bytes --] On Fri, 2015-07-10 at 16:25 +0100, Jan Beulich wrote: > >>> On 10.07.15 at 17:13, <JBeulich@suse.com> wrote: > > cpu_down() > > stop_machine_run(take_cpu_down, ...) > > notifier_call_chain(&cpu_chain, CPU_DYING, ...) > > __cpu_disable() > > remove_siblinginfo() > > __cpu_die() > > notifier_call_chain(&cpu_chain, CPU_DEAD, ...) > > cpu_smpboot_free() > > > > I.e. a clear use-after-invalidate. > > And I can't see a reason why we shouldn't be able to defer invoking > remove_siblinginfo() until cpu_smpboot_free(): Other than its > counterpart (set_cpu_sibling_map()) it doesn't require to be run on > the subject CPU (that function itself doesn't appear to depend on > that either, but it depends on identify_cpu() having run). Which > would at once allow reducing code: The clearing of > cpu_{core,sibling}_mask then becomes redundant with the freeing > of these masks. > > Of course there may be hidden dependencies, so maybe a safer > approach would be to just move the zapping of the three IDs > (and maybe the clearing of the CPU's cpu_sibling_setup_map bit) > into cpu_smpboot_free(). > cpu_smpboot_free FWIW, I've tried the patch below (call remove_siblinginfo() from cpu_smpboot_free()), and it works for me. I've tested both shutdown and ACPI S3 suspend. I've got to go now, so I guess I'm leaving it to Chao whether to pick it up, or go with the other approach Jan suggested (or something else). Dario --- diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c index 0f03364..9fb027c 100644 --- a/xen/arch/x86/smpboot.c +++ b/xen/arch/x86/smpboot.c @@ -663,6 +663,30 @@ void cpu_exit_clear(unsigned int cpu) set_cpu_state(CPU_STATE_DEAD); } +static void +remove_siblinginfo(int cpu) +{ + int sibling; + struct cpuinfo_x86 *c = cpu_data; + + cpumask_clear_cpu(cpu, socket_cpumask[cpu_to_socket(cpu)]); + + for_each_cpu ( sibling, per_cpu(cpu_core_mask, cpu) ) + { + cpumask_clear_cpu(cpu, per_cpu(cpu_core_mask, sibling)); + /* Last thread sibling in this cpu core going down. */ + if ( cpumask_weight(per_cpu(cpu_sibling_mask, cpu)) == 1 ) + c[sibling].booted_cores--; + } +cpu_smpboot_free + for_each_cpu(sibling, per_cpu(cpu_sibling_mask, cpu)) + cpumask_clear_cpu(cpu, per_cpu(cpu_sibling_mask, sibling)); + c[cpu].phys_proc_id = XEN_INVALID_SOCKET_ID; + c[cpu].cpu_core_id = XEN_INVALID_CORE_ID; + c[cpu].compute_unit_id = INVALID_CUID; + cpumask_clear_cpu(cpu, &cpu_sibling_setup_map); +} + static void cpu_smpboot_free(unsigned int cpu) { unsigned int order, socket = cpu_to_socket(cpu); @@ -673,6 +697,8 @@ static void cpu_smpboot_free(unsigned int cpu) socket_cpumask[socket] = NULL; } + remove_siblinginfo(cpu); + free_cpumask_var(per_cpu(cpu_sibling_mask, cpu)); free_cpumask_var(per_cpu(cpu_core_mask, cpu)); @@ -878,32 +904,6 @@ void __init smp_prepare_boot_cpu(void) cpumask_set_cpu(smp_processor_id(), &cpu_present_map); } -static void -remove_siblinginfo(int cpu) -{ - int sibling; - struct cpuinfo_x86 *c = cpu_data; - - cpumask_clear_cpu(cpu, socket_cpumask[cpu_to_socket(cpu)]); - - for_each_cpu ( sibling, per_cpu(cpu_core_mask, cpu) ) - { - cpumask_clear_cpu(cpu, per_cpu(cpu_core_mask, sibling)); - /* Last thread sibling in this cpu core going down. */ - if ( cpumask_weight(per_cpu(cpu_sibling_mask, cpu)) == 1 ) - c[sibling].booted_cores--; - } - - for_each_cpu(sibling, per_cpu(cpu_sibling_mask, cpu)) - cpumask_clear_cpu(cpu, per_cpu(cpu_sibling_mask, sibling)); - cpumask_clear(per_cpu(cpu_sibling_mask, cpu)); - cpumask_clear(per_cpu(cpu_core_mask, cpu)); - c[cpu].phys_proc_id = XEN_INVALID_SOCKET_ID; - c[cpu].cpu_core_id = XEN_INVALID_CORE_ID; - c[cpu].compute_unit_id = INVALID_CUID; - cpumask_clear_cpu(cpu, &cpu_sibling_setup_map); -} - void __cpu_disable(void) { int cpu = smp_processor_id(); @@ -919,8 +919,6 @@ void __cpu_disable(void) time_suspend(); - remove_siblinginfo(cpu); - /* It's now safe to remove this processor from the online map */ cpumask_clear_cpu(cpu, &cpu_online_map); fixup_irqs(); -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) [-- Attachment #1.2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 181 bytes --] [-- Attachment #2: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v3] x86: correct socket_cpumask allocation 2015-07-10 16:03 ` Dario Faggioli @ 2015-07-13 3:19 ` Chao Peng 0 siblings, 0 replies; 11+ messages in thread From: Chao Peng @ 2015-07-13 3:19 UTC (permalink / raw) To: Dario Faggioli Cc: Andrew Cooper, boris.ostrovsky, keir, Jan Beulich, xen-devel On Fri, Jul 10, 2015 at 06:03:54PM +0200, Dario Faggioli wrote: > On Fri, 2015-07-10 at 16:25 +0100, Jan Beulich wrote: > > >>> On 10.07.15 at 17:13, <JBeulich@suse.com> wrote: > > > > cpu_down() > > > stop_machine_run(take_cpu_down, ...) > > > notifier_call_chain(&cpu_chain, CPU_DYING, ...) > > > __cpu_disable() > > > remove_siblinginfo() > > > __cpu_die() > > > notifier_call_chain(&cpu_chain, CPU_DEAD, ...) > > > cpu_smpboot_free() > > > > > > I.e. a clear use-after-invalidate. > > > > And I can't see a reason why we shouldn't be able to defer invoking > > remove_siblinginfo() until cpu_smpboot_free(): Other than its > > counterpart (set_cpu_sibling_map()) it doesn't require to be run on > > the subject CPU (that function itself doesn't appear to depend on > > that either, but it depends on identify_cpu() having run). Which > > would at once allow reducing code: The clearing of > > cpu_{core,sibling}_mask then becomes redundant with the freeing > > of these masks. > > > > Of course there may be hidden dependencies, so maybe a safer > > approach would be to just move the zapping of the three IDs > > (and maybe the clearing of the CPU's cpu_sibling_setup_map bit) > > into cpu_smpboot_free(). > > cpu_smpboot_free > FWIW, I've tried the patch below (call remove_siblinginfo() from > cpu_smpboot_free()), and it works for me. > > I've tested both shutdown and ACPI S3 suspend. > > I've got to go now, so I guess I'm leaving it to Chao whether to pick it > up, or go with the other approach Jan suggested (or something else). I have no problem if Jan agreed and perhaps especially the OSS test can pass that (as it already looks tricky for this part of code). I can pick this change and send a patch, with the additional change of moving "cpumask_clear_cpu(cpu, socket_cpumask[cpu_to_socket(cpu)])" from remove_siblinginfo() to cpu_smpboot_free(). Chao ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v3] x86: correct socket_cpumask allocation 2015-07-10 15:13 ` Jan Beulich 2015-07-10 15:25 ` Jan Beulich @ 2015-07-10 15:33 ` Dario Faggioli 1 sibling, 0 replies; 11+ messages in thread From: Dario Faggioli @ 2015-07-10 15:33 UTC (permalink / raw) To: Jan Beulich; +Cc: Chao Peng, boris.ostrovsky, xen-devel, keir, Andrew Cooper [-- Attachment #1.1: Type: text/plain, Size: 1519 bytes --] On Fri, 2015-07-10 at 16:13 +0100, Jan Beulich wrote: > >>> On 10.07.15 at 16:57, <dario.faggioli@citrix.com> wrote: > > ... > > (XEN) Preparing system for ACPI S5 state. > > (XEN) Disabling non-boot CPUs ... > > (XEN) Broke affinity for irq 9 > > (XEN) cpu=1 cpu_to_socket=4294967295 > > (XEN) ----[ Xen-4.6-unstable x86_64 debug=y Tainted: C ]---- > > (XEN) CPU: 0 > > (XEN) RIP: e008:[<ffff82d0801886c2>] cpu_smpboot_free+0x43/0x28b > > ... > > > > I.e., it looks like phys_proc_id has already been reset to > > XEN_INVALID_SOCKET_ID, as we're kind-of racing with > > remove_siblinginfo(). > > Right. We have > > cpu_down() > stop_machine_run(take_cpu_down, ...) > notifier_call_chain(&cpu_chain, CPU_DYING, ...) > __cpu_disable() > remove_siblinginfo() > __cpu_die() > notifier_call_chain(&cpu_chain, CPU_DEAD, ...) > cpu_smpboot_free() > > I.e. a clear use-after-invalidate. > Exactly. I don't have a box with CAT, but on one, I expect similar problems to happen in: psr_cpu_fini() cat_cpu_fini() --> unsigned int socket = cpu_to_socket(cpu); as that also runs from CPU_DEAD. :-/ I guess this haven't seen any "let's shut the host down" kind of testing... Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) [-- Attachment #1.2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 181 bytes --] [-- Attachment #2: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2015-07-13 3:19 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-07-09 14:23 [PATCH v3] x86: correct socket_cpumask allocation Chao Peng 2015-07-09 15:16 ` Andrew Cooper 2015-07-09 15:36 ` Boris Ostrovsky 2015-07-10 14:29 ` Dario Faggioli 2015-07-10 14:47 ` Andrew Cooper 2015-07-10 14:57 ` Dario Faggioli 2015-07-10 15:13 ` Jan Beulich 2015-07-10 15:25 ` Jan Beulich 2015-07-10 16:03 ` Dario Faggioli 2015-07-13 3:19 ` Chao Peng 2015-07-10 15:33 ` Dario Faggioli
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).