All of lore.kernel.org
 help / color / mirror / Atom feed
From: Boris Ostrovsky <boris.ostrovsky@oracle.com>
To: Dario Faggioli <dario.faggioli@citrix.com>,
	Jan Beulich <JBeulich@suse.com>
Cc: Chao Peng <chao.p.peng@linux.intel.com>,
	xen-devel@lists.xen.org, keir@xen.org, andrew.cooper3@citrix.com
Subject: Re: [PATCH] x86: correct socket_cpumask allocation for AP
Date: Wed, 08 Jul 2015 14:24:34 -0400	[thread overview]
Message-ID: <559D6AE2.7040506@oracle.com> (raw)
In-Reply-To: <1436372256.22672.246.camel@citrix.com>

On 07/08/2015 12:17 PM, Dario Faggioli wrote:
> On Wed, 2015-07-08 at 16:38 +0100, Jan Beulich wrote:
>>>>> On 08.07.15 at 17:11, <dario.faggioli@citrix.com> wrote:
>>> On Wed, 2015-07-08 at 13:38 +0100, Jan Beulich wrote:
>>>>>>> On 08.07.15 at 11:36, <chao.p.peng@linux.intel.com> wrote:
>>>>> @@ -84,11 +85,21 @@ void *stack_base[NR_CPUS];
>>>>>   static void smp_store_cpu_info(int id)
>>>>>   {
>>>>>       struct cpuinfo_x86 *c = cpu_data + id;
>>>>> +    unsigned int socket;
>>>>>   
>>>>>       *c = boot_cpu_data;
>>>>>       if ( id != 0 )
>>>>> +    {
>>>>>           identify_cpu(c);
>>>>>   
>>>>> +        socket = cpu_to_socket(id);
>>>>> +        if ( !socket_cpumask[socket] )
>>>>> +        {
>>>>> +            socket_cpumask[socket] = secondary_socket_cpumask;
>>>>> +            secondary_socket_cpumask = NULL;
>>>> I don't think this will build with small enough NR_CPUS.
>>>>
>>> And it *does* *not* fix the issue on my box.
>> I.e. bad analysis (albeit it seemed correct to me)
>>
> Same here, and in fact I triple checked that I had the patch really
> applied... and, yes, it is, and it's still crashing, with the same
> (reported) dump as the one we find in Osstest's failure, as reported by
> Ian.
>
>>   _and_ new code not tested.
>>
> Looking another time, both me and Osstest are probably seeing a
> different issue, than the one Boris is facing. I don't see Boris' Oops,
> so I can't be sure, but in my case, this is happening in
> set_cpu_sibling_map(), called from smp_prepare_cpus() on the boot CPU,
> not during secondary CPUs bringup.

I see it from start_secondary():

...
(XEN) HVM: ASIDs enabled.
(XEN) HVM: VMX enabled
(XEN) HVM: Hardware Assisted Paging (HAP) detected
(XEN) HVM: HAP page sizes: 4kB, 2MB, 1GB
(XEN) ----[ Xen-4.6-unstable  x86_64  debug=y  Tainted:    C ]----
(XEN) CPU:    16
(XEN) RIP:    e008:[<ffff82d080189051>] set_cpu_sibling_map+0x65/0x37e
(XEN) RFLAGS: 0000000000010087   CONTEXT: hypervisor
(XEN) rax: 0000000000000000   rbx: 0000000000000000   rcx: 0000000000000010
(XEN) rdx: 0000000000000010   rsi: 00000033bc7c8400   rdi: 0000000000000010
(XEN) rbp: ffff83083be0fec0   rsp: ffff83083be0fe60   r8: ffff83083be0fe88
(XEN) r9:  0000000000014000   r10: ffff82cfffdfb0f0   r11: 0000000000000000
(XEN) r12: 0000000000000000   r13: 0000000000000009   r14: 0000000000000010
(XEN) r15: 0000000000000010   cr0: 000000008005003b   cr4: 00000000000426e0
(XEN) cr3: 00000000bd897000   cr2: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
(XEN) Xen stack trace from rsp=ffff83083be0fe60:
(XEN)    000000103be0fec0 0000000000000046 0000001000000000 0000008400000000
(XEN)    00000000000426e0 000000103caed000 0000000000000009 0000000000000000
(XEN)    0000000000000000 0000000000000009 0000000000000010 0000000000000010
(XEN)    ffff83083be0ff10 ffff82d08018958a 0000000000000000 0000001000000000
(XEN)    0000000000000000 0000000000000001 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000010 ffff8300bdce0000 00000033bc7c8400 0000000000000000
(XEN) Xen call trace:
(XEN)    [<ffff82d080189051>] set_cpu_sibling_map+0x65/0x37e
(XEN)    [<ffff82d08018958a>] start_secondary+0x220/0x277
(XEN)
(XEN) Pagetable walk from 0000000000000000:
(XEN)  L4[0x000] = 000000043ffef063 ffffffffffffffff
(XEN)  L3[0x000] = 000000043ffee063 ffffffffffffffff
(XEN)  L2[0x000] = 000000043ffed063 ffffffffffffffff
(XEN)  L1[0x000] = 0000000000000000 ffffffffffffffff
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 16:
(XEN) FATAL PAGE FAULT
(XEN) [error_code=0002]
(XEN) Faulting linear address: 0000000000000000
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...
(XEN) Resetting with ACPI MEMORY or I/O RESET_REG.





>
> I think it has to do with the fact that I've got CPU #0 on socket #1,
> while Boris' (and perhaps Chao's too) test box have it on socket #0.
>
> I'd be happy to test patches on my box, if that helps (although, I'm
> about to leave right now, so that will be tomorrow).
>
> Regards,
> Dario

  parent reply	other threads:[~2015-07-08 18:24 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-08  9:36 [PATCH] x86: correct socket_cpumask allocation for AP Chao Peng
2015-07-08 12:38 ` Jan Beulich
2015-07-08 15:11   ` Dario Faggioli
2015-07-08 15:38     ` Jan Beulich
2015-07-08 15:45       ` Boris Ostrovsky
2015-07-08 16:17       ` Dario Faggioli
2015-07-08 16:32         ` Jan Beulich
2015-07-09  1:58           ` Chao Peng
2015-07-09  8:39             ` Dario Faggioli
2015-07-08 18:24         ` Boris Ostrovsky [this message]
2015-07-09  1:47       ` Chao Peng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=559D6AE2.7040506@oracle.com \
    --to=boris.ostrovsky@oracle.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=chao.p.peng@linux.intel.com \
    --cc=dario.faggioli@citrix.com \
    --cc=keir@xen.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.