xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Boris Ostrovsky <boris.ostrovsky@oracle.com>
To: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Elena Ufimtseva <elena.ufimtseva@oracle.com>,
	Wei Liu <wei.liu2@citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	David Vrabel <david.vrabel@citrix.com>,
	Jan Beulich <JBeulich@suse.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>
Subject: Re: PV-vNUMA issue: topology is misinterpreted by the guest
Date: Mon, 20 Jul 2015 10:43:03 -0400	[thread overview]
Message-ID: <55AD08F7.7020105@oracle.com> (raw)
In-Reply-To: <1437401354.5036.19.camel@citrix.com>

On 07/20/2015 10:09 AM, Dario Faggioli wrote:
> On Fri, 2015-07-17 at 14:17 -0400, Boris Ostrovsky wrote:
>> On 07/17/2015 03:27 AM, Dario Faggioli wrote:
>>> In the meanwhile, what should we do? Document this? How? "don't use
>>> vNUMA with PV guest in SMT enabled systems" seems a bit harsh... Is
>>> there a workaround we can put in place/suggest?
>> I haven't been able to reproduce this on my Intel box because I think I
>> have different core enumeration.
>>
> Yes, most likely, that's highly topology dependant. :-(
>
>> Can you try adding
>>     cpuid=['0x1:ebx=xxxxxxxx00000001xxxxxxxxxxxxxxxx']
>> to your config file?
>>
> Done (sorry for the delay, the testbox was busy doing other stuff).
>
> Still no joy (.101 is the IP address of the guest, domain id 3):
>
> root@Zhaman:~# ssh root@192.168.1.101 "yes > /dev/null 2>&1 &"
> root@Zhaman:~# ssh root@192.168.1.101 "yes > /dev/null 2>&1 &"
> root@Zhaman:~# ssh root@192.168.1.101 "yes > /dev/null 2>&1 &"
> root@Zhaman:~# ssh root@192.168.1.101 "yes > /dev/null 2>&1 &"
> root@Zhaman:~# xl vcpu-list 3
> Name                                ID  VCPU   CPU State   Time(s) Affinity (Hard / Soft)
> test                                 3     0    4   r--      23.6  all / 0-7
> test                                 3     1    9   r--      19.8  all / 0-7
> test                                 3     2    8   -b-       0.4  all / 8-15
> test                                 3     3    4   -b-       0.2  all / 8-15
>
> *HOWEVER* it seems to have an effect. In fact, now, topology as it is
> shown in /sys/... is different:
>
> root@test:~# cat /sys/devices/system/cpu/cpu0/topology/thread_siblings_list
> 0
> (it was 0-1)
>
> This, OTOH, is still the same:
> root@test:~# cat /sys/devices/system/cpu/cpu0/topology/core_siblings_list
> 0-3
>
> Also, I now see this:
>
> [    0.150560] ------------[ cut here ]------------
> [    0.150560] WARNING: CPU: 2 PID: 0 at ../arch/x86/kernel/smpboot.c:317 topology_sane.isra.2+0x74/0x88()
> [    0.150560] sched: CPU #2's llc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
> [    0.150560] Modules linked in:
> [    0.150560] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 3.19.0+ #1
> [    0.150560]  0000000000000009 ffff88001ee2fdd0 ffffffff81657c7b ffffffff810bbd2c
> [    0.150560]  ffff88001ee2fe20 ffff88001ee2fe10 ffffffff81081510 ffff88001ee2fea0
> [    0.150560]  ffffffff8103aa02 ffff88003ea0a001 0000000000000000 ffff88001f20a040
> [    0.150560] Call Trace:
> [    0.150560]  [<ffffffff81657c7b>] dump_stack+0x4f/0x7b
> [    0.150560]  [<ffffffff810bbd2c>] ? up+0x39/0x3e
> [    0.150560]  [<ffffffff81081510>] warn_slowpath_common+0xa1/0xbb
> [    0.150560]  [<ffffffff8103aa02>] ? topology_sane.isra.2+0x74/0x88
> [    0.150560]  [<ffffffff81081570>] warn_slowpath_fmt+0x46/0x48
> [    0.150560]  [<ffffffff8101eeb1>] ? __cpuid.constprop.0+0x15/0x19
> [    0.150560]  [<ffffffff8103aa02>] topology_sane.isra.2+0x74/0x88
> [    0.150560]  [<ffffffff8103acd0>] set_cpu_sibling_map+0x27a/0x444
> [    0.150560]  [<ffffffff81056ac3>] ? numa_add_cpu+0x98/0x9f
> [    0.150560]  [<ffffffff8100b8f2>] cpu_bringup+0x63/0xa8
> [    0.150560]  [<ffffffff8100b945>] cpu_bringup_and_idle+0xe/0x1a
> [    0.150560] ---[ end trace 63d204896cce9f68 ]---
>
> Notice that it now says 'llc-sibling', while, before, it was saying
> 'smt-sibling'.

Exactly. You are now passing the first topology test which was to see 
that threads are on the same node. And since each processor has only one 
thread (as evidenced by thread_siblings_list) we are good.

The second test checks that cores (i.e. things that share last level 
cache) are on the same node. And they are not.


>
>> On AMD, BTW, we fail a different test so some other bits probably need
>> to be tweaked. You may fail it too (the LLC sanity check).
>>
> Yep, that's the one I guess. Should I try something more/else?


I'll need to see how LLC IDs are calculated, probably also from some 
CPUID bits. The question though will be --- what do we do with how cache 
sizes (and TLB sizes for that matter) are presented to the guests. Do we 
scale them down per thread?

-boris

  reply	other threads:[~2015-07-20 14:43 UTC|newest]

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-16 10:32 PV-vNUMA issue: topology is misinterpreted by the guest Dario Faggioli
2015-07-16 10:47 ` Jan Beulich
2015-07-16 10:56   ` Andrew Cooper
2015-07-16 15:25     ` Wei Liu
2015-07-16 15:45       ` Andrew Cooper
2015-07-16 15:50         ` Boris Ostrovsky
2015-07-16 16:29           ` Jan Beulich
2015-07-16 16:39             ` Andrew Cooper
2015-07-16 16:59               ` Boris Ostrovsky
2015-07-17  6:09                 ` Jan Beulich
2015-07-17  7:27                   ` Dario Faggioli
2015-07-17  7:42                     ` Jan Beulich
2015-07-17  8:44                     ` Wei Liu
2015-07-17 18:17                     ` Boris Ostrovsky
2015-07-20 14:09                       ` Dario Faggioli
2015-07-20 14:43                         ` Boris Ostrovsky [this message]
2015-07-21 20:00                           ` Boris Ostrovsky
2015-07-22 13:36                             ` Dario Faggioli
2015-07-22 13:50                               ` Juergen Gross
2015-07-22 13:58                                 ` Boris Ostrovsky
2015-07-22 14:09                                   ` Juergen Gross
2015-07-22 14:44                                     ` Boris Ostrovsky
2015-07-23  4:43                                       ` Juergen Gross
2015-07-23  7:28                                         ` Jan Beulich
2015-07-23  9:42                                         ` Andrew Cooper
2015-07-23 14:07                                         ` Dario Faggioli
2015-07-23 14:13                                           ` Juergen Gross
2015-07-24 10:28                                           ` Juergen Gross
2015-07-24 14:44                                             ` Dario Faggioli
2015-07-24 15:14                                               ` Juergen Gross
2015-07-24 15:24                                                 ` Juergen Gross
2015-07-24 15:58                                                   ` Dario Faggioli
2015-07-24 16:09                                                     ` Konrad Rzeszutek Wilk
2015-07-24 16:14                                                       ` Dario Faggioli
2015-07-24 16:18                                                       ` Juergen Gross
2015-07-24 16:29                                                         ` Konrad Rzeszutek Wilk
2015-07-24 16:39                                                           ` Juergen Gross
2015-07-24 16:44                                                             ` Boris Ostrovsky
2015-07-27  4:35                                                               ` Juergen Gross
2015-07-27 10:43                                                                 ` George Dunlap
2015-07-27 10:54                                                                   ` Andrew Cooper
2015-07-27 11:13                                                                     ` Juergen Gross
2015-07-27 10:54                                                                   ` Juergen Gross
2015-07-27 11:11                                                                     ` George Dunlap
2015-07-27 12:01                                                                       ` Juergen Gross
2015-07-27 12:16                                                                         ` Tim Deegan
2015-07-27 13:23                                                                         ` Dario Faggioli
2015-07-27 14:02                                                                           ` Juergen Gross
2015-07-27 14:02                                                                       ` Dario Faggioli
2015-07-27 10:41                                                       ` George Dunlap
2015-07-27 10:49                                                         ` Andrew Cooper
2015-07-27 13:11                                                           ` Dario Faggioli
2015-07-24 16:10                                                     ` Juergen Gross
2015-07-24 16:40                                                       ` Boris Ostrovsky
2015-07-24 16:48                                                         ` Juergen Gross
2015-07-24 17:11                                                           ` Boris Ostrovsky
2015-07-27 13:40                                                             ` Dario Faggioli
2015-07-27  4:24                                                         ` Juergen Gross
2015-07-27 14:09                                                       ` Dario Faggioli
2015-07-27 14:34                                                         ` Boris Ostrovsky
2015-07-27 14:43                                                           ` Juergen Gross
2015-07-27 14:51                                                             ` Boris Ostrovsky
2015-07-27 15:03                                                               ` Juergen Gross
2015-07-27 14:47                                                           ` Juergen Gross
2015-07-27 14:58                                                           ` Dario Faggioli
2015-07-28  4:29                                                         ` Juergen Gross
2015-07-28 15:11                                                           ` Juergen Gross
2015-07-28 16:17                                                             ` Dario Faggioli
2015-07-28 17:13                                                               ` Dario Faggioli
2015-07-29  6:04                                                               ` Juergen Gross
2015-07-29  7:09                                                                 ` Dario Faggioli
2015-07-29  7:44                                                             ` Dario Faggioli
2015-07-24 16:05                                                 ` Dario Faggioli
2015-07-28 10:05                                                   ` Wei Liu
2015-07-28 15:17                                                     ` Dario Faggioli
2015-07-24 20:27                                               ` Elena Ufimtseva
2015-07-22 14:50                                     ` Dario Faggioli
2015-07-22 15:32                                       ` Boris Ostrovsky
2015-07-22 15:49                                         ` Dario Faggioli
2015-07-22 18:10                                           ` Boris Ostrovsky
2015-07-23  7:25                                             ` Jan Beulich
2015-07-24 16:03                                               ` Boris Ostrovsky
2015-07-23 13:46                                             ` Dario Faggioli
2015-07-17 10:17                 ` Andrew Cooper
2015-07-16 15:26 ` Wei Liu
2015-07-27 15:13 ` David Vrabel
2015-07-27 16:02   ` Dario Faggioli
2015-07-27 16:31     ` David Vrabel
2015-07-27 16:33       ` Andrew Cooper
2015-07-27 17:42         ` Dario Faggioli
2015-07-27 17:50           ` Konrad Rzeszutek Wilk
2015-07-27 23:19           ` Andrew Cooper
2015-07-28  3:52             ` Juergen Gross
2015-07-28  9:40               ` Andrew Cooper
2015-07-28  9:28             ` Dario Faggioli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55AD08F7.7020105@oracle.com \
    --to=boris.ostrovsky@oracle.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=dario.faggioli@citrix.com \
    --cc=david.vrabel@citrix.com \
    --cc=elena.ufimtseva@oracle.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).