From: Juergen Gross <jgross@suse.com>
To: Boris Ostrovsky <boris.ostrovsky@oracle.com>,
Dario Faggioli <dario.faggioli@citrix.com>
Cc: Elena Ufimtseva <elena.ufimtseva@oracle.com>,
Wei Liu <wei.liu2@citrix.com>,
Andrew Cooper <andrew.cooper3@citrix.com>,
David Vrabel <david.vrabel@citrix.com>,
Jan Beulich <JBeulich@suse.com>,
"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>
Subject: Re: PV-vNUMA issue: topology is misinterpreted by the guest
Date: Thu, 23 Jul 2015 06:43:25 +0200 [thread overview]
Message-ID: <55B070ED.2040200@suse.com> (raw)
In-Reply-To: <55AFAC34.1060606@oracle.com>
On 07/22/2015 04:44 PM, Boris Ostrovsky wrote:
> On 07/22/2015 10:09 AM, Juergen Gross wrote:
>> On 07/22/2015 03:58 PM, Boris Ostrovsky wrote:
>>> On 07/22/2015 09:50 AM, Juergen Gross wrote:
>>>> On 07/22/2015 03:36 PM, Dario Faggioli wrote:
>>>>> On Tue, 2015-07-21 at 16:00 -0400, Boris Ostrovsky wrote:
>>>>>> On 07/20/2015 10:43 AM, Boris Ostrovsky wrote:
>>>>>>> On 07/20/2015 10:09 AM, Dario Faggioli wrote:
>>>>>
>>>>>>> I'll need to see how LLC IDs are calculated, probably also from some
>>>>>>> CPUID bits.
>>>>>>
>>>>>>
>>>>>> No, can't do this: LLC is calculated from CPUID leaf 4 (on Intel)
>>>>>> which
>>>>>> use indexes in ECX register and xl syntax doesn't allow you to
>>>>>> override
>>>>>> CPUIDs for such leaves.
>>>>>>
>>>>> Right. Which leaves us with the question of what should we do and/or
>>>>> recommend users to do?
>>>>>
>>>>> If there were a workaround that we could put in place, and document
>>>>> somewhere, however tricky it was, I'd say to go for it, and call it
>>>>> acceptable for now.
>>>>>
>>>>> But, if there isn't, should we disable PV vnuma, or warn the user that
>>>>> he may see issues? Can we identify, in Xen or in toolstack, whether an
>>>>> host topology will be problematic, and disable/warn in those cases
>>>>> too?
>>>>>
>>>>> I'm not sure, honestly. Disabling looks too aggressive, but it's an
>>>>> issue I wouldn't like an user to be facing, without at least being
>>>>> informed of the possibility... so, perhaps a (set of) warning(s)?
>>>>> Thoughts?
>>>>
>>>> I think we have 2 possible solutions:
>>>>
>>>> 1. Try to handle this all in the hypervisor via CPUID mangling.
>>>>
>>>> 2. Add PV-topology support to the guest and indicate this capability
>>>> via
>>>> elfnote; only enable PV-numa if this note is present.
>>>>
>>>> I'd prefer the second solution. If you are okay with this, I'd try
>>>> to do
>>>> some patches for the pvops kernel.
>
> Why do you think that kernel patches are preferable to CPUID management?
> This would be all in tools, I'd think. (Well, one problem that I can
> think of is that AMD sometimes pokes at MSRs and/or Northbridge's PCI
> registers to figure out nodeID --- that we may need to have to address
> in the hypervisor)
Doing it via CPUID is more HW specific. Trying to fake a topology for
the guest from outside might lead to weird decisions in the guest e.g.
regarding licenses based on socket counts.
If you are doing it in the guest itself you are able to address the
different problems (scheduling, licensing) in different ways.
> And those patches won't help HVM guests, will they? How would they be
> useful by user processes?
HVM can use pv interfaces as well. It's called pv-NUMA :-)
Hmm, I didn't think of user processes. Are you aware of cases where they
are to be considered? The only case where user processes are involved I
could think of is licensing again. Depending on the licensing model
playing with CPUID is either good or bad. I can even imagine the CPUID
configuration capabilities in xl are in use today for exactly this
purpose. Using them for pv-NUMA as well will make this feature unusable
for those users.
Juergen
next prev parent reply other threads:[~2015-07-23 4:43 UTC|newest]
Thread overview: 95+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-16 10:32 PV-vNUMA issue: topology is misinterpreted by the guest Dario Faggioli
2015-07-16 10:47 ` Jan Beulich
2015-07-16 10:56 ` Andrew Cooper
2015-07-16 15:25 ` Wei Liu
2015-07-16 15:45 ` Andrew Cooper
2015-07-16 15:50 ` Boris Ostrovsky
2015-07-16 16:29 ` Jan Beulich
2015-07-16 16:39 ` Andrew Cooper
2015-07-16 16:59 ` Boris Ostrovsky
2015-07-17 6:09 ` Jan Beulich
2015-07-17 7:27 ` Dario Faggioli
2015-07-17 7:42 ` Jan Beulich
2015-07-17 8:44 ` Wei Liu
2015-07-17 18:17 ` Boris Ostrovsky
2015-07-20 14:09 ` Dario Faggioli
2015-07-20 14:43 ` Boris Ostrovsky
2015-07-21 20:00 ` Boris Ostrovsky
2015-07-22 13:36 ` Dario Faggioli
2015-07-22 13:50 ` Juergen Gross
2015-07-22 13:58 ` Boris Ostrovsky
2015-07-22 14:09 ` Juergen Gross
2015-07-22 14:44 ` Boris Ostrovsky
2015-07-23 4:43 ` Juergen Gross [this message]
2015-07-23 7:28 ` Jan Beulich
2015-07-23 9:42 ` Andrew Cooper
2015-07-23 14:07 ` Dario Faggioli
2015-07-23 14:13 ` Juergen Gross
2015-07-24 10:28 ` Juergen Gross
2015-07-24 14:44 ` Dario Faggioli
2015-07-24 15:14 ` Juergen Gross
2015-07-24 15:24 ` Juergen Gross
2015-07-24 15:58 ` Dario Faggioli
2015-07-24 16:09 ` Konrad Rzeszutek Wilk
2015-07-24 16:14 ` Dario Faggioli
2015-07-24 16:18 ` Juergen Gross
2015-07-24 16:29 ` Konrad Rzeszutek Wilk
2015-07-24 16:39 ` Juergen Gross
2015-07-24 16:44 ` Boris Ostrovsky
2015-07-27 4:35 ` Juergen Gross
2015-07-27 10:43 ` George Dunlap
2015-07-27 10:54 ` Andrew Cooper
2015-07-27 11:13 ` Juergen Gross
2015-07-27 10:54 ` Juergen Gross
2015-07-27 11:11 ` George Dunlap
2015-07-27 12:01 ` Juergen Gross
2015-07-27 12:16 ` Tim Deegan
2015-07-27 13:23 ` Dario Faggioli
2015-07-27 14:02 ` Juergen Gross
2015-07-27 14:02 ` Dario Faggioli
2015-07-27 10:41 ` George Dunlap
2015-07-27 10:49 ` Andrew Cooper
2015-07-27 13:11 ` Dario Faggioli
2015-07-24 16:10 ` Juergen Gross
2015-07-24 16:40 ` Boris Ostrovsky
2015-07-24 16:48 ` Juergen Gross
2015-07-24 17:11 ` Boris Ostrovsky
2015-07-27 13:40 ` Dario Faggioli
2015-07-27 4:24 ` Juergen Gross
2015-07-27 14:09 ` Dario Faggioli
2015-07-27 14:34 ` Boris Ostrovsky
2015-07-27 14:43 ` Juergen Gross
2015-07-27 14:51 ` Boris Ostrovsky
2015-07-27 15:03 ` Juergen Gross
2015-07-27 14:47 ` Juergen Gross
2015-07-27 14:58 ` Dario Faggioli
2015-07-28 4:29 ` Juergen Gross
2015-07-28 15:11 ` Juergen Gross
2015-07-28 16:17 ` Dario Faggioli
2015-07-28 17:13 ` Dario Faggioli
2015-07-29 6:04 ` Juergen Gross
2015-07-29 7:09 ` Dario Faggioli
2015-07-29 7:44 ` Dario Faggioli
2015-07-24 16:05 ` Dario Faggioli
2015-07-28 10:05 ` Wei Liu
2015-07-28 15:17 ` Dario Faggioli
2015-07-24 20:27 ` Elena Ufimtseva
2015-07-22 14:50 ` Dario Faggioli
2015-07-22 15:32 ` Boris Ostrovsky
2015-07-22 15:49 ` Dario Faggioli
2015-07-22 18:10 ` Boris Ostrovsky
2015-07-23 7:25 ` Jan Beulich
2015-07-24 16:03 ` Boris Ostrovsky
2015-07-23 13:46 ` Dario Faggioli
2015-07-17 10:17 ` Andrew Cooper
2015-07-16 15:26 ` Wei Liu
2015-07-27 15:13 ` David Vrabel
2015-07-27 16:02 ` Dario Faggioli
2015-07-27 16:31 ` David Vrabel
2015-07-27 16:33 ` Andrew Cooper
2015-07-27 17:42 ` Dario Faggioli
2015-07-27 17:50 ` Konrad Rzeszutek Wilk
2015-07-27 23:19 ` Andrew Cooper
2015-07-28 3:52 ` Juergen Gross
2015-07-28 9:40 ` Andrew Cooper
2015-07-28 9:28 ` Dario Faggioli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55B070ED.2040200@suse.com \
--to=jgross@suse.com \
--cc=JBeulich@suse.com \
--cc=andrew.cooper3@citrix.com \
--cc=boris.ostrovsky@oracle.com \
--cc=dario.faggioli@citrix.com \
--cc=david.vrabel@citrix.com \
--cc=elena.ufimtseva@oracle.com \
--cc=wei.liu2@citrix.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).