xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Dario Faggioli <dario.faggioli@citrix.com>
To: George Dunlap <george.dunlap@citrix.com>
Cc: Juergen Gross <jgross@suse.com>,
	Elena Ufimtseva <elena.ufimtseva@oracle.com>,
	Wei Liu <wei.liu2@citrix.com>,
	George Dunlap <George.Dunlap@eu.citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	David Vrabel <david.vrabel@citrix.com>,
	Jan Beulich <JBeulich@suse.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>
Subject: Re: PV-vNUMA issue: topology is misinterpreted by the guest
Date: Mon, 27 Jul 2015 16:02:27 +0200	[thread overview]
Message-ID: <1438005747.5036.151.camel@citrix.com> (raw)
In-Reply-To: <55B611F1.80508@citrix.com>


[-- Attachment #1.1: Type: text/plain, Size: 3372 bytes --]

On Mon, 2015-07-27 at 12:11 +0100, George Dunlap wrote:

> 1. Userspace applications are in the habit of reading CPUID to determine
> the topology of the system they're running on
> 
I'd add this item here:

1b. Linux kernel uses CPUID to configure some bits of its scheduler. The
    result of that, will affect *all* the in-guest application that does
    not use CPUID for scheduling/performance purposes

Yes, I saw you mention this afterwards, but I think it really should be
here, in the analysis part.

> 2. Many use the topology information to help themselves make better
> scheduling decisions.  Because a vcpu is not typically pinned to a
> specific pcpu, we may need to lie here slightly (e.g., not mention
> threads) to get the optimal behavior overall.
> 
> 3. Others use the topology information to implement licensing
> restrictions.  Because threads are treated differently to cores, we want
> to tell the truth here (i.e., make sure we mention that some of these
> are threads) to get the optimal behavior overall.
> 
Define truth. I mean, is 'telling the truth' always good for this case?
E.g., in a 4 vcpus guest, on a 4 socket box, without any pinning, it may
be that when app X samples CPUID to check the license, each vcpu is
running on a different socket, so the truth means we need a licence for
4 sockets... I don't think this is ideal, is it?

Anyways...
> Numbers #2 and #3 lead to contradictory courses of action; we cannot
> optimize for both at the same time.
> 
...that's certainly true, IMO.

> I think at some level we need to just try to accommodate both -- if the
> user doesn't have licensing issues, or prefers performance over
> licensing, then present a unified topology in PVH / HVM using CPUID,
> ACPI, &c.  I think this should be the default.
> 
> If the user has licensing issues, and doesn't mind having wonky or
> unreliable topology to its guests, then let the raw CPUID through.  But
> it would, in this case, be good to try to give the guest OS scheduler a
> hint that it shouldn't really bother trying to read the topology or do
> placement as a result, as any decisions will be unreliable.
> 
This last sentence is basically Juergen proposal, AFAICT, and I agree
that it would be good in case #3. But, thinking more about it, would it
harm in case #2? I don't think it would.

After all, it'd boil down to make peace with the fact that something
like CPUID, in a (not only PV!) VM, is just not reliable enough to use
it for building in-kernel scheduling related data structure, like
Linux's scheduling domains. It is unreliable because its content may
conflict with vNUMA, but, really, even with no vNUMA, it is unreliable
because it depends on whether the VM's vcpus are pinned or not, and if
not, it depends on where they actually run, which is pure randomness,
from the guest point of view.

So, I'm really starting to think that a patch stopping the Linux kernel
relying on CPUID, whether original or mangled, and for all kind of
guests, would really be a nice one to have!

Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  parent reply	other threads:[~2015-07-27 14:02 UTC|newest]

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-16 10:32 PV-vNUMA issue: topology is misinterpreted by the guest Dario Faggioli
2015-07-16 10:47 ` Jan Beulich
2015-07-16 10:56   ` Andrew Cooper
2015-07-16 15:25     ` Wei Liu
2015-07-16 15:45       ` Andrew Cooper
2015-07-16 15:50         ` Boris Ostrovsky
2015-07-16 16:29           ` Jan Beulich
2015-07-16 16:39             ` Andrew Cooper
2015-07-16 16:59               ` Boris Ostrovsky
2015-07-17  6:09                 ` Jan Beulich
2015-07-17  7:27                   ` Dario Faggioli
2015-07-17  7:42                     ` Jan Beulich
2015-07-17  8:44                     ` Wei Liu
2015-07-17 18:17                     ` Boris Ostrovsky
2015-07-20 14:09                       ` Dario Faggioli
2015-07-20 14:43                         ` Boris Ostrovsky
2015-07-21 20:00                           ` Boris Ostrovsky
2015-07-22 13:36                             ` Dario Faggioli
2015-07-22 13:50                               ` Juergen Gross
2015-07-22 13:58                                 ` Boris Ostrovsky
2015-07-22 14:09                                   ` Juergen Gross
2015-07-22 14:44                                     ` Boris Ostrovsky
2015-07-23  4:43                                       ` Juergen Gross
2015-07-23  7:28                                         ` Jan Beulich
2015-07-23  9:42                                         ` Andrew Cooper
2015-07-23 14:07                                         ` Dario Faggioli
2015-07-23 14:13                                           ` Juergen Gross
2015-07-24 10:28                                           ` Juergen Gross
2015-07-24 14:44                                             ` Dario Faggioli
2015-07-24 15:14                                               ` Juergen Gross
2015-07-24 15:24                                                 ` Juergen Gross
2015-07-24 15:58                                                   ` Dario Faggioli
2015-07-24 16:09                                                     ` Konrad Rzeszutek Wilk
2015-07-24 16:14                                                       ` Dario Faggioli
2015-07-24 16:18                                                       ` Juergen Gross
2015-07-24 16:29                                                         ` Konrad Rzeszutek Wilk
2015-07-24 16:39                                                           ` Juergen Gross
2015-07-24 16:44                                                             ` Boris Ostrovsky
2015-07-27  4:35                                                               ` Juergen Gross
2015-07-27 10:43                                                                 ` George Dunlap
2015-07-27 10:54                                                                   ` Andrew Cooper
2015-07-27 11:13                                                                     ` Juergen Gross
2015-07-27 10:54                                                                   ` Juergen Gross
2015-07-27 11:11                                                                     ` George Dunlap
2015-07-27 12:01                                                                       ` Juergen Gross
2015-07-27 12:16                                                                         ` Tim Deegan
2015-07-27 13:23                                                                         ` Dario Faggioli
2015-07-27 14:02                                                                           ` Juergen Gross
2015-07-27 14:02                                                                       ` Dario Faggioli [this message]
2015-07-27 10:41                                                       ` George Dunlap
2015-07-27 10:49                                                         ` Andrew Cooper
2015-07-27 13:11                                                           ` Dario Faggioli
2015-07-24 16:10                                                     ` Juergen Gross
2015-07-24 16:40                                                       ` Boris Ostrovsky
2015-07-24 16:48                                                         ` Juergen Gross
2015-07-24 17:11                                                           ` Boris Ostrovsky
2015-07-27 13:40                                                             ` Dario Faggioli
2015-07-27  4:24                                                         ` Juergen Gross
2015-07-27 14:09                                                       ` Dario Faggioli
2015-07-27 14:34                                                         ` Boris Ostrovsky
2015-07-27 14:43                                                           ` Juergen Gross
2015-07-27 14:51                                                             ` Boris Ostrovsky
2015-07-27 15:03                                                               ` Juergen Gross
2015-07-27 14:47                                                           ` Juergen Gross
2015-07-27 14:58                                                           ` Dario Faggioli
2015-07-28  4:29                                                         ` Juergen Gross
2015-07-28 15:11                                                           ` Juergen Gross
2015-07-28 16:17                                                             ` Dario Faggioli
2015-07-28 17:13                                                               ` Dario Faggioli
2015-07-29  6:04                                                               ` Juergen Gross
2015-07-29  7:09                                                                 ` Dario Faggioli
2015-07-29  7:44                                                             ` Dario Faggioli
2015-07-24 16:05                                                 ` Dario Faggioli
2015-07-28 10:05                                                   ` Wei Liu
2015-07-28 15:17                                                     ` Dario Faggioli
2015-07-24 20:27                                               ` Elena Ufimtseva
2015-07-22 14:50                                     ` Dario Faggioli
2015-07-22 15:32                                       ` Boris Ostrovsky
2015-07-22 15:49                                         ` Dario Faggioli
2015-07-22 18:10                                           ` Boris Ostrovsky
2015-07-23  7:25                                             ` Jan Beulich
2015-07-24 16:03                                               ` Boris Ostrovsky
2015-07-23 13:46                                             ` Dario Faggioli
2015-07-17 10:17                 ` Andrew Cooper
2015-07-16 15:26 ` Wei Liu
2015-07-27 15:13 ` David Vrabel
2015-07-27 16:02   ` Dario Faggioli
2015-07-27 16:31     ` David Vrabel
2015-07-27 16:33       ` Andrew Cooper
2015-07-27 17:42         ` Dario Faggioli
2015-07-27 17:50           ` Konrad Rzeszutek Wilk
2015-07-27 23:19           ` Andrew Cooper
2015-07-28  3:52             ` Juergen Gross
2015-07-28  9:40               ` Andrew Cooper
2015-07-28  9:28             ` Dario Faggioli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1438005747.5036.151.camel@citrix.com \
    --to=dario.faggioli@citrix.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=david.vrabel@citrix.com \
    --cc=elena.ufimtseva@oracle.com \
    --cc=george.dunlap@citrix.com \
    --cc=jgross@suse.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).