xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Keir Fraser <keir@xen.org>,
	SuraveeSuthikulpanit <suravee.suthikulpanit@amd.com>,
	Tim Deegan <tim@xen.org>,
	Xen-devel List <xen-devel@lists.xen.org>,
	Jun Nakajima <jun.nakajima@intel.com>,
	BorisOstrovsky <boris.ostrovsky@oracle.com>,
	"xiantao.zhang@intel.com" <xiantao.zhang@intel.com>
Subject: Re: VM Feature levelling improvements proposal (draft C)
Date: Mon, 17 Feb 2014 17:38:09 +0000	[thread overview]
Message-ID: <53024901.2000000@citrix.com> (raw)
In-Reply-To: <53024B01020000780011CF81@nat28.tlf.novell.com>

On 17/02/14 16:46, Jan Beulich wrote:
>>>> On 17.02.14 at 17:22, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> How XenServer currently does levelling
>> ======================================
>>
>> The _Heterogeneous Pool Levelling_ support in XenServer appears to
>> predate the
>> libxc CPUID policy API, so does not currently use it.  The toolstack has a
>> table of CPU model numbers identifying whether levelling is supported.  It
>> then uses native `CPUID` instructions to look at the first four feature
>> masks,
>> and identifies the subset of features across the pool.
>> `cpuid_mask_{,extd_}{ecx,edx}` is then set on Xen's command line for
>> each host
>> in the pool, and all hosts rebooted.
>>
>> This has several limitations:
>>
>> * Xen and dom0 have a reduced feature set despite not needing to migrate
> Xen, at least for most features, doesn't, as it retrieves the feature
> flags before applying the mask. Dom0 indeed is being limited without
> need.

I should have worded this better.  In XenServer there are further
restrictions to Xen, mainly in the form of default command line options,
to work around PV guest bugs.  This is purely because of a lack of
per-VM feature levelling, and I am hoping to throw all of it away as
soon as a better implementation exists.

Logic such as "To boot the Ubuntu 12.04 installer on AMD
Piledriver/Bulldozer system, XSAVE and FMA4 must be hidden until the
guest admin has updated to the latest kernel and glibc" can then be
moved into the toolstack, rather than having to be blindly applied to
the entire system.  (This doesn't actually matter the latest release of
XenServer is still a Xen-4.1 based system which pre-dates XSAVE support
actually working correctly in Xen, but it is quite important to fix
before our next release).


>
>> * There is only a single level for all VMs in the pool
>> * The toolstack only understands 4 of the 5 possible masking MSRs, and there
>>   are now feature maps in further `CPUID` leaves which have no masking MSRs
>>
>>
>> Proposal for new implementation
>> ===============================
> Sounds reasonable, but is of course in need of some details when
> getting closer to actually implementing this. I'm in particular not
> in favor of an approach where three more MSR writes would be
> added to the (PV) context switch path (mostly) unconditionally.
>
> Jan
>

If there are no particular objections to the proposed design, I shall
work on a patch series which implements it, and documents its expected use.

I am also fairly loath to put more into the context switch codepath, but
I can see no other way of doing per-vm feature levelling for PV guests. 
In the hopefully common case that no masking is needed, then the MSRs
will be written once on the first switch, then never again, at which
point the overhead is a few failed conditions.

It is obviously in the toolstacks best interest to not set different
feature masks for each PV domain, and having dom0, the idle domain and
all HVM domains with the same mask will reduce the switching somewhat,
but correctness in this area to aid in safe migration is cruital.

I am open to alternate suggestions, which is why this is just a proposal
at this stage.  However, as I said - I cant see another way of doing
per-vm feature levelling.

~Andrew

  reply	other threads:[~2014-02-17 17:38 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-17 16:22 VM Feature levelling improvements proposal (draft C) Andrew Cooper
2014-02-17 16:46 ` Jan Beulich
2014-02-17 17:38   ` Andrew Cooper [this message]
2014-02-18 16:42 ` Boris Ostrovsky
2014-02-18 17:57   ` Andrew Cooper
2014-02-18 18:57     ` Boris Ostrovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53024901.2000000@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=jun.nakajima@intel.com \
    --cc=keir@xen.org \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xen.org \
    --cc=xiantao.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).