From: Andre Przywara <andre.przywara@amd.com>
To: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>,
xen-devel <xen-devel@lists.xensource.com>,
Jan Beulich <JBeulich@suse.com>,
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Subject: Re: [PATCH] RFC: Linux: disable APERF/MPERF feature in PV kernels
Date: Wed, 23 May 2012 15:31:43 +0200 [thread overview]
Message-ID: <4FBCE6BF.20400@amd.com> (raw)
In-Reply-To: <4FBCE453.5080206@citrix.com>
On 05/23/2012 03:21 PM, Andrew Cooper wrote:
> On 23/05/12 13:18, Jan Beulich wrote:
>>>>> On 23.05.12 at 13:11, Andrew Cooper<andrew.cooper3@citrix.com> wrote:
>>> On 23/05/12 08:34, Jan Beulich wrote:
>>>> First of all I'm of the opinion that this indeed should not be
>>>> masked in the hypervisor - there's no reason to disallow the
>>>> guest to read these registers (but we should of course deny
>>>> writes as long as Xen is controlling P-states, which we do).
>>> I am sorry but I am going to have to disagree with you on this point.
>>>
>>> We should not be advertising this feature to any guest at all if we
>>> can't provide an implementation which works as native expects. Else we
>>> are failing in our job of virtualisation.
>> That's perhaps a matter of the position you take - for HVM, I
>> would agree with yours, but there's many more aspects (not
>> the least related to accessing other MSRs) that we fail to
>> "properly" virtualize for PV guests - my position is that it is the
>> nature of PV that guest kernels have to be aware of being
>> virtualized (and hence stay away from doing certain things
>> unless [they think] they know what they're doing).
>>
>>> There is 'dom0_vcpus_pin'[1] which identity pins dom0 vcpus, and
>>> prevents update of the affinity masks, and appears to conditionally
>>> allow access to certain MSRs. I think it would be fine to expose this
>>> feature iff dom0s vcpus are pinned in this fashion. That way, the
>>> measurement should succeed, even if dom0 only has read access to the MSRs.
>> Restricting it to this case would be too restrictive - it really
>> makes sense at any time where the vCPU's affinity has exactly
>> one bit set (or to be precise, the intersection of it and the set
>> of online pCPU-s).
>>
>> Jan
>>
>
> That is unfortunately too lax. You also need to be able to guarantee
> that the affinity mask is not updated (and vcpu rescheduled) while in
> the middle of a measurement. Xen cant sensibly work out if or when a
> guest is taking a measurement, nor can dom0. So the only safe solution
> I can see is for Xen to prevent the affinity masks from ever being
> updated. With more thought, this would also preclude migration of a
> guest to another host.
Iff we really care about this feature, we could as well emulate it:
On every VCPU migration we calculate the difference between the two
pCPU's values of APERF and MPERF. On the trap this value is added to the
current MSR value. Similar to what is done with the TSC in HVM.
We trap on every MSR access anyway, so the performance impact is only
four HV rdmsrs on every VCPU migration.
Only I am not sure if this is really a problem we should solve or if
wouldn't be easier for us and clearer to the user to just discourage
those accesses.
Regards,
Andre.
--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
prev parent reply other threads:[~2012-05-23 13:31 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-22 16:07 [PATCH] RFC: Linux: disable APERF/MPERF feature in PV kernels Andre Przywara
2012-05-22 16:52 ` Jeremy Fitzhardinge
2012-05-22 17:08 ` Malcolm Crossley
2012-05-23 8:10 ` Jan Beulich
2012-05-22 20:46 ` Andre Przywara
2012-05-22 17:18 ` Konrad Rzeszutek Wilk
2012-05-22 21:02 ` Andre Przywara
2012-05-22 21:00 ` Konrad Rzeszutek Wilk
2012-05-22 22:44 ` Andre Przywara
2012-05-23 13:26 ` Konrad Rzeszutek Wilk
2012-05-24 13:24 ` Andre Przywara
2012-05-29 10:54 ` Andre Przywara
2012-05-23 7:34 ` Jan Beulich
2012-05-23 9:14 ` Andre Przywara
2012-05-23 9:43 ` Jan Beulich
2012-05-23 9:52 ` Andre Przywara
2012-05-23 10:01 ` Jan Beulich
2012-05-23 11:11 ` Andrew Cooper
2012-05-23 12:18 ` Jan Beulich
2012-05-23 13:21 ` Andrew Cooper
2012-05-23 13:31 ` Andre Przywara [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FBCE6BF.20400@amd.com \
--to=andre.przywara@amd.com \
--cc=JBeulich@suse.com \
--cc=andrew.cooper3@citrix.com \
--cc=jeremy@goop.org \
--cc=konrad.wilk@oracle.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).