Alternate p2m design specification

All of lore.kernel.org
 help / color / mirror / Atom feed

* Alternate p2m design specification
@ 2015-06-10  0:09 Ed White
  2015-06-10  7:43 ` Jan Beulich
  2015-06-10 18:23 ` Andrew Cooper
  0 siblings, 2 replies; 11+ messages in thread
From: Ed White @ 2015-06-10  0:09 UTC (permalink / raw)
  To: xen-devel
  Cc: ravi.sahita, Keir Fraser, Ian Campbell, Tim Deegan, Ian Jackson,
	Jan Beulich, Andrew Cooper

This document describes a new capability for VM Introspection, Security and Privacy in Xen. The new capability is called “altp2m” (short for Alternate p2m) that is used to provide the ability for Xen to host alternate guest physical memory domains for a specific guest-domain. This document describes the overall design specific to Xen for your review and feedback.

Background
=========

Intel VT-x2 CPUs support Extended Page Tables (EPTs). Extended Page Tables allow the VMM to restrict permissions for guest physical pages accessed by software operating in the guest (VMX-non-root). The p2m capability in Xen abstracts the architecture-specific details of EPTs. Typically, Xen manages a single p2m for a specific guest domain.

ALTP2M Introduction
================

The altp2m capability enables management of multiple (alternate) p2ms per HVM guest domain thus allowing for separate physical memory domains per guest. The altp2m capability allows for para-virtualized guest software agent within or across domains to be able to enforce memory introspection policies in an efficient manner. Altp2m also allows para-virtualized guest agent components to be isolated within an HVM (in terms of guest physical memory) for secure VM introspection as well as various other security and privacy usages that require efficient memory isolation. 

Two related Intel CPU features are utilized as performance enhancement capabilities within the altp2m module when operating with an in-domain agent. The altp2m module opportunistically uses these assists when enumerated on the CPU. Operations that require frequent switching between p2m domains can incur a high overhead if done via legacy approaches such as via a hypercall. VM Functions (VMFUNC) is a new VT-x instruction on Intel's 4th gen Core (Haswell) and Atom (Silvermont) CPUs. In general, VMFUNC is targeted to reduce overhead of services provided by the CPU to an HVM guest (once configured by the VMM) – one such leaf (0) is defined is to provide a low latency p2m switching (EPTP Switching in Intel terminology) capability. VMFUNC leaf 0 is enabled as part of the altp2m functionality to allow para-virtualized agents in an HVM to apply custom p2m domain switching policies without incurring overheads due to VM Exits.

#VE (Virtualization Exception) is a feature introduced on Intel’s 5th gen Core (Broadwell) and Atom (Goldmont) CPUs. #VE is a CPU assist defined to allow the VMM to convert EPT violations for specific guest physical page accesses to a guest-IDT-delivered exception (new vector 20), and thus reduce the latency for managing VM introspection policies for guest memory read, write and/or execute attempts – these are induced events configured by a para-virtualized security agent monitoring guest memory accesses based on its isolation/monitoring policies. In legacy (pre-#VE) CPUs, EPT violations require a VM Exit and frequent induced EPT violations can add high hypervisor overhead. #VE reduces the impact of this overhead, whilst reducing the amount of guest-specific policy context to be inserted into the VMM.

Both VMFUNC and #VE are designed such that a VMM can emulate them on legacy CPUs. The altp2m module includes full emulation of VMFUNC leaf 0 and #VE, so in-domain agents can be written to assume both capabilities are available on all hardware.

VMFUNC Introduction
=================

VMFUNC leaf 0 for EPTP-Switching is a hardware-assisted efficient way to switch EPTs configured by the VMM. Software in a Xen guest domain may invoke a VM function with the VMFUNC instruction; the value of EAX selects the specific VM function being invoked.

The VMM enables VM functions generally by setting the “enable VM functions” VM-execution control. A specific VM function is enabled by setting the corresponding VM-function control. When software wants to enable EPTP switching (VM function 0) it must set the “activate secondary controls” VM-execution control (bit 31 of the primary processor-based VM-execution controls), the “enable VM functions” VM-execution control (bit 13 of the secondary processor-based VMexecution controls) and the “EPTP switching” VM-function control (bit 0 of the VM-function controls).

The VMFUNC instruction causes an invalid-opcode exception (#UD) if the “enable VM functions” VM-execution controls is 0 or the value of EAX is greater than 63 (only VM functions 0–63 can be enabled). Otherwise, the instruction causes a VM exit if the bit at position EAX is 0 in the VM-function controls (the selected VM Function is not enabled). If such a VM exit occurs, the basic exit reason used is 59 (3BH), indicating “VMFUNC”, and the length of the VMFUNC instruction is saved into the VM-exit instruction-length field. If the instruction causes neither an invalid-opcode exception nor a VM exit due to a disabled VM function, it performs the functionality of the VM function specified by the value in EAX. 

VMFUNC leaf 0/EPTP switching allows guest software to load a new value for the EPT pointer (EPTP), thereby establishing a different EPT paging-structure hierarchy. Guest software is limited to selecting from a list of potential EPTP values configured in advance by the VMM. Specifically, the value of ECX is used to select an entry from an EPTP list, a 4-KByte structure referenced by the EPTP-list address (a new control field in the VMCS). VMFUNC causes a VM exit for error conditions such as if ECX ≥ 512. If the selected entry is a valid EPTP value (i.e. the EPTP would not cause VM entry to fail), it is stored in the EPTP field of the current VMCS and is used for subsequent accesses using guest-physical addresses.

The complete spec of VMFUNC can be found in chapter 25.5.5 of the Intel SDM at:
http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html

#VE Introduction
=============

A virtualization exception is a new processor exception. It uses vector 20 and is abbreviated #VE. A virtualization exception can occur only in VMX non-root operation. The 1-setting of the “EPT-violation #VE” VM-execution control causes some EPT violations to generate virtualization exceptions instead of VM exits. The VMM manages how the processor determines whether an EPT violation causes a virtualization exception or a VM exit. When the processor encounters a virtualization exception, it saves information about the exception to the virtualization-exception information area (hosted in a 4Kb page referenced by a new field in the VMCS). After saving virtualization-exception information, the processor delivers a virtualization exception as it would any other exception.

The values of certain EPT paging-structure entries determine which EPT violations are convertible. Specifically, bit 63 of certain EPT paging-structure entries is defined to suppress #VE – effectively, an EPT violation is convertible to #VE if and only if bit 63 of the EPT entry that caused the EPT violation is 0. Note that EPT misconfiguration behavior does not change and always cause VM exits.

The complete spec of #VE can be found in chapters 25.5.6 of the Intel SDM at:
http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html

With VMFUNC and #VE, the Xen hypervisor does not have to be involved for handling guest VM-introspection policies, which reduces hypervisor overhead, complexity (TCB), and would work well with VM migration.  For a guest domain using VMFUNC and #VE, more CPU cycles can be allocated to guest, so benchmarks in guest domain using VM-introspection with VMFUNC and #VE enabled will have better performance comparing to non-VMFUNC/#VE CPU.

Design
======

- Altp2m feature enabled via opt-in parameter

A new Xen boot parameter, 'altp2m', is introduced to control altp2m on a global basis – this parameter defaults to 0 (disabled).

- Altp2m enable/disable for particular domain

Additionally, a new domain parameter, 'altp2mhvm', is introduced to control altp2m for an individual HVM domain – this parameter also defaults to 0 (disabled).

Both parameters must be set to 1 (enabled) before altp2m functionality is available in a given domain.

At any point in time, altp2m is enabled for all vcpus of a domain or disabled for all vcpus of that domain. Alternate EPT tables created for the alternate p2m are shared by all vcpus assigned to a domain. Altp2m mode may be dynamically enabled/disabled for a domain. 

- Hypercalls for altp2m

Altp2m mode introduces a new set of hypercalls for altp2m management from software agents operating in Xen HVM guests.

The hypercalls are as follows:
Enable or Disable altp2m mode for domain
Create a new alternate p2m 
Edit permissions for a specific GPA within an alternate p2m 
Destroy an existing alternate p2m

- Core altp2m functionality

A new altp2m type is added to the p2m types (in addition to the previous hostp2m and nestedp2m types). An HVM domain can be started in hostp2m mode and switched over into altp2m mode via a hypercall. Once a HVM domain is in altp2m mode, a set of (currently set size is 10) altp2m objects is managed by Xen. Altp2m updates are performed in a lazy manner – in effect, the altp2m reflects the same EPT attributes for mappings accessed as the hostp2m unless the permissions for a GPA are modified by the guest agent (for a specific altp2m) – currently, page permissions and mappings for memory type ram_rw only can be modified via the altp2m hypercall. By default, all GPA mappings are set to suppress #VE (resulting in legacy behavior for Xen); #VE is un-suppressed for a GPA when the in-domain guest agent invokes an altp2m hypercall to modify the permission of a GPA. A subsequent guest access to the GPA that violates the agent-specified EPT permissions will cause a #VE (instead of an EPT viola
tion) that is expected to be handled by the guest software. One of the valid responses to a #VE event in the guest, is to switch altp2m's to activate a different set of GPA permissions and mappings. Using VMFUNC, this switch can be achieved efficiently for the single vcpu on which the permissions violation occurs. There is also a hypercall to switch altp2m's for every vcpu in a domain, as is typically required during agent initialisation.

The list of altp2m's is protected by a separate list lock, which must be held during any operations which could change the state of an altp2m from valid to invalid or vice-versa, or when performing any modification to an altp2m which is not the current p2m for the current vcpu. Many operations that must acquire the altp2m list lock occur in code paths where the hostp2m lock has already been acquired. To avoid locking order violations, the p2m lock has been split into two types: altp2ms have a lock type which is lower in the order than other p2m's; and the altp2m list lock is placed between the two.

- VMExit handler for VMFUNC

When altp2m is enabled on a CPU with VMFUNC enumerated, an erroneous VMFUNC may cause a VM exit with exit reason “VMFUNC”. A new exit handler is added for this exit reason, which injects #UD into the guest.

- Support for intra-domain and inter-domain VM introspection (and XSM)

The altp2m functionality allows the capability to be used via an agent operating in an HVM guest or alternately an agent operating in a separate privileged domain. For cross domain operation, an XSM hook is defined such that the administrator can define a policy for inter-domain VM introspection.

The way in which permissions violations are reported to an in-domain agent and the expected agent response have been described above. Restrictions imposed by an out-of-domain agent do not have suppress-#VE removed, so they always result in a VM exit. The violation is reported through the existing VM Event mechanism, modified to indicate that the event is an altp2m event and include the current altp2m index; the response can force a change to a different altp2m for the relevant vcpu before VM entry. If an in-domain agent places an altp2m restriction and a violation of that restriction occurs on a vcpu that cannot receive #VE, that will cause of VM exit that will be treated as if the restriction had been imposed by an out-of-domain agent.

---------------- END --------------------------

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Alternate p2m design specification
  2015-06-10  0:09 Alternate p2m design specification Ed White
@ 2015-06-10  7:43 ` Jan Beulich
  2015-06-10 16:39   ` Ed White
  2015-06-10 18:23 ` Andrew Cooper
  1 sibling, 1 reply; 11+ messages in thread
From: Jan Beulich @ 2015-06-10  7:43 UTC (permalink / raw)
  To: Ed White
  Cc: ravi.sahita, Keir Fraser, Ian Campbell, Andrew Cooper,
	Ian Jackson, Tim Deegan, xen-devel

>>> On 10.06.15 at 02:09, <edmund.h.white@intel.com> wrote:
> Design
> ======

Reads all quite reasonable; just one minor remark:

> - Core altp2m functionality
> 
> A new altp2m type is added to the p2m types (in addition to the previous 
> hostp2m and nestedp2m types). An HVM domain can be started in hostp2m mode 
> and switched over into altp2m mode via a hypercall. Once a HVM domain is in 
> altp2m mode, a set of (currently set size is 10) altp2m objects is managed by 
> Xen.

Rather than hardcoding 10, how about making the Xen command line
option specify the value (instead of being a simple boolean one)?

Jan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Alternate p2m design specification
  2015-06-10  7:43 ` Jan Beulich
@ 2015-06-10 16:39   ` Ed White
  2015-06-11  7:05     ` Jan Beulich
  0 siblings, 1 reply; 11+ messages in thread
From: Ed White @ 2015-06-10 16:39 UTC (permalink / raw)
  To: Jan Beulich
  Cc: ravi.sahita, Keir Fraser, Ian Campbell, Andrew Cooper,
	Ian Jackson, Tim Deegan, xen-devel

On 06/10/2015 12:43 AM, Jan Beulich wrote:
>>>> On 10.06.15 at 02:09, <edmund.h.white@intel.com> wrote:
>> Design
>> ======
> 
> Reads all quite reasonable; just one minor remark:
> 
>> - Core altp2m functionality
>>
>> A new altp2m type is added to the p2m types (in addition to the previous 
>> hostp2m and nestedp2m types). An HVM domain can be started in hostp2m mode 
>> and switched over into altp2m mode via a hypercall. Once a HVM domain is in 
>> altp2m mode, a set of (currently set size is 10) altp2m objects is managed by 
>> Xen.
> 
> Rather than hardcoding 10, how about making the Xen command line
> option specify the value (instead of being a simple boolean one)?
> 
> Jan
> 

For the use cases we're currently aware of, even 10 is generous. I'd be
wary of allowing the user to specify any number up to and including 512
with the current implementation, because there are a number of places where
we do a linear search of the list.

In practice, 1 alternate p2m is not useful, so in a future version of the
code, when we have more idea of how useful people find it, we could say
0=off, 1=on with 10 alternates, any other number=on with that many
alternates.

Does that seem reasonable?

Ed

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Alternate p2m design specification
  2015-06-10 16:39   ` Ed White
@ 2015-06-11  7:05     ` Jan Beulich
  2015-06-11 16:30       ` Ed White
  0 siblings, 1 reply; 11+ messages in thread
From: Jan Beulich @ 2015-06-11  7:05 UTC (permalink / raw)
  To: Ed White
  Cc: ravi.sahita, Keir Fraser, Ian Campbell, Andrew Cooper,
	Ian Jackson, Tim Deegan, xen-devel

>>> On 10.06.15 at 18:39, <edmund.h.white@intel.com> wrote:
> On 06/10/2015 12:43 AM, Jan Beulich wrote:
>>>>> On 10.06.15 at 02:09, <edmund.h.white@intel.com> wrote:
>>> Design
>>> ======
>> 
>> Reads all quite reasonable; just one minor remark:
>> 
>>> - Core altp2m functionality
>>>
>>> A new altp2m type is added to the p2m types (in addition to the previous 
>>> hostp2m and nestedp2m types). An HVM domain can be started in hostp2m mode 
>>> and switched over into altp2m mode via a hypercall. Once a HVM domain is in 
>>> altp2m mode, a set of (currently set size is 10) altp2m objects is managed 
> by 
>>> Xen.
>> 
>> Rather than hardcoding 10, how about making the Xen command line
>> option specify the value (instead of being a simple boolean one)?
> 
> For the use cases we're currently aware of, even 10 is generous. I'd be
> wary of allowing the user to specify any number up to and including 512
> with the current implementation, because there are a number of places where
> we do a linear search of the list.

I was actually considering this to allow the user to reduce the value
below 10.

> In practice, 1 alternate p2m is not useful, so in a future version of the
> code, when we have more idea of how useful people find it, we could say
> 0=off, 1=on with 10 alternates, any other number=on with that many
> alternates.
> 
> Does that seem reasonable?

That's certainly an option, but using only a number from the beginning
would simplify the handling of the option in the hypervisor. But of
course that's only a very minor aspect.

Jan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Alternate p2m design specification
  2015-06-11  7:05     ` Jan Beulich
@ 2015-06-11 16:30       ` Ed White
  0 siblings, 0 replies; 11+ messages in thread
From: Ed White @ 2015-06-11 16:30 UTC (permalink / raw)
  To: Jan Beulich
  Cc: ravi.sahita, Keir Fraser, Ian Campbell, Andrew Cooper,
	Ian Jackson, Tim Deegan, xen-devel

On 06/11/2015 12:05 AM, Jan Beulich wrote:
>>>> On 10.06.15 at 18:39, <edmund.h.white@intel.com> wrote:
>> On 06/10/2015 12:43 AM, Jan Beulich wrote:
>>>>>> On 10.06.15 at 02:09, <edmund.h.white@intel.com> wrote:
>>>> Design
>>>> ======
>>>
>>> Reads all quite reasonable; just one minor remark:
>>>
>>>> - Core altp2m functionality
>>>>
>>>> A new altp2m type is added to the p2m types (in addition to the previous 
>>>> hostp2m and nestedp2m types). An HVM domain can be started in hostp2m mode 
>>>> and switched over into altp2m mode via a hypercall. Once a HVM domain is in 
>>>> altp2m mode, a set of (currently set size is 10) altp2m objects is managed 
>> by 
>>>> Xen.
>>>
>>> Rather than hardcoding 10, how about making the Xen command line
>>> option specify the value (instead of being a simple boolean one)?
>>
>> For the use cases we're currently aware of, even 10 is generous. I'd be
>> wary of allowing the user to specify any number up to and including 512
>> with the current implementation, because there are a number of places where
>> we do a linear search of the list.
> 
> I was actually considering this to allow the user to reduce the value
> below 10.
> 

Understood. I chose 10 as the initial value because it's high enough for
all the likely use cases we know of, and low enough not be too expensive
at run time. My concern with allowing the value to be set lower is that
it would cause users of the code to fail in unexpected ways.

>> In practice, 1 alternate p2m is not useful, so in a future version of the
>> code, when we have more idea of how useful people find it, we could say
>> 0=off, 1=on with 10 alternates, any other number=on with that many
>> alternates.
>>
>> Does that seem reasonable?
> 
> That's certainly an option, but using only a number from the beginning
> would simplify the handling of the option in the hypervisor. But of
> course that's only a very minor aspect.
> 

The reason I would like to avoid doing it now is simply development and
test time. I'm trying to avoid dropping a large amount of code on you
and the other maintainers and/or reviewers right at the function freeze
deadline, and it's already challenging to meet that aim with the amount
of work still left to do.

Ed

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Alternate p2m design specification
  2015-06-10  0:09 Alternate p2m design specification Ed White
  2015-06-10  7:43 ` Jan Beulich
@ 2015-06-10 18:23 ` Andrew Cooper
  2015-06-10 19:41   ` Ed White
  1 sibling, 1 reply; 11+ messages in thread
From: Andrew Cooper @ 2015-06-10 18:23 UTC (permalink / raw)
  To: Ed White, xen-devel
  Cc: ravi.sahita, Keir Fraser, Ian Campbell, Tim Deegan, Ian Jackson,
	Jan Beulich

On 10/06/15 01:09, Ed White wrote:
> This document describes a new capability for VM Introspection, Security and Privacy in Xen. The new capability is called “altp2m” (short for Alternate p2m) that is used to provide the ability for Xen to host alternate guest physical memory domains for a specific guest-domain. This document describes the overall design specific to Xen for your review and feedback.

This is a very thorough description, and everything here seems in order.

A couple of remarks,

> The altp2m functionality allows the capability to be used via an agent operating in an HVM guest or alternately an agent operating in a separate privileged domain. For cross domain operation, an XSM hook is defined such that the administrator can define a policy for inter-domain VM introspection.

There should be an interlock to prevent both an internal and external
entity from actually being able to use the altp2m infrastructure.

Nothing good can come of an uncoordinated attempt like this, whereas a
coordinated use would already have to be communicating far more than the
hypercalls themselves would allow.

Also, hardware accelerated altp2m is mutually exclusive with EPT PML, as
we have no way of determining which translation was in use when a gpa
was appended to the buffer.  We are going to have to maintain a feature
compatibility matrix.  Even for non-accelerated altp2m, the cost of
working out the real gpa is likely prohibitive, and we should probably
resort to declaring logdirty and altp2m as exclusive features.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Alternate p2m design specification
  2015-06-10 18:23 ` Andrew Cooper
@ 2015-06-10 19:41   ` Ed White
  2015-06-10 23:09     ` Andrew Cooper
  0 siblings, 1 reply; 11+ messages in thread
From: Ed White @ 2015-06-10 19:41 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel
  Cc: ravi.sahita, Keir Fraser, Ian Campbell, Tim Deegan, Ian Jackson,
	Jan Beulich

On 06/10/2015 11:23 AM, Andrew Cooper wrote:
> On 10/06/15 01:09, Ed White wrote:
>> This document describes a new capability for VM Introspection, Security and Privacy in Xen. The new capability is called “altp2m” (short for Alternate p2m) that is used to provide the ability for Xen to host alternate guest physical memory domains for a specific guest-domain. This document describes the overall design specific to Xen for your review and feedback.
> 
> This is a very thorough description, and everything here seems in order.
> 
> A couple of remarks,
> 
>> The altp2m functionality allows the capability to be used via an agent operating in an HVM guest or alternately an agent operating in a separate privileged domain. For cross domain operation, an XSM hook is defined such that the administrator can define a policy for inter-domain VM introspection.
> 
> There should be an interlock to prevent both an internal and external
> entity from actually being able to use the altp2m infrastructure.
> 
> Nothing good can come of an uncoordinated attempt like this, whereas a
> coordinated use would already have to be communicating far more than the
> hypercalls themselves would allow.
> 

I have to disagree with you on that. Assuming we have the XSM hooks right,
you could enforce either/or through policy, but we have envisaged scenarios
where we would want to mix internal and external use of the hypercalls.

> 
> Also, hardware accelerated altp2m is mutually exclusive with EPT PML, as
> we have no way of determining which translation was in use when a gpa
> was appended to the buffer.  We are going to have to maintain a feature
> compatibility matrix.  Even for non-accelerated altp2m, the cost of
> working out the real gpa is likely prohibitive, and we should probably
> resort to declaring logdirty and altp2m as exclusive features.
> 
> ~Andrew
> 

I haven't investigated the PML code, but just to be clear, log-dirty
without PML is compatible with altp2m. We are already enforcing
either/or for nestedhvm and altp2mhvm domain parameters -- I'll
look into doing something similar for PML.

Ed


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Alternate p2m design specification
  2015-06-10 19:41   ` Ed White
@ 2015-06-10 23:09     ` Andrew Cooper
  2015-06-10 23:41       ` Ed White
  2015-06-11 12:06       ` Tim Deegan
  0 siblings, 2 replies; 11+ messages in thread
From: Andrew Cooper @ 2015-06-10 23:09 UTC (permalink / raw)
  To: Ed White, xen-devel
  Cc: ravi.sahita, Keir Fraser, Ian Campbell, Tim Deegan, Ian Jackson,
	Jan Beulich

On 10/06/15 20:41, Ed White wrote:
> On 06/10/2015 11:23 AM, Andrew Cooper wrote:
>> On 10/06/15 01:09, Ed White wrote:
>>> This document describes a new capability for VM Introspection, Security and Privacy in Xen. The new capability is called “altp2m” (short for Alternate p2m) that is used to provide the ability for Xen to host alternate guest physical memory domains for a specific guest-domain. This document describes the overall design specific to Xen for your review and feedback.
>> This is a very thorough description, and everything here seems in order.
>>
>> A couple of remarks,
>>
>>> The altp2m functionality allows the capability to be used via an agent operating in an HVM guest or alternately an agent operating in a separate privileged domain. For cross domain operation, an XSM hook is defined such that the administrator can define a policy for inter-domain VM introspection.
>> There should be an interlock to prevent both an internal and external
>> entity from actually being able to use the altp2m infrastructure.
>>
>> Nothing good can come of an uncoordinated attempt like this, whereas a
>> coordinated use would already have to be communicating far more than the
>> hypercalls themselves would allow.
>>
> I have to disagree with you on that. Assuming we have the XSM hooks right,
> you could enforce either/or through policy, but we have envisaged scenarios
> where we would want to mix internal and external use of the hypercalls.

I suppose, as long as nothing bad can happen to Xen from mixed use, the
worst that will happen is a very confused guest.

>> Also, hardware accelerated altp2m is mutually exclusive with EPT PML, as
>> we have no way of determining which translation was in use when a gpa
>> was appended to the buffer.  We are going to have to maintain a feature
>> compatibility matrix.  Even for non-accelerated altp2m, the cost of
>> working out the real gpa is likely prohibitive, and we should probably
>> resort to declaring logdirty and altp2m as exclusive features.
>>
>> ~Andrew
>>
> I haven't investigated the PML code, but just to be clear, log-dirty
> without PML is compatible with altp2m.

The logdirty code is built around the notion of a single linear idea of
a guests physical address space.  Altp2m, by its very nature, introduces
non-linearities into a guests physical address space.

A reasonable compromise might be "the linear view offered by the host
p2m" (which itself highlights one fact that the host p2m, if it isn't
already, must at all times be a strict superset of anything contained in
any nestp2m/altp2m).  However, you are still left with the problem of
translating a guest physical address as viewed in one altp2m into the
host p2m's view of the same mfn.  I suppose this is possible, given a
reverse lookup in the host p2m, but performance would be unappealing.

As for PML, it is an unspecified length of time until the hardware PML
buffer fills up and Xen comes to drain it into the logdirty bitmap.  The
vcpu in question could have switched EPTs any number of times in the
intervening period.  Xen therefore cannot even tell which altp2m an
individual gpa was logged under.

~Andrew


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Alternate p2m design specification
  2015-06-10 23:09     ` Andrew Cooper
@ 2015-06-10 23:41       ` Ed White
  2015-06-11 12:06       ` Tim Deegan
  1 sibling, 0 replies; 11+ messages in thread
From: Ed White @ 2015-06-10 23:41 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel
  Cc: ravi.sahita, Keir Fraser, Ian Campbell, Tim Deegan, Ian Jackson,
	Jan Beulich

On 06/10/2015 04:09 PM, Andrew Cooper wrote:
> On 10/06/15 20:41, Ed White wrote:
>> On 06/10/2015 11:23 AM, Andrew Cooper wrote:
>>> On 10/06/15 01:09, Ed White wrote:
>>>> This document describes a new capability for VM Introspection, Security and Privacy in Xen. The new capability is called “altp2m” (short for Alternate p2m) that is used to provide the ability for Xen to host alternate guest physical memory domains for a specific guest-domain. This document describes the overall design specific to Xen for your review and feedback.
>>> This is a very thorough description, and everything here seems in order.
>>>
>>> A couple of remarks,
>>>
>>>> The altp2m functionality allows the capability to be used via an agent operating in an HVM guest or alternately an agent operating in a separate privileged domain. For cross domain operation, an XSM hook is defined such that the administrator can define a policy for inter-domain VM introspection.
>>> There should be an interlock to prevent both an internal and external
>>> entity from actually being able to use the altp2m infrastructure.
>>>
>>> Nothing good can come of an uncoordinated attempt like this, whereas a
>>> coordinated use would already have to be communicating far more than the
>>> hypercalls themselves would allow.
>>>
>> I have to disagree with you on that. Assuming we have the XSM hooks right,
>> you could enforce either/or through policy, but we have envisaged scenarios
>> where we would want to mix internal and external use of the hypercalls.
> 
> I suppose, as long as nothing bad can happen to Xen from mixed use, the
> worst that will happen is a very confused guest.
> 

No, nothing bad can happen to Xen.

The *only* difference between internal and external is how a violation is
reported. If the restriction was placed via an internal hypercall, an
attempt is made to report it internally, with the VM event mechanism
used as a fallback. Externally replaced restrictions always result in
VM event notifications.

It is possible to end up crashing the domain if there's no listener,
but that's existing VM event behaviour.

>>> Also, hardware accelerated altp2m is mutually exclusive with EPT PML, as
>>> we have no way of determining which translation was in use when a gpa
>>> was appended to the buffer.  We are going to have to maintain a feature
>>> compatibility matrix.  Even for non-accelerated altp2m, the cost of
>>> working out the real gpa is likely prohibitive, and we should probably
>>> resort to declaring logdirty and altp2m as exclusive features.
>>>
>>> ~Andrew
>>>
>> I haven't investigated the PML code, but just to be clear, log-dirty
>> without PML is compatible with altp2m.
> 
> The logdirty code is built around the notion of a single linear idea of
> a guests physical address space.  Altp2m, by its very nature, introduces
> non-linearities into a guests physical address space.
> 
> A reasonable compromise might be "the linear view offered by the host
> p2m" (which itself highlights one fact that the host p2m, if it isn't
> already, must at all times be a strict superset of anything contained in
> any nestp2m/altp2m).  However, you are still left with the problem of
> translating a guest physical address as viewed in one altp2m into the
> host p2m's view of the same mfn.  I suppose this is possible, given a
> reverse lookup in the host p2m, but performance would be unappealing.
> 
> As for PML, it is an unspecified length of time until the hardware PML
> buffer fills up and Xen comes to drain it into the logdirty bitmap.  The
> vcpu in question could have switched EPTs any number of times in the
> intervening period.  Xen therefore cannot even tell which altp2m an
> individual gpa was logged under.
> 

To be clear, I was not suggesting that PML might work -- I know it will not.

The reason log-dirty without PML works is that page state transitions are
always done in the host p2m, even when the vcpu is not using that p2m. After
the transition occurs in the host p2m, that change is propagated to any altp2m
that currently has a copy of that gfn mapping, and if the gfn->mfn mapping is
different in the altp2m the entire altp2m is invalidated -- at least that
will be the case by the time I submit a revised patch series.

That works well enough for the common log-dirty use case (VRAM change tracking)
to work.

For a use case where the tracking must be 100% accurate, migration for example,
I don't see any alternative to stating that altp2m mode must be off for the
target domain, at least in the Xen 4.6 time frame.

Ed


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Alternate p2m design specification
  2015-06-10 23:09     ` Andrew Cooper
  2015-06-10 23:41       ` Ed White
@ 2015-06-11 12:06       ` Tim Deegan
  2015-06-11 16:23         ` Ed White
  1 sibling, 1 reply; 11+ messages in thread
From: Tim Deegan @ 2015-06-11 12:06 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: ravi.sahita, Keir Fraser, Ian Campbell, Ian Jackson, Ed White,
	Jan Beulich, xen-devel

At 00:09 +0100 on 11 Jun (1433981379), Andrew Cooper wrote:
> On 10/06/15 20:41, Ed White wrote:
> > On 06/10/2015 11:23 AM, Andrew Cooper wrote:
> >> Also, hardware accelerated altp2m is mutually exclusive with EPT PML, as
> >> we have no way of determining which translation was in use when a gpa
> >> was appended to the buffer.  We are going to have to maintain a feature
> >> compatibility matrix.  Even for non-accelerated altp2m, the cost of
> >> working out the real gpa is likely prohibitive, and we should probably
> >> resort to declaring logdirty and altp2m as exclusive features.
> >>
> >> ~Andrew
> >>
> > I haven't investigated the PML code, but just to be clear, log-dirty
> > without PML is compatible with altp2m.
> 
> The logdirty code is built around the notion of a single linear idea of
> a guests physical address space.  Altp2m, by its very nature, introduces
> non-linearities into a guests physical address space.

All current users of the log-dirty code, except for PML, operate on
MFNs and use the M2P to get a PFN so use for the dirty-bitmap
operation.  Those paths should all be able to operate with altp2m
active.  (IOW the single linear address space is the 'host' p2m).

Cheers,

Tim.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Alternate p2m design specification
  2015-06-11 12:06       ` Tim Deegan
@ 2015-06-11 16:23         ` Ed White
  0 siblings, 0 replies; 11+ messages in thread
From: Ed White @ 2015-06-11 16:23 UTC (permalink / raw)
  To: Tim Deegan, Andrew Cooper
  Cc: ravi.sahita, Keir Fraser, Ian Campbell, Ian Jackson, Jan Beulich,
	xen-devel

On 06/11/2015 05:06 AM, Tim Deegan wrote:
> At 00:09 +0100 on 11 Jun (1433981379), Andrew Cooper wrote:
>> On 10/06/15 20:41, Ed White wrote:
>>> On 06/10/2015 11:23 AM, Andrew Cooper wrote:
>>>> Also, hardware accelerated altp2m is mutually exclusive with EPT PML, as
>>>> we have no way of determining which translation was in use when a gpa
>>>> was appended to the buffer.  We are going to have to maintain a feature
>>>> compatibility matrix.  Even for non-accelerated altp2m, the cost of
>>>> working out the real gpa is likely prohibitive, and we should probably
>>>> resort to declaring logdirty and altp2m as exclusive features.
>>>>
>>>> ~Andrew
>>>>
>>> I haven't investigated the PML code, but just to be clear, log-dirty
>>> without PML is compatible with altp2m.
>>
>> The logdirty code is built around the notion of a single linear idea of
>> a guests physical address space.  Altp2m, by its very nature, introduces
>> non-linearities into a guests physical address space.
> 
> All current users of the log-dirty code, except for PML, operate on
> MFNs and use the M2P to get a PFN so use for the dirty-bitmap
> operation.  Those paths should all be able to operate with altp2m
> active.  (IOW the single linear address space is the 'host' p2m).
> 

I was going to write something this morning to point this out.

The nested page fault handler does a gfn->mfn translation using
the p2m the hardware is currently using, and then the non-PML
log-dirty code does a mfn->gfn translation that is valid for the
host p2m and uses those gfn's to update the log-dirty bitmap and
change the state of the pages in the host p2m.

However, migration still won't work with altp2m active, because
of the extra state that it knows nothing about.

Ed

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-06-11 16:30 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-10  0:09 Alternate p2m design specification Ed White
2015-06-10  7:43 ` Jan Beulich
2015-06-10 16:39   ` Ed White
2015-06-11  7:05     ` Jan Beulich
2015-06-11 16:30       ` Ed White
2015-06-10 18:23 ` Andrew Cooper
2015-06-10 19:41   ` Ed White
2015-06-10 23:09     ` Andrew Cooper
2015-06-10 23:41       ` Ed White
2015-06-11 12:06       ` Tim Deegan
2015-06-11 16:23         ` Ed White

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.