xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* (Reluctant) request to revert several changes, due to regressing VM migration
@ 2014-06-04 15:45 Andrew Cooper
  2014-06-04 16:00 ` Jan Beulich
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Cooper @ 2014-06-04 15:45 UTC (permalink / raw)
  To: Jan Beulich, Wu, Feng; +Cc: Xen-devel List

Changeset 31ee951a3 "x86/HVM: correct the SMEP logic for
HVM_CR0_GUEST_RESERVED_BITS" breaks migration for VMs using SMEP.

For migration, the architectural state is restored before the cpuid
policy is written.  This appears to be the behaviour in libxl, and is
certainly the behaviour in Xapi.

As a result, a VM using SMEP will fail the CR4 check in
hvm_load_cpu_ctxt().  This is easy to observe by performing a localhost
migration of a modern HVM Linux VM which enables SMEP.

Changeset 58658992 performs an equivalent action for SMAP, and as such
will be equivalently broken on supporting hardware.


Specifically, c/s f952f9c7f0e which is the backport of 31ee951a3 into
staging-4.4 is the problematic change which is causing regressions in
XenServer testing.


This is a reluctant request as pragmatically the changeset is correct.

However, there is a chicken & egg problem for libxc trying to calculate
the cpuid policy, which requires bits from the "magic chunks" in the
migration stream to be loaded first.  Therefore, simply moving the cpuid
policy calculation earlier will result in different problems.

~Andrew

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: (Reluctant) request to revert several changes, due to regressing VM migration
  2014-06-04 15:45 (Reluctant) request to revert several changes, due to regressing VM migration Andrew Cooper
@ 2014-06-04 16:00 ` Jan Beulich
  2014-06-04 16:08   ` Andrew Cooper
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2014-06-04 16:00 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Feng Wu, Xen-devel List

>>> On 04.06.14 at 17:45, <andrew.cooper3@citrix.com> wrote:
> Changeset 31ee951a3 "x86/HVM: correct the SMEP logic for
> HVM_CR0_GUEST_RESERVED_BITS" breaks migration for VMs using SMEP.
> 
> For migration, the architectural state is restored before the cpuid
> policy is written.  This appears to be the behaviour in libxl, and is
> certainly the behaviour in Xapi.
> 
> As a result, a VM using SMEP will fail the CR4 check in
> hvm_load_cpu_ctxt().  This is easy to observe by performing a localhost
> migration of a modern HVM Linux VM which enables SMEP.
> 
> Changeset 58658992 performs an equivalent action for SMAP, and as such
> will be equivalently broken on supporting hardware.
> 
> 
> Specifically, c/s f952f9c7f0e which is the backport of 31ee951a3 into
> staging-4.4 is the problematic change which is causing regressions in
> XenServer testing.
> 
> 
> This is a reluctant request as pragmatically the changeset is correct.

So as already hinted at on irc - what's wrong with using
cpu_has_smep as long as the guest's d->arch.cpuid[] is blank?
If the incoming guest didn't see SMEP available, all its CR4.SMEP
would necessarily be clear (or if they weren't, this would sooner
or later result in a guest crash).

But then again - isn't there another problem here: hvm_cpuid()
assumes to be on the subject vCPU, which hardly can be the
case for the hvm_load_cpu_ctxt() code path using the macro
in question. So perhaps it even needs to be further relaxed in
using cpu_has_smep when not on current != v. Which of course
would require care by eventual future users of this macro.

Jan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: (Reluctant) request to revert several changes, due to regressing VM migration
  2014-06-04 16:00 ` Jan Beulich
@ 2014-06-04 16:08   ` Andrew Cooper
  2014-06-05 11:58     ` [PATCH] x86/HVM: refine SMEP/SMAP tests in HVM_CR4_GUEST_RESERVED_BITS() Jan Beulich
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Cooper @ 2014-06-04 16:08 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Feng Wu, Xen-devel List

On 04/06/14 17:00, Jan Beulich wrote:
>>>> On 04.06.14 at 17:45, <andrew.cooper3@citrix.com> wrote:
>> Changeset 31ee951a3 "x86/HVM: correct the SMEP logic for
>> HVM_CR0_GUEST_RESERVED_BITS" breaks migration for VMs using SMEP.
>>
>> For migration, the architectural state is restored before the cpuid
>> policy is written.  This appears to be the behaviour in libxl, and is
>> certainly the behaviour in Xapi.
>>
>> As a result, a VM using SMEP will fail the CR4 check in
>> hvm_load_cpu_ctxt().  This is easy to observe by performing a localhost
>> migration of a modern HVM Linux VM which enables SMEP.
>>
>> Changeset 58658992 performs an equivalent action for SMAP, and as such
>> will be equivalently broken on supporting hardware.
>>
>>
>> Specifically, c/s f952f9c7f0e which is the backport of 31ee951a3 into
>> staging-4.4 is the problematic change which is causing regressions in
>> XenServer testing.
>>
>>
>> This is a reluctant request as pragmatically the changeset is correct.
> So as already hinted at on irc - what's wrong with using
> cpu_has_smep as long as the guest's d->arch.cpuid[] is blank?
> If the incoming guest didn't see SMEP available, all its CR4.SMEP
> would necessarily be clear (or if they weren't, this would sooner
> or later result in a guest crash).
>
> But then again - isn't there another problem here: hvm_cpuid()
> assumes to be on the subject vCPU, which hardly can be the
> case for the hvm_load_cpu_ctxt() code path using the macro
> in question. So perhaps it even needs to be further relaxed in
> using cpu_has_smep when not on current != v. Which of course
> would require care by eventual future users of this macro.
>
> Jan
>

As I have said, both previously on the list, and at the hackathon, the
cpuid handling and domain cpuid policy infrastructure is a massive
stinking swamp which gets worse every time I find a new bit of it.

For XenServer and our VM feature levelling support, I am going to have
to fix it somehow.  Although, fixing the 32/64bit migration issue is
still a more important problem.


I am tempted to suggest reverting it back to what it was before and
leaving it in that state until other bits of the infrastructure are
actually working.  This certainly isn't the only place where this code
is fragile.

~Andrew

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] x86/HVM: refine SMEP/SMAP tests in HVM_CR4_GUEST_RESERVED_BITS()
  2014-06-04 16:08   ` Andrew Cooper
@ 2014-06-05 11:58     ` Jan Beulich
  2014-06-05 13:07       ` David Vrabel
  2014-06-06  1:20       ` Wu, Feng
  0 siblings, 2 replies; 6+ messages in thread
From: Jan Beulich @ 2014-06-05 11:58 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel List; +Cc: Keir Fraser, Feng Wu

[-- Attachment #1: Type: text/plain, Size: 3660 bytes --]

Andrew validly points out that the use of the macro on the restore path
can't rely on the CPUID bits for the guest already being in place (as
their setting by the tool stack in turn requires the other restore
operations already having taken place). And even worse, using
hvm_cpuid() is invalid here because that function assumes to be used in
the context of the vCPU in question.

Reverting to the behavior prior to the change from checking
cpu_has_sm?p to hvm_vcpu_has_sm?p() would break the other (non-restore)
use of the macro. So let's revert to the prior behavior only for the
restore path, by adding a respective second parameter to the macro.

Obviously the two cpu_has_* uses in the macro should really also be
converted to hvm_cpuid() based checks at least for the non-restore
path.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1754,7 +1754,7 @@ static int hvm_load_cpu_ctxt(struct doma
         return -EINVAL;
     }
 
-    if ( ctxt.cr4 & HVM_CR4_GUEST_RESERVED_BITS(v) )
+    if ( ctxt.cr4 & HVM_CR4_GUEST_RESERVED_BITS(v, 1) )
     {
         printk(XENLOG_G_ERR "HVM%d restore: bad CR4 %#" PRIx64 "\n",
                d->domain_id, ctxt.cr4);
@@ -3186,7 +3186,7 @@ int hvm_set_cr4(unsigned long value)
     struct vcpu *v = current;
     unsigned long old_cr;
 
-    if ( value & HVM_CR4_GUEST_RESERVED_BITS(v) )
+    if ( value & HVM_CR4_GUEST_RESERVED_BITS(v, 0) )
     {
         HVM_DBG_LOG(DBG_LEVEL_1,
                     "Guest attempts to set reserved bit in CR4: %lx",
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -406,19 +406,27 @@ static inline bool_t hvm_vcpu_has_smep(v
     (X86_CR4_VMXE | X86_CR4_PAE | X86_CR4_MCE))
 
 /* These bits in CR4 cannot be set by the guest. */
-#define HVM_CR4_GUEST_RESERVED_BITS(_v)                 \
+#define HVM_CR4_GUEST_RESERVED_BITS(v, restore) ({      \
+    const struct vcpu *_v = (v);                        \
+    bool_t _restore = !!(restore);                      \
+    ASSERT((_restore) || _v == current);                \
     (~((unsigned long)                                  \
        (X86_CR4_VME | X86_CR4_PVI | X86_CR4_TSD |       \
         X86_CR4_DE  | X86_CR4_PSE | X86_CR4_PAE |       \
         X86_CR4_MCE | X86_CR4_PGE | X86_CR4_PCE |       \
         X86_CR4_OSFXSR | X86_CR4_OSXMMEXCPT |           \
-        (hvm_vcpu_has_smep() ? X86_CR4_SMEP : 0) |      \
-        (hvm_vcpu_has_smap() ? X86_CR4_SMAP : 0) |      \
+        (((_restore) ? cpu_has_smep :                   \
+                       hvm_vcpu_has_smep()) ?           \
+         X86_CR4_SMEP : 0) |                            \
+        (((_restore) ? cpu_has_smap :                   \
+                       hvm_vcpu_has_smap()) ?           \
+         X86_CR4_SMAP : 0) |                            \
         (cpu_has_fsgsbase ? X86_CR4_FSGSBASE : 0) |     \
-        ((nestedhvm_enabled((_v)->domain) && cpu_has_vmx)\
-                      ? X86_CR4_VMXE : 0)  |             \
-        (cpu_has_pcid ? X86_CR4_PCIDE : 0) |             \
-        (cpu_has_xsave ? X86_CR4_OSXSAVE : 0))))
+        ((nestedhvm_enabled(_v->domain) && cpu_has_vmx) \
+                      ? X86_CR4_VMXE : 0)  |            \
+        (cpu_has_pcid ? X86_CR4_PCIDE : 0) |            \
+        (cpu_has_xsave ? X86_CR4_OSXSAVE : 0))));       \
+})
 
 /* These exceptions must always be intercepted. */
 #define HVM_TRAP_MASK ((1U << TRAP_machine_check) | (1U << TRAP_invalid_op))




[-- Attachment #2: x86-HVM-refine-SMxP-tests.patch --]
[-- Type: text/plain, Size: 3722 bytes --]

x86/HVM: refine SMEP/SMAP tests in HVM_CR4_GUEST_RESERVED_BITS()

Andrew validly points out that the use of the macro on the restore path
can't rely on the CPUID bits for the guest already being in place (as
their setting by the tool stack in turn requires the other restore
operations already having taken place). And even worse, using
hvm_cpuid() is invalid here because that function assumes to be used in
the context of the vCPU in question.

Reverting to the behavior prior to the change from checking
cpu_has_sm?p to hvm_vcpu_has_sm?p() would break the other (non-restore)
use of the macro. So let's revert to the prior behavior only for the
restore path, by adding a respective second parameter to the macro.

Obviously the two cpu_has_* uses in the macro should really also be
converted to hvm_cpuid() based checks at least for the non-restore
path.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1754,7 +1754,7 @@ static int hvm_load_cpu_ctxt(struct doma
         return -EINVAL;
     }
 
-    if ( ctxt.cr4 & HVM_CR4_GUEST_RESERVED_BITS(v) )
+    if ( ctxt.cr4 & HVM_CR4_GUEST_RESERVED_BITS(v, 1) )
     {
         printk(XENLOG_G_ERR "HVM%d restore: bad CR4 %#" PRIx64 "\n",
                d->domain_id, ctxt.cr4);
@@ -3186,7 +3186,7 @@ int hvm_set_cr4(unsigned long value)
     struct vcpu *v = current;
     unsigned long old_cr;
 
-    if ( value & HVM_CR4_GUEST_RESERVED_BITS(v) )
+    if ( value & HVM_CR4_GUEST_RESERVED_BITS(v, 0) )
     {
         HVM_DBG_LOG(DBG_LEVEL_1,
                     "Guest attempts to set reserved bit in CR4: %lx",
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -406,19 +406,27 @@ static inline bool_t hvm_vcpu_has_smep(v
     (X86_CR4_VMXE | X86_CR4_PAE | X86_CR4_MCE))
 
 /* These bits in CR4 cannot be set by the guest. */
-#define HVM_CR4_GUEST_RESERVED_BITS(_v)                 \
+#define HVM_CR4_GUEST_RESERVED_BITS(v, restore) ({      \
+    const struct vcpu *_v = (v);                        \
+    bool_t _restore = !!(restore);                      \
+    ASSERT((_restore) || _v == current);                \
     (~((unsigned long)                                  \
        (X86_CR4_VME | X86_CR4_PVI | X86_CR4_TSD |       \
         X86_CR4_DE  | X86_CR4_PSE | X86_CR4_PAE |       \
         X86_CR4_MCE | X86_CR4_PGE | X86_CR4_PCE |       \
         X86_CR4_OSFXSR | X86_CR4_OSXMMEXCPT |           \
-        (hvm_vcpu_has_smep() ? X86_CR4_SMEP : 0) |      \
-        (hvm_vcpu_has_smap() ? X86_CR4_SMAP : 0) |      \
+        (((_restore) ? cpu_has_smep :                   \
+                       hvm_vcpu_has_smep()) ?           \
+         X86_CR4_SMEP : 0) |                            \
+        (((_restore) ? cpu_has_smap :                   \
+                       hvm_vcpu_has_smap()) ?           \
+         X86_CR4_SMAP : 0) |                            \
         (cpu_has_fsgsbase ? X86_CR4_FSGSBASE : 0) |     \
-        ((nestedhvm_enabled((_v)->domain) && cpu_has_vmx)\
-                      ? X86_CR4_VMXE : 0)  |             \
-        (cpu_has_pcid ? X86_CR4_PCIDE : 0) |             \
-        (cpu_has_xsave ? X86_CR4_OSXSAVE : 0))))
+        ((nestedhvm_enabled(_v->domain) && cpu_has_vmx) \
+                      ? X86_CR4_VMXE : 0)  |            \
+        (cpu_has_pcid ? X86_CR4_PCIDE : 0) |            \
+        (cpu_has_xsave ? X86_CR4_OSXSAVE : 0))));       \
+})
 
 /* These exceptions must always be intercepted. */
 #define HVM_TRAP_MASK ((1U << TRAP_machine_check) | (1U << TRAP_invalid_op))

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/HVM: refine SMEP/SMAP tests in HVM_CR4_GUEST_RESERVED_BITS()
  2014-06-05 11:58     ` [PATCH] x86/HVM: refine SMEP/SMAP tests in HVM_CR4_GUEST_RESERVED_BITS() Jan Beulich
@ 2014-06-05 13:07       ` David Vrabel
  2014-06-06  1:20       ` Wu, Feng
  1 sibling, 0 replies; 6+ messages in thread
From: David Vrabel @ 2014-06-05 13:07 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper, Xen-devel List; +Cc: Feng Wu, Keir Fraser

On 05/06/14 12:58, Jan Beulich wrote:
> Andrew validly points out that the use of the macro on the restore path
> can't rely on the CPUID bits for the guest already being in place (as
> their setting by the tool stack in turn requires the other restore
> operations already having taken place). And even worse, using
> hvm_cpuid() is invalid here because that function assumes to be used in
> the context of the vCPU in question.
> 
> Reverting to the behavior prior to the change from checking
> cpu_has_sm?p to hvm_vcpu_has_sm?p() would break the other (non-restore)
> use of the macro. So let's revert to the prior behavior only for the
> restore path, by adding a respective second parameter to the macro.
> 
> Obviously the two cpu_has_* uses in the macro should really also be
> converted to hvm_cpuid() based checks at least for the non-restore
> path.

FWIW,

Tested-by: David Vrabel <david.vrabel@citrix.com>

But on 4.4.x not unstable.

David

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/HVM: refine SMEP/SMAP tests in HVM_CR4_GUEST_RESERVED_BITS()
  2014-06-05 11:58     ` [PATCH] x86/HVM: refine SMEP/SMAP tests in HVM_CR4_GUEST_RESERVED_BITS() Jan Beulich
  2014-06-05 13:07       ` David Vrabel
@ 2014-06-06  1:20       ` Wu, Feng
  1 sibling, 0 replies; 6+ messages in thread
From: Wu, Feng @ 2014-06-06  1:20 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper, Xen-devel List; +Cc: Keir Fraser

Thanks Andrew for pointing this out and thanks a lot for Jan's fix!

Thanks,
Feng

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Thursday, June 05, 2014 7:59 PM
> To: Andrew Cooper; Xen-devel List
> Cc: Wu, Feng; Keir Fraser
> Subject: [PATCH] x86/HVM: refine SMEP/SMAP tests in
> HVM_CR4_GUEST_RESERVED_BITS()
> 
> Andrew validly points out that the use of the macro on the restore path
> can't rely on the CPUID bits for the guest already being in place (as
> their setting by the tool stack in turn requires the other restore
> operations already having taken place). And even worse, using
> hvm_cpuid() is invalid here because that function assumes to be used in
> the context of the vCPU in question.
> 
> Reverting to the behavior prior to the change from checking
> cpu_has_sm?p to hvm_vcpu_has_sm?p() would break the other (non-restore)
> use of the macro. So let's revert to the prior behavior only for the
> restore path, by adding a respective second parameter to the macro.
> 
> Obviously the two cpu_has_* uses in the macro should really also be
> converted to hvm_cpuid() based checks at least for the non-restore
> path.
> 
> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -1754,7 +1754,7 @@ static int hvm_load_cpu_ctxt(struct doma
>          return -EINVAL;
>      }
> 
> -    if ( ctxt.cr4 & HVM_CR4_GUEST_RESERVED_BITS(v) )
> +    if ( ctxt.cr4 & HVM_CR4_GUEST_RESERVED_BITS(v, 1) )
>      {
>          printk(XENLOG_G_ERR "HVM%d restore: bad CR4 %#" PRIx64 "\n",
>                 d->domain_id, ctxt.cr4);
> @@ -3186,7 +3186,7 @@ int hvm_set_cr4(unsigned long value)
>      struct vcpu *v = current;
>      unsigned long old_cr;
> 
> -    if ( value & HVM_CR4_GUEST_RESERVED_BITS(v) )
> +    if ( value & HVM_CR4_GUEST_RESERVED_BITS(v, 0) )
>      {
>          HVM_DBG_LOG(DBG_LEVEL_1,
>                      "Guest attempts to set reserved bit in CR4: %lx",
> --- a/xen/include/asm-x86/hvm/hvm.h
> +++ b/xen/include/asm-x86/hvm/hvm.h
> @@ -406,19 +406,27 @@ static inline bool_t hvm_vcpu_has_smep(v
>      (X86_CR4_VMXE | X86_CR4_PAE | X86_CR4_MCE))
> 
>  /* These bits in CR4 cannot be set by the guest. */
> -#define HVM_CR4_GUEST_RESERVED_BITS(_v)                 \
> +#define HVM_CR4_GUEST_RESERVED_BITS(v, restore) ({      \
> +    const struct vcpu *_v = (v);                        \
> +    bool_t _restore = !!(restore);                      \
> +    ASSERT((_restore) || _v == current);                \
>      (~((unsigned long)                                  \
>         (X86_CR4_VME | X86_CR4_PVI | X86_CR4_TSD |       \
>          X86_CR4_DE  | X86_CR4_PSE | X86_CR4_PAE |       \
>          X86_CR4_MCE | X86_CR4_PGE | X86_CR4_PCE |       \
>          X86_CR4_OSFXSR | X86_CR4_OSXMMEXCPT |           \
> -        (hvm_vcpu_has_smep() ? X86_CR4_SMEP : 0) |      \
> -        (hvm_vcpu_has_smap() ? X86_CR4_SMAP : 0) |      \
> +        (((_restore) ? cpu_has_smep :                   \
> +                       hvm_vcpu_has_smep()) ?           \
> +         X86_CR4_SMEP : 0) |                            \
> +        (((_restore) ? cpu_has_smap :                   \
> +                       hvm_vcpu_has_smap()) ?           \
> +         X86_CR4_SMAP : 0) |                            \
>          (cpu_has_fsgsbase ? X86_CR4_FSGSBASE : 0) |     \
> -        ((nestedhvm_enabled((_v)->domain) && cpu_has_vmx)\
> -                      ? X86_CR4_VMXE : 0)  |             \
> -        (cpu_has_pcid ? X86_CR4_PCIDE : 0) |             \
> -        (cpu_has_xsave ? X86_CR4_OSXSAVE : 0))))
> +        ((nestedhvm_enabled(_v->domain) && cpu_has_vmx) \
> +                      ? X86_CR4_VMXE : 0)  |            \
> +        (cpu_has_pcid ? X86_CR4_PCIDE : 0) |            \
> +        (cpu_has_xsave ? X86_CR4_OSXSAVE : 0))));       \
> +})
> 
>  /* These exceptions must always be intercepted. */
>  #define HVM_TRAP_MASK ((1U << TRAP_machine_check) | (1U <<
> TRAP_invalid_op))
> 
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-06-06  1:20 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-04 15:45 (Reluctant) request to revert several changes, due to regressing VM migration Andrew Cooper
2014-06-04 16:00 ` Jan Beulich
2014-06-04 16:08   ` Andrew Cooper
2014-06-05 11:58     ` [PATCH] x86/HVM: refine SMEP/SMAP tests in HVM_CR4_GUEST_RESERVED_BITS() Jan Beulich
2014-06-05 13:07       ` David Vrabel
2014-06-06  1:20       ` Wu, Feng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).