* [PATCH v3] xen: fix reboot/shutdown with running HVM guests
@ 2014-06-05 15:01 Roger Pau Monne
2014-06-05 15:42 ` Aravind Gopalakrishnan
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Roger Pau Monne @ 2014-06-05 15:01 UTC (permalink / raw)
To: xen-devel
Cc: Kevin Tian, Keir Fraser, Suravee Suthikulpanit, Jun Nakajima,
Eddie Dong, Aravind Gopalakrishnan, Jan Beulich, Boris Ostrovsky,
Roger Pau Monne
If there's a guest using VMX/SVM when the hypervisor shuts down, it
can lead to the following crash due to VMX/SVM functions being called
after hvm_cpu_down has been called. In order to prevent that, check in
{svm/vmx}_ctxt_switch_from that the cpu virtualization extensions are
still enabled.
(XEN) Domain 0 shutdown: rebooting machine.
(XEN) Assertion 'read_cr0() & X86_CR0_TS' failed at vmx.c:644
(XEN) ----[ Xen-4.5-unstable x86_64 debug=y Tainted: C ]----
(XEN) CPU: 0
(XEN) RIP: e008:[<ffff82d0801d90ce>] vmx_ctxt_switch_from+0x1e/0x14c
(XEN) RFLAGS: 0000000000010046 CONTEXT: hypervisor
(XEN) rax: 0000000080050033 rbx: ffff8300dfb1c000 rcx: 0000000000000000
(XEN) rdx: 0000000000000000 rsi: ffff82d0802d7fc0 rdi: ffff8300dfb1c000
(XEN) rbp: ffff82d0802d7a88 rsp: ffff82d0802d7a78 r8: 0000000000000000
(XEN) r9: ffff82cffffff000 r10: 0000000b06dca869 r11: 0000003d7d708160
(XEN) r12: ffff8300dfb1c000 r13: 0000000000000000 r14: ffff82d0802d0000
(XEN) r15: ffff82d0802d7f18 cr0: 0000000080050033 cr4: 00000000000026f0
(XEN) cr3: 000000019ed8d000 cr2: 0000000800dcb040
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
(XEN) Xen stack trace from rsp=ffff82d0802d7a78:
(XEN) ffff8300dfdf2000 ffff8300dfdf2000 ffff82d0802d7ad8 ffff82d08015d129
(XEN) fffffe003d272a38 ffff83019a3f9140 0000000000000000 0000000000000001
(XEN) 0000000000000086 000000000019a400 0000000000000000 0001000000010030
(XEN) ffff82d0802d7af8 ffff82d080160acf ffff83019ec18530 ffff8300dfdf2000
(XEN) ffff82d0802d7b08 ffff82d080160af9 ffff82d0802d7b58 ffff82d080161728
(XEN) ffff82d0802d7b48 ffff82d08013ffbf 0000000000000002 ffff83019ec18530
(XEN) 0000000000000000 0000000000000012 0000000000000000 0001000000010030
(XEN) ffff82d0802d7b68 ffff82d08014e721 ffff82d0802d7bc8 ffff82d08014cda2
(XEN) ffff8300dfb1c000 0000000000000092 ffff83019ec18604 ffff83019ec185f8
(XEN) ffff82d0802d0000 0000000000000001 0000000000000000 ffff82d08016560e
(XEN) ffff82d080272860 0000000000000020 ffff82d0802d7bd8 ffff82d0801448a8
(XEN) ffff82d0802d7be8 ffff82d080165625 ffff82d0802d7c18 ffff82d080166143
(XEN) 0000000000000000 0000000000000001 0000000000000000 ffff82d080272860
(XEN) ffff82d0802d7c48 ffff82d080166aa8 ffff8300dfb1c060 0000000000010000
(XEN) 0000000000000001 ffff82d080272860 ffff82d0802d7c78 ffff82d080166bae
(XEN) 000000000000000a ffff82d080276fe0 00000000000000fb 0000000000000061
(XEN) ffff82d0802d7c98 ffff82d080166f63 ffff82d0802d7c98 ffff82d0801821ff
(XEN) ffff82d0802d7cb8 ffff82d08018228b 0000000000000000 ffff82d0802d7dd8
(XEN) ffff82d0802d7cf8 ffff82d080181aa7 ffff82d0802d7d08 0000000000000206
(XEN) 0000000000000000 ffff82d0802d7dd8 00000000000000fb 0000000000000008
(XEN) Xen call trace:
(XEN) [<ffff82d0801d90ce>] vmx_ctxt_switch_from+0x1e/0x14c
(XEN) [<ffff82d08015d129>] __context_switch+0x127/0x462
(XEN) [<ffff82d080160acf>] __sync_local_execstate+0x6a/0x8b
(XEN) [<ffff82d080160af9>] sync_local_execstate+0x9/0xb
(XEN) [<ffff82d080161728>] map_domain_page+0x88/0x4de
(XEN) [<ffff82d08014e721>] map_vtd_domain_page+0xd/0xf
(XEN) [<ffff82d08014cda2>] io_apic_read_remap_rte+0x158/0x29f
(XEN) [<ffff82d0801448a8>] iommu_read_apic_from_ire+0x27/0x29
(XEN) [<ffff82d080165625>] io_apic_read+0x17/0x65
(XEN) [<ffff82d080166143>] __ioapic_read_entry+0x38/0x61
(XEN) [<ffff82d080166aa8>] clear_IO_APIC_pin+0x1a/0xf3
(XEN) [<ffff82d080166bae>] clear_IO_APIC+0x2d/0x60
(XEN) [<ffff82d080166f63>] disable_IO_APIC+0xd/0x81
(XEN) [<ffff82d08018228b>] smp_send_stop+0x58/0x68
(XEN) [<ffff82d080181aa7>] machine_restart+0x80/0x20a
(XEN) [<ffff82d080181c3c>] __machine_restart+0xb/0xf
(XEN) [<ffff82d080128fb9>] smp_call_function_interrupt+0x99/0xc0
(XEN) [<ffff82d080182330>] call_function_interrupt+0x33/0x43
(XEN) [<ffff82d08016bd89>] do_IRQ+0x9e/0x63a
(XEN) [<ffff82d08016406f>] common_interrupt+0x5f/0x70
(XEN) [<ffff82d0801a8600>] mwait_idle+0x29c/0x2f7
(XEN) [<ffff82d08015cf67>] idle_loop+0x58/0x76
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Assertion 'read_cr0() & X86_CR0_TS' failed at vmx.c:644
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Suggested-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Eddie Dong <eddie.dong@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Mukesh Rathor <mukesh.rathor@oracle.com>
---
Changes since v2:
- Added comments on the gates added to {svm/vmx}_ctxt_switch_from.
- Move the declaration of the pcpu variable vmxon to vmcs.h.
---
xen/arch/x86/hvm/svm/svm.c | 8 ++++++++
xen/arch/x86/hvm/vmx/vmcs.c | 2 +-
xen/arch/x86/hvm/vmx/vmx.c | 8 ++++++++
xen/include/asm-x86/hvm/vmx/vmcs.h | 2 ++
4 files changed, 19 insertions(+), 1 deletions(-)
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 870e4ee..76616ac 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -966,6 +966,14 @@ static void svm_ctxt_switch_from(struct vcpu *v)
{
int cpu = smp_processor_id();
+ /*
+ * Return early if trying to do a context switch without SVM enabled,
+ * this can happen when the hypervisor shuts down with HVM guests
+ * still running.
+ */
+ if ( unlikely((read_efer() & EFER_SVME) == 0) )
+ return;
+
svm_fpu_leave(v);
svm_save_dr(v);
diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 7564895..8ffc562 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -74,7 +74,7 @@ u64 vmx_ept_vpid_cap __read_mostly;
static DEFINE_PER_CPU_READ_MOSTLY(struct vmcs_struct *, vmxon_region);
static DEFINE_PER_CPU(struct vmcs_struct *, current_vmcs);
static DEFINE_PER_CPU(struct list_head, active_vmcs_list);
-static DEFINE_PER_CPU(bool_t, vmxon);
+DEFINE_PER_CPU(bool_t, vmxon);
static u32 vmcs_revision_id __read_mostly;
u64 __read_mostly vmx_basic_msr;
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index d45fb7f..2caa04a 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -671,6 +671,14 @@ static void vmx_fpu_leave(struct vcpu *v)
static void vmx_ctxt_switch_from(struct vcpu *v)
{
+ /*
+ * Return early if trying to do a context switch without VMX enabled,
+ * this can happen when the hypervisor shuts down with HVM guests
+ * still running.
+ */
+ if ( unlikely(!this_cpu(vmxon)) )
+ return;
+
vmx_fpu_leave(v);
vmx_save_guest_msrs(v);
vmx_restore_host_msrs();
diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h b/xen/include/asm-x86/hvm/vmx/vmcs.h
index 445b39f..215d93c 100644
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -490,6 +490,8 @@ void virtual_vmcs_exit(void *vvmcs);
u64 virtual_vmcs_vmread(void *vvmcs, u32 vmcs_encoding);
void virtual_vmcs_vmwrite(void *vvmcs, u32 vmcs_encoding, u64 val);
+DECLARE_PER_CPU(bool_t, vmxon);
+
#endif /* ASM_X86_HVM_VMX_VMCS_H__ */
/*
--
1.7.7.5 (Apple Git-26)
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH v3] xen: fix reboot/shutdown with running HVM guests
2014-06-05 15:01 [PATCH v3] xen: fix reboot/shutdown with running HVM guests Roger Pau Monne
@ 2014-06-05 15:42 ` Aravind Gopalakrishnan
2014-06-05 15:46 ` Andrew Cooper
2014-06-05 19:12 ` Tian, Kevin
2 siblings, 0 replies; 7+ messages in thread
From: Aravind Gopalakrishnan @ 2014-06-05 15:42 UTC (permalink / raw)
To: Roger Pau Monne, xen-devel
Cc: Kevin Tian, Keir Fraser, Jan Beulich, Eddie Dong, Jun Nakajima,
Boris Ostrovsky, Suravee Suthikulpanit
On 6/5/2014 10:01 AM, Roger Pau Monne wrote:
> If there's a guest using VMX/SVM when the hypervisor shuts down, it
> can lead to the following crash due to VMX/SVM functions being called
> after hvm_cpu_down has been called. In order to prevent that, check in
> {svm/vmx}_ctxt_switch_from that the cpu virtualization extensions are
> still enabled.
>
> (XEN) Domain 0 shutdown: rebooting machine.
> (XEN) Assertion 'read_cr0() & X86_CR0_TS' failed at vmx.c:644
> (XEN) ----[ Xen-4.5-unstable x86_64 debug=y Tainted: C ]----
> (XEN) CPU: 0
> (XEN) RIP: e008:[<ffff82d0801d90ce>] vmx_ctxt_switch_from+0x1e/0x14c
> (XEN) RFLAGS: 0000000000010046 CONTEXT: hypervisor
> (XEN) rax: 0000000080050033 rbx: ffff8300dfb1c000 rcx: 0000000000000000
> (XEN) rdx: 0000000000000000 rsi: ffff82d0802d7fc0 rdi: ffff8300dfb1c000
> (XEN) rbp: ffff82d0802d7a88 rsp: ffff82d0802d7a78 r8: 0000000000000000
> (XEN) r9: ffff82cffffff000 r10: 0000000b06dca869 r11: 0000003d7d708160
> (XEN) r12: ffff8300dfb1c000 r13: 0000000000000000 r14: ffff82d0802d0000
> (XEN) r15: ffff82d0802d7f18 cr0: 0000000080050033 cr4: 00000000000026f0
> (XEN) cr3: 000000019ed8d000 cr2: 0000000800dcb040
> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
> (XEN) Xen stack trace from rsp=ffff82d0802d7a78:
> (XEN) ffff8300dfdf2000 ffff8300dfdf2000 ffff82d0802d7ad8 ffff82d08015d129
> (XEN) fffffe003d272a38 ffff83019a3f9140 0000000000000000 0000000000000001
> (XEN) 0000000000000086 000000000019a400 0000000000000000 0001000000010030
> (XEN) ffff82d0802d7af8 ffff82d080160acf ffff83019ec18530 ffff8300dfdf2000
> (XEN) ffff82d0802d7b08 ffff82d080160af9 ffff82d0802d7b58 ffff82d080161728
> (XEN) ffff82d0802d7b48 ffff82d08013ffbf 0000000000000002 ffff83019ec18530
> (XEN) 0000000000000000 0000000000000012 0000000000000000 0001000000010030
> (XEN) ffff82d0802d7b68 ffff82d08014e721 ffff82d0802d7bc8 ffff82d08014cda2
> (XEN) ffff8300dfb1c000 0000000000000092 ffff83019ec18604 ffff83019ec185f8
> (XEN) ffff82d0802d0000 0000000000000001 0000000000000000 ffff82d08016560e
> (XEN) ffff82d080272860 0000000000000020 ffff82d0802d7bd8 ffff82d0801448a8
> (XEN) ffff82d0802d7be8 ffff82d080165625 ffff82d0802d7c18 ffff82d080166143
> (XEN) 0000000000000000 0000000000000001 0000000000000000 ffff82d080272860
> (XEN) ffff82d0802d7c48 ffff82d080166aa8 ffff8300dfb1c060 0000000000010000
> (XEN) 0000000000000001 ffff82d080272860 ffff82d0802d7c78 ffff82d080166bae
> (XEN) 000000000000000a ffff82d080276fe0 00000000000000fb 0000000000000061
> (XEN) ffff82d0802d7c98 ffff82d080166f63 ffff82d0802d7c98 ffff82d0801821ff
> (XEN) ffff82d0802d7cb8 ffff82d08018228b 0000000000000000 ffff82d0802d7dd8
> (XEN) ffff82d0802d7cf8 ffff82d080181aa7 ffff82d0802d7d08 0000000000000206
> (XEN) 0000000000000000 ffff82d0802d7dd8 00000000000000fb 0000000000000008
> (XEN) Xen call trace:
> (XEN) [<ffff82d0801d90ce>] vmx_ctxt_switch_from+0x1e/0x14c
> (XEN) [<ffff82d08015d129>] __context_switch+0x127/0x462
> (XEN) [<ffff82d080160acf>] __sync_local_execstate+0x6a/0x8b
> (XEN) [<ffff82d080160af9>] sync_local_execstate+0x9/0xb
> (XEN) [<ffff82d080161728>] map_domain_page+0x88/0x4de
> (XEN) [<ffff82d08014e721>] map_vtd_domain_page+0xd/0xf
> (XEN) [<ffff82d08014cda2>] io_apic_read_remap_rte+0x158/0x29f
> (XEN) [<ffff82d0801448a8>] iommu_read_apic_from_ire+0x27/0x29
> (XEN) [<ffff82d080165625>] io_apic_read+0x17/0x65
> (XEN) [<ffff82d080166143>] __ioapic_read_entry+0x38/0x61
> (XEN) [<ffff82d080166aa8>] clear_IO_APIC_pin+0x1a/0xf3
> (XEN) [<ffff82d080166bae>] clear_IO_APIC+0x2d/0x60
> (XEN) [<ffff82d080166f63>] disable_IO_APIC+0xd/0x81
> (XEN) [<ffff82d08018228b>] smp_send_stop+0x58/0x68
> (XEN) [<ffff82d080181aa7>] machine_restart+0x80/0x20a
> (XEN) [<ffff82d080181c3c>] __machine_restart+0xb/0xf
> (XEN) [<ffff82d080128fb9>] smp_call_function_interrupt+0x99/0xc0
> (XEN) [<ffff82d080182330>] call_function_interrupt+0x33/0x43
> (XEN) [<ffff82d08016bd89>] do_IRQ+0x9e/0x63a
> (XEN) [<ffff82d08016406f>] common_interrupt+0x5f/0x70
> (XEN) [<ffff82d0801a8600>] mwait_idle+0x29c/0x2f7
> (XEN) [<ffff82d08015cf67>] idle_loop+0x58/0x76
> (XEN)
> (XEN)
> (XEN) ****************************************
> (XEN) Panic on CPU 0:
> (XEN) Assertion 'read_cr0() & X86_CR0_TS' failed at vmx.c:644
> (XEN) ****************************************
> (XEN)
> (XEN) Reboot in five seconds...
>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> Suggested-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
> Cc: Keir Fraser <keir@xen.org>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Jun Nakajima <jun.nakajima@intel.com>
> Cc: Eddie Dong <eddie.dong@intel.com>
> Cc: Kevin Tian <kevin.tian@intel.com>
> Cc: Mukesh Rathor <mukesh.rathor@oracle.com>
> ---
> Changes since v2:
> - Added comments on the gates added to {svm/vmx}_ctxt_switch_from.
> - Move the declaration of the pcpu variable vmxon to vmcs.h.
> ---
> xen/arch/x86/hvm/svm/svm.c | 8 ++++++++
> xen/arch/x86/hvm/vmx/vmcs.c | 2 +-
> xen/arch/x86/hvm/vmx/vmx.c | 8 ++++++++
> xen/include/asm-x86/hvm/vmx/vmcs.h | 2 ++
> 4 files changed, 19 insertions(+), 1 deletions(-)
Acked-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH v3] xen: fix reboot/shutdown with running HVM guests
2014-06-05 15:01 [PATCH v3] xen: fix reboot/shutdown with running HVM guests Roger Pau Monne
2014-06-05 15:42 ` Aravind Gopalakrishnan
@ 2014-06-05 15:46 ` Andrew Cooper
2014-06-05 16:02 ` Jan Beulich
2014-06-05 19:12 ` Tian, Kevin
2 siblings, 1 reply; 7+ messages in thread
From: Andrew Cooper @ 2014-06-05 15:46 UTC (permalink / raw)
To: Roger Pau Monne
Cc: Kevin Tian, Keir Fraser, Suravee Suthikulpanit, Eddie Dong,
Jan Beulich, Aravind Gopalakrishnan, Jun Nakajima, xen-devel,
Boris Ostrovsky
On 05/06/14 16:01, Roger Pau Monne wrote:
> If there's a guest using VMX/SVM when the hypervisor shuts down, it
> can lead to the following crash due to VMX/SVM functions being called
> after hvm_cpu_down has been called. In order to prevent that, check in
> {svm/vmx}_ctxt_switch_from that the cpu virtualization extensions are
> still enabled.
>
>
I feel that this is still trying to fix the problem from the wrong end.
It is incorrect to be in a context switch at the point identified in the
stack trace.
How about having the hvm_cpu_down functions look at current, and
optionally run sync_exec_state() ?
~Andrew
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH v3] xen: fix reboot/shutdown with running HVM guests
2014-06-05 15:46 ` Andrew Cooper
@ 2014-06-05 16:02 ` Jan Beulich
2014-06-05 16:16 ` Andrew Cooper
0 siblings, 1 reply; 7+ messages in thread
From: Jan Beulich @ 2014-06-05 16:02 UTC (permalink / raw)
To: Andrew Cooper
Cc: Kevin Tian, KeirFraser, Suravee Suthikulpanit, Eddie Dong,
Aravind Gopalakrishnan, Jun Nakajima, xen-devel, Boris Ostrovsky,
Roger Pau Monne
>>> On 05.06.14 at 17:46, <andrew.cooper3@citrix.com> wrote:
> On 05/06/14 16:01, Roger Pau Monne wrote:
>> If there's a guest using VMX/SVM when the hypervisor shuts down, it
>> can lead to the following crash due to VMX/SVM functions being called
>> after hvm_cpu_down has been called. In order to prevent that, check in
>> {svm/vmx}_ctxt_switch_from that the cpu virtualization extensions are
>> still enabled.
>
> I feel that this is still trying to fix the problem from the wrong end.
> It is incorrect to be in a context switch at the point identified in the
> stack trace.
>
> How about having the hvm_cpu_down functions look at current, and
> optionally run sync_exec_state() ?
I considered this too before suggesting the other alternative,
but why would getting into the context switch path this way
be any better than through map_domain_page()?
Jan
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH v3] xen: fix reboot/shutdown with running HVM guests
2014-06-05 16:02 ` Jan Beulich
@ 2014-06-05 16:16 ` Andrew Cooper
2014-06-05 19:11 ` Tian, Kevin
0 siblings, 1 reply; 7+ messages in thread
From: Andrew Cooper @ 2014-06-05 16:16 UTC (permalink / raw)
To: Jan Beulich
Cc: Kevin Tian, KeirFraser, Suravee Suthikulpanit, Eddie Dong,
Aravind Gopalakrishnan, Jun Nakajima, xen-devel, Boris Ostrovsky,
Roger Pau Monne
On 05/06/14 17:02, Jan Beulich wrote:
>>>> On 05.06.14 at 17:46, <andrew.cooper3@citrix.com> wrote:
>> On 05/06/14 16:01, Roger Pau Monne wrote:
>>> If there's a guest using VMX/SVM when the hypervisor shuts down, it
>>> can lead to the following crash due to VMX/SVM functions being called
>>> after hvm_cpu_down has been called. In order to prevent that, check in
>>> {svm/vmx}_ctxt_switch_from that the cpu virtualization extensions are
>>> still enabled.
>> I feel that this is still trying to fix the problem from the wrong end.
>> It is incorrect to be in a context switch at the point identified in the
>> stack trace.
>>
>> How about having the hvm_cpu_down functions look at current, and
>> optionally run sync_exec_state() ?
> I considered this too before suggesting the other alternative,
> but why would getting into the context switch path this way
> be any better than through map_domain_page()?
>
> Jan
>
It is a more controlled point on all shutdown/crash paths, but still not
perfect.
What we probably need on all shutdown paths is an early step of
"reschedule the idle vcpu back on all pcpus". This won't fix the issue
for crash paths however.
~Andrew
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH v3] xen: fix reboot/shutdown with running HVM guests
2014-06-05 16:16 ` Andrew Cooper
@ 2014-06-05 19:11 ` Tian, Kevin
0 siblings, 0 replies; 7+ messages in thread
From: Tian, Kevin @ 2014-06-05 19:11 UTC (permalink / raw)
To: Andrew Cooper, Jan Beulich
Cc: KeirFraser, Suravee Suthikulpanit, Dong, Eddie,
Aravind Gopalakrishnan, Nakajima, Jun,
xen-devel@lists.xenproject.org, Boris Ostrovsky, Roger Pau Monne
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: Thursday, June 05, 2014 9:16 AM
>
> On 05/06/14 17:02, Jan Beulich wrote:
> >>>> On 05.06.14 at 17:46, <andrew.cooper3@citrix.com> wrote:
> >> On 05/06/14 16:01, Roger Pau Monne wrote:
> >>> If there's a guest using VMX/SVM when the hypervisor shuts down, it
> >>> can lead to the following crash due to VMX/SVM functions being called
> >>> after hvm_cpu_down has been called. In order to prevent that, check in
> >>> {svm/vmx}_ctxt_switch_from that the cpu virtualization extensions are
> >>> still enabled.
> >> I feel that this is still trying to fix the problem from the wrong end.
> >> It is incorrect to be in a context switch at the point identified in the
> >> stack trace.
> >>
> >> How about having the hvm_cpu_down functions look at current, and
> >> optionally run sync_exec_state() ?
> > I considered this too before suggesting the other alternative,
> > but why would getting into the context switch path this way
> > be any better than through map_domain_page()?
> >
> > Jan
> >
>
> It is a more controlled point on all shutdown/crash paths, but still not
> perfect.
>
> What we probably need on all shutdown paths is an early step of
> "reschedule the idle vcpu back on all pcpus". This won't fix the issue
> for crash paths however.
>
possibly we'll need both. I'll go to ack the patch anyway, since the check
itself still makes sense there.
Thanks
Kevin
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v3] xen: fix reboot/shutdown with running HVM guests
2014-06-05 15:01 [PATCH v3] xen: fix reboot/shutdown with running HVM guests Roger Pau Monne
2014-06-05 15:42 ` Aravind Gopalakrishnan
2014-06-05 15:46 ` Andrew Cooper
@ 2014-06-05 19:12 ` Tian, Kevin
2 siblings, 0 replies; 7+ messages in thread
From: Tian, Kevin @ 2014-06-05 19:12 UTC (permalink / raw)
To: Roger Pau Monne, xen-devel@lists.xenproject.org
Cc: Keir Fraser, Jan Beulich, Dong, Eddie, Aravind Gopalakrishnan,
Nakajima, Jun, Boris Ostrovsky, Suravee Suthikulpanit
> From: Roger Pau Monne [mailto:roger.pau@citrix.com]
> Sent: Thursday, June 05, 2014 8:01 AM
> If there's a guest using VMX/SVM when the hypervisor shuts down, it
> can lead to the following crash due to VMX/SVM functions being called
> after hvm_cpu_down has been called. In order to prevent that, check in
> {svm/vmx}_ctxt_switch_from that the cpu virtualization extensions are
> still enabled.
>
> (XEN) Domain 0 shutdown: rebooting machine.
> (XEN) Assertion 'read_cr0() & X86_CR0_TS' failed at vmx.c:644
> (XEN) ----[ Xen-4.5-unstable x86_64 debug=y Tainted: C ]----
> (XEN) CPU: 0
> (XEN) RIP: e008:[<ffff82d0801d90ce>] vmx_ctxt_switch_from+0x1e/0x14c
> (XEN) RFLAGS: 0000000000010046 CONTEXT: hypervisor
> (XEN) rax: 0000000080050033 rbx: ffff8300dfb1c000 rcx:
> 0000000000000000
> (XEN) rdx: 0000000000000000 rsi: ffff82d0802d7fc0 rdi:
> ffff8300dfb1c000
> (XEN) rbp: ffff82d0802d7a88 rsp: ffff82d0802d7a78 r8:
> 0000000000000000
> (XEN) r9: ffff82cffffff000 r10: 0000000b06dca869 r11:
> 0000003d7d708160
> (XEN) r12: ffff8300dfb1c000 r13: 0000000000000000 r14:
> ffff82d0802d0000
> (XEN) r15: ffff82d0802d7f18 cr0: 0000000080050033 cr4:
> 00000000000026f0
> (XEN) cr3: 000000019ed8d000 cr2: 0000000800dcb040
> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
> (XEN) Xen stack trace from rsp=ffff82d0802d7a78:
> (XEN) ffff8300dfdf2000 ffff8300dfdf2000 ffff82d0802d7ad8
> ffff82d08015d129
> (XEN) fffffe003d272a38 ffff83019a3f9140 0000000000000000
> 0000000000000001
> (XEN) 0000000000000086 000000000019a400 0000000000000000
> 0001000000010030
> (XEN) ffff82d0802d7af8 ffff82d080160acf ffff83019ec18530
> ffff8300dfdf2000
> (XEN) ffff82d0802d7b08 ffff82d080160af9 ffff82d0802d7b58
> ffff82d080161728
> (XEN) ffff82d0802d7b48 ffff82d08013ffbf 0000000000000002
> ffff83019ec18530
> (XEN) 0000000000000000 0000000000000012 0000000000000000
> 0001000000010030
> (XEN) ffff82d0802d7b68 ffff82d08014e721 ffff82d0802d7bc8
> ffff82d08014cda2
> (XEN) ffff8300dfb1c000 0000000000000092 ffff83019ec18604
> ffff83019ec185f8
> (XEN) ffff82d0802d0000 0000000000000001 0000000000000000
> ffff82d08016560e
> (XEN) ffff82d080272860 0000000000000020 ffff82d0802d7bd8
> ffff82d0801448a8
> (XEN) ffff82d0802d7be8 ffff82d080165625 ffff82d0802d7c18
> ffff82d080166143
> (XEN) 0000000000000000 0000000000000001 0000000000000000
> ffff82d080272860
> (XEN) ffff82d0802d7c48 ffff82d080166aa8 ffff8300dfb1c060
> 0000000000010000
> (XEN) 0000000000000001 ffff82d080272860 ffff82d0802d7c78
> ffff82d080166bae
> (XEN) 000000000000000a ffff82d080276fe0 00000000000000fb
> 0000000000000061
> (XEN) ffff82d0802d7c98 ffff82d080166f63 ffff82d0802d7c98
> ffff82d0801821ff
> (XEN) ffff82d0802d7cb8 ffff82d08018228b 0000000000000000
> ffff82d0802d7dd8
> (XEN) ffff82d0802d7cf8 ffff82d080181aa7 ffff82d0802d7d08
> 0000000000000206
> (XEN) 0000000000000000 ffff82d0802d7dd8 00000000000000fb
> 0000000000000008
> (XEN) Xen call trace:
> (XEN) [<ffff82d0801d90ce>] vmx_ctxt_switch_from+0x1e/0x14c
> (XEN) [<ffff82d08015d129>] __context_switch+0x127/0x462
> (XEN) [<ffff82d080160acf>] __sync_local_execstate+0x6a/0x8b
> (XEN) [<ffff82d080160af9>] sync_local_execstate+0x9/0xb
> (XEN) [<ffff82d080161728>] map_domain_page+0x88/0x4de
> (XEN) [<ffff82d08014e721>] map_vtd_domain_page+0xd/0xf
> (XEN) [<ffff82d08014cda2>] io_apic_read_remap_rte+0x158/0x29f
> (XEN) [<ffff82d0801448a8>] iommu_read_apic_from_ire+0x27/0x29
> (XEN) [<ffff82d080165625>] io_apic_read+0x17/0x65
> (XEN) [<ffff82d080166143>] __ioapic_read_entry+0x38/0x61
> (XEN) [<ffff82d080166aa8>] clear_IO_APIC_pin+0x1a/0xf3
> (XEN) [<ffff82d080166bae>] clear_IO_APIC+0x2d/0x60
> (XEN) [<ffff82d080166f63>] disable_IO_APIC+0xd/0x81
> (XEN) [<ffff82d08018228b>] smp_send_stop+0x58/0x68
> (XEN) [<ffff82d080181aa7>] machine_restart+0x80/0x20a
> (XEN) [<ffff82d080181c3c>] __machine_restart+0xb/0xf
> (XEN) [<ffff82d080128fb9>] smp_call_function_interrupt+0x99/0xc0
> (XEN) [<ffff82d080182330>] call_function_interrupt+0x33/0x43
> (XEN) [<ffff82d08016bd89>] do_IRQ+0x9e/0x63a
> (XEN) [<ffff82d08016406f>] common_interrupt+0x5f/0x70
> (XEN) [<ffff82d0801a8600>] mwait_idle+0x29c/0x2f7
> (XEN) [<ffff82d08015cf67>] idle_loop+0x58/0x76
> (XEN)
> (XEN)
> (XEN) ****************************************
> (XEN) Panic on CPU 0:
> (XEN) Assertion 'read_cr0() & X86_CR0_TS' failed at vmx.c:644
> (XEN) ****************************************
> (XEN)
> (XEN) Reboot in five seconds...
>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> Suggested-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
> Cc: Keir Fraser <keir@xen.org>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Jun Nakajima <jun.nakajima@intel.com>
> Cc: Eddie Dong <eddie.dong@intel.com>
> Cc: Kevin Tian <kevin.tian@intel.com>
> Cc: Mukesh Rathor <mukesh.rathor@oracle.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
> ---
> Changes since v2:
> - Added comments on the gates added to {svm/vmx}_ctxt_switch_from.
> - Move the declaration of the pcpu variable vmxon to vmcs.h.
> ---
> xen/arch/x86/hvm/svm/svm.c | 8 ++++++++
> xen/arch/x86/hvm/vmx/vmcs.c | 2 +-
> xen/arch/x86/hvm/vmx/vmx.c | 8 ++++++++
> xen/include/asm-x86/hvm/vmx/vmcs.h | 2 ++
> 4 files changed, 19 insertions(+), 1 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
> index 870e4ee..76616ac 100644
> --- a/xen/arch/x86/hvm/svm/svm.c
> +++ b/xen/arch/x86/hvm/svm/svm.c
> @@ -966,6 +966,14 @@ static void svm_ctxt_switch_from(struct vcpu *v)
> {
> int cpu = smp_processor_id();
>
> + /*
> + * Return early if trying to do a context switch without SVM enabled,
> + * this can happen when the hypervisor shuts down with HVM guests
> + * still running.
> + */
> + if ( unlikely((read_efer() & EFER_SVME) == 0) )
> + return;
> +
> svm_fpu_leave(v);
>
> svm_save_dr(v);
> diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
> index 7564895..8ffc562 100644
> --- a/xen/arch/x86/hvm/vmx/vmcs.c
> +++ b/xen/arch/x86/hvm/vmx/vmcs.c
> @@ -74,7 +74,7 @@ u64 vmx_ept_vpid_cap __read_mostly;
> static DEFINE_PER_CPU_READ_MOSTLY(struct vmcs_struct *,
> vmxon_region);
> static DEFINE_PER_CPU(struct vmcs_struct *, current_vmcs);
> static DEFINE_PER_CPU(struct list_head, active_vmcs_list);
> -static DEFINE_PER_CPU(bool_t, vmxon);
> +DEFINE_PER_CPU(bool_t, vmxon);
>
> static u32 vmcs_revision_id __read_mostly;
> u64 __read_mostly vmx_basic_msr;
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index d45fb7f..2caa04a 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -671,6 +671,14 @@ static void vmx_fpu_leave(struct vcpu *v)
>
> static void vmx_ctxt_switch_from(struct vcpu *v)
> {
> + /*
> + * Return early if trying to do a context switch without VMX enabled,
> + * this can happen when the hypervisor shuts down with HVM guests
> + * still running.
> + */
> + if ( unlikely(!this_cpu(vmxon)) )
> + return;
> +
> vmx_fpu_leave(v);
> vmx_save_guest_msrs(v);
> vmx_restore_host_msrs();
> diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h
> b/xen/include/asm-x86/hvm/vmx/vmcs.h
> index 445b39f..215d93c 100644
> --- a/xen/include/asm-x86/hvm/vmx/vmcs.h
> +++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
> @@ -490,6 +490,8 @@ void virtual_vmcs_exit(void *vvmcs);
> u64 virtual_vmcs_vmread(void *vvmcs, u32 vmcs_encoding);
> void virtual_vmcs_vmwrite(void *vvmcs, u32 vmcs_encoding, u64 val);
>
> +DECLARE_PER_CPU(bool_t, vmxon);
> +
> #endif /* ASM_X86_HVM_VMX_VMCS_H__ */
>
> /*
> --
> 1.7.7.5 (Apple Git-26)
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-06-05 19:16 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-05 15:01 [PATCH v3] xen: fix reboot/shutdown with running HVM guests Roger Pau Monne
2014-06-05 15:42 ` Aravind Gopalakrishnan
2014-06-05 15:46 ` Andrew Cooper
2014-06-05 16:02 ` Jan Beulich
2014-06-05 16:16 ` Andrew Cooper
2014-06-05 19:11 ` Tian, Kevin
2014-06-05 19:12 ` Tian, Kevin
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.