* [PATCH] kvm/x86: clear hlt for intel cpu when resetting vcpu
@ 2023-06-30 7:26 Qi Ai
2023-06-30 10:14 ` Chao Gao
0 siblings, 1 reply; 8+ messages in thread
From: Qi Ai @ 2023-06-30 7:26 UTC (permalink / raw)
To: seanjc, pbonzini, tglx, mingo, bp, dave.hansen, hpa, kvm
Cc: fengzhimin, cenjiahui, fangying.tommy, dengqiao.joey, Qi Ai
when cpu-pm=on is set in qemu, if a crash occurs within the guest,
after kdump has collected the vmcore, the system will be reset.
the ActivityState in the VMCS is set to HLT, because the guest executed
the halt instruction. however, ActivityState is not set to Active
before the restart, resulting in the cpu being in an inactive state
where it doesn't execute instructions.
in the __set_regs function, check whether a reset will occurs.
if it is, set the ActivityState to Active, which ensures that the cpu will
be executing instructions normally.
Signed-off-by: Qi Ai <aiqi.i7@bytedance.com>
---
arch/x86/include/asm/kvm_host.h | 2 ++
arch/x86/kvm/vmx/vmx.c | 2 ++
arch/x86/kvm/x86.c | 6 ++++++
3 files changed, 10 insertions(+)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index fb9d1f2d6136..db5a47500b08 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1731,6 +1731,8 @@ struct kvm_x86_ops {
* Returns vCPU specific APICv inhibit reasons
*/
unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu);
+
+ void (*clear_hlt)(struct kvm_vcpu *vcpu);
};
struct kvm_x86_nested_ops {
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 44fb619803b8..11c2fde1ad98 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -8266,6 +8266,8 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
.complete_emulated_msr = kvm_complete_insn_gp,
.vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector,
+
+ .clear_hlt = vmx_clear_hlt,
};
static unsigned int vmx_handle_intel_pt_intr(void)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7f70207e8689..21360f5ed006 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11258,6 +11258,12 @@ static void __set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
vcpu->arch.exception_vmexit.pending = false;
kvm_make_request(KVM_REQ_EVENT, vcpu);
+
+ if (kvm_x86_ops.clear_hlt) {
+ if (kvm_vcpu_is_bsp(vcpu) && regs->rip == 0xfff0 &&
+ !is_protmode(vcpu))
+ kvm_x86_ops.clear_hlt(vcpu);
+ }
}
int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
--
2.20.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] kvm/x86: clear hlt for intel cpu when resetting vcpu
2023-06-30 7:26 [PATCH] kvm/x86: clear hlt for intel cpu when resetting vcpu Qi Ai
@ 2023-06-30 10:14 ` Chao Gao
2023-06-30 22:23 ` Isaku Yamahata
2023-06-30 22:56 ` Sean Christopherson
0 siblings, 2 replies; 8+ messages in thread
From: Chao Gao @ 2023-06-30 10:14 UTC (permalink / raw)
To: Qi Ai
Cc: seanjc, pbonzini, tglx, mingo, bp, dave.hansen, hpa, kvm,
fengzhimin, cenjiahui, fangying.tommy, dengqiao.joey
On Fri, Jun 30, 2023 at 03:26:12PM +0800, Qi Ai wrote:
>--- a/arch/x86/include/asm/kvm_host.h
>+++ b/arch/x86/include/asm/kvm_host.h
>@@ -1731,6 +1731,8 @@ struct kvm_x86_ops {
> * Returns vCPU specific APICv inhibit reasons
> */
> unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu);
>+
>+ void (*clear_hlt)(struct kvm_vcpu *vcpu);
add this new ops to arch/x86/include/asm/kvm-x86-ops.h, and declare it
optional, i.e.,
KVM_X86_OP_OPTIONAL(...)
> };
>
> struct kvm_x86_nested_ops {
>diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>index 44fb619803b8..11c2fde1ad98 100644
>--- a/arch/x86/kvm/vmx/vmx.c
>+++ b/arch/x86/kvm/vmx/vmx.c
>@@ -8266,6 +8266,8 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
> .complete_emulated_msr = kvm_complete_insn_gp,
>
> .vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector,
>+
>+ .clear_hlt = vmx_clear_hlt,
> };
>
> static unsigned int vmx_handle_intel_pt_intr(void)
>diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>index 7f70207e8689..21360f5ed006 100644
>--- a/arch/x86/kvm/x86.c
>+++ b/arch/x86/kvm/x86.c
>@@ -11258,6 +11258,12 @@ static void __set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
> vcpu->arch.exception_vmexit.pending = false;
>
> kvm_make_request(KVM_REQ_EVENT, vcpu);
>+
>+ if (kvm_x86_ops.clear_hlt) {
>+ if (kvm_vcpu_is_bsp(vcpu) && regs->rip == 0xfff0 &&
it is weird this applies to BSP only.
>+ !is_protmode(vcpu))
>+ kvm_x86_ops.clear_hlt(vcpu);
Use static_call_cond(kvm_x86_clear_hlt)(vcpu) instead.
It looks incorrect that we add this side-effect heuristically here. I
am wondering if we can link vcpu->arch.mp_state to VMCS activity state,
i.e., when mp_state is set to RUNNABLE in KVM_SET_MP_STATE ioctl, KVM
sets VMCS activity state to active.
>+ }
> }
>
> int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
>--
>2.20.1
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] kvm/x86: clear hlt for intel cpu when resetting vcpu
2023-06-30 10:14 ` Chao Gao
@ 2023-06-30 22:23 ` Isaku Yamahata
2023-06-30 22:56 ` Sean Christopherson
1 sibling, 0 replies; 8+ messages in thread
From: Isaku Yamahata @ 2023-06-30 22:23 UTC (permalink / raw)
To: Chao Gao
Cc: Qi Ai, seanjc, pbonzini, tglx, mingo, bp, dave.hansen, hpa, kvm,
fengzhimin, cenjiahui, fangying.tommy, dengqiao.joey,
isaku.yamahata
On Fri, Jun 30, 2023 at 06:14:31PM +0800,
Chao Gao <chao.gao@intel.com> wrote:
> On Fri, Jun 30, 2023 at 03:26:12PM +0800, Qi Ai wrote:
> >--- a/arch/x86/include/asm/kvm_host.h
> >+++ b/arch/x86/include/asm/kvm_host.h
> >@@ -1731,6 +1731,8 @@ struct kvm_x86_ops {
> > * Returns vCPU specific APICv inhibit reasons
> > */
> > unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu);
> >+
> >+ void (*clear_hlt)(struct kvm_vcpu *vcpu);
>
> add this new ops to arch/x86/include/asm/kvm-x86-ops.h, and declare it
> optional, i.e.,
>
> KVM_X86_OP_OPTIONAL(...)
>
>
> > };
> >
> > struct kvm_x86_nested_ops {
> >diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> >index 44fb619803b8..11c2fde1ad98 100644
> >--- a/arch/x86/kvm/vmx/vmx.c
> >+++ b/arch/x86/kvm/vmx/vmx.c
> >@@ -8266,6 +8266,8 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
> > .complete_emulated_msr = kvm_complete_insn_gp,
> >
> > .vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector,
> >+
> >+ .clear_hlt = vmx_clear_hlt,
> > };
> >
> > static unsigned int vmx_handle_intel_pt_intr(void)
> >diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> >index 7f70207e8689..21360f5ed006 100644
> >--- a/arch/x86/kvm/x86.c
> >+++ b/arch/x86/kvm/x86.c
> >@@ -11258,6 +11258,12 @@ static void __set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
> > vcpu->arch.exception_vmexit.pending = false;
> >
> > kvm_make_request(KVM_REQ_EVENT, vcpu);
> >+
> >+ if (kvm_x86_ops.clear_hlt) {
> >+ if (kvm_vcpu_is_bsp(vcpu) && regs->rip == 0xfff0 &&
>
> it is weird this applies to BSP only.
>
> >+ !is_protmode(vcpu))
> >+ kvm_x86_ops.clear_hlt(vcpu);
>
> Use static_call_cond(kvm_x86_clear_hlt)(vcpu) instead.
>
> It looks incorrect that we add this side-effect heuristically here. I
> am wondering if we can link vcpu->arch.mp_state to VMCS activity state,
> i.e., when mp_state is set to RUNNABLE in KVM_SET_MP_STATE ioctl, KVM
> sets VMCS activity state to active.
Yes, when vcpu is reset should be checked by mp-state change. Strictly we should
clear other guest non-register state strictly. interruptibility state, pending
debug exceptions, and guest interrupt status. In practice, those status other
than hlt wouldn't matter much, though.
--
Isaku Yamahata <isaku.yamahata@gmail.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] kvm/x86: clear hlt for intel cpu when resetting vcpu
2023-06-30 10:14 ` Chao Gao
2023-06-30 22:23 ` Isaku Yamahata
@ 2023-06-30 22:56 ` Sean Christopherson
2023-07-04 11:34 ` Qi Ai
2023-07-05 9:04 ` Chao Gao
1 sibling, 2 replies; 8+ messages in thread
From: Sean Christopherson @ 2023-06-30 22:56 UTC (permalink / raw)
To: Chao Gao
Cc: Qi Ai, pbonzini, tglx, mingo, bp, dave.hansen, hpa, kvm,
fengzhimin, cenjiahui, fangying.tommy, dengqiao.joey
mn Fri, Jun 30, 2023, Chao Gao wrote:
> On Fri, Jun 30, 2023 at 03:26:12PM +0800, Qi Ai wrote:
> >+ !is_protmode(vcpu))
> >+ kvm_x86_ops.clear_hlt(vcpu);
>
> Use static_call_cond(kvm_x86_clear_hlt)(vcpu) instead.
>
> It looks incorrect that we add this side-effect heuristically here. I
Yeah, adding heuristics to KVM_SET_REGS isn't happening. KVM's existing hack
for "Older userspace" in __set_sregs_common() is bad enough:
/* Older userspace won't unhalt the vcpu on reset. */
if (kvm_vcpu_is_bsp(vcpu) && kvm_rip_read(vcpu) == 0xfff0 &&
sregs->cs.selector == 0xf000 && sregs->cs.base == 0xffff0000 &&
!is_protmode(vcpu))
vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
> am wondering if we can link vcpu->arch.mp_state to VMCS activity state,
Hrm, maybe.
> i.e., when mp_state is set to RUNNABLE in KVM_SET_MP_STATE ioctl, KVM
> sets VMCS activity state to active.
Not in the ioctl(), there needs to be a proper set of APIs, e.g. so that the
existing hack works, and so that KVM actually reports out to userspace that a
vCPU is HALTED if userspace gained control of the vCPU, e.g. after an IRQ exit,
while the vCPU was HALTED. I.e. mp_state versus vmcs.ACTIVITY_STATE needs to be
bidirectional, not one-way. E.g. if a vCPU is live migrated, I'm pretty sure
vmcs.ACTIVITY_STATE is lost, which is wrong.
The downside is that if KVM propagates vmcs.ACTIVITY_STATE to mp_state for the
halted case, then KVM will enter kvm_vcpu_halt() instead of entering the guest
in halted state, which is undesirable. Hmm, that can be handled by treating
the vCPU as running, e.g.
static inline bool kvm_vcpu_running(struct kvm_vcpu *vcpu)
{
return (vcpu->arch.mp_state == KVM_MP_STATE_RUNNABLE ||
(vcpu->arch.mp_state == KVM_MP_STATE_HALTED &&
kvm_hlt_in_guest(vcpu->kvm))) &&
!vcpu->arch.apf.halted);
}
but that would have cascading effect to a whole pile of things. I don't *think*
they'd be used with kvm_hlt_in_guest(), but we've had weirder stuff.
I'm half tempted to solve this particular issue by stuffing vmcs.ACTIVITY_STATE on
shutdown, similar to what SVM does on shutdown interception. KVM doesn't come
anywhere near faithfully emulating shutdown, so it's unlikely to break anything.
And then the mp_state vs. hlt_in_guest coulbe tackled separately. Ugh, but that
wouldn't cover a synthesized KVM_REQ_TRIPLE_FAULT.
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 44fb619803b8..ee4bb37067d1 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -5312,6 +5312,8 @@ static __always_inline int handle_external_interrupt(struct kvm_vcpu *vcpu)
static int handle_triple_fault(struct kvm_vcpu *vcpu)
{
+ vmcs_write32(GUEST_ACTIVITY_STATE, GUEST_ACTIVITY_ACTIVE);
+
vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN;
vcpu->mmio_needed = 0;
return 0;
I don't suppose QEMU can to blast INIT at all vCPUs for this case?
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] kvm/x86: clear hlt for intel cpu when resetting vcpu
2023-06-30 22:56 ` Sean Christopherson
@ 2023-07-04 11:34 ` Qi Ai
2023-07-05 8:40 ` Chao Gao
2023-07-05 9:04 ` Chao Gao
1 sibling, 1 reply; 8+ messages in thread
From: Qi Ai @ 2023-07-04 11:34 UTC (permalink / raw)
To: seanjc
Cc: aiqi.i7, bp, cenjiahui, chao.gao, dave.hansen, dengqiao.joey,
fangying.tommy, fengzhimin, hpa, kvm, mingo, pbonzini, tglx
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=y, Size: 3780 bytes --]
On Sat, Jul 1, 2023 at 6:56 AM Sean Christopherson <seanjc@google.com> wrote:
>
> mn Fri, Jun 30, 2023, Chao Gao wrote:
> > On Fri, Jun 30, 2023 at 03:26:12PM +0800, Qi Ai wrote:
> > >+ !is_protmode(vcpu))
> > >+ kvm_x86_ops.clear_hlt(vcpu);
> >
> > Use static_call_cond(kvm_x86_clear_hlt)(vcpu) instead.
> >
> > It looks incorrect that we add this side-effect heuristically here. I
>
> Yeah, adding heuristics to KVM_SET_REGS isn't happening. KVM's existing hack
> for "Older userspace" in __set_sregs_common() is bad enough:
>
> /* Older userspace won't unhalt the vcpu on reset. */
> if (kvm_vcpu_is_bsp(vcpu) && kvm_rip_read(vcpu) == 0xfff0 &&
> sregs->cs.selector == 0xf000 && sregs->cs.base == 0xffff0000 &&
> !is_protmode(vcpu))
> vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
>
> > am wondering if we can link vcpu->arch.mp_state to VMCS activity state,
>
> Hrm, maybe.
>
> > i.e., when mp_state is set to RUNNABLE in KVM_SET_MP_STATE ioctl, KVM
> > sets VMCS activity state to active.
>
> Not in the ioctl(), there needs to be a proper set of APIs, e.g. so that the
> existing hack works, and so that KVM actually reports out to userspace that a
> vCPU is HALTED if userspace gained control of the vCPU, e.g. after an IRQ exit,
> while the vCPU was HALTED. I.e. mp_state versus vmcs.ACTIVITY_STATE needs to be
> bidirectional, not one-way. E.g. if a vCPU is live migrated, I'm pretty sure
> vmcs.ACTIVITY_STATE is lost, which is wrong.
>
> The downside is that if KVM propagates vmcs.ACTIVITY_STATE to mp_state for the
> halted case, then KVM will enter kvm_vcpu_halt() instead of entering the guest
> in halted state, which is undesirable. Hmm, that can be handled by treating
> the vCPU as running, e.g.
>
> static inline bool kvm_vcpu_running(struct kvm_vcpu *vcpu)
> {
> return (vcpu->arch.mp_state == KVM_MP_STATE_RUNNABLE ||
> (vcpu->arch.mp_state == KVM_MP_STATE_HALTED &&
> kvm_hlt_in_guest(vcpu->kvm))) &&
> !vcpu->arch.apf.halted);
> }
>
> but that would have cascading effect to a whole pile of things. I don't *think*
> they'd be used with kvm_hlt_in_guest(), but we've had weirder stuff.
>
> I'm half tempted to solve this particular issue by stuffing vmcs.ACTIVITY_STATE on
> shutdown, similar to what SVM does on shutdown interception. KVM doesn't come
> anywhere near faithfully emulating shutdown, so it's unlikely to break anything.
> And then the mp_state vs. hlt_in_guest coulbe tackled separately. Ugh, but that
> wouldn't cover a synthesized KVM_REQ_TRIPLE_FAULT.
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 44fb619803b8..ee4bb37067d1 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -5312,6 +5312,8 @@ static __always_inline int handle_external_interrupt(struct kvm_vcpu *vcpu)
>
> static int handle_triple_fault(struct kvm_vcpu *vcpu)
> {
> + vmcs_write32(GUEST_ACTIVITY_STATE, GUEST_ACTIVITY_ACTIVE);
> +
> vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN;
> vcpu->mmio_needed = 0;
> return 0;
>
>
> I don't suppose QEMU can to blast INIT at all vCPUs for this case?
Reproduce this problem need to use the cpu_pm=on in QEMU, so execute halt in vm doesn't
cause a vm exit, so mp_state will never be HLT. I am confused why mp_state is considered in this case.
And the bsp's vmcs.ACTIVITY_STATE need to reset to ACTIVITY to solve this problem.
We need a proper set of APIs as you say. In this case, do we only provide a reset ioctl,
or do we need to report vmcs.ACTIVITY_STATE to the userspace?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] kvm/x86: clear hlt for intel cpu when resetting vcpu
2023-07-04 11:34 ` Qi Ai
@ 2023-07-05 8:40 ` Chao Gao
0 siblings, 0 replies; 8+ messages in thread
From: Chao Gao @ 2023-07-05 8:40 UTC (permalink / raw)
To: Qi Ai
Cc: seanjc, bp, cenjiahui, dave.hansen, dengqiao.joey, fangying.tommy,
fengzhimin, hpa, kvm, mingo, pbonzini, tglx
On Tue, Jul 04, 2023 at 07:34:05PM +0800, Qi Ai wrote:
>Reproduce this problem need to use the cpu_pm=on in QEMU, so execute halt in vm doesn't
>cause a vm exit, so mp_state will never be HLT. I am confused why mp_state is considered in this case.
This is just current implementation. It is not necessary to be this way. If
userspace can manipulate vmcs.ACTIVITY_STATE indirectly via mp_state, your
issue will be fixed. But as Sean said, this solution will cause "cascading
effect to a whole pile of things"
>
>And the bsp's vmcs.ACTIVITY_STATE need to reset to ACTIVITY to solve this problem.
>We need a proper set of APIs as you say. In this case, do we only provide a reset ioctl,
>or do we need to report vmcs.ACTIVITY_STATE to the userspace?
The latter I believe. Then userspace can migrate the state. If we go with the
former, the subtle bug pointed out by Sean won't be fixed:
if a vCPU is live migrated, I'm pretty sure vmcs.ACTIVITY_STATE
is lost, which is wrong.
Definitely, we need Sean's confirmation here.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] kvm/x86: clear hlt for intel cpu when resetting vcpu
2023-06-30 22:56 ` Sean Christopherson
2023-07-04 11:34 ` Qi Ai
@ 2023-07-05 9:04 ` Chao Gao
2023-07-31 19:15 ` Sean Christopherson
1 sibling, 1 reply; 8+ messages in thread
From: Chao Gao @ 2023-07-05 9:04 UTC (permalink / raw)
To: Sean Christopherson
Cc: Qi Ai, pbonzini, tglx, mingo, bp, dave.hansen, hpa, kvm,
fengzhimin, cenjiahui, fangying.tommy, dengqiao.joey
On Fri, Jun 30, 2023 at 03:56:14PM -0700, Sean Christopherson wrote:
>mn Fri, Jun 30, 2023, Chao Gao wrote:
>> am wondering if we can link vcpu->arch.mp_state to VMCS activity state,
>
>Hrm, maybe.
>
>> i.e., when mp_state is set to RUNNABLE in KVM_SET_MP_STATE ioctl, KVM
>> sets VMCS activity state to active.
>
>Not in the ioctl(), there needs to be a proper set of APIs, e.g. so that the
>existing hack works,
I don't get why the existing hack will be broken if we piggyback on the
KVM_GET/SET_MP_STATE ioctl(). The hack is for "Older userspace" back to
2008. I suppose the "Older userspace" doesn't support disabling hlt exit.
>and so that KVM actually reports out to userspace that a
>vCPU is HALTED if userspace gained control of the vCPU, e.g. after an IRQ exit,
>while the vCPU was HALTED. I.e. mp_state versus vmcs.ACTIVITY_STATE needs to be
>bidirectional, not one-way. E.g. if a vCPU is live migrated, I'm pretty sure
>vmcs.ACTIVITY_STATE is lost, which is wrong.
Yes. Agreed.
>
>I'm half tempted to solve this particular issue by stuffing vmcs.ACTIVITY_STATE on
>shutdown, similar to what SVM does on shutdown interception. KVM doesn't come
>anywhere near faithfully emulating shutdown, so it's unlikely to break anything.
>And then the mp_state vs. hlt_in_guest coulbe tackled separately. Ugh, but that
>wouldn't cover a synthesized KVM_REQ_TRIPLE_FAULT.
I traced the process of guest reboot but I didn't see triple-fault
VMExit. So, I don't think this can fix the issue.
>
>diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>index 44fb619803b8..ee4bb37067d1 100644
>--- a/arch/x86/kvm/vmx/vmx.c
>+++ b/arch/x86/kvm/vmx/vmx.c
>@@ -5312,6 +5312,8 @@ static __always_inline int handle_external_interrupt(struct kvm_vcpu *vcpu)
>
> static int handle_triple_fault(struct kvm_vcpu *vcpu)
> {
>+ vmcs_write32(GUEST_ACTIVITY_STATE, GUEST_ACTIVITY_ACTIVE);
>+
> vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN;
> vcpu->mmio_needed = 0;
> return 0;
>
>
>I don't suppose QEMU can to blast INIT at all vCPUs for this case?
IIUC, userspace can queue INIT to a vCPU via KVM_SET_VCPU_EVENTS ioctl()
with flags = KVM_VCPUEVENT_VALID_SMM and latched_init = 1.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] kvm/x86: clear hlt for intel cpu when resetting vcpu
2023-07-05 9:04 ` Chao Gao
@ 2023-07-31 19:15 ` Sean Christopherson
0 siblings, 0 replies; 8+ messages in thread
From: Sean Christopherson @ 2023-07-31 19:15 UTC (permalink / raw)
To: Chao Gao
Cc: Qi Ai, pbonzini, tglx, mingo, bp, dave.hansen, hpa, kvm,
fengzhimin, cenjiahui, fangying.tommy, dengqiao.joey
On Wed, Jul 05, 2023, Chao Gao wrote:
> On Fri, Jun 30, 2023 at 03:56:14PM -0700, Sean Christopherson wrote:
> >mn Fri, Jun 30, 2023, Chao Gao wrote:
> >> am wondering if we can link vcpu->arch.mp_state to VMCS activity state,
> >
> >Hrm, maybe.
> >
> >> i.e., when mp_state is set to RUNNABLE in KVM_SET_MP_STATE ioctl, KVM
> >> sets VMCS activity state to active.
> >
> >Not in the ioctl(), there needs to be a proper set of APIs, e.g. so that the
> >existing hack works,
>
> I don't get why the existing hack will be broken if we piggyback on the
> KVM_GET/SET_MP_STATE ioctl(). The hack is for "Older userspace" back to
> 2008. I suppose the "Older userspace" doesn't support disabling hlt exit.
True, it does seem extremely unlikely that the two would collide. And even if
there is some bizarre userspace that relies on the old hack *and* exposes HLT,
fixing this in set_mpstate() wouldn't make the problem worse, i.e. there's no
reason to hold up the fix.
Qi, for v2, can you handle this in kvm_arch_vcpu_ioctl_set_mpstate() instead of
__set_regs()? And wire up the new hook to support a static_call(), e.g. I think
the call site would be:
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 05a68d7d99fe..92d255fb00c5 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11411,6 +11411,10 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
vcpu->arch.mp_state = mp_state->mp_state;
kvm_make_request(KVM_REQ_EVENT, vcpu);
+ if (kvm_hlt_in_guest(vcpu->kvm) &&
+ mp_state->mp_state != KVM_MP_STATE_HALTED)
+ static_call_cond(kvm_x86_clear_hlt)(vcpu);
+
ret = 0;
out:
vcpu_put(vcpu);
> >and so that KVM actually reports out to userspace that a
> >vCPU is HALTED if userspace gained control of the vCPU, e.g. after an IRQ exit,
> >while the vCPU was HALTED. I.e. mp_state versus vmcs.ACTIVITY_STATE needs to be
> >bidirectional, not one-way. E.g. if a vCPU is live migrated, I'm pretty sure
> >vmcs.ACTIVITY_STATE is lost, which is wrong.
>
> Yes. Agreed.
Fixing this is going to be painful. Reporting KVM_MP_STATE_HALTED would cause
problems if a VM is migrated from a KVM with the fix to a KVM without the fix,
as the "bad" KVM wouldn't know it should treat the guest as running and actually
put the vCPU into HLT. We can't use a new KVM_MP_STATE because that would fully
break migration (*sigh*).
That means we probably need to add a flag somewhere, on top of also teaching KVM
to treat the guest as runnable. So as above, I think it makes sense to get the
immediate bug fixed and worry about migrating HLT state in the future, especially
since KVM *can't* migrate HLT state on SVM.
> >I'm half tempted to solve this particular issue by stuffing vmcs.ACTIVITY_STATE on
> >shutdown, similar to what SVM does on shutdown interception. KVM doesn't come
> >anywhere near faithfully emulating shutdown, so it's unlikely to break anything.
> >And then the mp_state vs. hlt_in_guest coulbe tackled separately. Ugh, but that
> >wouldn't cover a synthesized KVM_REQ_TRIPLE_FAULT.
>
> I traced the process of guest reboot but I didn't see triple-fault
> VMExit. So, I don't think this can fix the issue.
Ah, right. No idea why I jumped to the conclusion that there would be a triple
fault, the changelog even specifically calls out that the reboot happens after
kdump runs. Even if there were a triple fault, that wouldn't cover all vCPUs.
^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-07-31 19:15 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-30 7:26 [PATCH] kvm/x86: clear hlt for intel cpu when resetting vcpu Qi Ai
2023-06-30 10:14 ` Chao Gao
2023-06-30 22:23 ` Isaku Yamahata
2023-06-30 22:56 ` Sean Christopherson
2023-07-04 11:34 ` Qi Ai
2023-07-05 8:40 ` Chao Gao
2023-07-05 9:04 ` Chao Gao
2023-07-31 19:15 ` Sean Christopherson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).