[PATCH] KVM: arm64: vgic-v4: Consistently request doorbell irq for blocking vCPU

stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] KVM: arm64: vgic-v4: Consistently request doorbell irq for blocking vCPU
@ 2023-07-10 17:55 Oliver Upton
  2023-07-11  7:23 ` Marc Zyngier
  2023-07-11 20:00 ` Oliver Upton
  0 siblings, 2 replies; 13+ messages in thread
From: Oliver Upton @ 2023-07-10 17:55 UTC (permalink / raw)
  To: kvmarm
  Cc: Marc Zyngier, James Morse, Suzuki K Poulose, Zenghui Yu,
	Oliver Upton, stable, Xiang Chen

Xiang reports that VMs occasionally fail to boot on GICv4.1 systems when
running a preemptible kernel, as it is possible that a vCPU is blocked
without requesting a doorbell interrupt.

The issue is that any preemption that occurs between vgic_v4_put() and
schedule() on the block path will mark the vPE as nonresident and *not*
request a doorbell irq.

Fix it by consistently requesting a doorbell irq in the vcpu put path if
the vCPU is blocking. While this technically means we could drop the
early doorbell irq request in kvm_vcpu_wfi(), deliberately leave it
intact such that vCPU halt polling can properly detect the wakeup
condition before actually scheduling out a vCPU.

Cc: stable@vger.kernel.org
Fixes: 8e01d9a396e6 ("KVM: arm64: vgic-v4: Move the GICv4 residency flow to be driven by vcpu_load/put")
Reported-by: Xiang Chen <chenxiang66@hisilicon.com>
Tested-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
---
 arch/arm64/kvm/vgic/vgic-v3.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
index c3b8e132d599..8c467e9f4f11 100644
--- a/arch/arm64/kvm/vgic/vgic-v3.c
+++ b/arch/arm64/kvm/vgic/vgic-v3.c
@@ -749,7 +749,7 @@ void vgic_v3_put(struct kvm_vcpu *vcpu)
 {
 	struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3;

-	WARN_ON(vgic_v4_put(vcpu, false));
+	WARN_ON(vgic_v4_put(vcpu, kvm_vcpu_is_blocking(vcpu)));

 	vgic_v3_vmcr_sync(vcpu);

-- 
2.41.0.255.g8b1d071c50-goog

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] KVM: arm64: vgic-v4: Consistently request doorbell irq for blocking vCPU
  2023-07-10 17:55 [PATCH] KVM: arm64: vgic-v4: Consistently request doorbell irq for blocking vCPU Oliver Upton
@ 2023-07-11  7:23 ` Marc Zyngier
  2023-07-11  7:26   ` Oliver Upton
  2023-07-11 20:00 ` Oliver Upton
  1 sibling, 1 reply; 13+ messages in thread
From: Marc Zyngier @ 2023-07-11  7:23 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvmarm, James Morse, Suzuki K Poulose, Zenghui Yu, stable,
	Xiang Chen

On Mon, 10 Jul 2023 18:55:53 +0100,
Oliver Upton <oliver.upton@linux.dev> wrote:
> 
> Xiang reports that VMs occasionally fail to boot on GICv4.1 systems when
> running a preemptible kernel, as it is possible that a vCPU is blocked
> without requesting a doorbell interrupt.
> 
> The issue is that any preemption that occurs between vgic_v4_put() and
> schedule() on the block path will mark the vPE as nonresident and *not*
> request a doorbell irq.

It'd be worth spelling out. You need to go via *three* schedule()
calls: one to be preempted (with DB set), one to be made resident
again, and then the final one in kvm_vcpu_halt(), clearing the DB on
vcpu_put() due to the bug.

> 
> Fix it by consistently requesting a doorbell irq in the vcpu put path if
> the vCPU is blocking. While this technically means we could drop the
> early doorbell irq request in kvm_vcpu_wfi(), deliberately leave it
> intact such that vCPU halt polling can properly detect the wakeup
> condition before actually scheduling out a vCPU.
> 
> Cc: stable@vger.kernel.org
> Fixes: 8e01d9a396e6 ("KVM: arm64: vgic-v4: Move the GICv4 residency flow to be driven by vcpu_load/put")
> Reported-by: Xiang Chen <chenxiang66@hisilicon.com>
> Tested-by: Xiang Chen <chenxiang66@hisilicon.com>
> Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
> ---
>  arch/arm64/kvm/vgic/vgic-v3.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
> index c3b8e132d599..8c467e9f4f11 100644
> --- a/arch/arm64/kvm/vgic/vgic-v3.c
> +++ b/arch/arm64/kvm/vgic/vgic-v3.c
> @@ -749,7 +749,7 @@ void vgic_v3_put(struct kvm_vcpu *vcpu)
>  {
>  	struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3;
>  
> -	WARN_ON(vgic_v4_put(vcpu, false));
> +	WARN_ON(vgic_v4_put(vcpu, kvm_vcpu_is_blocking(vcpu)));
>  
>  	vgic_v3_vmcr_sync(vcpu);
>  

Other than the above nitpicking, this looks good. Thanks both for the
very detailed report and the fix.

Reviewed-by: Marc Zyngier <maz@kernel.org>

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] KVM: arm64: vgic-v4: Consistently request doorbell irq for blocking vCPU
  2023-07-11  7:23 ` Marc Zyngier
@ 2023-07-11  7:26   ` Oliver Upton
  2023-07-11  7:57     ` Marc Zyngier
  2023-07-12 12:09     ` Zenghui Yu
  0 siblings, 2 replies; 13+ messages in thread
From: Oliver Upton @ 2023-07-11  7:26 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, James Morse, Suzuki K Poulose, Zenghui Yu, stable,
	Xiang Chen

On Tue, Jul 11, 2023 at 08:23:25AM +0100, Marc Zyngier wrote:
> On Mon, 10 Jul 2023 18:55:53 +0100,
> Oliver Upton <oliver.upton@linux.dev> wrote:
> > 
> > Xiang reports that VMs occasionally fail to boot on GICv4.1 systems when
> > running a preemptible kernel, as it is possible that a vCPU is blocked
> > without requesting a doorbell interrupt.
> > 
> > The issue is that any preemption that occurs between vgic_v4_put() and
> > schedule() on the block path will mark the vPE as nonresident and *not*
> > request a doorbell irq.
> 
> It'd be worth spelling out. You need to go via *three* schedule()
> calls: one to be preempted (with DB set), one to be made resident
> again, and then the final one in kvm_vcpu_halt(), clearing the DB on
> vcpu_put() due to the bug.

Yeah, a bit lazy in the wording. What I had meant to imply was
preemption happening after the doorbell is set up and before the thread
has an opportunity to explicitly schedule out. Perhaps I should just say
that.

> > 
> > Fix it by consistently requesting a doorbell irq in the vcpu put path if
> > the vCPU is blocking. While this technically means we could drop the
> > early doorbell irq request in kvm_vcpu_wfi(), deliberately leave it
> > intact such that vCPU halt polling can properly detect the wakeup
> > condition before actually scheduling out a vCPU.
> > 
> > Cc: stable@vger.kernel.org
> > Fixes: 8e01d9a396e6 ("KVM: arm64: vgic-v4: Move the GICv4 residency flow to be driven by vcpu_load/put")
> > Reported-by: Xiang Chen <chenxiang66@hisilicon.com>
> > Tested-by: Xiang Chen <chenxiang66@hisilicon.com>
> > Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
> > ---
> >  arch/arm64/kvm/vgic/vgic-v3.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
> > index c3b8e132d599..8c467e9f4f11 100644
> > --- a/arch/arm64/kvm/vgic/vgic-v3.c
> > +++ b/arch/arm64/kvm/vgic/vgic-v3.c
> > @@ -749,7 +749,7 @@ void vgic_v3_put(struct kvm_vcpu *vcpu)
> >  {
> >  	struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3;
> >  
> > -	WARN_ON(vgic_v4_put(vcpu, false));
> > +	WARN_ON(vgic_v4_put(vcpu, kvm_vcpu_is_blocking(vcpu)));
> >  
> >  	vgic_v3_vmcr_sync(vcpu);
> >  
> 
> Other than the above nitpicking, this looks good. Thanks both for the
> very detailed report and the fix.
> 
> Reviewed-by: Marc Zyngier <maz@kernel.org>

Thanks!

--
Best,
Oliver

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] KVM: arm64: vgic-v4: Consistently request doorbell irq for blocking vCPU
  2023-07-11  7:26   ` Oliver Upton
@ 2023-07-11  7:57     ` Marc Zyngier
  2023-07-12 12:09     ` Zenghui Yu
  1 sibling, 0 replies; 13+ messages in thread
From: Marc Zyngier @ 2023-07-11  7:57 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvmarm, James Morse, Suzuki K Poulose, Zenghui Yu, stable,
	Xiang Chen

On Tue, 11 Jul 2023 08:26:54 +0100,
Oliver Upton <oliver.upton@linux.dev> wrote:
> 
> On Tue, Jul 11, 2023 at 08:23:25AM +0100, Marc Zyngier wrote:
> > On Mon, 10 Jul 2023 18:55:53 +0100,
> > Oliver Upton <oliver.upton@linux.dev> wrote:
> > > 
> > > Xiang reports that VMs occasionally fail to boot on GICv4.1 systems when
> > > running a preemptible kernel, as it is possible that a vCPU is blocked
> > > without requesting a doorbell interrupt.
> > > 
> > > The issue is that any preemption that occurs between vgic_v4_put() and
> > > schedule() on the block path will mark the vPE as nonresident and *not*
> > > request a doorbell irq.
> > 
> > It'd be worth spelling out. You need to go via *three* schedule()
> > calls: one to be preempted (with DB set), one to be made resident
> > again, and then the final one in kvm_vcpu_halt(), clearing the DB on
> > vcpu_put() due to the bug.
> 
> Yeah, a bit lazy in the wording. What I had meant to imply was
> preemption happening after the doorbell is set up and before the thread
> has an opportunity to explicitly schedule out. Perhaps I should just say
> that.

Yup. And it is the transition via a new 'resident' state that blows
it. No need to repost for that, just amend it locally.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] KVM: arm64: vgic-v4: Consistently request doorbell irq for blocking vCPU
  2023-07-11  7:26   ` Oliver Upton
  2023-07-11  7:57     ` Marc Zyngier
@ 2023-07-12 12:09     ` Zenghui Yu
  2023-07-12 13:49       ` Marc Zyngier
  1 sibling, 1 reply; 13+ messages in thread
From: Zenghui Yu @ 2023-07-12 12:09 UTC (permalink / raw)
  To: Oliver Upton
  Cc: Marc Zyngier, kvmarm, James Morse, Suzuki K Poulose, stable,
	Xiang Chen

On 2023/7/11 15:26, Oliver Upton wrote:
> On Tue, Jul 11, 2023 at 08:23:25AM +0100, Marc Zyngier wrote:
>> On Mon, 10 Jul 2023 18:55:53 +0100,
>> Oliver Upton <oliver.upton@linux.dev> wrote:
>>>
>>> Xiang reports that VMs occasionally fail to boot on GICv4.1 systems when
>>> running a preemptible kernel, as it is possible that a vCPU is blocked
>>> without requesting a doorbell interrupt.
>>>
>>> The issue is that any preemption that occurs between vgic_v4_put() and
>>> schedule() on the block path will mark the vPE as nonresident and *not*
>>> request a doorbell irq.
>>
>> It'd be worth spelling out. You need to go via *three* schedule()
>> calls: one to be preempted (with DB set), one to be made resident
>> again, and then the final one in kvm_vcpu_halt(), clearing the DB on
>> vcpu_put() due to the bug.
> 
> Yeah, a bit lazy in the wording. What I had meant to imply was
> preemption happening after the doorbell is set up and before the thread
> has an opportunity to explicitly schedule out. Perhaps I should just say
> that.
> 
>>>
>>> Fix it by consistently requesting a doorbell irq in the vcpu put path if
>>> the vCPU is blocking.

Yup. Agreed!

>>> While this technically means we could drop the
>>> early doorbell irq request in kvm_vcpu_wfi(), deliberately leave it
>>> intact such that vCPU halt polling can properly detect the wakeup
>>> condition before actually scheduling out a vCPU.

Yeah, just like what we did in commit 07ab0f8d9a12 ("KVM: Call
kvm_arch_vcpu_blocking early into the blocking sequence").

My only concern is that if the preemption happens before halt polling,
we would enter the polling loop with VPE already resident on the RD and
can't recognize any firing GICv4.x virtual interrupts (targeting this
VPE) in polling. [1]

Given that making VPE resident on the vcpu block path (i.e., in
kvm_vcpu_halt()) makes little sense (right?) and leads to this sort of
problem, a crude idea is that we can probably keep track of the
"nested" vgic_v4_{put,load} calls (instead of a single vpe->resident
flag) and keep VPE *not resident* on the whole block path (like what we
had before commit 8e01d9a396e6). And we then rely on
kvm_vcpu_wfi/vgic_v4_load to actually schedule the VPE on...

The "need doorbell" rule would be simple as before: we do request DB
only if there is a WFI trap (by kvm_vcpu_wfi/vgic_v4_load(vcpu, true)),
and don't need it for any other cases.

Nevertheless [1] is just a matter of performance and shouldn't get in
the way of we fixing the initial bug. ;-)

Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>

Thanks,
Zenghui

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] KVM: arm64: vgic-v4: Consistently request doorbell irq for blocking vCPU
  2023-07-12 12:09     ` Zenghui Yu
@ 2023-07-12 13:49       ` Marc Zyngier
  2023-07-12 15:56         ` Zenghui Yu
                           ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Marc Zyngier @ 2023-07-12 13:49 UTC (permalink / raw)
  To: Zenghui Yu
  Cc: Oliver Upton, kvmarm, James Morse, Suzuki K Poulose, stable,
	Xiang Chen

On Wed, 12 Jul 2023 13:09:45 +0100,
Zenghui Yu <yuzenghui@huawei.com> wrote:
> 
> On 2023/7/11 15:26, Oliver Upton wrote:
> > On Tue, Jul 11, 2023 at 08:23:25AM +0100, Marc Zyngier wrote:
> >> On Mon, 10 Jul 2023 18:55:53 +0100,
> >> Oliver Upton <oliver.upton@linux.dev> wrote:
> >>> 
> >>> Xiang reports that VMs occasionally fail to boot on GICv4.1 systems when
> >>> running a preemptible kernel, as it is possible that a vCPU is blocked
> >>> without requesting a doorbell interrupt.
> >>> 
> >>> The issue is that any preemption that occurs between vgic_v4_put() and
> >>> schedule() on the block path will mark the vPE as nonresident and *not*
> >>> request a doorbell irq.
> >> 
> >> It'd be worth spelling out. You need to go via *three* schedule()
> >> calls: one to be preempted (with DB set), one to be made resident
> >> again, and then the final one in kvm_vcpu_halt(), clearing the DB on
> >> vcpu_put() due to the bug.
> > 
> > Yeah, a bit lazy in the wording. What I had meant to imply was
> > preemption happening after the doorbell is set up and before the thread
> > has an opportunity to explicitly schedule out. Perhaps I should just say
> > that.
> > 
> >>> 
> >>> Fix it by consistently requesting a doorbell irq in the vcpu put path if
> >>> the vCPU is blocking.
> 
> Yup. Agreed!
> 
> >>> While this technically means we could drop the
> >>> early doorbell irq request in kvm_vcpu_wfi(), deliberately leave it
> >>> intact such that vCPU halt polling can properly detect the wakeup
> >>> condition before actually scheduling out a vCPU.
> 
> Yeah, just like what we did in commit 07ab0f8d9a12 ("KVM: Call
> kvm_arch_vcpu_blocking early into the blocking sequence").
> 
> My only concern is that if the preemption happens before halt polling,
> we would enter the polling loop with VPE already resident on the RD and
> can't recognize any firing GICv4.x virtual interrupts (targeting this
> VPE) in polling. [1]

The status of the pending bit is recorded in pending_last, so we don't
lose what was snapshot at the point of hitting WFI. But we indeed
don't have any idea for something firing during the polling loop.

> Given that making VPE resident on the vcpu block path (i.e., in
> kvm_vcpu_halt()) makes little sense (right?) and leads to this sort of
> problem, a crude idea is that we can probably keep track of the
> "nested" vgic_v4_{put,load} calls (instead of a single vpe->resident
> flag) and keep VPE *not resident* on the whole block path (like what we
> had before commit 8e01d9a396e6). And we then rely on
> kvm_vcpu_wfi/vgic_v4_load to actually schedule the VPE on...

I'm not sure about the nested tracking part, but it's easy enough to
have a vcpu flag indicating that we're in WFI. So an *alternative* to
the current fix would be something like this:

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index f54ba0a63669..417a0e85456b 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -817,6 +817,8 @@ struct kvm_vcpu_arch {
 #define DBG_SS_ACTIVE_PENDING	__vcpu_single_flag(sflags, BIT(5))
 /* PMUSERENR for the guest EL0 is on physical CPU */
 #define PMUSERENR_ON_CPU	__vcpu_single_flag(sflags, BIT(6))
+/* WFI instruction trapped */
+#define IN_WFI			__vcpu_single_flag(sflags, BIT(7))
 
 /* vcpu entered with HCR_EL2.E2H set */
 #define VCPU_HCR_E2H		__vcpu_single_flag(oflags, BIT(0))
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 236c5f1c9090..cf208d30a9ea 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -725,13 +725,15 @@ void kvm_vcpu_wfi(struct kvm_vcpu *vcpu)
 	 */
 	preempt_disable();
 	kvm_vgic_vmcr_sync(vcpu);
-	vgic_v4_put(vcpu, true);
+	vcpu_set_flag(vcpu, IN_WFI);
+	vgic_v4_put(vcpu);
 	preempt_enable();
 
 	kvm_vcpu_halt(vcpu);
 	vcpu_clear_flag(vcpu, IN_WFIT);
 
 	preempt_disable();
+	vcpu_clear_flag(vcpu, IN_WFI);
 	vgic_v4_load(vcpu);
 	preempt_enable();
 }
@@ -799,7 +801,7 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu)
 		if (kvm_check_request(KVM_REQ_RELOAD_GICv4, vcpu)) {
 			/* The distributor enable bits were changed */
 			preempt_disable();
-			vgic_v4_put(vcpu, false);
+			vgic_v4_put(vcpu);
 			vgic_v4_load(vcpu);
 			preempt_enable();
 		}
diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
index 49d35618d576..df61ead7c757 100644
--- a/arch/arm64/kvm/vgic/vgic-v3.c
+++ b/arch/arm64/kvm/vgic/vgic-v3.c
@@ -780,7 +780,7 @@ void vgic_v3_put(struct kvm_vcpu *vcpu)
 	 * done a vgic_v4_put) and when running a nested guest (the
 	 * vPE was never resident in order to generate a doorbell).
 	 */
-	WARN_ON(vgic_v4_put(vcpu, false));
+	WARN_ON(vgic_v4_put(vcpu));
 
 	vgic_v3_vmcr_sync(vcpu);
 
diff --git a/arch/arm64/kvm/vgic/vgic-v4.c b/arch/arm64/kvm/vgic/vgic-v4.c
index c1c28fe680ba..339a55194b2c 100644
--- a/arch/arm64/kvm/vgic/vgic-v4.c
+++ b/arch/arm64/kvm/vgic/vgic-v4.c
@@ -336,14 +336,14 @@ void vgic_v4_teardown(struct kvm *kvm)
 	its_vm->vpes = NULL;
 }
 
-int vgic_v4_put(struct kvm_vcpu *vcpu, bool need_db)
+int vgic_v4_put(struct kvm_vcpu *vcpu)
 {
 	struct its_vpe *vpe = &vcpu->arch.vgic_cpu.vgic_v3.its_vpe;
 
 	if (!vgic_supports_direct_msis(vcpu->kvm) || !vpe->resident)
 		return 0;
 
-	return its_make_vpe_non_resident(vpe, need_db);
+	return its_make_vpe_non_resident(vpe, !!vcpu_get_flag(vcpu, IN_WFI));
 }
 
 int vgic_v4_load(struct kvm_vcpu *vcpu)
@@ -354,6 +354,9 @@ int vgic_v4_load(struct kvm_vcpu *vcpu)
 	if (!vgic_supports_direct_msis(vcpu->kvm) || vpe->resident)
 		return 0;
 
+	if (vcpu_get_flag(vcpu, IN_WFI))
+		return 0;
+
 	/*
 	 * Before making the VPE resident, make sure the redistributor
 	 * corresponding to our current CPU expects us here. See the
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 9b91a8135dac..765d801d1ddc 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -446,7 +446,7 @@ int kvm_vgic_v4_unset_forwarding(struct kvm *kvm, int irq,
 
 int vgic_v4_load(struct kvm_vcpu *vcpu);
 void vgic_v4_commit(struct kvm_vcpu *vcpu);
-int vgic_v4_put(struct kvm_vcpu *vcpu, bool need_db);
+int vgic_v4_put(struct kvm_vcpu *vcpu);
 
 bool vgic_state_is_nested(struct kvm_vcpu *vcpu);
 

Of course, it is totally untested... ;-) But I like that the doorbell
request is solely driven by the WFI state, and we avoid leaking the
knowledge outside of the vgic code.

Thoughts?

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] KVM: arm64: vgic-v4: Consistently request doorbell irq for blocking vCPU
  2023-07-12 13:49       ` Marc Zyngier
@ 2023-07-12 15:56         ` Zenghui Yu
  2023-07-13  2:38           ` chenxiang (M)
  2023-07-12 20:14         ` Oliver Upton
  2023-07-13  5:57         ` chenxiang (M)
  2 siblings, 1 reply; 13+ messages in thread
From: Zenghui Yu @ 2023-07-12 15:56 UTC (permalink / raw)
  To: Marc Zyngier, Zenghui Yu
  Cc: Oliver Upton, kvmarm, James Morse, Suzuki K Poulose, stable,
	Xiang Chen

On 2023/7/12 21:49, Marc Zyngier wrote:
> On Wed, 12 Jul 2023 13:09:45 +0100,
> Zenghui Yu <yuzenghui@huawei.com> wrote:
>>
>> On 2023/7/11 15:26, Oliver Upton wrote:
>>> On Tue, Jul 11, 2023 at 08:23:25AM +0100, Marc Zyngier wrote:
>>>> On Mon, 10 Jul 2023 18:55:53 +0100,
>>>> Oliver Upton <oliver.upton@linux.dev> wrote:
>>>>>
>>>>> Xiang reports that VMs occasionally fail to boot on GICv4.1 systems when
>>>>> running a preemptible kernel, as it is possible that a vCPU is blocked
>>>>> without requesting a doorbell interrupt.
>>>>>
>>>>> The issue is that any preemption that occurs between vgic_v4_put() and
>>>>> schedule() on the block path will mark the vPE as nonresident and *not*
>>>>> request a doorbell irq.
>>>>
>>>> It'd be worth spelling out. You need to go via *three* schedule()
>>>> calls: one to be preempted (with DB set), one to be made resident
>>>> again, and then the final one in kvm_vcpu_halt(), clearing the DB on
>>>> vcpu_put() due to the bug.
>>>
>>> Yeah, a bit lazy in the wording. What I had meant to imply was
>>> preemption happening after the doorbell is set up and before the thread
>>> has an opportunity to explicitly schedule out. Perhaps I should just say
>>> that.
>>>
>>>>>
>>>>> Fix it by consistently requesting a doorbell irq in the vcpu put path if
>>>>> the vCPU is blocking.
>>
>> Yup. Agreed!
>>
>>>>> While this technically means we could drop the
>>>>> early doorbell irq request in kvm_vcpu_wfi(), deliberately leave it
>>>>> intact such that vCPU halt polling can properly detect the wakeup
>>>>> condition before actually scheduling out a vCPU.
>>
>> Yeah, just like what we did in commit 07ab0f8d9a12 ("KVM: Call
>> kvm_arch_vcpu_blocking early into the blocking sequence").
>>
>> My only concern is that if the preemption happens before halt polling,
>> we would enter the polling loop with VPE already resident on the RD and
>> can't recognize any firing GICv4.x virtual interrupts (targeting this
>> VPE) in polling. [1]
> 
> The status of the pending bit is recorded in pending_last, so we don't
> lose what was snapshot at the point of hitting WFI. But we indeed
> don't have any idea for something firing during the polling loop.
> 
>> Given that making VPE resident on the vcpu block path (i.e., in
>> kvm_vcpu_halt()) makes little sense (right?) and leads to this sort of
>> problem, a crude idea is that we can probably keep track of the
>> "nested" vgic_v4_{put,load} calls (instead of a single vpe->resident
>> flag) and keep VPE *not resident* on the whole block path (like what we
>> had before commit 8e01d9a396e6). And we then rely on
>> kvm_vcpu_wfi/vgic_v4_load to actually schedule the VPE on...
> 
> I'm not sure about the nested tracking part, but it's easy enough to
> have a vcpu flag indicating that we're in WFI. So an *alternative* to
> the current fix would be something like this:
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index f54ba0a63669..417a0e85456b 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -817,6 +817,8 @@ struct kvm_vcpu_arch {
>  #define DBG_SS_ACTIVE_PENDING	__vcpu_single_flag(sflags, BIT(5))
>  /* PMUSERENR for the guest EL0 is on physical CPU */
>  #define PMUSERENR_ON_CPU	__vcpu_single_flag(sflags, BIT(6))
> +/* WFI instruction trapped */
> +#define IN_WFI			__vcpu_single_flag(sflags, BIT(7))

Ah, trust me that I was thinking about exactly the same vcpu flag
when writing the last email. ;-) So here is my Ack for this
alternative, thanks Marc for your quick reply!

>  
>  /* vcpu entered with HCR_EL2.E2H set */
>  #define VCPU_HCR_E2H		__vcpu_single_flag(oflags, BIT(0))
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 236c5f1c9090..cf208d30a9ea 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -725,13 +725,15 @@ void kvm_vcpu_wfi(struct kvm_vcpu *vcpu)
>  	 */
>  	preempt_disable();
>  	kvm_vgic_vmcr_sync(vcpu);
> -	vgic_v4_put(vcpu, true);
> +	vcpu_set_flag(vcpu, IN_WFI);
> +	vgic_v4_put(vcpu);
>  	preempt_enable();
>  
>  	kvm_vcpu_halt(vcpu);
>  	vcpu_clear_flag(vcpu, IN_WFIT);
>  
>  	preempt_disable();
> +	vcpu_clear_flag(vcpu, IN_WFI);
>  	vgic_v4_load(vcpu);
>  	preempt_enable();
>  }
> @@ -799,7 +801,7 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu)
>  		if (kvm_check_request(KVM_REQ_RELOAD_GICv4, vcpu)) {
>  			/* The distributor enable bits were changed */
>  			preempt_disable();
> -			vgic_v4_put(vcpu, false);
> +			vgic_v4_put(vcpu);
>  			vgic_v4_load(vcpu);
>  			preempt_enable();
>  		}
> diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
> index 49d35618d576..df61ead7c757 100644
> --- a/arch/arm64/kvm/vgic/vgic-v3.c
> +++ b/arch/arm64/kvm/vgic/vgic-v3.c
> @@ -780,7 +780,7 @@ void vgic_v3_put(struct kvm_vcpu *vcpu)
>  	 * done a vgic_v4_put) and when running a nested guest (the
>  	 * vPE was never resident in order to generate a doorbell).
>  	 */
> -	WARN_ON(vgic_v4_put(vcpu, false));
> +	WARN_ON(vgic_v4_put(vcpu));
>  
>  	vgic_v3_vmcr_sync(vcpu);
>  
> diff --git a/arch/arm64/kvm/vgic/vgic-v4.c b/arch/arm64/kvm/vgic/vgic-v4.c
> index c1c28fe680ba..339a55194b2c 100644
> --- a/arch/arm64/kvm/vgic/vgic-v4.c
> +++ b/arch/arm64/kvm/vgic/vgic-v4.c
> @@ -336,14 +336,14 @@ void vgic_v4_teardown(struct kvm *kvm)
>  	its_vm->vpes = NULL;
>  }
>  
> -int vgic_v4_put(struct kvm_vcpu *vcpu, bool need_db)
> +int vgic_v4_put(struct kvm_vcpu *vcpu)
>  {
>  	struct its_vpe *vpe = &vcpu->arch.vgic_cpu.vgic_v3.its_vpe;
>  
>  	if (!vgic_supports_direct_msis(vcpu->kvm) || !vpe->resident)
>  		return 0;
>  
> -	return its_make_vpe_non_resident(vpe, need_db);
> +	return its_make_vpe_non_resident(vpe, !!vcpu_get_flag(vcpu, IN_WFI));
>  }
>  
>  int vgic_v4_load(struct kvm_vcpu *vcpu)
> @@ -354,6 +354,9 @@ int vgic_v4_load(struct kvm_vcpu *vcpu)
>  	if (!vgic_supports_direct_msis(vcpu->kvm) || vpe->resident)
>  		return 0;
>  
> +	if (vcpu_get_flag(vcpu, IN_WFI))
> +		return 0;
> +
>  	/*
>  	 * Before making the VPE resident, make sure the redistributor
>  	 * corresponding to our current CPU expects us here. See the
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 9b91a8135dac..765d801d1ddc 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -446,7 +446,7 @@ int kvm_vgic_v4_unset_forwarding(struct kvm *kvm, int irq,
>  
>  int vgic_v4_load(struct kvm_vcpu *vcpu);
>  void vgic_v4_commit(struct kvm_vcpu *vcpu);
> -int vgic_v4_put(struct kvm_vcpu *vcpu, bool need_db);
> +int vgic_v4_put(struct kvm_vcpu *vcpu);
>  
>  bool vgic_state_is_nested(struct kvm_vcpu *vcpu);
>  
> 
> Of course, it is totally untested... ;-) But I like that the doorbell
> request is solely driven by the WFI state, and we avoid leaking the
> knowledge outside of the vgic code.

I'm happy with this approach and will have another look tomorrow. It'd
also be great if Xiang can give this one a go on the appropriate HW.

Thanks,
Zenghui

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] KVM: arm64: vgic-v4: Consistently request doorbell irq for blocking vCPU
  2023-07-12 15:56         ` Zenghui Yu
@ 2023-07-13  2:38           ` chenxiang (M)
  0 siblings, 0 replies; 13+ messages in thread
From: chenxiang (M) @ 2023-07-13  2:38 UTC (permalink / raw)
  To: Zenghui Yu, Marc Zyngier, Zenghui Yu
  Cc: Oliver Upton, kvmarm, James Morse, Suzuki K Poulose, stable



在 2023/7/12 星期三 23:56, Zenghui Yu 写道:
> On 2023/7/12 21:49, Marc Zyngier wrote:
>> On Wed, 12 Jul 2023 13:09:45 +0100,
>> Zenghui Yu <yuzenghui@huawei.com> wrote:
>>>
>>> On 2023/7/11 15:26, Oliver Upton wrote:
>>>> On Tue, Jul 11, 2023 at 08:23:25AM +0100, Marc Zyngier wrote:
>>>>> On Mon, 10 Jul 2023 18:55:53 +0100,
>>>>> Oliver Upton <oliver.upton@linux.dev> wrote:
>>>>>>
>>>>>> Xiang reports that VMs occasionally fail to boot on GICv4.1 
>>>>>> systems when
>>>>>> running a preemptible kernel, as it is possible that a vCPU is 
>>>>>> blocked
>>>>>> without requesting a doorbell interrupt.
>>>>>>
>>>>>> The issue is that any preemption that occurs between 
>>>>>> vgic_v4_put() and
>>>>>> schedule() on the block path will mark the vPE as nonresident and 
>>>>>> *not*
>>>>>> request a doorbell irq.
>>>>>
>>>>> It'd be worth spelling out. You need to go via *three* schedule()
>>>>> calls: one to be preempted (with DB set), one to be made resident
>>>>> again, and then the final one in kvm_vcpu_halt(), clearing the DB on
>>>>> vcpu_put() due to the bug.
>>>>
>>>> Yeah, a bit lazy in the wording. What I had meant to imply was
>>>> preemption happening after the doorbell is set up and before the 
>>>> thread
>>>> has an opportunity to explicitly schedule out. Perhaps I should 
>>>> just say
>>>> that.
>>>>
>>>>>>
>>>>>> Fix it by consistently requesting a doorbell irq in the vcpu put 
>>>>>> path if
>>>>>> the vCPU is blocking.
>>>
>>> Yup. Agreed!
>>>
>>>>>> While this technically means we could drop the
>>>>>> early doorbell irq request in kvm_vcpu_wfi(), deliberately leave it
>>>>>> intact such that vCPU halt polling can properly detect the wakeup
>>>>>> condition before actually scheduling out a vCPU.
>>>
>>> Yeah, just like what we did in commit 07ab0f8d9a12 ("KVM: Call
>>> kvm_arch_vcpu_blocking early into the blocking sequence").
>>>
>>> My only concern is that if the preemption happens before halt polling,
>>> we would enter the polling loop with VPE already resident on the RD and
>>> can't recognize any firing GICv4.x virtual interrupts (targeting this
>>> VPE) in polling. [1]
>>
>> The status of the pending bit is recorded in pending_last, so we don't
>> lose what was snapshot at the point of hitting WFI. But we indeed
>> don't have any idea for something firing during the polling loop.
>>
>>> Given that making VPE resident on the vcpu block path (i.e., in
>>> kvm_vcpu_halt()) makes little sense (right?) and leads to this sort of
>>> problem, a crude idea is that we can probably keep track of the
>>> "nested" vgic_v4_{put,load} calls (instead of a single vpe->resident
>>> flag) and keep VPE *not resident* on the whole block path (like what we
>>> had before commit 8e01d9a396e6). And we then rely on
>>> kvm_vcpu_wfi/vgic_v4_load to actually schedule the VPE on...
>>
>> I'm not sure about the nested tracking part, but it's easy enough to
>> have a vcpu flag indicating that we're in WFI. So an *alternative* to
>> the current fix would be something like this:
>>
>> diff --git a/arch/arm64/include/asm/kvm_host.h 
>> b/arch/arm64/include/asm/kvm_host.h
>> index f54ba0a63669..417a0e85456b 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -817,6 +817,8 @@ struct kvm_vcpu_arch {
>>  #define DBG_SS_ACTIVE_PENDING    __vcpu_single_flag(sflags, BIT(5))
>>  /* PMUSERENR for the guest EL0 is on physical CPU */
>>  #define PMUSERENR_ON_CPU    __vcpu_single_flag(sflags, BIT(6))
>> +/* WFI instruction trapped */
>> +#define IN_WFI            __vcpu_single_flag(sflags, BIT(7))
>
> Ah, trust me that I was thinking about exactly the same vcpu flag
> when writing the last email. ;-) So here is my Ack for this
> alternative, thanks Marc for your quick reply!
>
>>
>>  /* vcpu entered with HCR_EL2.E2H set */
>>  #define VCPU_HCR_E2H        __vcpu_single_flag(oflags, BIT(0))
>> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>> index 236c5f1c9090..cf208d30a9ea 100644
>> --- a/arch/arm64/kvm/arm.c
>> +++ b/arch/arm64/kvm/arm.c
>> @@ -725,13 +725,15 @@ void kvm_vcpu_wfi(struct kvm_vcpu *vcpu)
>>       */
>>      preempt_disable();
>>      kvm_vgic_vmcr_sync(vcpu);
>> -    vgic_v4_put(vcpu, true);
>> +    vcpu_set_flag(vcpu, IN_WFI);
>> +    vgic_v4_put(vcpu);
>>      preempt_enable();
>>
>>      kvm_vcpu_halt(vcpu);
>>      vcpu_clear_flag(vcpu, IN_WFIT);
>>
>>      preempt_disable();
>> +    vcpu_clear_flag(vcpu, IN_WFI);
>>      vgic_v4_load(vcpu);
>>      preempt_enable();
>>  }
>> @@ -799,7 +801,7 @@ static int check_vcpu_requests(struct kvm_vcpu 
>> *vcpu)
>>          if (kvm_check_request(KVM_REQ_RELOAD_GICv4, vcpu)) {
>>              /* The distributor enable bits were changed */
>>              preempt_disable();
>> -            vgic_v4_put(vcpu, false);
>> +            vgic_v4_put(vcpu);
>>              vgic_v4_load(vcpu);
>>              preempt_enable();
>>          }
>> diff --git a/arch/arm64/kvm/vgic/vgic-v3.c 
>> b/arch/arm64/kvm/vgic/vgic-v3.c
>> index 49d35618d576..df61ead7c757 100644
>> --- a/arch/arm64/kvm/vgic/vgic-v3.c
>> +++ b/arch/arm64/kvm/vgic/vgic-v3.c
>> @@ -780,7 +780,7 @@ void vgic_v3_put(struct kvm_vcpu *vcpu)
>>       * done a vgic_v4_put) and when running a nested guest (the
>>       * vPE was never resident in order to generate a doorbell).
>>       */
>> -    WARN_ON(vgic_v4_put(vcpu, false));
>> +    WARN_ON(vgic_v4_put(vcpu));
>>
>>      vgic_v3_vmcr_sync(vcpu);
>>
>> diff --git a/arch/arm64/kvm/vgic/vgic-v4.c 
>> b/arch/arm64/kvm/vgic/vgic-v4.c
>> index c1c28fe680ba..339a55194b2c 100644
>> --- a/arch/arm64/kvm/vgic/vgic-v4.c
>> +++ b/arch/arm64/kvm/vgic/vgic-v4.c
>> @@ -336,14 +336,14 @@ void vgic_v4_teardown(struct kvm *kvm)
>>      its_vm->vpes = NULL;
>>  }
>>
>> -int vgic_v4_put(struct kvm_vcpu *vcpu, bool need_db)
>> +int vgic_v4_put(struct kvm_vcpu *vcpu)
>>  {
>>      struct its_vpe *vpe = &vcpu->arch.vgic_cpu.vgic_v3.its_vpe;
>>
>>      if (!vgic_supports_direct_msis(vcpu->kvm) || !vpe->resident)
>>          return 0;
>>
>> -    return its_make_vpe_non_resident(vpe, need_db);
>> +    return its_make_vpe_non_resident(vpe, !!vcpu_get_flag(vcpu, 
>> IN_WFI));
>>  }
>>
>>  int vgic_v4_load(struct kvm_vcpu *vcpu)
>> @@ -354,6 +354,9 @@ int vgic_v4_load(struct kvm_vcpu *vcpu)
>>      if (!vgic_supports_direct_msis(vcpu->kvm) || vpe->resident)
>>          return 0;
>>
>> +    if (vcpu_get_flag(vcpu, IN_WFI))
>> +        return 0;
>> +
>>      /*
>>       * Before making the VPE resident, make sure the redistributor
>>       * corresponding to our current CPU expects us here. See the
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index 9b91a8135dac..765d801d1ddc 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -446,7 +446,7 @@ int kvm_vgic_v4_unset_forwarding(struct kvm *kvm, 
>> int irq,
>>
>>  int vgic_v4_load(struct kvm_vcpu *vcpu);
>>  void vgic_v4_commit(struct kvm_vcpu *vcpu);
>> -int vgic_v4_put(struct kvm_vcpu *vcpu, bool need_db);
>> +int vgic_v4_put(struct kvm_vcpu *vcpu);
>>
>>  bool vgic_state_is_nested(struct kvm_vcpu *vcpu);
>>
>>
>> Of course, it is totally untested... ;-) But I like that the doorbell
>> request is solely driven by the WFI state, and we avoid leaking the
>> knowledge outside of the vgic code.
>
> I'm happy with this approach and will have another look tomorrow. It'd
> also be great if Xiang can give this one a go on the appropriate HW.

Ok, I will test this approach and feedback the result when finished.

>
> Thanks,
> Zenghui
> .
>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] KVM: arm64: vgic-v4: Consistently request doorbell irq for blocking vCPU
  2023-07-12 13:49       ` Marc Zyngier
  2023-07-12 15:56         ` Zenghui Yu
@ 2023-07-12 20:14         ` Oliver Upton
  2023-07-13  5:57         ` chenxiang (M)
  2 siblings, 0 replies; 13+ messages in thread
From: Oliver Upton @ 2023-07-12 20:14 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Zenghui Yu, kvmarm, James Morse, Suzuki K Poulose, stable,
	Xiang Chen

On Wed, Jul 12, 2023 at 02:49:15PM +0100, Marc Zyngier wrote:
> > Given that making VPE resident on the vcpu block path (i.e., in
> > kvm_vcpu_halt()) makes little sense (right?) and leads to this sort of
> > problem, a crude idea is that we can probably keep track of the
> > "nested" vgic_v4_{put,load} calls (instead of a single vpe->resident
> > flag) and keep VPE *not resident* on the whole block path (like what we
> > had before commit 8e01d9a396e6). And we then rely on
> > kvm_vcpu_wfi/vgic_v4_load to actually schedule the VPE on...
> 
> I'm not sure about the nested tracking part, but it's easy enough to
> have a vcpu flag indicating that we're in WFI. So an *alternative* to
> the current fix would be something like this:

Yeah, I like your approach better. I've gone ahead and backed out my
change and can take this instead once someone tests it out :)

-- 
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] KVM: arm64: vgic-v4: Consistently request doorbell irq for blocking vCPU
  2023-07-12 13:49       ` Marc Zyngier
  2023-07-12 15:56         ` Zenghui Yu
  2023-07-12 20:14         ` Oliver Upton
@ 2023-07-13  5:57         ` chenxiang (M)
  2023-07-13  6:01           ` Oliver Upton
  2 siblings, 1 reply; 13+ messages in thread
From: chenxiang (M) @ 2023-07-13  5:57 UTC (permalink / raw)
  To: Marc Zyngier, Zenghui Yu
  Cc: Oliver Upton, kvmarm, James Morse, Suzuki K Poulose, stable

Hi,


在 2023/7/12 星期三 21:49, Marc Zyngier 写道:
> On Wed, 12 Jul 2023 13:09:45 +0100,
> Zenghui Yu <yuzenghui@huawei.com> wrote:
>> On 2023/7/11 15:26, Oliver Upton wrote:
>>> On Tue, Jul 11, 2023 at 08:23:25AM +0100, Marc Zyngier wrote:
>>>> On Mon, 10 Jul 2023 18:55:53 +0100,
>>>> Oliver Upton <oliver.upton@linux.dev> wrote:
>>>>> Xiang reports that VMs occasionally fail to boot on GICv4.1 systems when
>>>>> running a preemptible kernel, as it is possible that a vCPU is blocked
>>>>> without requesting a doorbell interrupt.
>>>>>
>>>>> The issue is that any preemption that occurs between vgic_v4_put() and
>>>>> schedule() on the block path will mark the vPE as nonresident and *not*
>>>>> request a doorbell irq.
>>>> It'd be worth spelling out. You need to go via *three* schedule()
>>>> calls: one to be preempted (with DB set), one to be made resident
>>>> again, and then the final one in kvm_vcpu_halt(), clearing the DB on
>>>> vcpu_put() due to the bug.
>>> Yeah, a bit lazy in the wording. What I had meant to imply was
>>> preemption happening after the doorbell is set up and before the thread
>>> has an opportunity to explicitly schedule out. Perhaps I should just say
>>> that.
>>>
>>>>> Fix it by consistently requesting a doorbell irq in the vcpu put path if
>>>>> the vCPU is blocking.
>> Yup. Agreed!
>>
>>>>> While this technically means we could drop the
>>>>> early doorbell irq request in kvm_vcpu_wfi(), deliberately leave it
>>>>> intact such that vCPU halt polling can properly detect the wakeup
>>>>> condition before actually scheduling out a vCPU.
>> Yeah, just like what we did in commit 07ab0f8d9a12 ("KVM: Call
>> kvm_arch_vcpu_blocking early into the blocking sequence").
>>
>> My only concern is that if the preemption happens before halt polling,
>> we would enter the polling loop with VPE already resident on the RD and
>> can't recognize any firing GICv4.x virtual interrupts (targeting this
>> VPE) in polling. [1]
> The status of the pending bit is recorded in pending_last, so we don't
> lose what was snapshot at the point of hitting WFI. But we indeed
> don't have any idea for something firing during the polling loop.
>
>> Given that making VPE resident on the vcpu block path (i.e., in
>> kvm_vcpu_halt()) makes little sense (right?) and leads to this sort of
>> problem, a crude idea is that we can probably keep track of the
>> "nested" vgic_v4_{put,load} calls (instead of a single vpe->resident
>> flag) and keep VPE *not resident* on the whole block path (like what we
>> had before commit 8e01d9a396e6). And we then rely on
>> kvm_vcpu_wfi/vgic_v4_load to actually schedule the VPE on...
> I'm not sure about the nested tracking part, but it's easy enough to
> have a vcpu flag indicating that we're in WFI. So an *alternative* to
> the current fix would be something like this:
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index f54ba0a63669..417a0e85456b 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -817,6 +817,8 @@ struct kvm_vcpu_arch {
>   #define DBG_SS_ACTIVE_PENDING	__vcpu_single_flag(sflags, BIT(5))
>   /* PMUSERENR for the guest EL0 is on physical CPU */
>   #define PMUSERENR_ON_CPU	__vcpu_single_flag(sflags, BIT(6))
> +/* WFI instruction trapped */
> +#define IN_WFI			__vcpu_single_flag(sflags, BIT(7))
>   
>   /* vcpu entered with HCR_EL2.E2H set */
>   #define VCPU_HCR_E2H		__vcpu_single_flag(oflags, BIT(0))
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 236c5f1c9090..cf208d30a9ea 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -725,13 +725,15 @@ void kvm_vcpu_wfi(struct kvm_vcpu *vcpu)
>   	 */
>   	preempt_disable();
>   	kvm_vgic_vmcr_sync(vcpu);
> -	vgic_v4_put(vcpu, true);
> +	vcpu_set_flag(vcpu, IN_WFI);
> +	vgic_v4_put(vcpu);
>   	preempt_enable();
>   
>   	kvm_vcpu_halt(vcpu);
>   	vcpu_clear_flag(vcpu, IN_WFIT);
>   
>   	preempt_disable();
> +	vcpu_clear_flag(vcpu, IN_WFI);
>   	vgic_v4_load(vcpu);
>   	preempt_enable();
>   }
> @@ -799,7 +801,7 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu)
>   		if (kvm_check_request(KVM_REQ_RELOAD_GICv4, vcpu)) {
>   			/* The distributor enable bits were changed */
>   			preempt_disable();
> -			vgic_v4_put(vcpu, false);
> +			vgic_v4_put(vcpu);
>   			vgic_v4_load(vcpu);
>   			preempt_enable();
>   		}
> diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
> index 49d35618d576..df61ead7c757 100644
> --- a/arch/arm64/kvm/vgic/vgic-v3.c
> +++ b/arch/arm64/kvm/vgic/vgic-v3.c
> @@ -780,7 +780,7 @@ void vgic_v3_put(struct kvm_vcpu *vcpu)
>   	 * done a vgic_v4_put) and when running a nested guest (the
>   	 * vPE was never resident in order to generate a doorbell).
>   	 */
> -	WARN_ON(vgic_v4_put(vcpu, false));
> +	WARN_ON(vgic_v4_put(vcpu));
>   
>   	vgic_v3_vmcr_sync(vcpu);
>   
> diff --git a/arch/arm64/kvm/vgic/vgic-v4.c b/arch/arm64/kvm/vgic/vgic-v4.c
> index c1c28fe680ba..339a55194b2c 100644
> --- a/arch/arm64/kvm/vgic/vgic-v4.c
> +++ b/arch/arm64/kvm/vgic/vgic-v4.c
> @@ -336,14 +336,14 @@ void vgic_v4_teardown(struct kvm *kvm)
>   	its_vm->vpes = NULL;
>   }
>   
> -int vgic_v4_put(struct kvm_vcpu *vcpu, bool need_db)
> +int vgic_v4_put(struct kvm_vcpu *vcpu)
>   {
>   	struct its_vpe *vpe = &vcpu->arch.vgic_cpu.vgic_v3.its_vpe;
>   
>   	if (!vgic_supports_direct_msis(vcpu->kvm) || !vpe->resident)
>   		return 0;
>   
> -	return its_make_vpe_non_resident(vpe, need_db);
> +	return its_make_vpe_non_resident(vpe, !!vcpu_get_flag(vcpu, IN_WFI));
>   }
>   
>   int vgic_v4_load(struct kvm_vcpu *vcpu)
> @@ -354,6 +354,9 @@ int vgic_v4_load(struct kvm_vcpu *vcpu)
>   	if (!vgic_supports_direct_msis(vcpu->kvm) || vpe->resident)
>   		return 0;
>   
> +	if (vcpu_get_flag(vcpu, IN_WFI))
> +		return 0;
> +
>   	/*
>   	 * Before making the VPE resident, make sure the redistributor
>   	 * corresponding to our current CPU expects us here. See the
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 9b91a8135dac..765d801d1ddc 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -446,7 +446,7 @@ int kvm_vgic_v4_unset_forwarding(struct kvm *kvm, int irq,
>   
>   int vgic_v4_load(struct kvm_vcpu *vcpu);
>   void vgic_v4_commit(struct kvm_vcpu *vcpu);
> -int vgic_v4_put(struct kvm_vcpu *vcpu, bool need_db);
> +int vgic_v4_put(struct kvm_vcpu *vcpu);
>   
>   bool vgic_state_is_nested(struct kvm_vcpu *vcpu);
>   
>
> Of course, it is totally untested... ;-) But I like that the doorbell
> request is solely driven by the WFI state, and we avoid leaking the
> knowledge outside of the vgic code.

I have tested this approach and it also solves the issue. Please feel 
free to add:
Tested-by: Xiang Chen <chenxiang66@hisilicon.com>

>
> Thoughts?
>
> 	M.
>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] KVM: arm64: vgic-v4: Consistently request doorbell irq for blocking vCPU
  2023-07-13  5:57         ` chenxiang (M)
@ 2023-07-13  6:01           ` Oliver Upton
  2023-07-13  7:11             ` Marc Zyngier
  0 siblings, 1 reply; 13+ messages in thread
From: Oliver Upton @ 2023-07-13  6:01 UTC (permalink / raw)
  To: chenxiang (M)
  Cc: Marc Zyngier, Zenghui Yu, kvmarm, James Morse, Suzuki K Poulose,
	stable

On Thu, Jul 13, 2023 at 01:57:46PM +0800, chenxiang (M) wrote:
> > Of course, it is totally untested... ;-) But I like that the doorbell
> > request is solely driven by the WFI state, and we avoid leaking the
> > knowledge outside of the vgic code.
> 
> I have tested this approach and it also solves the issue. Please feel free
> to add:
> Tested-by: Xiang Chen <chenxiang66@hisilicon.com>

Excellent, thanks for testing a couple of iterations for us here. Marc,
do you want to add a changelog to this?

--
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] KVM: arm64: vgic-v4: Consistently request doorbell irq for blocking vCPU
  2023-07-13  6:01           ` Oliver Upton
@ 2023-07-13  7:11             ` Marc Zyngier
  0 siblings, 0 replies; 13+ messages in thread
From: Marc Zyngier @ 2023-07-13  7:11 UTC (permalink / raw)
  To: Oliver Upton
  Cc: chenxiang (M), Zenghui Yu, kvmarm, James Morse, Suzuki K Poulose,
	stable

On Thu, 13 Jul 2023 07:01:39 +0100,
Oliver Upton <oliver.upton@linux.dev> wrote:
> 
> On Thu, Jul 13, 2023 at 01:57:46PM +0800, chenxiang (M) wrote:
> > > Of course, it is totally untested... ;-) But I like that the doorbell
> > > request is solely driven by the WFI state, and we avoid leaking the
> > > knowledge outside of the vgic code.
> > 
> > I have tested this approach and it also solves the issue. Please feel free
> > to add:
> > Tested-by: Xiang Chen <chenxiang66@hisilicon.com>
> 
> Excellent, thanks for testing a couple of iterations for us here. Marc,
> do you want to add a changelog to this?

Done [1]. I've nicked most of your changelog, so you get a CDB tag ;-)

Thanks,

	M.

[1] https://lore.kernel.org/all/20230713070657.3873244-1-maz@kernel.org/

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] KVM: arm64: vgic-v4: Consistently request doorbell irq for blocking vCPU
  2023-07-10 17:55 [PATCH] KVM: arm64: vgic-v4: Consistently request doorbell irq for blocking vCPU Oliver Upton
  2023-07-11  7:23 ` Marc Zyngier
@ 2023-07-11 20:00 ` Oliver Upton
  1 sibling, 0 replies; 13+ messages in thread
From: Oliver Upton @ 2023-07-11 20:00 UTC (permalink / raw)
  To: Oliver Upton, kvmarm
  Cc: Xiang Chen, Zenghui Yu, Marc Zyngier, James Morse,
	Suzuki K Poulose, stable

On Mon, 10 Jul 2023 17:55:53 +0000, Oliver Upton wrote:
> Xiang reports that VMs occasionally fail to boot on GICv4.1 systems when
> running a preemptible kernel, as it is possible that a vCPU is blocked
> without requesting a doorbell interrupt.
> 
> The issue is that any preemption that occurs between vgic_v4_put() and
> schedule() on the block path will mark the vPE as nonresident and *not*
> request a doorbell irq.
> 
> [...]

Applied to kvmarm/fixes, thanks!

[1/1] KVM: arm64: vgic-v4: Consistently request doorbell irq for blocking vCPU
      https://git.kernel.org/kvmarm/kvmarm/c/d30ea1f31ff5

--
Best,
Oliver

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-07-13  7:12 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-10 17:55 [PATCH] KVM: arm64: vgic-v4: Consistently request doorbell irq for blocking vCPU Oliver Upton
2023-07-11  7:23 ` Marc Zyngier
2023-07-11  7:26   ` Oliver Upton
2023-07-11  7:57     ` Marc Zyngier
2023-07-12 12:09     ` Zenghui Yu
2023-07-12 13:49       ` Marc Zyngier
2023-07-12 15:56         ` Zenghui Yu
2023-07-13  2:38           ` chenxiang (M)
2023-07-12 20:14         ` Oliver Upton
2023-07-13  5:57         ` chenxiang (M)
2023-07-13  6:01           ` Oliver Upton
2023-07-13  7:11             ` Marc Zyngier
2023-07-11 20:00 ` Oliver Upton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).