* Re: [PATCH] KVM: arm64: vgic-v4: Consistently request doorbell irq for blocking vCPU
2023-07-12 13:49 ` Marc Zyngier
@ 2023-07-12 15:56 ` Zenghui Yu
2023-07-13 2:38 ` chenxiang (M)
2023-07-12 20:14 ` Oliver Upton
2023-07-13 5:57 ` chenxiang (M)
2 siblings, 1 reply; 13+ messages in thread
From: Zenghui Yu @ 2023-07-12 15:56 UTC (permalink / raw)
To: Marc Zyngier, Zenghui Yu
Cc: Oliver Upton, kvmarm, James Morse, Suzuki K Poulose, stable,
Xiang Chen
On 2023/7/12 21:49, Marc Zyngier wrote:
> On Wed, 12 Jul 2023 13:09:45 +0100,
> Zenghui Yu <yuzenghui@huawei.com> wrote:
>>
>> On 2023/7/11 15:26, Oliver Upton wrote:
>>> On Tue, Jul 11, 2023 at 08:23:25AM +0100, Marc Zyngier wrote:
>>>> On Mon, 10 Jul 2023 18:55:53 +0100,
>>>> Oliver Upton <oliver.upton@linux.dev> wrote:
>>>>>
>>>>> Xiang reports that VMs occasionally fail to boot on GICv4.1 systems when
>>>>> running a preemptible kernel, as it is possible that a vCPU is blocked
>>>>> without requesting a doorbell interrupt.
>>>>>
>>>>> The issue is that any preemption that occurs between vgic_v4_put() and
>>>>> schedule() on the block path will mark the vPE as nonresident and *not*
>>>>> request a doorbell irq.
>>>>
>>>> It'd be worth spelling out. You need to go via *three* schedule()
>>>> calls: one to be preempted (with DB set), one to be made resident
>>>> again, and then the final one in kvm_vcpu_halt(), clearing the DB on
>>>> vcpu_put() due to the bug.
>>>
>>> Yeah, a bit lazy in the wording. What I had meant to imply was
>>> preemption happening after the doorbell is set up and before the thread
>>> has an opportunity to explicitly schedule out. Perhaps I should just say
>>> that.
>>>
>>>>>
>>>>> Fix it by consistently requesting a doorbell irq in the vcpu put path if
>>>>> the vCPU is blocking.
>>
>> Yup. Agreed!
>>
>>>>> While this technically means we could drop the
>>>>> early doorbell irq request in kvm_vcpu_wfi(), deliberately leave it
>>>>> intact such that vCPU halt polling can properly detect the wakeup
>>>>> condition before actually scheduling out a vCPU.
>>
>> Yeah, just like what we did in commit 07ab0f8d9a12 ("KVM: Call
>> kvm_arch_vcpu_blocking early into the blocking sequence").
>>
>> My only concern is that if the preemption happens before halt polling,
>> we would enter the polling loop with VPE already resident on the RD and
>> can't recognize any firing GICv4.x virtual interrupts (targeting this
>> VPE) in polling. [1]
>
> The status of the pending bit is recorded in pending_last, so we don't
> lose what was snapshot at the point of hitting WFI. But we indeed
> don't have any idea for something firing during the polling loop.
>
>> Given that making VPE resident on the vcpu block path (i.e., in
>> kvm_vcpu_halt()) makes little sense (right?) and leads to this sort of
>> problem, a crude idea is that we can probably keep track of the
>> "nested" vgic_v4_{put,load} calls (instead of a single vpe->resident
>> flag) and keep VPE *not resident* on the whole block path (like what we
>> had before commit 8e01d9a396e6). And we then rely on
>> kvm_vcpu_wfi/vgic_v4_load to actually schedule the VPE on...
>
> I'm not sure about the nested tracking part, but it's easy enough to
> have a vcpu flag indicating that we're in WFI. So an *alternative* to
> the current fix would be something like this:
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index f54ba0a63669..417a0e85456b 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -817,6 +817,8 @@ struct kvm_vcpu_arch {
> #define DBG_SS_ACTIVE_PENDING __vcpu_single_flag(sflags, BIT(5))
> /* PMUSERENR for the guest EL0 is on physical CPU */
> #define PMUSERENR_ON_CPU __vcpu_single_flag(sflags, BIT(6))
> +/* WFI instruction trapped */
> +#define IN_WFI __vcpu_single_flag(sflags, BIT(7))
Ah, trust me that I was thinking about exactly the same vcpu flag
when writing the last email. ;-) So here is my Ack for this
alternative, thanks Marc for your quick reply!
>
> /* vcpu entered with HCR_EL2.E2H set */
> #define VCPU_HCR_E2H __vcpu_single_flag(oflags, BIT(0))
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 236c5f1c9090..cf208d30a9ea 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -725,13 +725,15 @@ void kvm_vcpu_wfi(struct kvm_vcpu *vcpu)
> */
> preempt_disable();
> kvm_vgic_vmcr_sync(vcpu);
> - vgic_v4_put(vcpu, true);
> + vcpu_set_flag(vcpu, IN_WFI);
> + vgic_v4_put(vcpu);
> preempt_enable();
>
> kvm_vcpu_halt(vcpu);
> vcpu_clear_flag(vcpu, IN_WFIT);
>
> preempt_disable();
> + vcpu_clear_flag(vcpu, IN_WFI);
> vgic_v4_load(vcpu);
> preempt_enable();
> }
> @@ -799,7 +801,7 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu)
> if (kvm_check_request(KVM_REQ_RELOAD_GICv4, vcpu)) {
> /* The distributor enable bits were changed */
> preempt_disable();
> - vgic_v4_put(vcpu, false);
> + vgic_v4_put(vcpu);
> vgic_v4_load(vcpu);
> preempt_enable();
> }
> diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
> index 49d35618d576..df61ead7c757 100644
> --- a/arch/arm64/kvm/vgic/vgic-v3.c
> +++ b/arch/arm64/kvm/vgic/vgic-v3.c
> @@ -780,7 +780,7 @@ void vgic_v3_put(struct kvm_vcpu *vcpu)
> * done a vgic_v4_put) and when running a nested guest (the
> * vPE was never resident in order to generate a doorbell).
> */
> - WARN_ON(vgic_v4_put(vcpu, false));
> + WARN_ON(vgic_v4_put(vcpu));
>
> vgic_v3_vmcr_sync(vcpu);
>
> diff --git a/arch/arm64/kvm/vgic/vgic-v4.c b/arch/arm64/kvm/vgic/vgic-v4.c
> index c1c28fe680ba..339a55194b2c 100644
> --- a/arch/arm64/kvm/vgic/vgic-v4.c
> +++ b/arch/arm64/kvm/vgic/vgic-v4.c
> @@ -336,14 +336,14 @@ void vgic_v4_teardown(struct kvm *kvm)
> its_vm->vpes = NULL;
> }
>
> -int vgic_v4_put(struct kvm_vcpu *vcpu, bool need_db)
> +int vgic_v4_put(struct kvm_vcpu *vcpu)
> {
> struct its_vpe *vpe = &vcpu->arch.vgic_cpu.vgic_v3.its_vpe;
>
> if (!vgic_supports_direct_msis(vcpu->kvm) || !vpe->resident)
> return 0;
>
> - return its_make_vpe_non_resident(vpe, need_db);
> + return its_make_vpe_non_resident(vpe, !!vcpu_get_flag(vcpu, IN_WFI));
> }
>
> int vgic_v4_load(struct kvm_vcpu *vcpu)
> @@ -354,6 +354,9 @@ int vgic_v4_load(struct kvm_vcpu *vcpu)
> if (!vgic_supports_direct_msis(vcpu->kvm) || vpe->resident)
> return 0;
>
> + if (vcpu_get_flag(vcpu, IN_WFI))
> + return 0;
> +
> /*
> * Before making the VPE resident, make sure the redistributor
> * corresponding to our current CPU expects us here. See the
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 9b91a8135dac..765d801d1ddc 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -446,7 +446,7 @@ int kvm_vgic_v4_unset_forwarding(struct kvm *kvm, int irq,
>
> int vgic_v4_load(struct kvm_vcpu *vcpu);
> void vgic_v4_commit(struct kvm_vcpu *vcpu);
> -int vgic_v4_put(struct kvm_vcpu *vcpu, bool need_db);
> +int vgic_v4_put(struct kvm_vcpu *vcpu);
>
> bool vgic_state_is_nested(struct kvm_vcpu *vcpu);
>
>
> Of course, it is totally untested... ;-) But I like that the doorbell
> request is solely driven by the WFI state, and we avoid leaking the
> knowledge outside of the vgic code.
I'm happy with this approach and will have another look tomorrow. It'd
also be great if Xiang can give this one a go on the appropriate HW.
Thanks,
Zenghui
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH] KVM: arm64: vgic-v4: Consistently request doorbell irq for blocking vCPU
2023-07-12 15:56 ` Zenghui Yu
@ 2023-07-13 2:38 ` chenxiang (M)
0 siblings, 0 replies; 13+ messages in thread
From: chenxiang (M) @ 2023-07-13 2:38 UTC (permalink / raw)
To: Zenghui Yu, Marc Zyngier, Zenghui Yu
Cc: Oliver Upton, kvmarm, James Morse, Suzuki K Poulose, stable
在 2023/7/12 星期三 23:56, Zenghui Yu 写道:
> On 2023/7/12 21:49, Marc Zyngier wrote:
>> On Wed, 12 Jul 2023 13:09:45 +0100,
>> Zenghui Yu <yuzenghui@huawei.com> wrote:
>>>
>>> On 2023/7/11 15:26, Oliver Upton wrote:
>>>> On Tue, Jul 11, 2023 at 08:23:25AM +0100, Marc Zyngier wrote:
>>>>> On Mon, 10 Jul 2023 18:55:53 +0100,
>>>>> Oliver Upton <oliver.upton@linux.dev> wrote:
>>>>>>
>>>>>> Xiang reports that VMs occasionally fail to boot on GICv4.1
>>>>>> systems when
>>>>>> running a preemptible kernel, as it is possible that a vCPU is
>>>>>> blocked
>>>>>> without requesting a doorbell interrupt.
>>>>>>
>>>>>> The issue is that any preemption that occurs between
>>>>>> vgic_v4_put() and
>>>>>> schedule() on the block path will mark the vPE as nonresident and
>>>>>> *not*
>>>>>> request a doorbell irq.
>>>>>
>>>>> It'd be worth spelling out. You need to go via *three* schedule()
>>>>> calls: one to be preempted (with DB set), one to be made resident
>>>>> again, and then the final one in kvm_vcpu_halt(), clearing the DB on
>>>>> vcpu_put() due to the bug.
>>>>
>>>> Yeah, a bit lazy in the wording. What I had meant to imply was
>>>> preemption happening after the doorbell is set up and before the
>>>> thread
>>>> has an opportunity to explicitly schedule out. Perhaps I should
>>>> just say
>>>> that.
>>>>
>>>>>>
>>>>>> Fix it by consistently requesting a doorbell irq in the vcpu put
>>>>>> path if
>>>>>> the vCPU is blocking.
>>>
>>> Yup. Agreed!
>>>
>>>>>> While this technically means we could drop the
>>>>>> early doorbell irq request in kvm_vcpu_wfi(), deliberately leave it
>>>>>> intact such that vCPU halt polling can properly detect the wakeup
>>>>>> condition before actually scheduling out a vCPU.
>>>
>>> Yeah, just like what we did in commit 07ab0f8d9a12 ("KVM: Call
>>> kvm_arch_vcpu_blocking early into the blocking sequence").
>>>
>>> My only concern is that if the preemption happens before halt polling,
>>> we would enter the polling loop with VPE already resident on the RD and
>>> can't recognize any firing GICv4.x virtual interrupts (targeting this
>>> VPE) in polling. [1]
>>
>> The status of the pending bit is recorded in pending_last, so we don't
>> lose what was snapshot at the point of hitting WFI. But we indeed
>> don't have any idea for something firing during the polling loop.
>>
>>> Given that making VPE resident on the vcpu block path (i.e., in
>>> kvm_vcpu_halt()) makes little sense (right?) and leads to this sort of
>>> problem, a crude idea is that we can probably keep track of the
>>> "nested" vgic_v4_{put,load} calls (instead of a single vpe->resident
>>> flag) and keep VPE *not resident* on the whole block path (like what we
>>> had before commit 8e01d9a396e6). And we then rely on
>>> kvm_vcpu_wfi/vgic_v4_load to actually schedule the VPE on...
>>
>> I'm not sure about the nested tracking part, but it's easy enough to
>> have a vcpu flag indicating that we're in WFI. So an *alternative* to
>> the current fix would be something like this:
>>
>> diff --git a/arch/arm64/include/asm/kvm_host.h
>> b/arch/arm64/include/asm/kvm_host.h
>> index f54ba0a63669..417a0e85456b 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -817,6 +817,8 @@ struct kvm_vcpu_arch {
>> #define DBG_SS_ACTIVE_PENDING __vcpu_single_flag(sflags, BIT(5))
>> /* PMUSERENR for the guest EL0 is on physical CPU */
>> #define PMUSERENR_ON_CPU __vcpu_single_flag(sflags, BIT(6))
>> +/* WFI instruction trapped */
>> +#define IN_WFI __vcpu_single_flag(sflags, BIT(7))
>
> Ah, trust me that I was thinking about exactly the same vcpu flag
> when writing the last email. ;-) So here is my Ack for this
> alternative, thanks Marc for your quick reply!
>
>>
>> /* vcpu entered with HCR_EL2.E2H set */
>> #define VCPU_HCR_E2H __vcpu_single_flag(oflags, BIT(0))
>> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>> index 236c5f1c9090..cf208d30a9ea 100644
>> --- a/arch/arm64/kvm/arm.c
>> +++ b/arch/arm64/kvm/arm.c
>> @@ -725,13 +725,15 @@ void kvm_vcpu_wfi(struct kvm_vcpu *vcpu)
>> */
>> preempt_disable();
>> kvm_vgic_vmcr_sync(vcpu);
>> - vgic_v4_put(vcpu, true);
>> + vcpu_set_flag(vcpu, IN_WFI);
>> + vgic_v4_put(vcpu);
>> preempt_enable();
>>
>> kvm_vcpu_halt(vcpu);
>> vcpu_clear_flag(vcpu, IN_WFIT);
>>
>> preempt_disable();
>> + vcpu_clear_flag(vcpu, IN_WFI);
>> vgic_v4_load(vcpu);
>> preempt_enable();
>> }
>> @@ -799,7 +801,7 @@ static int check_vcpu_requests(struct kvm_vcpu
>> *vcpu)
>> if (kvm_check_request(KVM_REQ_RELOAD_GICv4, vcpu)) {
>> /* The distributor enable bits were changed */
>> preempt_disable();
>> - vgic_v4_put(vcpu, false);
>> + vgic_v4_put(vcpu);
>> vgic_v4_load(vcpu);
>> preempt_enable();
>> }
>> diff --git a/arch/arm64/kvm/vgic/vgic-v3.c
>> b/arch/arm64/kvm/vgic/vgic-v3.c
>> index 49d35618d576..df61ead7c757 100644
>> --- a/arch/arm64/kvm/vgic/vgic-v3.c
>> +++ b/arch/arm64/kvm/vgic/vgic-v3.c
>> @@ -780,7 +780,7 @@ void vgic_v3_put(struct kvm_vcpu *vcpu)
>> * done a vgic_v4_put) and when running a nested guest (the
>> * vPE was never resident in order to generate a doorbell).
>> */
>> - WARN_ON(vgic_v4_put(vcpu, false));
>> + WARN_ON(vgic_v4_put(vcpu));
>>
>> vgic_v3_vmcr_sync(vcpu);
>>
>> diff --git a/arch/arm64/kvm/vgic/vgic-v4.c
>> b/arch/arm64/kvm/vgic/vgic-v4.c
>> index c1c28fe680ba..339a55194b2c 100644
>> --- a/arch/arm64/kvm/vgic/vgic-v4.c
>> +++ b/arch/arm64/kvm/vgic/vgic-v4.c
>> @@ -336,14 +336,14 @@ void vgic_v4_teardown(struct kvm *kvm)
>> its_vm->vpes = NULL;
>> }
>>
>> -int vgic_v4_put(struct kvm_vcpu *vcpu, bool need_db)
>> +int vgic_v4_put(struct kvm_vcpu *vcpu)
>> {
>> struct its_vpe *vpe = &vcpu->arch.vgic_cpu.vgic_v3.its_vpe;
>>
>> if (!vgic_supports_direct_msis(vcpu->kvm) || !vpe->resident)
>> return 0;
>>
>> - return its_make_vpe_non_resident(vpe, need_db);
>> + return its_make_vpe_non_resident(vpe, !!vcpu_get_flag(vcpu,
>> IN_WFI));
>> }
>>
>> int vgic_v4_load(struct kvm_vcpu *vcpu)
>> @@ -354,6 +354,9 @@ int vgic_v4_load(struct kvm_vcpu *vcpu)
>> if (!vgic_supports_direct_msis(vcpu->kvm) || vpe->resident)
>> return 0;
>>
>> + if (vcpu_get_flag(vcpu, IN_WFI))
>> + return 0;
>> +
>> /*
>> * Before making the VPE resident, make sure the redistributor
>> * corresponding to our current CPU expects us here. See the
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index 9b91a8135dac..765d801d1ddc 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -446,7 +446,7 @@ int kvm_vgic_v4_unset_forwarding(struct kvm *kvm,
>> int irq,
>>
>> int vgic_v4_load(struct kvm_vcpu *vcpu);
>> void vgic_v4_commit(struct kvm_vcpu *vcpu);
>> -int vgic_v4_put(struct kvm_vcpu *vcpu, bool need_db);
>> +int vgic_v4_put(struct kvm_vcpu *vcpu);
>>
>> bool vgic_state_is_nested(struct kvm_vcpu *vcpu);
>>
>>
>> Of course, it is totally untested... ;-) But I like that the doorbell
>> request is solely driven by the WFI state, and we avoid leaking the
>> knowledge outside of the vgic code.
>
> I'm happy with this approach and will have another look tomorrow. It'd
> also be great if Xiang can give this one a go on the appropriate HW.
Ok, I will test this approach and feedback the result when finished.
>
> Thanks,
> Zenghui
> .
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] KVM: arm64: vgic-v4: Consistently request doorbell irq for blocking vCPU
2023-07-12 13:49 ` Marc Zyngier
2023-07-12 15:56 ` Zenghui Yu
@ 2023-07-12 20:14 ` Oliver Upton
2023-07-13 5:57 ` chenxiang (M)
2 siblings, 0 replies; 13+ messages in thread
From: Oliver Upton @ 2023-07-12 20:14 UTC (permalink / raw)
To: Marc Zyngier
Cc: Zenghui Yu, kvmarm, James Morse, Suzuki K Poulose, stable,
Xiang Chen
On Wed, Jul 12, 2023 at 02:49:15PM +0100, Marc Zyngier wrote:
> > Given that making VPE resident on the vcpu block path (i.e., in
> > kvm_vcpu_halt()) makes little sense (right?) and leads to this sort of
> > problem, a crude idea is that we can probably keep track of the
> > "nested" vgic_v4_{put,load} calls (instead of a single vpe->resident
> > flag) and keep VPE *not resident* on the whole block path (like what we
> > had before commit 8e01d9a396e6). And we then rely on
> > kvm_vcpu_wfi/vgic_v4_load to actually schedule the VPE on...
>
> I'm not sure about the nested tracking part, but it's easy enough to
> have a vcpu flag indicating that we're in WFI. So an *alternative* to
> the current fix would be something like this:
Yeah, I like your approach better. I've gone ahead and backed out my
change and can take this instead once someone tests it out :)
--
Thanks,
Oliver
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH] KVM: arm64: vgic-v4: Consistently request doorbell irq for blocking vCPU
2023-07-12 13:49 ` Marc Zyngier
2023-07-12 15:56 ` Zenghui Yu
2023-07-12 20:14 ` Oliver Upton
@ 2023-07-13 5:57 ` chenxiang (M)
2023-07-13 6:01 ` Oliver Upton
2 siblings, 1 reply; 13+ messages in thread
From: chenxiang (M) @ 2023-07-13 5:57 UTC (permalink / raw)
To: Marc Zyngier, Zenghui Yu
Cc: Oliver Upton, kvmarm, James Morse, Suzuki K Poulose, stable
Hi,
在 2023/7/12 星期三 21:49, Marc Zyngier 写道:
> On Wed, 12 Jul 2023 13:09:45 +0100,
> Zenghui Yu <yuzenghui@huawei.com> wrote:
>> On 2023/7/11 15:26, Oliver Upton wrote:
>>> On Tue, Jul 11, 2023 at 08:23:25AM +0100, Marc Zyngier wrote:
>>>> On Mon, 10 Jul 2023 18:55:53 +0100,
>>>> Oliver Upton <oliver.upton@linux.dev> wrote:
>>>>> Xiang reports that VMs occasionally fail to boot on GICv4.1 systems when
>>>>> running a preemptible kernel, as it is possible that a vCPU is blocked
>>>>> without requesting a doorbell interrupt.
>>>>>
>>>>> The issue is that any preemption that occurs between vgic_v4_put() and
>>>>> schedule() on the block path will mark the vPE as nonresident and *not*
>>>>> request a doorbell irq.
>>>> It'd be worth spelling out. You need to go via *three* schedule()
>>>> calls: one to be preempted (with DB set), one to be made resident
>>>> again, and then the final one in kvm_vcpu_halt(), clearing the DB on
>>>> vcpu_put() due to the bug.
>>> Yeah, a bit lazy in the wording. What I had meant to imply was
>>> preemption happening after the doorbell is set up and before the thread
>>> has an opportunity to explicitly schedule out. Perhaps I should just say
>>> that.
>>>
>>>>> Fix it by consistently requesting a doorbell irq in the vcpu put path if
>>>>> the vCPU is blocking.
>> Yup. Agreed!
>>
>>>>> While this technically means we could drop the
>>>>> early doorbell irq request in kvm_vcpu_wfi(), deliberately leave it
>>>>> intact such that vCPU halt polling can properly detect the wakeup
>>>>> condition before actually scheduling out a vCPU.
>> Yeah, just like what we did in commit 07ab0f8d9a12 ("KVM: Call
>> kvm_arch_vcpu_blocking early into the blocking sequence").
>>
>> My only concern is that if the preemption happens before halt polling,
>> we would enter the polling loop with VPE already resident on the RD and
>> can't recognize any firing GICv4.x virtual interrupts (targeting this
>> VPE) in polling. [1]
> The status of the pending bit is recorded in pending_last, so we don't
> lose what was snapshot at the point of hitting WFI. But we indeed
> don't have any idea for something firing during the polling loop.
>
>> Given that making VPE resident on the vcpu block path (i.e., in
>> kvm_vcpu_halt()) makes little sense (right?) and leads to this sort of
>> problem, a crude idea is that we can probably keep track of the
>> "nested" vgic_v4_{put,load} calls (instead of a single vpe->resident
>> flag) and keep VPE *not resident* on the whole block path (like what we
>> had before commit 8e01d9a396e6). And we then rely on
>> kvm_vcpu_wfi/vgic_v4_load to actually schedule the VPE on...
> I'm not sure about the nested tracking part, but it's easy enough to
> have a vcpu flag indicating that we're in WFI. So an *alternative* to
> the current fix would be something like this:
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index f54ba0a63669..417a0e85456b 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -817,6 +817,8 @@ struct kvm_vcpu_arch {
> #define DBG_SS_ACTIVE_PENDING __vcpu_single_flag(sflags, BIT(5))
> /* PMUSERENR for the guest EL0 is on physical CPU */
> #define PMUSERENR_ON_CPU __vcpu_single_flag(sflags, BIT(6))
> +/* WFI instruction trapped */
> +#define IN_WFI __vcpu_single_flag(sflags, BIT(7))
>
> /* vcpu entered with HCR_EL2.E2H set */
> #define VCPU_HCR_E2H __vcpu_single_flag(oflags, BIT(0))
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 236c5f1c9090..cf208d30a9ea 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -725,13 +725,15 @@ void kvm_vcpu_wfi(struct kvm_vcpu *vcpu)
> */
> preempt_disable();
> kvm_vgic_vmcr_sync(vcpu);
> - vgic_v4_put(vcpu, true);
> + vcpu_set_flag(vcpu, IN_WFI);
> + vgic_v4_put(vcpu);
> preempt_enable();
>
> kvm_vcpu_halt(vcpu);
> vcpu_clear_flag(vcpu, IN_WFIT);
>
> preempt_disable();
> + vcpu_clear_flag(vcpu, IN_WFI);
> vgic_v4_load(vcpu);
> preempt_enable();
> }
> @@ -799,7 +801,7 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu)
> if (kvm_check_request(KVM_REQ_RELOAD_GICv4, vcpu)) {
> /* The distributor enable bits were changed */
> preempt_disable();
> - vgic_v4_put(vcpu, false);
> + vgic_v4_put(vcpu);
> vgic_v4_load(vcpu);
> preempt_enable();
> }
> diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
> index 49d35618d576..df61ead7c757 100644
> --- a/arch/arm64/kvm/vgic/vgic-v3.c
> +++ b/arch/arm64/kvm/vgic/vgic-v3.c
> @@ -780,7 +780,7 @@ void vgic_v3_put(struct kvm_vcpu *vcpu)
> * done a vgic_v4_put) and when running a nested guest (the
> * vPE was never resident in order to generate a doorbell).
> */
> - WARN_ON(vgic_v4_put(vcpu, false));
> + WARN_ON(vgic_v4_put(vcpu));
>
> vgic_v3_vmcr_sync(vcpu);
>
> diff --git a/arch/arm64/kvm/vgic/vgic-v4.c b/arch/arm64/kvm/vgic/vgic-v4.c
> index c1c28fe680ba..339a55194b2c 100644
> --- a/arch/arm64/kvm/vgic/vgic-v4.c
> +++ b/arch/arm64/kvm/vgic/vgic-v4.c
> @@ -336,14 +336,14 @@ void vgic_v4_teardown(struct kvm *kvm)
> its_vm->vpes = NULL;
> }
>
> -int vgic_v4_put(struct kvm_vcpu *vcpu, bool need_db)
> +int vgic_v4_put(struct kvm_vcpu *vcpu)
> {
> struct its_vpe *vpe = &vcpu->arch.vgic_cpu.vgic_v3.its_vpe;
>
> if (!vgic_supports_direct_msis(vcpu->kvm) || !vpe->resident)
> return 0;
>
> - return its_make_vpe_non_resident(vpe, need_db);
> + return its_make_vpe_non_resident(vpe, !!vcpu_get_flag(vcpu, IN_WFI));
> }
>
> int vgic_v4_load(struct kvm_vcpu *vcpu)
> @@ -354,6 +354,9 @@ int vgic_v4_load(struct kvm_vcpu *vcpu)
> if (!vgic_supports_direct_msis(vcpu->kvm) || vpe->resident)
> return 0;
>
> + if (vcpu_get_flag(vcpu, IN_WFI))
> + return 0;
> +
> /*
> * Before making the VPE resident, make sure the redistributor
> * corresponding to our current CPU expects us here. See the
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 9b91a8135dac..765d801d1ddc 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -446,7 +446,7 @@ int kvm_vgic_v4_unset_forwarding(struct kvm *kvm, int irq,
>
> int vgic_v4_load(struct kvm_vcpu *vcpu);
> void vgic_v4_commit(struct kvm_vcpu *vcpu);
> -int vgic_v4_put(struct kvm_vcpu *vcpu, bool need_db);
> +int vgic_v4_put(struct kvm_vcpu *vcpu);
>
> bool vgic_state_is_nested(struct kvm_vcpu *vcpu);
>
>
> Of course, it is totally untested... ;-) But I like that the doorbell
> request is solely driven by the WFI state, and we avoid leaking the
> knowledge outside of the vgic code.
I have tested this approach and it also solves the issue. Please feel
free to add:
Tested-by: Xiang Chen <chenxiang66@hisilicon.com>
>
> Thoughts?
>
> M.
>
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH] KVM: arm64: vgic-v4: Consistently request doorbell irq for blocking vCPU
2023-07-13 5:57 ` chenxiang (M)
@ 2023-07-13 6:01 ` Oliver Upton
2023-07-13 7:11 ` Marc Zyngier
0 siblings, 1 reply; 13+ messages in thread
From: Oliver Upton @ 2023-07-13 6:01 UTC (permalink / raw)
To: chenxiang (M)
Cc: Marc Zyngier, Zenghui Yu, kvmarm, James Morse, Suzuki K Poulose,
stable
On Thu, Jul 13, 2023 at 01:57:46PM +0800, chenxiang (M) wrote:
> > Of course, it is totally untested... ;-) But I like that the doorbell
> > request is solely driven by the WFI state, and we avoid leaking the
> > knowledge outside of the vgic code.
>
> I have tested this approach and it also solves the issue. Please feel free
> to add:
> Tested-by: Xiang Chen <chenxiang66@hisilicon.com>
Excellent, thanks for testing a couple of iterations for us here. Marc,
do you want to add a changelog to this?
--
Thanks,
Oliver
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] KVM: arm64: vgic-v4: Consistently request doorbell irq for blocking vCPU
2023-07-13 6:01 ` Oliver Upton
@ 2023-07-13 7:11 ` Marc Zyngier
0 siblings, 0 replies; 13+ messages in thread
From: Marc Zyngier @ 2023-07-13 7:11 UTC (permalink / raw)
To: Oliver Upton
Cc: chenxiang (M), Zenghui Yu, kvmarm, James Morse, Suzuki K Poulose,
stable
On Thu, 13 Jul 2023 07:01:39 +0100,
Oliver Upton <oliver.upton@linux.dev> wrote:
>
> On Thu, Jul 13, 2023 at 01:57:46PM +0800, chenxiang (M) wrote:
> > > Of course, it is totally untested... ;-) But I like that the doorbell
> > > request is solely driven by the WFI state, and we avoid leaking the
> > > knowledge outside of the vgic code.
> >
> > I have tested this approach and it also solves the issue. Please feel free
> > to add:
> > Tested-by: Xiang Chen <chenxiang66@hisilicon.com>
>
> Excellent, thanks for testing a couple of iterations for us here. Marc,
> do you want to add a changelog to this?
Done [1]. I've nicked most of your changelog, so you get a CDB tag ;-)
Thanks,
M.
[1] https://lore.kernel.org/all/20230713070657.3873244-1-maz@kernel.org/
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 13+ messages in thread