[Question] Received vtimer interrupt but ISTATUS is 0

linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* [Question] Received vtimer interrupt but ISTATUS is 0
@ 2025-10-14 14:45 Kunkun Jiang
  2025-10-14 16:32 ` Marc Zyngier
  0 siblings, 1 reply; 4+ messages in thread
From: Kunkun Jiang @ 2025-10-14 14:45 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
	Zenghui Yu, Catalin Marinas, Will Deacon
  Cc: moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64),
	open list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64), open list,
	wanghaibin.wang@huawei.com, Zenghui Yu

Hi all,

I'm having a very strange problem that can be simplified to a vtimer 
interrupt being received but ISTATUS is 0. Why dose this happen? 
According to analysis, it may be the timer condition is met and the 
interrupt is generated. Maybe some actions(cancel timer?) are done in 
the VM, ISTATUS becomes 0 and he hardware needs to clear the interrupt. 
But the clear command is sent too slowly, the OS has already read the 
ICC_IAR_EL1. So hypervisor executed kvm_arch_timer_handler but ISTATUS is 0.
The code flow is as follows:
kvm_arch_timer_handler
     ->if (kvm_timer_should_fire)
         ->the value of SYS_CNTV_CTL is 0b001(ISTATUS=0,IMASK=0,ENABLE=1)
     ->return IRQ_HANDLED

Because ISTATUS is 0, kvm_timer_update_irq will not be executed to 
inject this interrupt into the VM. Since EOImode is 1 and the vtimer 
interrupt has IRQD_FORWARDED_TO_VCPU flag, hypervisor will not write 
ICC_DIR_EL1 to deactivate the interrupt. This interrupt remains in 
active state, blocking subsequent interrupt from being process. 
Fortunately, in kvm_timer_vcpu_load it will be determined again whether 
an interrupt needs to be injected into the VM. But the delay will 
definitely increase.

What I want to discuss is the solution to this problem. My solution is 
to add a deactivation action:
diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
index dbd74e4885e2..46baba531d51 100644
--- a/arch/arm64/kvm/arch_timer.c
+++ b/arch/arm64/kvm/arch_timer.c
@@ -228,8 +228,13 @@ static irqreturn_t kvm_arch_timer_handler(int irq, 
void *dev_id)
         else
                 ctx = map.direct_ptimer;

-       if (kvm_timer_should_fire(ctx))
+       if (kvm_timer_should_fire(ctx)) {
                 kvm_timer_update_irq(vcpu, true, ctx);
+       } else {
+               struct vgic_irq *irq;
+               irq = vgic_get_vcpu_irq(vcpu, timer_irq(timer_ctx));
+               gic_write_dir(irq->hwintid);
+       }

         if (userspace_irqchip(vcpu->kvm) &&
             !static_branch_unlikely(&has_gic_active_state))

If you have any new ideas or other solutions to this problem, please let 
me know.

Looking forward to your reply.

Kunkun Jiang





^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [Question] Received vtimer interrupt but ISTATUS is 0
  2025-10-14 14:45 [Question] Received vtimer interrupt but ISTATUS is 0 Kunkun Jiang
@ 2025-10-14 16:32 ` Marc Zyngier
  2025-10-21 13:38   ` Kunkun Jiang
  0 siblings, 1 reply; 4+ messages in thread
From: Marc Zyngier @ 2025-10-14 16:32 UTC (permalink / raw)
  To: Kunkun Jiang
  Cc: Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
	Catalin Marinas, Will Deacon,
	moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64),
	open list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64), open list,
	wanghaibin.wang@huawei.com

On Tue, 14 Oct 2025 15:45:37 +0100,
Kunkun Jiang <jiangkunkun@huawei.com> wrote:
> 
> Hi all,
> 
> I'm having a very strange problem that can be simplified to a vtimer
> interrupt being received but ISTATUS is 0. Why dose this happen?
> According to analysis, it may be the timer condition is met and the
> interrupt is generated. Maybe some actions(cancel timer?) are done in
> the VM, ISTATUS becomes 0 and he hardware needs to clear the
> interrupt. But the clear command is sent too slowly, the OS has
> already read the ICC_IAR_EL1. So hypervisor executed
> kvm_arch_timer_handler but ISTATUS is 0.

If what you describe is accurate, and that the HW takes so long to
retire the timer interrupt that we cannot trust having taken an
interrupt, how long until we can trust that what we have is actually
correct?

Given that it takes a full exit from the guest before we can handle
the interrupt, I am rather puzzled that you observe this sort of bad
behaviours on modern HW. You either have an insanely fast CPU with a
very slow GIC, or a very bizarre machine (a bit like a ThunderX -- not
a compliment).

How does it work when context-switching from a vcpu that has a pending
timer interrupt to one that doesn't? Do you also see spurious
interrupts?

> The code flow is as follows:
> kvm_arch_timer_handler
>     ->if (kvm_timer_should_fire)
>         ->the value of SYS_CNTV_CTL is 0b001(ISTATUS=0,IMASK=0,ENABLE=1)
>     ->return IRQ_HANDLED
> 
> Because ISTATUS is 0, kvm_timer_update_irq will not be executed to
> inject this interrupt into the VM. Since EOImode is 1 and the vtimer
> interrupt has IRQD_FORWARDED_TO_VCPU flag, hypervisor will not write
> ICC_DIR_EL1 to deactivate the interrupt. This interrupt remains in
> active state, blocking subsequent interrupt from being
> process. Fortunately, in kvm_timer_vcpu_load it will be determined
> again whether an interrupt needs to be injected into the VM. But the
> delay will definitely increase.

Right, so you are at most a context switch away from your next
interrupt, just like in the !vcpu case. While not ideal, that's not
fatal.

> 
> What I want to discuss is the solution to this problem. My solution is
> to add a deactivation action:
> diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> index dbd74e4885e2..46baba531d51 100644
> --- a/arch/arm64/kvm/arch_timer.c
> +++ b/arch/arm64/kvm/arch_timer.c
> @@ -228,8 +228,13 @@ static irqreturn_t kvm_arch_timer_handler(int
> irq, void *dev_id)
>         else
>                 ctx = map.direct_ptimer;
> 
> -       if (kvm_timer_should_fire(ctx))
> +       if (kvm_timer_should_fire(ctx)) {
>                 kvm_timer_update_irq(vcpu, true, ctx);
> +       } else {
> +               struct vgic_irq *irq;
> +               irq = vgic_get_vcpu_irq(vcpu, timer_irq(timer_ctx));
> +               gic_write_dir(irq->hwintid);
> +       }
> 
>         if (userspace_irqchip(vcpu->kvm) &&
>             !static_branch_unlikely(&has_gic_active_state))
> 
> If you have any new ideas or other solutions to this problem, please
> let me know.

That's not right.

For a start, this is GICv3 specific, and will break on everything
else. Also, why the round-trip via the vgic_irq when you already have
the interrupt number that has fired *as a parameter*?

Finally, this breaks with NV, as you could have switched between EL1
and EL2 timers, and since you cannot trust you are in the correct
interrupt context (interrupt firing out of context), you can't trust
irq->hwintid either, as the mappings will have changed.

Something like the patchlet below should do the trick, but I'm
definitely not happy about this sort of sorry hacks.

	M.

diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
index dbd74e4885e24..3db7c6bdffbc0 100644
--- a/arch/arm64/kvm/arch_timer.c
+++ b/arch/arm64/kvm/arch_timer.c
@@ -206,6 +206,13 @@ static void soft_timer_cancel(struct hrtimer *hrt)
 	hrtimer_cancel(hrt);
 }
 
+static void set_timer_irq_phys_active(struct arch_timer_context *ctx, bool active)
+{
+	int r;
+	r = irq_set_irqchip_state(ctx->host_timer_irq, IRQCHIP_STATE_ACTIVE, active);
+	WARN_ON(r);
+}
+
 static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id)
 {
 	struct kvm_vcpu *vcpu = *(struct kvm_vcpu **)dev_id;
@@ -230,6 +237,8 @@ static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id)
 
 	if (kvm_timer_should_fire(ctx))
 		kvm_timer_update_irq(vcpu, true, ctx);
+	else
+		set_timer_irq_phys_active(ctx, false);
 
 	if (userspace_irqchip(vcpu->kvm) &&
 	    !static_branch_unlikely(&has_gic_active_state))
@@ -659,13 +668,6 @@ static void timer_restore_state(struct arch_timer_context *ctx)
 	local_irq_restore(flags);
 }
 
-static inline void set_timer_irq_phys_active(struct arch_timer_context *ctx, bool active)
-{
-	int r;
-	r = irq_set_irqchip_state(ctx->host_timer_irq, IRQCHIP_STATE_ACTIVE, active);
-	WARN_ON(r);
-}
-
 static void kvm_timer_vcpu_load_gic(struct arch_timer_context *ctx)
 {
 	struct kvm_vcpu *vcpu = ctx->vcpu;

-- 
Jazz isn't dead. It just smells funny.


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [Question] Received vtimer interrupt but ISTATUS is 0
  2025-10-14 16:32 ` Marc Zyngier
@ 2025-10-21 13:38   ` Kunkun Jiang
  2025-10-21 14:46     ` Marc Zyngier
  0 siblings, 1 reply; 4+ messages in thread
From: Kunkun Jiang @ 2025-10-21 13:38 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
	Catalin Marinas, Will Deacon,
	moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64),
	open list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64), open list,
	wanghaibin.wang@huawei.com

Hi Marc,

On 2025/10/15 0:32, Marc Zyngier wrote:
> On Tue, 14 Oct 2025 15:45:37 +0100,
> Kunkun Jiang <jiangkunkun@huawei.com> wrote:
>>
>> Hi all,
>>
>> I'm having a very strange problem that can be simplified to a vtimer
>> interrupt being received but ISTATUS is 0. Why dose this happen?
>> According to analysis, it may be the timer condition is met and the
>> interrupt is generated. Maybe some actions(cancel timer?) are done in
>> the VM, ISTATUS becomes 0 and he hardware needs to clear the
>> interrupt. But the clear command is sent too slowly, the OS has
>> already read the ICC_IAR_EL1. So hypervisor executed
>> kvm_arch_timer_handler but ISTATUS is 0.
> 
> If what you describe is accurate, and that the HW takes so long to
> retire the timer interrupt that we cannot trust having taken an
> interrupt, how long until we can trust that what we have is actually
> correct?
> 
> Given that it takes a full exit from the guest before we can handle
> the interrupt, I am rather puzzled that you observe this sort of bad
> behaviours on modern HW. You either have an insanely fast CPU with a
> very slow GIC, or a very bizarre machine (a bit like a ThunderX -- not
> a compliment).
I added dump_stack in the exception branch, and the following is the 
stack when the problem occurred.
> [ 2669.521569] Call trace:
> [ 2669.521577]  dump_backtrace+0x0/0x220
> [ 2669.521579]  show_stack+0x20/0x2c
> [ 2669.521583]  dump_stack+0xf0/0x138
> [ 2669.521588]  kvm_arch_timer_handler+0x138/0x194
> [ 2669.521592]  handle_percpu_devid_irq+0x90/0x1f4
> [ 2669.521598]  __handle_domain_irq+0x84/0xfc
> [ 2669.521600]  gic_handle_irq+0xfc/0x320
> [ 2669.521601]  el1_irq+0xb8/0x140
> [ 2669.521604]  kvm_arch_vcpu_ioctl_run+0x258/0x6fc
> [ 2669.521607]  kvm_vcpu_ioctl+0x334/0xa94
> [ 2669.521612]  __arm64_sys_ioctl+0xb0/0xf4
> [ 2669.521614]  el0_svc_common.constprop.0+0x7c/0x1bc
> [ 2669.521616]  do_el0_svc+0x2c/0xa4
> [ 2669.521619]  el0_svc+0x20/0x30
> [ 2669.521620]  el0_sync_handler+0xb0/0xb4
> [ 2669.521621]  el0_sync+0x160/0x180By analyzing this stack, it should indeed take a full exit from the 
guest.Do you think this is a hardware issue?
> 
> How does it work when context-switching from a vcpu that has a pending
> timer interrupt to one that doesn't? Do you also see spurious
> interrupts?
I added a log under the 'if(!vcpu)' branch and tested it, but it did not 
go to this branch. In addition, I have set the vcpu to be bound to the 
core, and only one vcpu is running on one core.
> 
>> The code flow is as follows:
>> kvm_arch_timer_handler
>>      ->if (kvm_timer_should_fire)
>>          ->the value of SYS_CNTV_CTL is 0b001(ISTATUS=0,IMASK=0,ENABLE=1)
>>      ->return IRQ_HANDLED
>>
>> Because ISTATUS is 0, kvm_timer_update_irq will not be executed to
>> inject this interrupt into the VM. Since EOImode is 1 and the vtimer
>> interrupt has IRQD_FORWARDED_TO_VCPU flag, hypervisor will not write
>> ICC_DIR_EL1 to deactivate the interrupt. This interrupt remains in
>> active state, blocking subsequent interrupt from being
>> process. Fortunately, in kvm_timer_vcpu_load it will be determined
>> again whether an interrupt needs to be injected into the VM. But the
>> delay will definitely increase.
> 
> Right, so you are at most a context switch away from your next
> interrupt, just like in the !vcpu case. While not ideal, that's not
> fatal.
> 
>>
>> What I want to discuss is the solution to this problem. My solution is
>> to add a deactivation action:
>> diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
>> index dbd74e4885e2..46baba531d51 100644
>> --- a/arch/arm64/kvm/arch_timer.c
>> +++ b/arch/arm64/kvm/arch_timer.c
>> @@ -228,8 +228,13 @@ static irqreturn_t kvm_arch_timer_handler(int
>> irq, void *dev_id)
>>          else
>>                  ctx = map.direct_ptimer;
>>
>> -       if (kvm_timer_should_fire(ctx))
>> +       if (kvm_timer_should_fire(ctx)) {
>>                  kvm_timer_update_irq(vcpu, true, ctx);
>> +       } else {
>> +               struct vgic_irq *irq;
>> +               irq = vgic_get_vcpu_irq(vcpu, timer_irq(timer_ctx));
>> +               gic_write_dir(irq->hwintid);
>> +       }
>>
>>          if (userspace_irqchip(vcpu->kvm) &&
>>              !static_branch_unlikely(&has_gic_active_state))
>>
>> If you have any new ideas or other solutions to this problem, please
>> let me know.
> 
> That's not right.
> 
> For a start, this is GICv3 specific, and will break on everything
> else. Also, why the round-trip via the vgic_irq when you already have
> the interrupt number that has fired *as a parameter*?
> 
> Finally, this breaks with NV, as you could have switched between EL1
> and EL2 timers, and since you cannot trust you are in the correct
> interrupt context (interrupt firing out of context), you can't trust
> irq->hwintid either, as the mappings will have changed.
> 
> Something like the patchlet below should do the trick, but I'm
> definitely not happy about this sort of sorry hacks.
> 
> 	M.
> 
> diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> index dbd74e4885e24..3db7c6bdffbc0 100644
> --- a/arch/arm64/kvm/arch_timer.c
> +++ b/arch/arm64/kvm/arch_timer.c
> @@ -206,6 +206,13 @@ static void soft_timer_cancel(struct hrtimer *hrt)
>   	hrtimer_cancel(hrt);
>   }
>   
> +static void set_timer_irq_phys_active(struct arch_timer_context *ctx, bool active)
> +{
> +	int r;
> +	r = irq_set_irqchip_state(ctx->host_timer_irq, IRQCHIP_STATE_ACTIVE, active);
> +	WARN_ON(r);
> +}
> +
>   static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id)
>   {
>   	struct kvm_vcpu *vcpu = *(struct kvm_vcpu **)dev_id;
> @@ -230,6 +237,8 @@ static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id)
>   
>   	if (kvm_timer_should_fire(ctx))
>   		kvm_timer_update_irq(vcpu, true, ctx);
> +	else
> +		set_timer_irq_phys_active(ctx, false);
>   
>   	if (userspace_irqchip(vcpu->kvm) &&
>   	    !static_branch_unlikely(&has_gic_active_state))
> @@ -659,13 +668,6 @@ static void timer_restore_state(struct arch_timer_context *ctx)
>   	local_irq_restore(flags);
>   }
>   
> -static inline void set_timer_irq_phys_active(struct arch_timer_context *ctx, bool active)
> -{
> -	int r;
> -	r = irq_set_irqchip_state(ctx->host_timer_irq, IRQCHIP_STATE_ACTIVE, active);
> -	WARN_ON(r);
> -}
> -
>   static void kvm_timer_vcpu_load_gic(struct arch_timer_context *ctx)
>   {
>   	struct kvm_vcpu *vcpu = ctx->vcpu;
> 
After extensive testing, this patch was able to resolve the issue I 
encountered.
Tested-by: Kunkun Jiang <jiangkunkun@huawei.com>

Thanks,
Kunkun Jiang


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Question] Received vtimer interrupt but ISTATUS is 0
  2025-10-21 13:38   ` Kunkun Jiang
@ 2025-10-21 14:46     ` Marc Zyngier
  0 siblings, 0 replies; 4+ messages in thread
From: Marc Zyngier @ 2025-10-21 14:46 UTC (permalink / raw)
  To: Kunkun Jiang
  Cc: Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
	Catalin Marinas, Will Deacon,
	moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64),
	open list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64), open list,
	wanghaibin.wang@huawei.com

On Tue, 21 Oct 2025 14:38:26 +0100,
Kunkun Jiang <jiangkunkun@huawei.com> wrote:
> 
> Hi Marc,
> 
> On 2025/10/15 0:32, Marc Zyngier wrote:
> > On Tue, 14 Oct 2025 15:45:37 +0100,
> > Kunkun Jiang <jiangkunkun@huawei.com> wrote:
> >> 
> >> Hi all,
> >> 
> >> I'm having a very strange problem that can be simplified to a vtimer
> >> interrupt being received but ISTATUS is 0. Why dose this happen?
> >> According to analysis, it may be the timer condition is met and the
> >> interrupt is generated. Maybe some actions(cancel timer?) are done in
> >> the VM, ISTATUS becomes 0 and he hardware needs to clear the
> >> interrupt. But the clear command is sent too slowly, the OS has
> >> already read the ICC_IAR_EL1. So hypervisor executed
> >> kvm_arch_timer_handler but ISTATUS is 0.
> > 
> > If what you describe is accurate, and that the HW takes so long to
> > retire the timer interrupt that we cannot trust having taken an
> > interrupt, how long until we can trust that what we have is actually
> > correct?
> > 
> > Given that it takes a full exit from the guest before we can handle
> > the interrupt, I am rather puzzled that you observe this sort of bad
> > behaviours on modern HW. You either have an insanely fast CPU with a
> > very slow GIC, or a very bizarre machine (a bit like a ThunderX -- not
> > a compliment).
> I added dump_stack in the exception branch, and the following is the
> stack when the problem occurred.
> > [ 2669.521569] Call trace:
> > [ 2669.521577]  dump_backtrace+0x0/0x220
> > [ 2669.521579]  show_stack+0x20/0x2c
> > [ 2669.521583]  dump_stack+0xf0/0x138
> > [ 2669.521588]  kvm_arch_timer_handler+0x138/0x194
> > [ 2669.521592]  handle_percpu_devid_irq+0x90/0x1f4
> > [ 2669.521598]  __handle_domain_irq+0x84/0xfc
> > [ 2669.521600]  gic_handle_irq+0xfc/0x320
> > [ 2669.521601]  el1_irq+0xb8/0x140
> > [ 2669.521604]  kvm_arch_vcpu_ioctl_run+0x258/0x6fc
> > [ 2669.521607]  kvm_vcpu_ioctl+0x334/0xa94
> > [ 2669.521612]  __arm64_sys_ioctl+0xb0/0xf4
> > [ 2669.521614]  el0_svc_common.constprop.0+0x7c/0x1bc
> > [ 2669.521616]  do_el0_svc+0x2c/0xa4
> > [ 2669.521619]  el0_svc+0x20/0x30
> > [ 2669.521620]  el0_sync_handler+0xb0/0xb4
> > [ 2669.521621]  el0_sync+0x160/0x180By analyzing this stack, it should indeed take a full exit from the 
> guest.Do you think this is a hardware issue?

Of course this is a HW issue. Your GIC is slow to retire a pending
interrupt, you pay the consequences.

> > 
> > How does it work when context-switching from a vcpu that has a pending
> > timer interrupt to one that doesn't? Do you also see spurious
> > interrupts?
> I added a log under the 'if(!vcpu)' branch and tested it, but it did
> not go to this branch. In addition, I have set the vcpu to be bound to
> the core, and only one vcpu is running on one core.

Well, that's hardly testing the conditions I have outlined.

> > 
> >> The code flow is as follows:
> >> kvm_arch_timer_handler
> >>      ->if (kvm_timer_should_fire)
> >>          ->the value of SYS_CNTV_CTL is 0b001(ISTATUS=0,IMASK=0,ENABLE=1)
> >>      ->return IRQ_HANDLED
> >> 
> >> Because ISTATUS is 0, kvm_timer_update_irq will not be executed to
> >> inject this interrupt into the VM. Since EOImode is 1 and the vtimer
> >> interrupt has IRQD_FORWARDED_TO_VCPU flag, hypervisor will not write
> >> ICC_DIR_EL1 to deactivate the interrupt. This interrupt remains in
> >> active state, blocking subsequent interrupt from being
> >> process. Fortunately, in kvm_timer_vcpu_load it will be determined
> >> again whether an interrupt needs to be injected into the VM. But the
> >> delay will definitely increase.
> > 
> > Right, so you are at most a context switch away from your next
> > interrupt, just like in the !vcpu case. While not ideal, that's not
> > fatal.
> > 
> >> 
> >> What I want to discuss is the solution to this problem. My solution is
> >> to add a deactivation action:
> >> diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> >> index dbd74e4885e2..46baba531d51 100644
> >> --- a/arch/arm64/kvm/arch_timer.c
> >> +++ b/arch/arm64/kvm/arch_timer.c
> >> @@ -228,8 +228,13 @@ static irqreturn_t kvm_arch_timer_handler(int
> >> irq, void *dev_id)
> >>          else
> >>                  ctx = map.direct_ptimer;
> >> 
> >> -       if (kvm_timer_should_fire(ctx))
> >> +       if (kvm_timer_should_fire(ctx)) {
> >>                  kvm_timer_update_irq(vcpu, true, ctx);
> >> +       } else {
> >> +               struct vgic_irq *irq;
> >> +               irq = vgic_get_vcpu_irq(vcpu, timer_irq(timer_ctx));
> >> +               gic_write_dir(irq->hwintid);
> >> +       }
> >> 
> >>          if (userspace_irqchip(vcpu->kvm) &&
> >>              !static_branch_unlikely(&has_gic_active_state))
> >> 
> >> If you have any new ideas or other solutions to this problem, please
> >> let me know.
> > 
> > That's not right.
> > 
> > For a start, this is GICv3 specific, and will break on everything
> > else. Also, why the round-trip via the vgic_irq when you already have
> > the interrupt number that has fired *as a parameter*?
> > 
> > Finally, this breaks with NV, as you could have switched between EL1
> > and EL2 timers, and since you cannot trust you are in the correct
> > interrupt context (interrupt firing out of context), you can't trust
> > irq->hwintid either, as the mappings will have changed.
> > 
> > Something like the patchlet below should do the trick, but I'm
> > definitely not happy about this sort of sorry hacks.
> > 
> > 	M.
> > 
> > diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> > index dbd74e4885e24..3db7c6bdffbc0 100644
> > --- a/arch/arm64/kvm/arch_timer.c
> > +++ b/arch/arm64/kvm/arch_timer.c
> > @@ -206,6 +206,13 @@ static void soft_timer_cancel(struct hrtimer *hrt)
> >   	hrtimer_cancel(hrt);
> >   }
> >   +static void set_timer_irq_phys_active(struct arch_timer_context
> > *ctx, bool active)
> > +{
> > +	int r;
> > +	r = irq_set_irqchip_state(ctx->host_timer_irq, IRQCHIP_STATE_ACTIVE, active);
> > +	WARN_ON(r);
> > +}
> > +
> >   static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id)
> >   {
> >   	struct kvm_vcpu *vcpu = *(struct kvm_vcpu **)dev_id;
> > @@ -230,6 +237,8 @@ static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id)
> >     	if (kvm_timer_should_fire(ctx))
> >   		kvm_timer_update_irq(vcpu, true, ctx);
> > +	else
> > +		set_timer_irq_phys_active(ctx, false);
> >     	if (userspace_irqchip(vcpu->kvm) &&
> >   	    !static_branch_unlikely(&has_gic_active_state))
> > @@ -659,13 +668,6 @@ static void timer_restore_state(struct arch_timer_context *ctx)
> >   	local_irq_restore(flags);
> >   }
> >   -static inline void set_timer_irq_phys_active(struct
> > arch_timer_context *ctx, bool active)
> > -{
> > -	int r;
> > -	r = irq_set_irqchip_state(ctx->host_timer_irq, IRQCHIP_STATE_ACTIVE, active);
> > -	WARN_ON(r);
> > -}
> > -
> >   static void kvm_timer_vcpu_load_gic(struct arch_timer_context *ctx)
> >   {
> >   	struct kvm_vcpu *vcpu = ctx->vcpu;
> > 
> After extensive testing, this patch was able to resolve the issue I
> encountered.
> Tested-by: Kunkun Jiang <jiangkunkun@huawei.com>

Just to be clear: a similar discussion took place over 5 years ago on
the same subject[1], and I was pretty clear about the conclusion.

There is no bug here. Only a slow HW implementation that leads to
suboptimal behaviours. There is no state loss, no lack of forward
progress, and the interrupts still get delivered in finite time. As
far as I am concerned, things work OK.

Thanks,

	M.

[1] https://lore.kernel.org/r/1595584037-6877-1-git-send-email-zhangshaokun@hisilicon.com

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-10-21 14:47 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-14 14:45 [Question] Received vtimer interrupt but ISTATUS is 0 Kunkun Jiang
2025-10-14 16:32 ` Marc Zyngier
2025-10-21 13:38   ` Kunkun Jiang
2025-10-21 14:46     ` Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).