From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6D446CCD193 for ; Tue, 14 Oct 2025 16:33:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:MIME-Version: References:In-Reply-To:Subject:Cc:To:From:Message-ID:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=CdlQeaWFqgjoLLTVm0QXQcmVGZkX2Fw5iTA4/9UiG0I=; b=AZA2ek8hp511CdvGhf8yh5cyAX O8xyauyqI8ZJ7wZYRLQ3gSNaOZ51HZFNA/AP0X5AbukV/z+iEclqvdtLrkjEqiB4MPgPydQ6QanhW ZeT5+Frmk1GuhBpLDps6byEGCKTFLu1I8Y46Pdc9spzYs8mVdpSBS95oWgGk5v5T+065cuS4tXFX2 3u04zPCF4PRWovCRfr32BtBth8CWZP0IlSxLKeMV8e5rGI6Mmawr1FiYMB/3uQNYaUFjhkLW5eS3b Q95+5fdoQC0P1NX/u6LipNy7wqqLXOP2yh3Ov5t+Tx68R+7+sxehNb0oSnfX6jXZCm+E0l23NRFtb zcBKvH7w==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98.2 #2 (Red Hat Linux)) id 1v8hxT-0000000GzqO-3EYA; Tue, 14 Oct 2025 16:33:08 +0000 Received: from sea.source.kernel.org ([2600:3c0a:e001:78e:0:1991:8:25]) by bombadil.infradead.org with esmtps (Exim 4.98.2 #2 (Red Hat Linux)) id 1v8hxR-0000000Gzpm-02CC for linux-arm-kernel@lists.infradead.org; Tue, 14 Oct 2025 16:33:06 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sea.source.kernel.org (Postfix) with ESMTP id 9C14043240; Tue, 14 Oct 2025 16:33:04 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6FDD7C4CEE7; Tue, 14 Oct 2025 16:33:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1760459584; bh=D+E6b9xuGUDEplIh1+X+xjW+DVXmn5cvE2zLqdeGK1k=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=O4ezOjIVbLFTb9tq+/Dr/a2j7f4BA2HY0JYIUAnrPtdYTSNXQxoIvuePdc3hCzNzX qV5RBijVRxO+hLbM7OFjz+kQjPWJINKwwAjuiutVbE9YL59nVf1rHTs+aZL0VuHSFM bocWF5sy8cj1t3y53QZRObhG6TB9IqvQu2wPRdrXInvfvZMByI8/PhV/Bf9U3TMkVE 2jB+6U4MyZs/SFgcsHOPFD9aZGcVDykYoJ7QzYEOQhbYdSnCCvESZursC5sGFpaA60 z4ynFNxQYsD6MhtfdGmMtcKpn9M5mxcqxyOyvv6b3oRHntQ5RrCRSZ0FN6SKEFvFvC 2Gn4vKPLQ9I1A== Received: from 82-132-221-186.dab.02.net ([82.132.221.186] helo=lobster-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1v8hxN-0000000DwIE-3qph; Tue, 14 Oct 2025 16:33:02 +0000 Date: Tue, 14 Oct 2025 17:32:56 +0100 Message-ID: <87frblxx3b.wl-maz@kernel.org> From: Marc Zyngier To: Kunkun Jiang Cc: Oliver Upton , Joey Gouly , Suzuki K Poulose , Zenghui Yu , Catalin Marinas , Will Deacon , "moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)" , "open list:KERNEL VIRTUAL MACHINE FOR\ ARM64 (KVM/arm64)" , open list , "wanghaibin.wang@huawei.com" Subject: Re: [Question] Received vtimer interrupt but ISTATUS is 0 In-Reply-To: <14b30b59-12bb-fc69-8447-aae86fcafcd1@huawei.com> References: <14b30b59-12bb-fc69-8447-aae86fcafcd1@huawei.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/30.1 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 82.132.221.186 X-SA-Exim-Rcpt-To: jiangkunkun@huawei.com, oliver.upton@linux.dev, joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org, wanghaibin.wang@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20251014_093305_118521_BEC5B887 X-CRM114-Status: GOOD ( 37.82 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Tue, 14 Oct 2025 15:45:37 +0100, Kunkun Jiang wrote: > > Hi all, > > I'm having a very strange problem that can be simplified to a vtimer > interrupt being received but ISTATUS is 0. Why dose this happen? > According to analysis, it may be the timer condition is met and the > interrupt is generated. Maybe some actions(cancel timer?) are done in > the VM, ISTATUS becomes 0 and he hardware needs to clear the > interrupt. But the clear command is sent too slowly, the OS has > already read the ICC_IAR_EL1. So hypervisor executed > kvm_arch_timer_handler but ISTATUS is 0. If what you describe is accurate, and that the HW takes so long to retire the timer interrupt that we cannot trust having taken an interrupt, how long until we can trust that what we have is actually correct? Given that it takes a full exit from the guest before we can handle the interrupt, I am rather puzzled that you observe this sort of bad behaviours on modern HW. You either have an insanely fast CPU with a very slow GIC, or a very bizarre machine (a bit like a ThunderX -- not a compliment). How does it work when context-switching from a vcpu that has a pending timer interrupt to one that doesn't? Do you also see spurious interrupts? > The code flow is as follows: > kvm_arch_timer_handler > ->if (kvm_timer_should_fire) > ->the value of SYS_CNTV_CTL is 0b001(ISTATUS=0,IMASK=0,ENABLE=1) > ->return IRQ_HANDLED > > Because ISTATUS is 0, kvm_timer_update_irq will not be executed to > inject this interrupt into the VM. Since EOImode is 1 and the vtimer > interrupt has IRQD_FORWARDED_TO_VCPU flag, hypervisor will not write > ICC_DIR_EL1 to deactivate the interrupt. This interrupt remains in > active state, blocking subsequent interrupt from being > process. Fortunately, in kvm_timer_vcpu_load it will be determined > again whether an interrupt needs to be injected into the VM. But the > delay will definitely increase. Right, so you are at most a context switch away from your next interrupt, just like in the !vcpu case. While not ideal, that's not fatal. > > What I want to discuss is the solution to this problem. My solution is > to add a deactivation action: > diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c > index dbd74e4885e2..46baba531d51 100644 > --- a/arch/arm64/kvm/arch_timer.c > +++ b/arch/arm64/kvm/arch_timer.c > @@ -228,8 +228,13 @@ static irqreturn_t kvm_arch_timer_handler(int > irq, void *dev_id) > else > ctx = map.direct_ptimer; > > - if (kvm_timer_should_fire(ctx)) > + if (kvm_timer_should_fire(ctx)) { > kvm_timer_update_irq(vcpu, true, ctx); > + } else { > + struct vgic_irq *irq; > + irq = vgic_get_vcpu_irq(vcpu, timer_irq(timer_ctx)); > + gic_write_dir(irq->hwintid); > + } > > if (userspace_irqchip(vcpu->kvm) && > !static_branch_unlikely(&has_gic_active_state)) > > If you have any new ideas or other solutions to this problem, please > let me know. That's not right. For a start, this is GICv3 specific, and will break on everything else. Also, why the round-trip via the vgic_irq when you already have the interrupt number that has fired *as a parameter*? Finally, this breaks with NV, as you could have switched between EL1 and EL2 timers, and since you cannot trust you are in the correct interrupt context (interrupt firing out of context), you can't trust irq->hwintid either, as the mappings will have changed. Something like the patchlet below should do the trick, but I'm definitely not happy about this sort of sorry hacks. M. diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c index dbd74e4885e24..3db7c6bdffbc0 100644 --- a/arch/arm64/kvm/arch_timer.c +++ b/arch/arm64/kvm/arch_timer.c @@ -206,6 +206,13 @@ static void soft_timer_cancel(struct hrtimer *hrt) hrtimer_cancel(hrt); } +static void set_timer_irq_phys_active(struct arch_timer_context *ctx, bool active) +{ + int r; + r = irq_set_irqchip_state(ctx->host_timer_irq, IRQCHIP_STATE_ACTIVE, active); + WARN_ON(r); +} + static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id) { struct kvm_vcpu *vcpu = *(struct kvm_vcpu **)dev_id; @@ -230,6 +237,8 @@ static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id) if (kvm_timer_should_fire(ctx)) kvm_timer_update_irq(vcpu, true, ctx); + else + set_timer_irq_phys_active(ctx, false); if (userspace_irqchip(vcpu->kvm) && !static_branch_unlikely(&has_gic_active_state)) @@ -659,13 +668,6 @@ static void timer_restore_state(struct arch_timer_context *ctx) local_irq_restore(flags); } -static inline void set_timer_irq_phys_active(struct arch_timer_context *ctx, bool active) -{ - int r; - r = irq_set_irqchip_state(ctx->host_timer_irq, IRQCHIP_STATE_ACTIVE, active); - WARN_ON(r); -} - static void kvm_timer_vcpu_load_gic(struct arch_timer_context *ctx) { struct kvm_vcpu *vcpu = ctx->vcpu; -- Jazz isn't dead. It just smells funny.