From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 259CBCCA470 for ; Wed, 1 Oct 2025 07:23:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:MIME-Version: References:In-Reply-To:Subject:Cc:To:From:Message-ID:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=Yj8qSLIUEPFW7sQ5cAjHW8HVGaFvO59yWvjIuL8pKQc=; b=NkD+Sh2kto44diEz6KezYy9aHm mg7PctdaW+TCId6DjFEhqoTMpQoeJX7VmVx8RPsANmHKLebWg0hhF4CTXR0+/s7VlbzXTpjXlZx+G Fa0RgxkHQzDjW3LD8ox00qZGxnlX69pL/9ZBpRTgGA2JNSJ6aSLilZ1ALPFMhBpwQNL6dxc9A0z3e G3wYjNdJdRW7wlc7mQJ73WC9OXchUSuaDY3SCI1JgKMcVI8vrVzTIDgPD0d4R4L49n5fEwyE6wHq7 EdqqWzuLi5Nypht8YzBK59S7dhKybsddDOVVWP/7NC89t8sb5BqGrLQEXVFe2zxOPl716OlqXE5EE zE8CTgVw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98.2 #2 (Red Hat Linux)) id 1v3rBC-00000006tbn-1vfi; Wed, 01 Oct 2025 07:23:14 +0000 Received: from tor.source.kernel.org ([172.105.4.254]) by bombadil.infradead.org with esmtps (Exim 4.98.2 #2 (Red Hat Linux)) id 1v3rBB-00000006tbh-0QpZ for linux-arm-kernel@lists.infradead.org; Wed, 01 Oct 2025 07:23:13 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by tor.source.kernel.org (Postfix) with ESMTP id 3685160C0A; Wed, 1 Oct 2025 07:23:12 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D6BC0C4CEF4; Wed, 1 Oct 2025 07:23:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1759303391; bh=UfXu+QTDzaOVLLUaPTAIak1Yvw3illXn2S/PL5K7QtU=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=nhkWbNCk7jx86Ir4oMOxxi4gy+gc7y4UeX6PNtInoxoDCvEp2gEOToFqIA8/oiTdi y1Bwe9VvABqy7Elyc+pF3HSsXZDnujhEa2CdVy3T1XUihFJHP3BS7zxrIZ5oMry7y3 1X1bZJtVfoZZ723nvbxS7PvVjWap4u9egG6cLJ7gE8vXv3L6IeaTYd7TrwrGnGoRPy RxYM+LcPjGpEy30arpfbFHRJ/74vq5t34Zl8DHt+NmPpyxl5UJOkxCR/tmON3JLbni PvcTaWkA0nX7JTgYRPQSOfeHpy+kHi1H/cDHjokqBWvGRpEE82y1A8NZDew5MK7EaW j8TYEqfhgbjvg== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1v3rB7-0000000Aix6-2O3Z; Wed, 01 Oct 2025 07:23:09 +0000 Date: Wed, 01 Oct 2025 08:23:09 +0100 Message-ID: <86seg3ytk2.wl-maz@kernel.org> From: Marc Zyngier To: Volodymyr Babchuk Cc: "linux-arm-kernel@lists.infradead.org" , Dmytro Terletskyi , kvmarm Subject: Re: KVM: Nested VGIC emulation leads to infinite IRQ exceptions In-Reply-To: <87bjmrprvq.fsf@epam.com> References: <87bjmrprvq.fsf@epam.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/30.1 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: Volodymyr_Babchuk@epam.com, linux-arm-kernel@lists.infradead.org, Dmytro_Terletskyi@epam.com, kvmarm@lists.linux.dev X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Please use the kvmarm mailing list for KVM related discussions (added for your convenience). On Tue, 30 Sep 2025 22:11:54 +0100, Volodymyr Babchuk wrote: > > > Hi all, > > We are trying to run Xen as KVM nested hypervisor (again!) and we have > encountered strange issue with GIC nested emulation. I am certain that > we'll dig to the root cause, but probably someone on the ML will save us > a couple of days of debugging by providing with some insights. > > So, setup is following: QEMU 9.2 is running Xen 4.20 with KVM (latest > Linux master branch) as accelerator. 9.2 is an odd choice, specially as it doesn't have any NV support. ISTR that 10.1 is the first version to have some NV support, although without E2H0 enablement which I expect Xen requires. Anyway, if you're already running something, then I expect you're patched QEMU to death to get there. > QEMU provides a couple of virtio > devices to the VM and some of these device are passed thought to DomU > (we had to hook these devices to vSMMU, but this is another > story). Sometimes we observe the following sequence of events: > > 1. DomU gets IRQ from a virtio device > 2. DomU acknowledges the IRQ by reading IAR1 register > 3. DomU is unable to deactivate the IRQ (there is no write to the > EOI register) > > We are not sure why this happens, but our current theory is that DomU's > vvcpu0 is interrupted during handling of the IRQ by Xen's timer > interrupt. Also, we are not able to catch this specific moment in KVM > trace because of lots of lost events. Anyways, after this we are seeing > the following loop: > > 4. vCPU switches to Xen via IRQ Exception > 5. Xen reads IAR1 to get IRQ nr, but gets 1023 (aka no IRQs) > 6. Xen issues ERET to return back to guest > 7. GOTO 4. What is the configuration of ICH_HCR_EL2 in Xen at the point of reading IAR? My hunch is that you are taking a maintenance interrupt, disable the virtual CPU interface on taking that interrupt, which of course results in no interrupt to acknowledge. Reading ICH_MISR_EL2 at the point of entering Xen should give you a clue of the reason why this is happening -- assuming my hunch is correct. > This basically renders the whole vCPU stuck. Also we noticed that DomU's > vvCPU is stuck right after accessing virtio mmio register. So looks like > this is what happens: > > 1. QEMU sends virtio IRQ to the VM > 2. Xen handles the IRQ and injects it into DomU > 3. DomU tries to handle it and accesses a virtio mmio register > 4. This produces a memory fault that leads to switch back to KVM (and > then to QEMU of course) so QEMU can handle MMIO access > 5. When QEMU continues vCPU thread, it immediately gets switched back to > vEL2 (probably due to timer IRQ, but this is my speculation) > 6. the vCPU is spinning in the aforementioned loop > > Looks like this happens because of empty LRs, but we still didn't > confirmed this because the issue is not 100% reproducible. I'll be glad > to hear any suggestions. I don't think this is likely. If the guest hasn't done an EOI, then the interrupt should still be in the LR in the context of DomU, with at least an Active state. You want to try and look at what Xen sees there. > This is a part of the KVM trace, where you can see that vCPU in question > tries to perform ERET to Linux in DomU but is being brought back to > vEL2. In this particular case this is vCPU1 / vvCPU0. I filtered out > other vCPUs to reduce clutter. > > qemu-system-aar-41290 [000] d.... 12023.695620: kvm_entry: PC: 0x00000a0000267c80 > qemu-system-aar-41290 [000] d.... 12023.695620: vgic_update_irq_pending: VCPU: 1, IRQ 25, level: 0 > qemu-system-aar-41290 [000] d.... 12023.695621: kvm_get_timer_map: VCPU: 1, dv: 2, dp: 3, ev: 1, ep: 0 > qemu-system-aar-41290 [000] d.... 12023.695621: kvm_timer_emulate: arch_timer_ctx_index: 1 (should_fire: 1) > qemu-system-aar-41290 [000] d.... 12023.695621: kvm_timer_emulate: arch_timer_ctx_index: 0 (should_fire: 0) > qemu-system-aar-41290 [000] ..... 12023.695621: kvm_exit: TRAP: HSR_EC: 0x001a (ERET), PC: 0x00000a00002674e0 [...] There isn't much to go on here, as we mostly see the timers, which do not help at all. I'd suggest you look at the maintenance interrupt, and how Xen manipulates ICH_HCR_EL2, but that's the extent of what I can do. To help you further, I'd need a reproducer. I've asked you more than once to provide a way to reproduce your setup, but got no answer. The Debian package doesn't boot (it just messes up grub), and I don't have the time to learn how to deal with Xen from scratch. Until then, you'll have to apply some debugging by yourself. Thanks, M. -- Without deviation from the norm, progress is not possible.