From: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
To: Marc Zyngier <maz@kernel.org>
Cc: "linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>,
Dmytro Terletskyi <Dmytro_Terletskyi@epam.com>,
kvmarm <kvmarm@lists.linux.dev>
Subject: Re: KVM: Nested VGIC emulation leads to infinite IRQ exceptions
Date: Thu, 2 Oct 2025 12:29:42 +0000 [thread overview]
Message-ID: <873481pjuz.fsf@epam.com> (raw)
In-Reply-To: <86seg3ytk2.wl-maz@kernel.org> (Marc Zyngier's message of "Wed, 01 Oct 2025 08:23:09 +0100")
Hi Marc,
Marc Zyngier <maz@kernel.org> writes:
> Please use the kvmarm mailing list for KVM related discussions (added
> for your convenience).
Oops, sorry. I missed that MAINTAINERS have 2 "L:" entries.
> On Tue, 30 Sep 2025 22:11:54 +0100,
> Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> wrote:
>>
>>
>> Hi all,
>>
>> We are trying to run Xen as KVM nested hypervisor (again!) and we have
>> encountered strange issue with GIC nested emulation. I am certain that
>> we'll dig to the root cause, but probably someone on the ML will save us
>> a couple of days of debugging by providing with some insights.
>>
>> So, setup is following: QEMU 9.2 is running Xen 4.20 with KVM (latest
>> Linux master branch) as accelerator.
>
> 9.2 is an odd choice, specially as it doesn't have any NV support.
> ISTR that 10.1 is the first version to have some NV support, although
> without E2H0 enablement which I expect Xen requires.
Yep, I had to patch QEMU to enable E2H0 (among other things).
>
> Anyway, if you're already running something, then I expect you're
> patched QEMU to death to get there.
You are certainly correct.
[...]
>
> To help you further, I'd need a reproducer. I've asked you more than
> once to provide a way to reproduce your setup, but got no answer. The
> Debian package doesn't boot (it just messes up grub), and I don't have
> the time to learn how to deal with Xen from scratch.
The current setup is quite complex as it involves whole Android build,
so there is no easy setup to share reproducer.
> Until then, you'll have to apply some debugging by yourself.
This is what I and Dmytro are doing. And looks like I found the
problem. I added some more traces and here we go:
Xen wants to return back to vvCPU:
qemu-system-aar-3378 [085] ..... 246.770716: kvm_inject_nested_exception: IRQ: esr_el2 0x0 elr_el2: 0xffffffc0010e5508 spsr_el2: 0x024000c5 (M: EL1h) hcr_el2: 807c663f
qemu-system-aar-3378 [085] ..... 246.770716: kvm_get_timer_map: VCPU: 1, dv: 2, dp: 3, ev: 1, ep: 0
qemu-system-aar-3378 [085] ..... 246.770716: kvm_timer_update_irq: VCPU: 1, IRQ 28, level 0
qemu-system-aar-3378 [085] ..... 246.770716: vgic_update_irq_pending: VCPU: 1, IRQ 28, level: 0
qemu-system-aar-3378 [085] ..... 246.770717: kvm_timer_update_irq: VCPU: 1, IRQ 26, level 1
We have pending timer IRQ for Xen
qemu-system-aar-3378 [085] ..... 246.770717: vgic_update_irq_pending: VCPU: 1, IRQ 26, level: 1
qemu-system-aar-3378 [085] d.... 246.770717: kvm_timer_restore_state: CTL: 0x000000 CVAL: 0x0 arch_timer_ctx_index: 2
qemu-system-aar-3378 [085] d.... 246.770717: kvm_timer_restore_state: CTL: 0x000005 CVAL: 0x3e6c59a71a95 arch_timer_ctx_index: 3
qemu-system-aar-3378 [085] ..... 246.770717: kvm_timer_emulate: arch_timer_ctx_index: 1 (should_fire: 1)
qemu-system-aar-3378 [085] ..... 246.770718: kvm_timer_emulate: arch_timer_ctx_index: 0 (should_fire: 0)
qemu-system-aar-3378 [085] d.... 246.770719: vgic_update_irq_pending: VCPU: 1, IRQ 25, level: 0
But we also have bunch of ACTIVE interrupts which fill all available
LRs:
qemu-system-aar-3378 [085] d.... 246.770720: vgic_populate_lr: VCPU 1 lr 0 = 90a000000000004f
qemu-system-aar-3378 [085] d.... 246.770720: vgic_populate_lr: VCPU 1 lr 1 = 90a000000000004e
qemu-system-aar-3378 [085] d.... 246.770720: vgic_populate_lr: VCPU 1 lr 2 = d0a000000000004a
qemu-system-aar-3378 [085] d.... 246.770720: vgic_populate_lr: VCPU 1 lr 3 = d0a000000000004b
As all LR entries have ACTIVE bit set, read from IAR1 will produce 1023,
of course. Problem is that Xen itself can't deactivate these 4 IRQs as
they are directed to DomU, so DomU should active them first. But DomU
can't do this as it is never executed.
I am not sure what is the correct fix, but I see two options:
- Prioritize timer IRQs so they always present in LRs
- De-prioritize ACTIVE IRQs so they are inserted into LRs last.
Looks like the second one is better.
--
WBR, Volodymyr
next prev parent reply other threads:[~2025-10-02 12:30 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-30 21:11 KVM: Nested VGIC emulation leads to infinite IRQ exceptions Volodymyr Babchuk
2025-10-01 7:23 ` Marc Zyngier
2025-10-02 12:29 ` Volodymyr Babchuk [this message]
2025-10-02 14:28 ` Marc Zyngier
2025-10-02 15:08 ` Volodymyr Babchuk
2025-10-01 16:17 ` Marc Zyngier
2025-11-03 17:08 ` Marc Zyngier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=873481pjuz.fsf@epam.com \
--to=volodymyr_babchuk@epam.com \
--cc=Dmytro_Terletskyi@epam.com \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=maz@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).