From: "Philippe Mathieu-Daudé" <philmd@linaro.org>
To: David Woodhouse <dwmw2@infradead.org>,
qemu-devel@nongnu.org, Thomas Huth <thuth@redhat.com>
Cc: Paul Durrant <paul@xen.org>, Paolo Bonzini <pbonzini@redhat.com>,
Richard Henderson <richard.henderson@linaro.org>,
Eduardo Habkost <eduardo@habkost.net>,
"Michael S. Tsirkin" <mst@redhat.com>,
Marcel Apfelbaum <marcel.apfelbaum@gmail.com>,
Bernhard Beschow <shentey@gmail.com>
Subject: Re: [PATCH v2] hw/i386/pc: Fix level interrupt sharing for Xen event channel GSI
Date: Tue, 7 Jan 2025 17:44:56 +0100 [thread overview]
Message-ID: <f383b5c0-3bce-4c6c-bef7-ae34535083e8@linaro.org> (raw)
In-Reply-To: <975f9dab5718b441e6d4463a43845dc22b5ddf8f.camel@infradead.org>
(Cc'ing Bernhard)
On 7/1/25 17:32, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
>
> The system GSIs are not designed for sharing. One device might assert a
> shared interrupt with qemu_set_irq() and another might deassert it, and
> the level from the first device is lost.
>
> This could be solved by using a multiplexer which functions as an OR
> gate, much like the PCI code already implements for pci_set_irq() for
> muxing the INTx lines.
>
> Alternatively, it could be solved by having a 'resample' callback which
> is invoked when the interrupt is acked at the interrupt controller, and
> causes the devices to re-trigger the interrupt if it should still be
> pending. This is the model that VFIO in Linux uses, with a 'resampler'
> eventfd that actually unmasks the interrupt on the hardware device and
> thus triggers a new interrupt from it if needed. QEMU currently doesn't
> use that VFIO interface correctly, and just bashes on the resampler for
> every MMIO access to the device "just in case".
>
> This does neither of those. The Xen event channel GSI support *already*
> has hooks into the PC gsi_handler() code, for routing GSIs to PIRQs. So
> we can implement the logical OR of the external input (from PCI INTx,
> serial etc.) with the Xen event channel GSI by allowing that existing
> hook to modify the 'level' being asserted.
>
> Closes: https://gitlab.com/qemu-project/qemu/-/issues/2731
> Reported-by: Thomas Huth <thuth@redhat.com>
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> Acked-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> v2:
> • Fix deassertion of PIRQ GSI.
>
> hw/i386/kvm/xen_evtchn.c | 50 +++++++++++++++++++++++++++++++---------
> hw/i386/kvm/xen_evtchn.h | 2 +-
> hw/i386/x86-common.c | 32 ++++++++++++++++---------
> 3 files changed, 61 insertions(+), 23 deletions(-)
>
> diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c
> index bd2a3cbee0..93f68790b0 100644
> --- a/hw/i386/kvm/xen_evtchn.c
> +++ b/hw/i386/kvm/xen_evtchn.c
> @@ -140,6 +140,8 @@ struct XenEvtchnState {
>
> uint64_t callback_param;
> bool evtchn_in_kernel;
> + bool setting_callback_gsi;
> + int extern_gsi_level;
> uint32_t callback_gsi;
>
> QEMUBH *gsi_bh;
> @@ -431,7 +433,16 @@ void xen_evtchn_set_callback_level(int level)
> }
>
> if (s->callback_gsi && s->callback_gsi < s->nr_callback_gsis) {
> - qemu_set_irq(s->callback_gsis[s->callback_gsi], level);
> + /*
> + * Ugly, but since we hold the BQL we can set this flag so that
> + * xen_evtchn_set_gsi() can tell the difference between this code
> + * setting the GSI, and an external device (PCI INTx) doing so.
> + */
> + s->setting_callback_gsi = true;
> + /* Do not deassert the line if an external device is asserting it. */
> + qemu_set_irq(s->callback_gsis[s->callback_gsi],
> + level || s->extern_gsi_level);
> + s->setting_callback_gsi = false;
> if (level) {
> /* Ensure the vCPU polls for deassertion */
> kvm_xen_set_callback_asserted();
> @@ -1596,7 +1607,7 @@ static int allocate_pirq(XenEvtchnState *s, int type, int gsi)
> return pirq;
> }
>
> -bool xen_evtchn_set_gsi(int gsi, int level)
> +bool xen_evtchn_set_gsi(int gsi, int *level)
> {
> XenEvtchnState *s = xen_evtchn_singleton;
> int pirq;
> @@ -1608,16 +1619,33 @@ bool xen_evtchn_set_gsi(int gsi, int level)
> }
>
> /*
> - * Check that that it *isn't* the event channel GSI, and thus
> - * that we are not recursing and it's safe to take s->port_lock.
> - *
> - * Locking aside, it's perfectly sane to bail out early for that
> - * special case, as it would make no sense for the event channel
> - * GSI to be routed back to event channels, when the delivery
> - * method is to raise the GSI... that recursion wouldn't *just*
> - * be a locking issue.
> + * For the callback_gsi we need to implement a logical OR of the event
> + * channel GSI and the external input (e.g. from PCI INTx), because
> + * QEMU itself doesn't support shared level interrupts via demux or
> + * resamplers.
> */
> if (gsi && gsi == s->callback_gsi) {
> + /* Remember the external state of the GSI pin (e.g. from PCI INTx) */
> + if (!s->setting_callback_gsi) {
> + s->extern_gsi_level = *level;
> +
> + /*
> + * Don't allow the external device to deassert the line if the
> + * eveht channel GSI should still be asserted.
> + */
> + if (!s->extern_gsi_level) {
> + struct vcpu_info *vi = kvm_xen_get_vcpu_info_hva(0);
> + if (vi && vi->evtchn_upcall_pending) {
> + *level = 1;
> + }
> + }
> + }
> +
> + /*
> + * The event channel GSI cannot be routed to PIRQ, as that would make
> + * no sense. It could also deadlock on s->port_lock, if we proceed.
> + * So bail out now.
> + */
> return false;
> }
>
> @@ -1628,7 +1656,7 @@ bool xen_evtchn_set_gsi(int gsi, int level)
> return false;
> }
>
> - if (level) {
> + if (*level) {
> int port = s->pirq[pirq].port;
>
> s->pirq_gsi_set |= (1U << gsi);
> diff --git a/hw/i386/kvm/xen_evtchn.h b/hw/i386/kvm/xen_evtchn.h
> index b740acfc0d..0521ebc092 100644
> --- a/hw/i386/kvm/xen_evtchn.h
> +++ b/hw/i386/kvm/xen_evtchn.h
> @@ -23,7 +23,7 @@ void xen_evtchn_set_callback_level(int level);
>
> int xen_evtchn_set_port(uint16_t port);
>
> -bool xen_evtchn_set_gsi(int gsi, int level);
> +bool xen_evtchn_set_gsi(int gsi, int *level);
> void xen_evtchn_snoop_msi(PCIDevice *dev, bool is_msix, unsigned int vector,
> uint64_t addr, uint32_t data, bool is_masked);
> void xen_evtchn_remove_pci_device(PCIDevice *dev);
> diff --git a/hw/i386/x86-common.c b/hw/i386/x86-common.c
> index a7d46c3105..13badc26a5 100644
> --- a/hw/i386/x86-common.c
> +++ b/hw/i386/x86-common.c
> @@ -450,8 +450,27 @@ static long get_file_size(FILE *f)
> void gsi_handler(void *opaque, int n, int level)
> {
> GSIState *s = opaque;
> + bool bypass_ioapic = false;
>
> trace_x86_gsi_interrupt(n, level);
> +
> +#ifdef CONFIG_XEN_EMU
> + /*
> + * Xen delivers the GSI to the Legacy PIC (not that Legacy PIC
> + * routing actually works properly under Xen). And then to
> + * *either* the PIRQ handling or the I/OAPIC depending on
> + * whether the former wants it.
> + *
> + * Additionally, this hook allows the Xen event channel GSI to
> + * work around QEMU's lack of support for shared level interrupts,
> + * by keeping track of the externally driven state of the pin and
> + * implementing a logical OR with the state of the evtchn GSI.
> + */
> + if (xen_mode == XEN_EMULATE) {
> + bypass_ioapic = xen_evtchn_set_gsi(n, &level);
> + }
> +#endif
> +
> switch (n) {
> case 0 ... ISA_NUM_IRQS - 1:
> if (s->i8259_irq[n]) {
> @@ -460,18 +479,9 @@ void gsi_handler(void *opaque, int n, int level)
> }
> /* fall through */
> case ISA_NUM_IRQS ... IOAPIC_NUM_PINS - 1:
> -#ifdef CONFIG_XEN_EMU
> - /*
> - * Xen delivers the GSI to the Legacy PIC (not that Legacy PIC
> - * routing actually works properly under Xen). And then to
> - * *either* the PIRQ handling or the I/OAPIC depending on
> - * whether the former wants it.
> - */
> - if (xen_mode == XEN_EMULATE && xen_evtchn_set_gsi(n, level)) {
> - break;
> + if (!bypass_ioapic) {
> + qemu_set_irq(s->ioapic_irq[n], level);
> }
> -#endif
> - qemu_set_irq(s->ioapic_irq[n], level);
> break;
> case IO_APIC_SECONDARY_IRQBASE
> ... IO_APIC_SECONDARY_IRQBASE + IOAPIC_NUM_PINS - 1:
next prev parent reply other threads:[~2025-01-07 16:45 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-07 16:32 [PATCH v2] hw/i386/pc: Fix level interrupt sharing for Xen event channel GSI David Woodhouse
2025-01-07 16:38 ` Paul Durrant
2025-01-07 16:58 ` David Woodhouse
2025-01-07 16:44 ` Philippe Mathieu-Daudé [this message]
-- strict thread matches above, loose matches on Subject: below --
2025-01-08 12:12 David Woodhouse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f383b5c0-3bce-4c6c-bef7-ae34535083e8@linaro.org \
--to=philmd@linaro.org \
--cc=dwmw2@infradead.org \
--cc=eduardo@habkost.net \
--cc=marcel.apfelbaum@gmail.com \
--cc=mst@redhat.com \
--cc=paul@xen.org \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
--cc=shentey@gmail.com \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).