Re: Qemu interrupt-remap fault support

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Michael S. Tsirkin" <mst@redhat.com>
To: David Woodhouse <dwmw2@infradead.org>
Cc: Jason Wang <jasowang@redhat.com>, Peter Xu <peterx@redhat.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	yi.l.liu@intel.com, yi.y.sun@linux.intel.com,
	qemu-devel <qemu-devel@nongnu.org>
Subject: Re: Qemu interrupt-remap fault support
Date: Fri, 13 Jan 2023 06:31:24 -0500	[thread overview]
Message-ID: <20230113062525-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <aaef9961d210ac1873153bf3cf01d984708744fc.camel@infradead.org>

On Fri, Jan 13, 2023 at 09:08:38AM +0000, David Woodhouse wrote:
> I'm looking at interrupt remapping (because I need to hook into the
> translation somehow to add PIRQ support for Xen which translates guest
> MSIs directly to KVM_IRQ_ROUTING_XEN_EVTCHN).
> 
> Am I right in understanding that it doesn't report faults on interrupts
> which can't be translated? It attempts to translate interrupts at the
> time the table is modified (vtd_int_remap()) or when an APIC access
> actually triggers an MSI (vtd_mem_ir_write()) but in neither case does
> it actually raise a fault?
>
> The behaviour we want here is that we only raise a fault when the IRQ
> actually *happens*. But that's hard in our current model where it looks
> like we pretranslate *everything* in advance and just let it run.
> 
> Here's a proposal for a model which could make it work (using VFIO as
> the example since that's the more complex part but it works for
> emulated MSI sources too):
> 
> We consume the VFIO eventfd *both* in userspace and the kernel. (Since 
> https://lore.kernel.org/kvm/20201027143944.648769-1-dwmw2@infradead.org/
> we can just keep listening on the VFIO eventfd in userspace and the
> kernel will eat all the events so you never notice. On older kernels we
> have to manually stop listening in userspace.)
> 
> When a translation is valid and should be considered 'cached' in the
> IOMMU, that's when we actually hook it up to the irqfd. 
> 
> We can ditch the iec invalidate callbacks (vtd_iec_notify_all) because
> all an invalidation needs to do is KVM_IRQFD_FLAG_DEASSIGN for the
> corresponding GSI.
> 
> (
> You might consider abusing a spare field in the KVM routing table to
> hold a cookie like the IRTE# so that you know *which* entries to
> invalidate. I couldn't possibly comment.
> 
> 	/* 64-bit cookie for IOMMU to use for invalidation choices */
> 	#define ire_ir_cookie(ire) ((ire)->u.adapter.ind_offset)
> 
> 	/* Flags, to indicate a stale entry that needs retranslating
> */
> 	#define ire_user_flags(ire) ((ire)->u.adapter.summary_offset)
> 	#define IRE_USER_FLAG_STALE             1
> )
> 
> So when an interrupt happens and it's *untranslated*, that's when it
> gets raised to userspace to handle, e.g. in vfio_msi_interrupt(). That
> does the normal thing and attempts to deliver the guest MSI directly.
> We add a flag "bool delivering_now" to the X86IOMMUClass int_remap
> function, to allow it to distinguish between preemptive translations
> and actual delivery and to raise the fault in the latter case.
> 
> When the guest frobs a device's MSI table we can do the translation as
> we do at the moment, of course with the 'delivering_now' argument being
> false. And *if* the translation succeeds then we can install the IRQFD
> right away.
> 
> This model allows us to generate faults as the hardware would, and also
> improves the efficiency of invalidation by only invalidating what we
> need to. I haven't looked hard at how it works with an emulated AMD
> IOMMU, but I know that the Xen PIRQ support (which is where I came in)
> slots into it fairly trivially, using the PIRQ# as the 'cookie' for
> invalidation instead of the IRTE# that the Intel IOMMU uses.
> 

Don't see anything obviously wrong with this.
We really need Alex's input on VFIO though.

-- 
MST

next prev parent reply	other threads:[~2023-01-13 11:32 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-13  9:08 Qemu interrupt-remap fault support David Woodhouse
2023-01-13 11:31 ` Michael S. Tsirkin [this message]
2023-01-13 16:51 ` Alex Williamson
2023-01-13 17:08   ` David Woodhouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230113062525-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=dwmw2@infradead.org \
    --cc=jasowang@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=yi.l.liu@intel.com \
    --cc=yi.y.sun@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.