qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: David Woodhouse <dwmw2@infradead.org>
Cc: Jason Wang <jasowang@redhat.com>, Peter Xu <peterx@redhat.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	yi.l.liu@intel.com, yi.y.sun@linux.intel.com,
	qemu-devel <qemu-devel@nongnu.org>
Subject: Re: Qemu interrupt-remap fault support
Date: Fri, 13 Jan 2023 06:31:24 -0500	[thread overview]
Message-ID: <20230113062525-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <aaef9961d210ac1873153bf3cf01d984708744fc.camel@infradead.org>

On Fri, Jan 13, 2023 at 09:08:38AM +0000, David Woodhouse wrote:
> I'm looking at interrupt remapping (because I need to hook into the
> translation somehow to add PIRQ support for Xen which translates guest
> MSIs directly to KVM_IRQ_ROUTING_XEN_EVTCHN).
> 
> Am I right in understanding that it doesn't report faults on interrupts
> which can't be translated? It attempts to translate interrupts at the
> time the table is modified (vtd_int_remap()) or when an APIC access
> actually triggers an MSI (vtd_mem_ir_write()) but in neither case does
> it actually raise a fault?
>
> The behaviour we want here is that we only raise a fault when the IRQ
> actually *happens*. But that's hard in our current model where it looks
> like we pretranslate *everything* in advance and just let it run.
> 
> Here's a proposal for a model which could make it work (using VFIO as
> the example since that's the more complex part but it works for
> emulated MSI sources too):
> 
> We consume the VFIO eventfd *both* in userspace and the kernel. (Since 
> https://lore.kernel.org/kvm/20201027143944.648769-1-dwmw2@infradead.org/
> we can just keep listening on the VFIO eventfd in userspace and the
> kernel will eat all the events so you never notice. On older kernels we
> have to manually stop listening in userspace.)
> 
> When a translation is valid and should be considered 'cached' in the
> IOMMU, that's when we actually hook it up to the irqfd. 
> 
> We can ditch the iec invalidate callbacks (vtd_iec_notify_all) because
> all an invalidation needs to do is KVM_IRQFD_FLAG_DEASSIGN for the
> corresponding GSI.
> 
> (
> You might consider abusing a spare field in the KVM routing table to
> hold a cookie like the IRTE# so that you know *which* entries to
> invalidate. I couldn't possibly comment.
> 
> 	/* 64-bit cookie for IOMMU to use for invalidation choices */
> 	#define ire_ir_cookie(ire) ((ire)->u.adapter.ind_offset)
> 
> 	/* Flags, to indicate a stale entry that needs retranslating
> */
> 	#define ire_user_flags(ire) ((ire)->u.adapter.summary_offset)
> 	#define IRE_USER_FLAG_STALE             1
> )
> 
> So when an interrupt happens and it's *untranslated*, that's when it
> gets raised to userspace to handle, e.g. in vfio_msi_interrupt(). That
> does the normal thing and attempts to deliver the guest MSI directly.
> We add a flag "bool delivering_now" to the X86IOMMUClass int_remap
> function, to allow it to distinguish between preemptive translations
> and actual delivery and to raise the fault in the latter case.
> 
> When the guest frobs a device's MSI table we can do the translation as
> we do at the moment, of course with the 'delivering_now' argument being
> false. And *if* the translation succeeds then we can install the IRQFD
> right away.
> 
> This model allows us to generate faults as the hardware would, and also
> improves the efficiency of invalidation by only invalidating what we
> need to. I haven't looked hard at how it works with an emulated AMD
> IOMMU, but I know that the Xen PIRQ support (which is where I came in)
> slots into it fairly trivially, using the PIRQ# as the 'cookie' for
> invalidation instead of the IRTE# that the Intel IOMMU uses.
> 

Don't see anything obviously wrong with this.
We really need Alex's input on VFIO though.

-- 
MST



  reply	other threads:[~2023-01-13 11:32 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-13  9:08 Qemu interrupt-remap fault support David Woodhouse
2023-01-13 11:31 ` Michael S. Tsirkin [this message]
2023-01-13 16:51 ` Alex Williamson
2023-01-13 17:08   ` David Woodhouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230113062525-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=dwmw2@infradead.org \
    --cc=jasowang@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=yi.l.liu@intel.com \
    --cc=yi.y.sun@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).