qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: mst@redhat.com, qemu-devel@nongnu.org, eric.auger@redhat.com,
	viktor@daynix.com
Subject: Re: [PATCH 3/3] intel-iommu: build iova tree during IOMMU translation
Date: Wed, 30 Nov 2022 10:17:50 -0500	[thread overview]
Message-ID: <Y4d0HokcV/tg0wlk@x1n> (raw)
In-Reply-To: <CACGkMEuC41jFin3XAVSs3ra0tmxZD7L5NeDLn5OD6ziq7z1huA@mail.gmail.com>

On Wed, Nov 30, 2022 at 02:33:51PM +0800, Jason Wang wrote:
> On Tue, Nov 29, 2022 at 11:57 PM Peter Xu <peterx@redhat.com> wrote:
> >
> > On Tue, Nov 29, 2022 at 04:10:37PM +0800, Jason Wang wrote:
> > > The IOVA tree is only built during page walk this breaks the device
> > > that tries to use UNMAP notifier only. One example is vhost-net, it
> > > tries to use UNMAP notifier when vIOMMU doesn't support DEVIOTLB_UNMAP
> > > notifier (e.g when dt mode is not enabled). The interesting part is
> > > that it doesn't use MAP since it can query the IOMMU translation by
> > > itself upon a IOTLB miss.
> > >
> > > This doesn't work since Qemu doesn't build IOVA tree in IOMMU
> > > translation which means the UNMAP notifier won't be triggered during
> > > the page walk since Qemu think it is never mapped. This could be
> > > noticed when vIOMMU is used with vhost_net but dt is disabled.
> > >
> > > Fixing this by build the iova tree during IOMMU translation, this
> > > makes sure the UNMAP notifier event could be identified during page
> > > walk. And we need to walk page table not only for UNMAP notifier but
> > > for MAP notifier during PSI.
> > >
> > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > ---
> > >  hw/i386/intel_iommu.c | 43 ++++++++++++++++++-------------------------
> > >  1 file changed, 18 insertions(+), 25 deletions(-)
> > >
> > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > > index d025ef2873..edeb62f4b2 100644
> > > --- a/hw/i386/intel_iommu.c
> > > +++ b/hw/i386/intel_iommu.c
> > > @@ -1834,6 +1834,8 @@ static bool vtd_do_iommu_translate(VTDAddressSpace *vtd_as, PCIBus *bus,
> > >      uint8_t access_flags;
> > >      bool rid2pasid = (pasid == PCI_NO_PASID) && s->root_scalable;
> > >      VTDIOTLBEntry *iotlb_entry;
> > > +    const DMAMap *mapped;
> > > +    DMAMap target;
> > >
> > >      /*
> > >       * We have standalone memory region for interrupt addresses, we
> > > @@ -1954,6 +1956,21 @@ out:
> > >      entry->translated_addr = vtd_get_slpte_addr(slpte, s->aw_bits) & page_mask;
> > >      entry->addr_mask = ~page_mask;
> > >      entry->perm = access_flags;
> > > +
> > > +    target.iova = entry->iova;
> > > +    target.size = entry->addr_mask;
> > > +    target.translated_addr = entry->translated_addr;
> > > +    target.perm = entry->perm;
> > > +
> > > +    mapped = iova_tree_find(vtd_as->iova_tree, &target);
> > > +    if (!mapped) {
> > > +        /* To make UNMAP notifier work, we need build iova tree here
> > > +         * in order to have the UNMAP iommu notifier to be triggered
> > > +         * during the page walk.
> > > +         */
> > > +        iova_tree_insert(vtd_as->iova_tree, &target);
> > > +    }
> > > +
> > >      return true;
> > >
> > >  error:
> > > @@ -2161,31 +2178,7 @@ static void vtd_iotlb_page_invalidate_notify(IntelIOMMUState *s,
> > >          ret = vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus),
> > >                                         vtd_as->devfn, &ce);
> > >          if (!ret && domain_id == vtd_get_domain_id(s, &ce, vtd_as->pasid)) {
> > > -            if (vtd_as_has_map_notifier(vtd_as)) {
> > > -                /*
> > > -                 * As long as we have MAP notifications registered in
> > > -                 * any of our IOMMU notifiers, we need to sync the
> > > -                 * shadow page table.
> > > -                 */
> > > -                vtd_sync_shadow_page_table_range(vtd_as, &ce, addr, size);
> > > -            } else {
> > > -                /*
> > > -                 * For UNMAP-only notifiers, we don't need to walk the
> > > -                 * page tables.  We just deliver the PSI down to
> > > -                 * invalidate caches.
> > > -                 */
> > > -                IOMMUTLBEvent event = {
> > > -                    .type = IOMMU_NOTIFIER_UNMAP,
> > > -                    .entry = {
> > > -                        .target_as = &address_space_memory,
> > > -                        .iova = addr,
> > > -                        .translated_addr = 0,
> > > -                        .addr_mask = size - 1,
> > > -                        .perm = IOMMU_NONE,
> > > -                    },
> > > -                };
> > > -                memory_region_notify_iommu(&vtd_as->iommu, 0, event);
> >
> > Isn't this path the one that will be responsible for pass-through the UNMAP
> > events from guest to vhost when there's no MAP notifier requested?
> 
> Yes, but it doesn't do the iova tree removing. More below.
> 
> >
> > At least that's what I expected when introducing the iova tree, because for
> > unmap-only device hierachy I thought we didn't need the tree at all.
> 
> Then the problem is the UNMAP notifier won't be trigger at all during
> DSI page walk in vtd_page_walk_one() because there's no DMAMap stored
> in the iova tree.:
> 
>         if (!mapped) {
>             /* Skip since we didn't map this range at all */
>             trace_vtd_page_walk_one_skip_unmap(entry->iova, entry->addr_mask);
>             return 0;
>         }
> 
> So I choose to build the iova tree in translate then we won't go
> within the above condition.

That's also why it's weird because IIUC we should never walk a page table
at all if there's no MAP notifier regiestered.

When I'm looking at the walk callers I found that indeed there's one path
missing where can cause it to actually walk the pgtables without !MAP, then
I also noticed commit f7701e2c7983b6, and I'm wondering what we really want
is something like this:

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index a08ee85edf..c46f3db992 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1536,7 +1536,7 @@ static int vtd_sync_shadow_page_table(VTDAddressSpace *vtd_as)
     VTDContextEntry ce;
     IOMMUNotifier *n;

-    if (!(vtd_as->iommu.iommu_notify_flags & IOMMU_NOTIFIER_IOTLB_EVENTS)) {
+    if (!vtd_as_has_map_notifier(vtd_as)) {
         return 0;
     }

So I'm not sure whether this patch is the problem resolver; so far I feel
like it's patch 2 who does the real fix.  Then we can have the above
oneliner so we stop any walks when there's no map notifiers.

Thanks,

-- 
Peter Xu



  reply	other threads:[~2022-11-30 15:18 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-29  8:10 [PATCH 0/3] Fix UNMAP notifier for intel-iommu Jason Wang
2022-11-29  8:10 ` [PATCH 1/3] intel-iommu: fail MAP notifier without caching mode Jason Wang
2022-11-29 15:35   ` Peter Xu
2022-11-30  6:23     ` Jason Wang
2022-11-30 15:06       ` Peter Xu
2022-12-01  8:46         ` Jason Wang
2022-12-06 13:23   ` Eric Auger
2022-11-29  8:10 ` [PATCH 2/3] intel-iommu: fail DEVIOTLB_UNMAP without dt mode Jason Wang
2022-11-29 15:38   ` Peter Xu
2022-12-01 16:03   ` Peter Xu
2023-02-23  3:19     ` Jason Wang
2022-12-06 13:33   ` Eric Auger
2023-02-03  9:08   ` Laurent Vivier
2023-02-07 16:17     ` Laurent Vivier
2023-02-07 16:35       ` Peter Xu
2022-11-29  8:10 ` [PATCH 3/3] intel-iommu: build iova tree during IOMMU translation Jason Wang
2022-11-29 15:57   ` Peter Xu
2022-11-30  6:33     ` Jason Wang
2022-11-30 15:17       ` Peter Xu [this message]
2022-12-01  8:35         ` Jason Wang
2022-12-01 14:58           ` Peter Xu
2022-12-05  4:12             ` Jason Wang
2022-12-05 23:18               ` Peter Xu
2022-12-06  3:18                 ` Jason Wang
2022-12-06 13:58                   ` Peter Xu
2022-12-23  8:02                     ` Jason Wang
2022-12-23 16:22                       ` Peter Xu
2022-11-30 16:37 ` [PATCH 0/3] Fix UNMAP notifier for intel-iommu Michael S. Tsirkin
2022-12-01  8:29   ` Jason Wang
2022-12-20 13:53 ` Michael S. Tsirkin
2022-12-21  3:17   ` Jason Wang
2023-01-15 23:30 ` Viktor Prutyanov
2023-01-16  7:06   ` Jason Wang
2023-01-27 13:17     ` Michael S. Tsirkin
2023-01-29  5:43       ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y4d0HokcV/tg0wlk@x1n \
    --to=peterx@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=viktor@daynix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).