All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Laurent Vivier <lvivier@redhat.com>
Cc: thuth@redhat.com, qemu-devel@nongnu.org, aik@ozlabs.ru,
	mdroth@linux.vnet.ibm.com, abologna@redhat.com,
	alex.williamson@redhat.com, qemu-ppc@nongnu.org,
	pbonzini@redhat.com, gwshan@linux.vnet.ibm.com
Subject: Re: [Qemu-devel] [PATCHv3 5/7] memory: Allow replay of IOMMU mapping notifications
Date: Thu, 1 Oct 2015 09:51:11 +1000	[thread overview]
Message-ID: <20150930235111.GF23574@voom> (raw)
In-Reply-To: <560BA488.9080200@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 5426 bytes --]

On Wed, Sep 30, 2015 at 10:59:52AM +0200, Laurent Vivier wrote:
> 
> 
> On 30/09/2015 04:13, David Gibson wrote:
> > When we have guest visible IOMMUs, we allow notifiers to be registered
> > which will be informed of all changes to IOMMU mappings.  This is used by
> > vfio to keep the host IOMMU mappings in sync with guest IOMMU mappings.
> > 
> > However, unlike with a memory region listener, an iommu notifier won't be
> > told about any mappings which already exist in the (guest) IOMMU at the
> > time it is registered.  This can cause problems if hotplugging a VFIO
> > device onto a guest bus which had existing guest IOMMU mappings, but didn't
> > previously have an VFIO devices (and hence no host IOMMU mappings).
> > 
> > This adds a memory_region_iommu_replay() function to handle this case.  It
> > replays any existing mappings in an IOMMU memory region to a specified
> > notifier.  Because the IOMMU memory region doesn't internally remember the
> > granularity of the guest IOMMU it has a small hack where the caller must
> > specify a granularity at which to replay mappings.
> > 
> > If there are finer mappings in the guest IOMMU these will be reported in
> > the iotlb structures passed to the notifier which it must handle (probably
> > causing it to flag an error).  This isn't new - the VFIO iommu notifier
> > must already handle notifications about guest IOMMU mappings too short
> > for it to represent in the host IOMMU.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  include/exec/memory.h | 13 +++++++++++++
> >  memory.c              | 20 ++++++++++++++++++++
> >  2 files changed, 33 insertions(+)
> > 
> > diff --git a/include/exec/memory.h b/include/exec/memory.h
> > index 5baaf48..0f07159 100644
> > --- a/include/exec/memory.h
> > +++ b/include/exec/memory.h
> > @@ -583,6 +583,19 @@ void memory_region_notify_iommu(MemoryRegion *mr,
> >  void memory_region_register_iommu_notifier(MemoryRegion *mr, Notifier *n);
> >  
> >  /**
> > + * memory_region_iommu_replay: replay existing IOMMU translations to
> > + * a notifier
> > + *
> > + * @mr: the memory region to observe
> > + * @n: the notifier to which to replay iommu mappings
> > + * @granularity: Minimum page granularity to replay notifications for
> > + * @is_write: Whether to treat the replay as a translate "write"
> > + *     through the iommu
> > + */
> > +void memory_region_iommu_replay(MemoryRegion *mr, Notifier *n,
> > +                                hwaddr granularity, bool is_write);
> > +
> > +/**
> >   * memory_region_unregister_iommu_notifier: unregister a notifier for
> >   * changes to IOMMU translation entries.
> >   *
> > diff --git a/memory.c b/memory.c
> > index ef87363..1b03d22 100644
> > --- a/memory.c
> > +++ b/memory.c
> > @@ -1403,6 +1403,26 @@ void memory_region_register_iommu_notifier(MemoryRegion *mr, Notifier *n)
> >      notifier_list_add(&mr->iommu_notify, n);
> >  }
> >  
> > +void memory_region_iommu_replay(MemoryRegion *mr, Notifier *n,
> > +                                hwaddr granularity, bool is_write)
> > +{
> > +    hwaddr addr;
> > +    IOMMUTLBEntry iotlb;
> > +
> > +    for (addr = 0; addr < memory_region_size(mr); addr += granularity) {
> > +        iotlb = mr->iommu_ops->translate(mr, addr, is_write);
> 
> in iotlb, there is an "address_mask", on spapr, it is copied from
> "page_shift", which is SPAPR_TCE_PAGE_SHIFT (12 -> 4k).
> 
> At a first glance, we would like to use it to scan the memory region,
> but as granularity could be a greater value, I think it is a better choice.

Using address_mask doesn't quite work.  *If* you start with an
existing, valid translation, then you can use address mask to skip to
the end of it - that might be a useful optimization in future,
particularly if the guest IOMMU has variable page sizes.

But if you start on an address that doesn't have a current valid
translation in the IOMMU, then address_mask gets set to ~0, so it
doesn't give you any information on where to try next for a valid
mapping.  That's what the granularity parameter is needed for.

> But the question is: why the iotlb page_size is not equal to the
> granularity given by VFIO_IOMMU_GET_INFO _IO ?

Well, the iotlb page size is the page size from the *guest* iommu,
whereas VFIO_IOMMU_GET_INFO tells you the page size of the *host*
iommu.  In practice, they'll probably be the same, at least on setups
likely to work well with VFIO, but in theory they could be different.

> > +        if (iotlb.perm != IOMMU_NONE) {
> > +            n->notify(n, &iotlb);
> > +        }
> > +
> > +        /* if (2^64 - MR size) < granularity, it's possible to get an
> > +         * infinite loop here.  This should catch such a wraparound */
> > +        if ((addr + granularity) < addr) {
> > +            break;
> > +        }
> > +    }
> > +}
> > +
> >  void memory_region_unregister_iommu_notifier(Notifier *n)
> >  {
> >      notifier_remove(n);
> > 
> 
> As my question is not a bout this particular patch but on another
> existing part, I can say:
> 
> Reviewed-by: Laurent Vivier <lvivier@redhat.com>
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

  reply	other threads:[~2015-09-30 23:58 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-30  2:13 [Qemu-devel] [PATCHv3 0/7] VFIO extensions to allow VFIO devices on spapr-pci-host-bridge David Gibson
2015-09-30  2:13 ` [Qemu-devel] [PATCHv3 1/7] vfio: Remove unneeded union from VFIOContainer David Gibson
2015-09-30  8:19   ` Laurent Vivier
2015-09-30  2:13 ` [Qemu-devel] [PATCHv3 2/7] vfio: Generalize vfio_listener_region_add failure path David Gibson
2015-09-30  8:20   ` Laurent Vivier
2015-09-30  2:13 ` [Qemu-devel] [PATCHv3 3/7] vfio: Check guest IOVA ranges against host IOMMU capabilities David Gibson
2015-09-30  8:25   ` Laurent Vivier
2015-09-30  2:13 ` [Qemu-devel] [PATCHv3 4/7] vfio: Record host IOMMU's available IO page sizes David Gibson
2015-09-30  8:27   ` Laurent Vivier
2015-09-30  2:13 ` [Qemu-devel] [PATCHv3 5/7] memory: Allow replay of IOMMU mapping notifications David Gibson
2015-09-30  8:32   ` Laurent Vivier
2015-09-30  8:59   ` Laurent Vivier
2015-09-30 23:51     ` David Gibson [this message]
2015-10-05 13:21   ` Paolo Bonzini
2015-09-30  2:13 ` [Qemu-devel] [PATCHv3 6/7] vfio: Allow hotplug of containers onto existing guest IOMMU mappings David Gibson
2015-09-30  9:09   ` Laurent Vivier
2015-09-30 23:56     ` David Gibson
2015-09-30  2:13 ` [Qemu-devel] [PATCHv3 7/7] vfio: Expose a VFIO PCI device's group for EEH David Gibson
2015-09-30  9:12   ` Laurent Vivier
2015-10-02 18:12 ` [Qemu-devel] [PATCHv3 0/7] VFIO extensions to allow VFIO devices on spapr-pci-host-bridge Alex Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150930235111.GF23574@voom \
    --to=david@gibson.dropbear.id.au \
    --cc=abologna@redhat.com \
    --cc=aik@ozlabs.ru \
    --cc=alex.williamson@redhat.com \
    --cc=gwshan@linux.vnet.ibm.com \
    --cc=lvivier@redhat.com \
    --cc=mdroth@linux.vnet.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.