From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35860) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZfLoN-0005Hw-8E for qemu-devel@nongnu.org; Fri, 25 Sep 2015 01:44:48 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZfLoL-0003qq-Nk for qemu-devel@nongnu.org; Fri, 25 Sep 2015 01:44:47 -0400 Date: Fri, 25 Sep 2015 15:39:08 +1000 From: David Gibson Message-ID: <20150925053908.GE11620@voom.redhat.com> References: <1443069231-14856-1-git-send-email-david@gibson.dropbear.id.au> <1443069231-14856-6-git-send-email-david@gibson.dropbear.id.au> <5604201B.1030705@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="10jrOL3x2xqLmOsH" Content-Disposition: inline In-Reply-To: <5604201B.1030705@redhat.com> Subject: Re: [Qemu-devel] [PATCH 5/7] memory: Allow replay of IOMMU mapping notifications List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Laurent Vivier Cc: thuth@redhat.com, qemu-devel@nongnu.org, abologna@redhat.com, alex.williamson@redhat.com, qemu-ppc@nongnu.org, pbonzini@redhat.com --10jrOL3x2xqLmOsH Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Sep 24, 2015 at 06:08:59PM +0200, Laurent Vivier wrote: >=20 >=20 > On 24/09/2015 06:33, David Gibson wrote: > > When we have guest visible IOMMUs, we allow notifiers to be registered > > which will be informed of all changes to IOMMU mappings. This is used = by > > vfio to keep the host IOMMU mappings in sync with guest IOMMU mappings. > >=20 > > However, unlike with a memory region listener, an iommu notifier won't = be > > told about any mappings which already exist in the (guest) IOMMU at the > > time it is registered. This can cause problems if hotplugging a VFIO > > device onto a guest bus which had existing guest IOMMU mappings, but di= dn't > > previously have an VFIO devices (and hence no host IOMMU mappings). > >=20 > > This adds a memory_region_register_iommu_notifier_replay() function to > > handle this case. As well as registering the new notifier it replays > > existing mappings. Because the IOMMU memory region doesn't internally > > remember the granularity of the guest IOMMU it has a small hack where t= he > > caller must specify a granularity at which to replay mappings. > >=20 > > If there are finer mappings in the guest IOMMU these will be reported in > > the iotlb structures passed to the notifier which it must handle (proba= bly > > causing it to flag an error). This isn't new - the VFIO iommu notifier > > must already handle notifications about guest IOMMU mappings too short > > for it to represent in the host IOMMU. > >=20 > > Signed-off-by: David Gibson > > --- > > include/exec/memory.h | 17 +++++++++++++++++ > > memory.c | 18 ++++++++++++++++++ > > 2 files changed, 35 insertions(+) > >=20 > > diff --git a/include/exec/memory.h b/include/exec/memory.h > > index 5baaf48..304f985 100644 > > --- a/include/exec/memory.h > > +++ b/include/exec/memory.h > > @@ -583,6 +583,23 @@ void memory_region_notify_iommu(MemoryRegion > > *mr, [snip] > > +void memory_region_register_iommu_notifier_replay(MemoryRegion *mr, No= tifier *n, > > + hwaddr granularity, > > + bool is_write) > > +{ > > + hwaddr addr; > > + IOMMUTLBEntry iotlb; > > + > > + memory_region_register_iommu_notifier(mr, n); > > + > > + for (addr =3D 0; addr < memory_region_size(mr); addr +=3D granular= ity) { > > + > > + iotlb =3D mr->iommu_ops->translate(mr, addr, is_write); > > + if (iotlb.perm !=3D IOMMU_NONE) { > > + n->notify(n, &iotlb); > > + } > > + } > > +} >=20 > If mr->size > (UINT64_MAX + 1 - granularity), you run into an infinite > loop because hwaddr is a 64bit value and the stop condition is beyond > its max value. You can avoid this by using the power of 2 of the Ugh, yes, and I think my old version with more int128s was still wrong, too. > granularity, instead of the granularity: >=20 > int shift =3D ctz64(granularity); > hwaddr size =3D memory_region_size(mr) >> shift; > for (addr =3D 0; addr < size; addr++) > { > iotlb =3D mr->iommu_ops->translate(mr, addr << shift, is_write); > ... >=20 > so in patch 6, you should pass the power of 2 instead of the value of > the granularity. >=20 > Of course, it works if granularity is at least 2.... Hrm, rather clunky. I've instead gone for putting this at the end of the loop body: /* if (2^64 - MR size) < granularity, it's possible to get an * infinite loop here. This should catch such a wraparound */ if ((addr + granularity) < addr) { break; } Of course, unless granularity is huge, stepping through a whole 2^64 address space might be indistinguishable from an infinite loop in practice.. --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --10jrOL3x2xqLmOsH Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJWBN38AAoJEGw4ysog2bOSgU8QANrB81dlBIN9GwUeMo0ACJOC qckdlqFMwFalcFUjy9wBPEmkYZDZxkLgeLGXFvcOVPsbqn1kj9UeHAGnUsfMJ+zx as7wqmBM+LzoPGeTF9RJ2er5AHya+zzKaKaPi01ApsNsIkcO5kXAJ7f7L+WFpoO4 MhmefmcilJdHINPbRzeXWFpyp6l4SKHwlqfIgn3EbmWJdQ4bhaZ6mlOPNww+yTQv yab/xMfCSlQnYtfr0NI6oCqehDpZUcAnFF0XA1YGVOjAZEQlbQ01BVvVQgIhd5xc wrVHO5naystWRZSiYFhEV29MAHukVdCLM/tsu9oay2Wla32P+S/JW68Yn2ClJM3x d3hDzMOpQh9VuPGRt03CuM8urc32sNpnidEzyYSo1ubxSx+opaQJU4iPY0RD8ap6 x4KIPMoOaGS3iZW2BvUHMb+U9XUnSAlaY7x8OkZlSAuKiAyam6fhLY+QQojMxDwx qnw0XAge+8zve0E4qDuR81BdzHT4nkIDz/nCfmRpQr0DXpY0dv16FYOXqvINM/81 Zn+VbhsD29bDshC0ed3D2n1wus6kiV8aMPLLJumXVXzhY2I3IffZ8uY+ajBWwT53 zNOOwsI1GhMl+oG82Z03qqxyW67y9nwbamcCyKHOpDO/UpchISLtL8Ra7YpU3mu4 rhVwb72DuqtMo54oYNCp =tAHG -----END PGP SIGNATURE----- --10jrOL3x2xqLmOsH--