From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35615) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bwNXr-00056C-Ql for qemu-devel@nongnu.org; Tue, 18 Oct 2016 02:06:41 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bwNXo-0000YD-Mv for qemu-devel@nongnu.org; Tue, 18 Oct 2016 02:06:39 -0400 Received: from ozlabs.org ([103.22.144.67]:55505) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1bwNXo-0000Xl-B1 for qemu-devel@nongnu.org; Tue, 18 Oct 2016 02:06:36 -0400 Date: Tue, 18 Oct 2016 16:52:04 +1100 From: David Gibson Message-ID: <20161018055204.GH25390@umbus.fritz.box> References: <1476719064-9242-1-git-send-email-bd.aviv@gmail.com> <20161017100736.68a56fd9@t450s.home> <20161018040655.GG25390@umbus.fritz.box> <20161017224702.53301858@t450s.home> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="44sYDrpRTlpaXQDy" Content-Disposition: inline In-Reply-To: <20161017224702.53301858@t450s.home> Subject: Re: [Qemu-devel] [PATCH v4 RESEND 0/3] IOMMU: intel_iommu support map and unmap notifications List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alex Williamson Cc: "Aviv B.D" , Jan Kiszka , qemu-devel@nongnu.org, Peter Xu , "Michael S. Tsirkin" --44sYDrpRTlpaXQDy Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Oct 17, 2016 at 10:47:02PM -0600, Alex Williamson wrote: > On Tue, 18 Oct 2016 15:06:55 +1100 > David Gibson wrote: >=20 > > On Mon, Oct 17, 2016 at 10:07:36AM -0600, Alex Williamson wrote: > > > On Mon, 17 Oct 2016 18:44:21 +0300 > > > "Aviv B.D" wrote: > > > =20 > > > > From: "Aviv Ben-David" > > > >=20 > > > > * Advertize Cache Mode capability in iommu cap register.=20 > > > > This capability is controlled by "cache-mode" property of intel-i= ommu device. > > > > To enable this option call QEMU with "-device intel-iommu,cache-m= ode=3Dtrue". > > > >=20 > > > > * On page cache invalidation in intel vIOMMU, check if the domain b= elong to > > > > registered notifier, and notify accordingly. > > > >=20 > > > > Currently this patch still doesn't enabling VFIO devices support wi= th vIOMMU=20 > > > > present. Current problems: > > > > * vfio_iommu_map_notify is not aware about memory range belong to s= pecific=20 > > > > VFIOGuestIOMMU. =20 > > >=20 > > > Could you elaborate on why this is an issue? > > > =20 > > > > * memory_region_iommu_replay hangs QEMU on start up while it ittera= te over=20 > > > > 64bit address space. Commenting out the call to this function ena= bles=20 > > > > workable VFIO device while vIOMMU present. =20 > > >=20 > > > This has been discussed previously, it would be incorrect for vfio not > > > to call the replay function. The solution is to add an iommu driver > > > callback to efficiently walk the mappings within a MemoryRegion. =20 > >=20 > > Right, replay is a bit of a hack. There are a couple of other > > approaches that might be adequate without a new callback: > > - Make the VFIOGuestIOMMU aware of the guest address range mapped > > by the vIOMMU. Intel currently advertises that as a full 64-bit > > address space, but I bet that's not actually true in practice. > > - Have the IOMMU MR advertise a (minimum) page size for vIOMMU > > mappings. That may let you stpe through the range with greater > > strides >=20 > Hmm, VT-d supports at least a 39-bit address width and always supports > a minimum 4k page size, so yes that does reduce us from 2^52 steps down > to 2^27, Right, which is probably doable, if not ideal > but it's still absurd to walk through the raw address space. Well.. it depends on the internal structure of the IOMMU. For Power, it's traditionally just a 1-level page table, so we can't actually do any better than stepping through each IOMMU page. > It does however seem correct to create the MemoryRegion with a width > that actually matches the IOMMU capability, but I don't think that's a > sufficient fix by itself. Thanks, I suspect it would actually make it workable in the short term. But I don't disagree that a "traverse" or "replay" callback of some sort in the iommu_ops is a better idea long term. Having a fallback to the current replay implementation if the callback isn't supplied seems pretty reasonable though. --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --44sYDrpRTlpaXQDy Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJYBbiBAAoJEGw4ysog2bOSpiQQALJEvGY/PMY96Db5AXwUNbau G/jLOqIqz5h2ry5vsnT3QgMGiVe0DREt0QA2NpRBtqhMbLXuEvX3zyEuBCbmybz6 bsqRtjGF0jCDnpdoBlm+0DZ5InRv4+4xOR38Q/kLnAiRfuLnzsrEXx5mI9gecuzA g3nfw16mVTskl0lc7DWT9xquXEUPx9Ciiw682oJ5TlVqzSBbyLcF7kyFSTRp2JzB yhL8AEb1gBlWBYNuqloyExKsTqyQ+BZ0khHzuvQABoln/O9aJdZdHV0GWzqyvcRJ d/SysQNUTjlTDbaLHE69djXvvRtm0+c7oTqBXw3wJYSJSkOXzjhql1jkifkOTkeV hFiVv4dLhkOZmCs0j3Xw7YvKx1S5+tK9a1t0iBBuqS2/p/w63E9WKyPmp1tMtZZM dfyywWHFsY8KySziE+kuTrWBq2oaV1MXLGnp0iqaF6ee0ls9n8fUJHE+peycL5qF HOS7jGm14bs7oHsAaPH+uUqjgEtn3YGwlg8SO/wJQhnR6LYjuTjI4ED1/AqZgHra Lylf6p/jyQdFa5D94g1Ai1nnqXV6EDyWjvlkK3uj1+Efsx7a4sR0UzXdwGiz36SL qSeC6oBQqInUBt3iL91aj8hurVmjmrIbKyerFy7Y6scH0i1jT4o6s07ogadZ0+jV 8cdW9uD1NS3134GyhN66 =PY5G -----END PGP SIGNATURE----- --44sYDrpRTlpaXQDy--