From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51824) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZFbSF-0006fx-T7 for qemu-devel@nongnu.org; Thu, 16 Jul 2015 01:11:33 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZFbSC-0008OX-DC for qemu-devel@nongnu.org; Thu, 16 Jul 2015 01:11:31 -0400 Date: Thu, 16 Jul 2015 15:11:22 +1000 From: David Gibson Message-ID: <20150716051122.GA25179@voom.redhat.com> References: <1436876514-2946-1-git-send-email-aik@ozlabs.ru> <1436876514-2946-5-git-send-email-aik@ozlabs.ru> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="6TrnltStXW4iwmi0" Content-Disposition: inline In-Reply-To: <1436876514-2946-5-git-send-email-aik@ozlabs.ru> Subject: Re: [Qemu-devel] [RFC PATCH qemu v3 4/4] vfio: spapr: Add SPAPR IOMMU v2 support (DMA memory preregistering) List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexey Kardashevskiy Cc: Peter Crosthwaite , qemu-devel@nongnu.org, Michael Roth , Alex Williamson , qemu-ppc@nongnu.org, Paolo Bonzini --6TrnltStXW4iwmi0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jul 14, 2015 at 10:21:54PM +1000, Alexey Kardashevskiy wrote: > This makes use of the new "memory registering" feature. The idea is > to provide the userspace ability to notify the host kernel about pages > which are going to be used for DMA. Having this information, the host > kernel can pin them all once per user process, do locked pages > accounting (once) and not spent time on doing that in real time with > possible failures which cannot be handled nicely in some cases. >=20 > This adds a guest RAM memory listener which notifies a VFIO container > about memory which needs to be pinned/unpinned. VFIO MMIO regions > (i.e. "skip dump" regions) are skipped. >=20 > The feature is only enabled for SPAPR IOMMU v2. The host kernel changes > are required. Since v2 does not need/support VFIO_IOMMU_ENABLE, this does > not call it when v2 is detected and enabled. >=20 > This does not change the guest visible interface. >=20 > Signed-off-by: Alexey Kardashevskiy I've looked at this in more depth now, and attempting to unify the pre-reg and mapping listeners like this can't work - they need to be listening on different address spaces: mapping actions need to be listening on the PCI address space, whereas the pre-reg needs to be listening on address_space_memory. For x86 - for now - those end up being the same thing, but on Power they're not. We do need to be clear about what differences are due to the presence of a guest IOMMU versus which are due to arch or underlying IOMMU type. For now Power has a guest IOMMU and x86 doesn't, but that could well change in future: we could well implement the guest side IOMMU for x86 in future (or x86 could invent a paravirt IOMMU interface). On the other side, BenH's experimental powernv machine type could introduce Power machines without a guest side IOMMU (or at least an optional guest side IOMMU). The quick and dirty approach here is: 1. Leave the main listener as is 2. Add a new pre-reg notifier to the spapr iommu specific code, which listens on address_space_memory, *not* the PCI space The more generally correct approach, which allows for more complex IOMMU arrangements and the possibility of new IOMMU types with pre-reg is: 1. Have the core implement both a mapping listener and a pre-reg listener (optionally enabled by a per-iommu-type flag). Basically the first one sees what *is* mapped, the second sees what *could* be mapped. 2. As now, the mapping listener listens on PCI address space, if RAM blocks are added, immediately map them into the host IOMMU, if guest IOMMU blocks appear register a notifier which will mirror guest IOMMU mappings to the host IOMMU (this is what we do now). 3. The pre-reg listener also listens on the PCI address space. RAM blocks added are pre-registered immediately. But, if guest IOMMU blocks are added, instead of registering a guest-iommu notifier, we register another listener on the *target* AS of the guest IOMMU, same callbacks as this one. In practice that target AS will almost always resolve to address_space_memory, but this can at least in theory handle crazy guest setups with multiple layers of IOMMU. 4. Have to ensure that the pre-reg callbacks always happen before the mapping calls. For a system with an IOMMU backend which requires pre-registration, but doesn't have a guest IOMMU, we need to pre-reg, then host-iommu-map RAM blocks that appear in PCI address space. --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --6TrnltStXW4iwmi0 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJVpzz6AAoJEGw4ysog2bOSuJgP/090sR3mXA+mBZJ/v6Y+yvGl 0ndcq8YrWGFjc3hmhfxCWJrtbbX6ARPHrn849tc2OqskJ966NmWhyerv2CSCGHc1 lF6X7J7Bsk/X+amtgvz1sO2rad5bgs0U0cQwHrfXf9Gxe8Odgs8N6OAgujvVXWON J49TsDW0Z5eex8YAZXGqxlQ2dWwLby/PMDCoJHs8u+hUatlNnnd6l/Qf2z+BPvGz jO3FsRFK0Hm61LNLR0a+Ehm/ejJTV29l/K0RXjuXELMvaQrlwX29PL/22zE+vID3 gGJ3f/wXFdB8CVZafNBaDTnaA9XIE/TWQ/W3B2GF/C5ANF/VXx41op0uNY6XfQWd n8lt54zrZmM99dB4A799PqedCvbO214D7RzJ1J/sjSobirEihjyDicFKJQYl9GXC WRN2uU2u+Sb+5lU3kn8yfqLpYK7snXtrbObte4kGGGb6kV7tBSdLTL142YEThWkv L6yZLAJfrA02moejaf4rTtj5xD+LslnMIeHylQbtTWs1VVei6ApybqEU5bhihKp0 e9H93kZ33gik7OpDvxmeXR3M5yUei2pZ1EQIBgIxVSGZOfLfpyOWHPDc8pk6NQit xlDtxQoOBNWsz0GCw5z1+4QIMYIraPdL85gzb8xL/xxYlrwMJuvB46HEO5p8wyem Cz56ryD36N1tIa3w/Hap =8TIA -----END PGP SIGNATURE----- --6TrnltStXW4iwmi0--