From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37546) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1d0Kdu-0000eA-0i for qemu-devel@nongnu.org; Tue, 18 Apr 2017 00:21:31 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1d0Kdq-0000mh-Qx for qemu-devel@nongnu.org; Tue, 18 Apr 2017 00:21:30 -0400 Received: from mx1.redhat.com ([209.132.183.28]:35414) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1d0Kdq-0000mT-I4 for qemu-devel@nongnu.org; Tue, 18 Apr 2017 00:21:26 -0400 Date: Tue, 18 Apr 2017 12:21:09 +0800 From: Peter Xu Message-ID: <20170418042109.GA22226@pxdev.xzpeter.org> References: <1492426712-12230-1-git-send-email-peterx@redhat.com> <1492426712-12230-8-git-send-email-peterx@redhat.com> <1c5c9278-92b1-6f56-b251-53bf60c7f4e8@redhat.com> <20170418035033.GH16703@pxdev.xzpeter.org> <18bd8627-bdca-b9e8-8d4e-c4aa0689d4aa@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <18bd8627-bdca-b9e8-8d4e-c4aa0689d4aa@redhat.com> Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH 7/7] intel_iommu: support passthrough (PT) List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jason Wang Cc: qemu-devel@nongnu.org, David Gibson , Lan Tianyu , Marcel Apfelbaum , "Michael S.Tsirkin" On Tue, Apr 18, 2017 at 12:00:13PM +0800, Jason Wang wrote: >=20 >=20 > On 2017=E5=B9=B404=E6=9C=8818=E6=97=A5 11:50, Peter Xu wrote: > >On Tue, Apr 18, 2017 at 11:23:35AM +0800, Jason Wang wrote: > >>On 2017=E5=B9=B404=E6=9C=8817=E6=97=A5 18:58, Peter Xu wrote: > >[...] > > > >>>+static void vtd_switch_address_space(VTDAddressSpace *as) > >>>+{ > >>>+ bool use_iommu; > >>>+ > >>>+ assert(as); > >>>+ > >>>+ use_iommu =3D as->iommu_state->dmar_enabled; > >>>+ if (use_iommu) { > >>>+ /* Further checks per-device configuration */ > >>>+ use_iommu &=3D !vtd_dev_pt_enabled(as); > >>>+ } > >>Looks like you can use as->iommu_state->dmar_enabled && > >>!vtd_dev_pt_enabled(as) > >vtd_dev_pt_enalbed() needs to read the guest memory (starting from > >reading root entry), which is slightly slow. I was trying to avoid > >unecessary reads. > > > >[...] >=20 > I think compiler won't go to vtd_dev_pt_enabled() if dmar_enabled is fa= lse. You are right. I'll switch. >=20 > > > >>>@@ -991,6 +1058,18 @@ static void vtd_do_iommu_translate(VTDAddressS= pace *vtd_as, PCIBus *bus, > >>> cc_entry->context_cache_gen =3D s->context_cache_gen; > >>> } > >>>+ /* > >>>+ * We don't need to translate for pass-through context entries. > >>>+ * Also, let's ignore IOTLB caching as well for PT devices. > >>>+ */ > >>>+ if (vtd_ce_get_type(&ce) =3D=3D VTD_CONTEXT_TT_PASS_THROUGH) { > >>>+ entry->translated_addr =3D entry->iova; > >>>+ entry->addr_mask =3D VTD_PAGE_SIZE - 1; > >>>+ entry->perm =3D IOMMU_RW; > >>>+ trace_vtd_translate_pt(source_id, entry->iova); > >>>+ return; > >>>+ } > >>Several questions here: > >> > >>1) Is this just for vhost? > >No. When caching mode is not enabled, all passthroughed devices should > >be using this path. >=20 > Ok, then it looks better to switch the address space if we've found it = was > PT? Do you mean to switch in that if() above? Then when invalidate context entry, we switch back if needed? >=20 > > > >>2) Since this is done after IOTLB querying, do we need flush IOTLB du= ring > >>address switching? > >IMHO if guest switches address space for a device, it is required to > >send IOTLB flush as well for that device/domain. > > > >[...] >=20 > Ok. >=20 > > > >>> static void vtd_switch_address_space_all(IntelIOMMUState *s) > >>> { > >>> GHashTableIter iter; > >>>@@ -2849,6 +2914,10 @@ static void vtd_init(IntelIOMMUState *s) > >>> s->ecap |=3D VTD_ECAP_DT; > >>> } > >>>+ if (x86_iommu->pt_supported) { > >>>+ s->ecap |=3D VTD_ECAP_PT; > >>>+ } > >>Since we support migration now, need compat this for pre 2.10. > >Oh yes. If I set pt=3Doff by default, it should be okay then, right? >=20 > Right, but I think it's better to keep this on by default for performan= ce > reason. Okay. Just to confirm, that'll need one entry for HW_COMPAT_2_9, right? (though it is still not there) --=20 Peter Xu