From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48616) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aalOc-0003cq-20 for qemu-devel@nongnu.org; Tue, 01 Mar 2016 09:35:34 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aalOY-0006id-RP for qemu-devel@nongnu.org; Tue, 01 Mar 2016 09:35:29 -0500 Received: from goliath.siemens.de ([192.35.17.28]:34298) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aalOY-0006iN-IT for qemu-devel@nongnu.org; Tue, 01 Mar 2016 09:35:26 -0500 References: <1456078260-6669-1-git-send-email-davidkiarie4@gmail.com> <20160301134419-mutt-send-email-mst@redhat.com> <56D59DA3.3040002@siemens.com> <20160301155010-mutt-send-email-mst@redhat.com> <56D5A358.6070800@siemens.com> <56D5A49D.7050704@siemens.com> <20160301162142-mutt-send-email-mst@redhat.com> From: Jan Kiszka Message-ID: <56D5A8AA.5020409@siemens.com> Date: Tue, 1 Mar 2016 15:35:22 +0100 MIME-Version: 1.0 In-Reply-To: <20160301162142-mutt-send-email-mst@redhat.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [V6 0/4] AMD IOMMU List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: valentine.sinitsyn@gmail.com, marcel@redhat.com, David Kiarie , qemu-devel@nongnu.org On 2016-03-01 15:30, Michael S. Tsirkin wrote: > On Tue, Mar 01, 2016 at 03:18:05PM +0100, Jan Kiszka wrote: >> On 2016-03-01 15:12, Jan Kiszka wrote: >>> On 2016-03-01 14:55, Michael S. Tsirkin wrote: >>>> On Tue, Mar 01, 2016 at 02:48:19PM +0100, Jan Kiszka wrote: >>>>> On 2016-03-01 14:07, Michael S. Tsirkin wrote: >>>>>> On Sun, Feb 21, 2016 at 09:10:56PM +0300, David Kiarie wrote: >>>>>>> Hello there, >>>>>>> >>>>>>> Repost, AMD IOMMU patches version 6. >>>>>>> >>>>>>> Changes since version 5 >>>>>>> -Fixed macro formating issues >>>>>>> -changed occurences of IO MMU to IOMMU for consistency >>>>>>> -Fixed capability registers duplication >>>>>>> -Rebased to current master >>>>>>> >>>>>>> David Kiarie (4): >>>>>>> hw/i386: Introduce AMD IOMMU >>>>>>> hw/core: Add AMD IOMMU to machine properties >>>>>>> hw/i386: ACPI table for AMD IOMMU >>>>>>> hw/pci-host: Emulate AMD IOMMU >>>>>> >>>>>> I went over AMD IOMMU spec. >>>>>> I'm concerned that it appears that there's no chance for it to >>>>>> work correctly if host caches invalid PTE entries. >>>>>> >>>>>> The spec vaguely discusses write-protecting such PTEs but >>>>>> that would be very complex if it can be made to work at all. >>>>>> >>>>>> This means that this can't work with e.g. VFIO. >>>>>> It can only work with emulated devices. >>>>> >>>>> You mean it can't work if we program a real IOMMU (for VFIO) with >>>>> translated data from the emulated one but cannot track any updates of >>>>> the related page tables because the guest is not required to issue >>>>> traceable flush requests? Hmm, too bad. >>>>> >>>>>> >>>>>> OTOH VTD can easily support PTE shadowing by setting a flag. >>>>> >>>>> Do you mean RWBF=1 in the CAP register? Given that "Newer hardware >>>>> implementations are expected to NOT require explicit software flushing >>>>> of write buffers and report RWBF=0 in the Capability register", we may >>>>> eventually run into guests that no longer check that flag if we expose >>>>> something that looks like a "newer" implementation. >>>> >>>> Hopefully not, if that happens we'll have to do a PV IOMMU :) >>> >>> Please not. >>> >>>> >>>>> However, this flag is not set right now in our VT-d model. >>>>>> >>>>>> I'd like us to find some way to avoid possibility >>>>>> of user error creating a configuration mixing e.g. >>>>>> vfio with the amd iommu. >>>>>> >>>>>> I'm not sure how to do this. >>>>>> >>>>>> Any idea? >>>>> >>>>> There is likely no way around write-protecting the IOMMU page tables (in >>>>> KVM mode) once we evaluated and cached them somewhere. >>>> >>>> Well for one, it's possible to use vt-d and not amd iommu. >>> >>> That would lead to nice combos of AMD CPUs with VT-d IOMMU. While it may >>> be possible, I wouldn't rely on guests having tested that combination >>> very well. >> >> To make the concern more concrete: I'm playing with code that will reuse >> the MMU page tables for the IOMMU - the AMD architecture is designed for >> that optimization (in contrast to Intel's). So, if the guest is not >> foreseeing that artificial combo above (ours will definitely not) >> because it is designed around the reuse, it will at least fail to run. >> >> Jan > > So if you have an AMD iommu on the host and that is capable > of 2-level translation, then the flushing problem > can be fixed by a kind of iommu pass-through > where you point the host's iommu to guest's page tables. Yes, right, that could be another approach - provided the tables have compatible entries. I didn't look details of any of both so far, but I wouldn't be overly optimistic. Usually, hardware is not very well designed for interesting nesting purposes. > > So maybe what you need to do is make it possible > for device to query iommu and ask whether it > supports devices caching invalid PTEs. > If not, vfio could fail. Makes sense. Jan -- Siemens AG, Corporate Technology, CT RDA ITP SES-DE Corporate Competence Center Embedded Linux