From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56367) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b9xxO-0003uz-Gw for qemu-devel@nongnu.org; Mon, 06 Jun 2016 13:04:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1b9xxM-0001i6-DD for qemu-devel@nongnu.org; Mon, 06 Jun 2016 13:04:53 -0400 Received: from mx1.redhat.com ([209.132.183.28]:60046) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b9xxM-0001hl-5A for qemu-devel@nongnu.org; Mon, 06 Jun 2016 13:04:52 -0400 Date: Mon, 6 Jun 2016 11:02:11 -0600 From: Alex Williamson Message-ID: <20160606110211.2c9bc8ef@ul30vt.home> In-Reply-To: <20160606134317.GJ21254@pxdev.xzpeter.org> References: <1463847590-22782-1-git-send-email-bd.aviv@gmail.com> <1463847590-22782-2-git-send-email-bd.aviv@gmail.com> <57408FDB.1010000@web.de> <20160602084439.GB3477@pxdev.xzpeter.org> <20160602070046.761be49c@ul30vt.home> <5750313C.4000709@web.de> <20160606050407.GB21254@pxdev.xzpeter.org> <20160606071141.31d2008e@ul30vt.home> <20160606134317.GJ21254@pxdev.xzpeter.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH v3 1/3] IOMMU: add VTD_CAP_CM to vIOMMU capability exposed to guest List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Xu Cc: Jan Kiszka , "Aviv B.D" , qemu-devel@nongnu.org, "Michael S. Tsirkin" On Mon, 6 Jun 2016 21:43:17 +0800 Peter Xu wrote: > On Mon, Jun 06, 2016 at 07:11:41AM -0600, Alex Williamson wrote: > > On Mon, 6 Jun 2016 13:04:07 +0800 > > Peter Xu wrote: =20 > [...] > > > Besides the reason that there might have guests that do not support > > > CM=3D1, will there be performance considerations? When user's > > > configuration does not require CM capability (e.g., generic VM > > > configuration, without VFIO), shall we allow user to disable the CM > > > bit so that we can have better IOMMU performance (avoid extra and > > > useless invalidations)? =20 > >=20 > > With Alexey's proposed patch to have callback ops when the iommu > > notifier list adds its first entry and removes its last, any of the > > additional overhead to generate notifies when nobody is listening can > > be avoided. These same callbacks would be the ones that need to > > generate a hw_error if a notifier is added while running in CM=3D0. =20 >=20 > Not familar with Alexey's patch https://lists.nongnu.org/archive/html/qemu-devel/2016-06/msg00079.html >, but is that for VFIO only? vfio is currently the only user of the iommu notifier, but the interface is generic, which is how it should (must) be. > I mean, if > we configured CMbit=3D1, guest kernel will send invalidation request > every time it creates new entries (context entries, or iotlb > entries). Even without VFIO notifiers, guest need to trap into QEMU > and process the invalidation requests. This is avoidable if we are not > using VFIO devices at all (so no need to maintain any mappings), > right? CM=3D1 only defines that not-present and invalid entries can be cached, any changes to existing entries requires an invalidation regardless of CM. What you're looking for sounds more like ECAP.C: C: Page-walk Coherency This field indicates if hardware access to the root, context, extended-context and interrupt-remap tables, and second-level paging structures for requests-without PASID, are coherent (snooped) or not. =E2=80=A2 0: Indicates hardware accesses to remapping structures are no= n-coherent. =E2=80=A2 1: Indicates hardware accesses to remapping structures are co= herent. Without both CM=3D0 and C=3D0, our only virtualization mechanism for maintaining a hardware cache coherent with the guest view of the iommu would be to shadow all of the VT-d structures. For purely emulated devices, maybe we can get away with that, but I doubt the current ghashes used for the iotlb are prepared for it. > If we allow user to specify cmbit=3D{0|1}, user can decide whether > he/she would like to take this benefit. So long as the *default* gives us the ability to support an external hardware cache, like vfio, and we generate a hw_error or equivalent to avoid unsafe combinations, you're free to enable whatever other shortcuts you want. Thanks, Alex