From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44686) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bv5hj-0007Vu-8B for qemu-devel@nongnu.org; Fri, 14 Oct 2016 12:51:32 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bv5he-00015o-VF for qemu-devel@nongnu.org; Fri, 14 Oct 2016 12:51:30 -0400 Received: from mx1.redhat.com ([209.132.183.28]:60732) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bv5he-00015h-L8 for qemu-devel@nongnu.org; Fri, 14 Oct 2016 12:51:26 -0400 Date: Fri, 14 Oct 2016 10:51:24 -0600 From: Alex Williamson Message-ID: <20161014105124.42b438a6@t450s.home> In-Reply-To: <20161014163545.GA6121@nvidia.com> References: <20161010180140.GA27757@nvidia.com> <1259cdba-c137-c3da-abe2-ecf51aec6738@linux.intel.com> <523e1446-75f1-fe3a-d818-f7d238d57751@redhat.com> <5800B579.9000705@intel.com> <20161014084158.623087aa@t450s.home> <20161014084601.2a50ba87@t450s.home> <20161014163545.GA6121@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 1/2] KVM: page track: add a new notifier type: track_flush_slot List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Neo Jia Cc: Jike Song , Paolo Bonzini , "Tian, Kevin" , kvm@vger.kernel.org, guangrong.xiao@intel.com, Xiaoguang Chen , qemu-devel , Kirti Wankhede , Xiao Guangrong On Fri, 14 Oct 2016 09:35:45 -0700 Neo Jia wrote: > On Fri, Oct 14, 2016 at 08:46:01AM -0600, Alex Williamson wrote: > > On Fri, 14 Oct 2016 08:41:58 -0600 > > Alex Williamson wrote: > > > > > On Fri, 14 Oct 2016 18:37:45 +0800 > > > Jike Song wrote: > > > > > > > On 10/11/2016 05:47 PM, Paolo Bonzini wrote: > > > > > > > > > > > > > > > On 11/10/2016 11:21, Xiao Guangrong wrote: > > > > >> > > > > >> > > > > >> On 10/11/2016 04:54 PM, Paolo Bonzini wrote: > > > > >>> > > > > >>> > > > > >>> On 11/10/2016 04:39, Xiao Guangrong wrote: > > > > >>>> > > > > >>>> > > > > >>>> On 10/11/2016 02:32 AM, Paolo Bonzini wrote: > > > > >>>>> > > > > >>>>> > > > > >>>>> On 10/10/2016 20:01, Neo Jia wrote: > > > > >>>>>>> Hi Neo, > > > > >>>>>>> > > > > >>>>>>> AFAIK this is needed because KVMGT doesn't paravirtualize the PPGTT, > > > > >>>>>>> while nVidia does. > > > > >>>>>> > > > > >>>>>> Hi Paolo and Xiaoguang, > > > > >>>>>> > > > > >>>>>> I am just wondering how device driver can register a notifier so he > > > > >>>>>> can be > > > > >>>>>> notified for write-protected pages when writes are happening. > > > > >>>>> > > > > >>>>> It can't yet, but the API is ready for that. kvm_vfio_set_group is > > > > >>>>> currently where a struct kvm_device* and struct vfio_group* touch. > > > > >>>>> Given > > > > >>>>> a struct kvm_device*, dev->kvm provides the struct kvm to be passed to > > > > >>>>> kvm_page_track_register_notifier. So I guess you could add a callback > > > > >>>>> that passes the struct kvm_device* to the mdev device. > > > > >>>>> > > > > >>>>> Xiaoguang and Guangrong, what were your plans? We discussed it briefly > > > > >>>>> at KVM Forum but I don't remember the details. > > > > >>>> > > > > >>>> Your suggestion was that pass kvm fd to KVMGT via VFIO, so that we can > > > > >>>> figure out the kvm instance based on the fd. > > > > >>>> > > > > >>>> We got a new idea, how about search the kvm instance by mm_struct, it > > > > >>>> can work as KVMGT is running in the vcpu context and it is much more > > > > >>>> straightforward. > > > > >>> > > > > >>> Perhaps I didn't understand your suggestion, but the same mm_struct can > > > > >>> have more than 1 struct kvm so I'm not sure that it can work. > > > > >> > > > > >> vcpu->pid is valid during vcpu running so that it can be used to figure > > > > >> out which kvm instance owns the vcpu whose pid is the one as current > > > > >> thread, i think it can work. :) > > > > > > > > > > No, don't do that. There's no reason for a thread to run a single VCPU, > > > > > and if you can have multiple VCPUs you can also have multiple VCPUs from > > > > > multiple VMs. > > > > > > > > > > Passing file descriptors around are the right way to connect subsystems. > > > > > > > > [CC Alex, Kevin and Qemu-devel] > > > > > > > > Hi Paolo & Alex, > > > > > > > > IIUC, passing file descriptors means touching QEMU and the UAPI between > > > > QEMU and VFIO. Would you guys have a look at below draft patch? If it's > > > > on the correct direction, I'll send the split ones. Thanks! > > > > > > > > -- > > > > Thanks, > > > > Jike > > > > > > > > > > > > diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c > > > > index bec694c..f715d37 100644 > > > > --- a/hw/vfio/pci-quirks.c > > > > +++ b/hw/vfio/pci-quirks.c > > > > @@ -10,12 +10,14 @@ > > > > * the COPYING file in the top-level directory. > > > > */ > > > > > > > > +#include > > > > #include "qemu/osdep.h" > > > > #include "qemu/error-report.h" > > > > #include "qemu/range.h" > > > > #include "qapi/error.h" > > > > #include "hw/nvram/fw_cfg.h" > > > > #include "pci.h" > > > > +#include "sysemu/kvm.h" > > > > #include "trace.h" > > > > > > > > /* Use uin32_t for vendor & device so PCI_ANY_ID expands and cannot match hw */ > > > > @@ -1844,3 +1846,15 @@ void vfio_setup_resetfn_quirk(VFIOPCIDevice *vdev) > > > > break; > > > > } > > > > } > > > > + > > > > +void vfio_quirk_kvmgt(VFIOPCIDevice *vdev) > > > > +{ > > > > + int vmfd; > > > > + > > > > + if (!kvm_enabled() || !vdev->kvmgt) > > > > + return; > > > > + > > > > + /* Tell the device what KVM it attached */ > > > > + vmfd = kvm_get_vmfd(kvm_state); > > > > + ioctl(vdev->vbasedev.fd, VFIO_SET_KVMFD, vmfd); > > > > +} > > > > diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c > > > > index a5a620a..8732552 100644 > > > > --- a/hw/vfio/pci.c > > > > +++ b/hw/vfio/pci.c > > > > @@ -2561,6 +2561,8 @@ static int vfio_initfn(PCIDevice *pdev) > > > > return ret; > > > > } > > > > > > > > + vfio_quirk_kvmgt(vdev); > > > > + > > > > /* Get a copy of config space */ > > > > ret = pread(vdev->vbasedev.fd, vdev->pdev.config, > > > > MIN(pci_config_size(&vdev->pdev), vdev->config_size), > > > > @@ -2832,6 +2834,7 @@ static Property vfio_pci_dev_properties[] = { > > > > DEFINE_PROP_UINT32("x-pci-sub-device-id", VFIOPCIDevice, > > > > sub_device_id, PCI_ANY_ID), > > > > DEFINE_PROP_UINT32("x-igd-gms", VFIOPCIDevice, igd_gms, 0), > > > > + DEFINE_PROP_BOOL("kvmgt", VFIOPCIDevice, kvmgt, false), > > > > > > Just a side note, device options are a headache, users are prone to get > > > them wrong and minimally it requires an entire round to get libvirt > > > support. We should be able to detect from the device or vfio API > > > whether such a call is required. Obviously if we can use the existing > > > kvm-vfio device, that's the better option anyway. Thanks, > > > > Also, vfio devices currently have no hard dependencies on KVM, if kvmgt > > does, it needs to produce a device failure when unavailable. Thanks, > > Also, I would like to see this as an generic feature instead of > kvmgt specific interface, so we don't have to add new options to QEMU and it is > up to the vendor driver to proceed with or without it. In general this should be decided by lack of some required feature exclusively provided by KVM. I would not want to add a generic opt-out for mdev vendor drivers to decide that they arbitrarily want to disable that path. Thanks, Alex