From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alex Williamson Subject: Re: VFIO based vGPU(was Re: [Announcement] 2015-Q3 release of XenGT - a Mediated ...) Date: Tue, 26 Jan 2016 09:37:29 -0700 Message-ID: <1453826249.26652.54.camel@redhat.com> References: <569C5071.6080004@intel.com> <1453092476.32741.67.camel@redhat.com> <569CA8AD.6070200@intel.com> <1453143919.32741.169.camel@redhat.com> <569F4C86.2070501@intel.com> <56A6083E.10703@intel.com> <1453757426.32741.614.camel@redhat.com> <56A72313.9030009@intel.com> <56A77D2D.40109@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: "Tian, Kevin" , Gerd Hoffmann , Paolo Bonzini , "Lv, Zhiyuan" , "Ruan, Shuai" , "kvm@vger.kernel.org" , qemu-devel , "igvt-g@lists.01.org" , Neo Jia To: Yang Zhang , Jike Song Return-path: Received: from mx1.redhat.com ([209.132.183.28]:42064 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966712AbcAZQhb (ORCPT ); Tue, 26 Jan 2016 11:37:31 -0500 In-Reply-To: <56A77D2D.40109@gmail.com> Sender: kvm-owner@vger.kernel.org List-ID: On Tue, 2016-01-26 at 22:05 +0800, Yang Zhang wrote: > On 2016/1/26 15:41, Jike Song wrote: > > On 01/26/2016 05:30 AM, Alex Williamson wrote: > > > [cc +Neo @Nvidia] > > >=20 > > > Hi Jike, > > >=20 > > > On Mon, 2016-01-25 at 19:34 +0800, Jike Song wrote: > > > > On 01/20/2016 05:05 PM, Tian, Kevin wrote: > > > > > I would expect we can spell out next level tasks toward above > > > > > direction, upon which Alex can easily judge whether there are > > > > > some common VFIO framework changes that he can help :-) > > > >=20 > > > > Hi Alex, > > > >=20 > > > > Here is a draft task list after a short discussion w/ Kevin, > > > > would you please have a look? > > > >=20 > > > > Bus Driver > > > >=20 > > > > { in i915/vgt/xxx.c } > > > >=20 > > > > - define a subset of vfio_pci interfaces > > > > - selective pass-through (say aperture) > > > > - trap MMIO: interface w/ QEMU > > >=20 > > > What's included in the subset?=C2=A0=C2=A0Certainly the bus reset= ioctls really > > > don't apply, but you'll need to support the full device interface= , > > > right?=C2=A0=C2=A0That includes the region info ioctl and access = through the vfio > > > device file descriptor as well as the interrupt info and setup io= ctls. > > >=20 > >=20 > > [All interfaces I thought are via ioctl:)=C2=A0=C2=A0For other stuf= f like file > > descriptor we'll definitely keep it.] > >=20 > > The list of ioctl commands provided by vfio_pci: > >=20 > > - VFIO_DEVICE_GET_PCI_HOT_RESET_INFO > > - VFIO_DEVICE_PCI_HOT_RESET > >=20 > > As you said, above 2 don't apply. But for this: > >=20 > > - VFIO_DEVICE_RESET > >=20 > > In my opinion it should be kept, no matter what will be provided in > > the bus driver. > >=20 > > - VFIO_PCI_ROM_REGION_INDEX > > - VFIO_PCI_VGA_REGION_INDEX > >=20 > > I suppose above 2 don't apply neither? For a vgpu we don't provide = a > > ROM BAR or VGA region. > >=20 > > - VFIO_DEVICE_GET_INFO > > - VFIO_DEVICE_GET_REGION_INFO > > - VFIO_DEVICE_GET_IRQ_INFO > > - VFIO_DEVICE_SET_IRQS > >=20 > > Above 4 are needed of course. > >=20 > > We will need to extend: > >=20 > > - VFIO_DEVICE_GET_REGION_INFO > >=20 > >=20 > > a) adding a flag: DONT_MAP. For example, the MMIO of vgpu > > should be trapped instead of being mmap-ed. >=20 > I may not in the context, but i am curious how to handle the DONT_MAP= in=C2=A0 > vfio driver? Since there are no real MMIO maps into the region and i=C2= =A0 > suppose the access to the region should be handled by vgpu in i915=C2= =A0 > driver, but currently most of the mmio accesses are handled by Qemu. VFIO supports the following region attributes: #define VFIO_REGION_INFO_FLAG_READ=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0(= 1 << 0) /* Region supports read */ #define VFIO_REGION_INFO_FLAG_WRITE=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0(1 << = 1) /* Region supports write */ #define VFIO_REGION_INFO_FLAG_MMAP=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0(= 1 << 2) /* Region supports mmap */ If MMAP is not set, then the QEMU driver will do pread and/or pwrite to the specified offsets of the device file descriptor, depending on what accesses are supported. =C2=A0This is all reported through the REGION_I= NFO ioctl for a given index. =C2=A0If mmap is supported, the VM will have d= irect access to the area, without faulting to KVM other than to populate the mapping. =C2=A0Without mmap support, a VM MMIO access traps into KVM, w= hich returns out to QEMU to service the request, which then finds the MemoryRegion serviced through vfio, which will then perform a pread/pwrite through to the kernel vfio bus driver to handle the access. =C2=A0Thanks, Alex