From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alex Williamson Subject: Re: [iGVT-g] [vfio-users] [PATCH v3 00/11] igd passthrough chipset tweaks Date: Thu, 28 Jan 2016 19:54:54 -0700 Message-ID: <1454036094.23148.9.camel@redhat.com> References: <1451994098-6972-1-git-send-email-kraxel@redhat.com> <1454009759.7183.7.camel@redhat.com> <003AAFE53969E14CB1F09B6FD68C3CD47BB669D2@ORSMSX106.amr.corp.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <003AAFE53969E14CB1F09B6FD68C3CD47BB669D2@ORSMSX106.amr.corp.intel.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+gceq-qemu-devel=gmane.org@nongnu.org Sender: qemu-devel-bounces+gceq-qemu-devel=gmane.org@nongnu.org To: "Kay, Allen M" , Gerd Hoffmann , "qemu-devel@nongnu.org" Cc: "igvt-g@ml01.01.org" , "xen-devel@lists.xensource.com" , Eduardo Habkost , Stefano Stabellini , Cao jin , "vfio-users@redhat.com" List-Id: xen-devel@lists.xenproject.org On Fri, 2016-01-29 at 02:22 +0000, Kay, Allen M wrote: >=C2=A0 > > -----Original Message----- > > From: iGVT-g [mailto:igvt-g-bounces@lists.01.org] On Behalf Of Alex > > Williamson > > Sent: Thursday, January 28, 2016 11:36 AM > > To: Gerd Hoffmann; qemu-devel@nongnu.org > > Cc: igvt-g@ml01.01.org; xen-devel@lists.xensource.com; Eduardo Habkos= t; > > Stefano Stabellini; Cao jin; vfio-users@redhat.com > > Subject: Re: [iGVT-g] [vfio-users] [PATCH v3 00/11] igd passthrough c= hipset > > tweaks > >=C2=A0 > >=C2=A0 > > 1) The OpRegion MemoryRegion is mapped into system_memory through > > programming of the 0xFC config space register. > > =C2=A0a) vfio-pci could pick an address to do this as it is realized. > > =C2=A0b) SeaBIOS/OVMF could program this. > >=C2=A0 > > Discussion: 1.a) Avoids any BIOS dependency, but vfio-pci would need = to pick > > an address and mark it as e820 reserved.=C2=A0=C2=A0I'm not sure how = to pick that > > address.=C2=A0=C2=A0We'd probably want to make the 0xFC config regist= er read- > > only.=C2=A0=C2=A01.b) has the issue you mentioned where in most cases= the OpRegion > > will be 8k, but the BIOS won't know how much address space it's mappi= ng > > into system memory when it writes the 0xFC register.=C2=A0=C2=A0I don= 't know how > > much of a problem this is since the BIOS can easily determine the siz= e once > > mapped and re-map it somewhere there's sufficient space. > > Practically, it seems like it's always going to be 8K.=C2=A0=C2=A0Thi= s of course requires > > modification to every BIOS.=C2=A0=C2=A0It also leaves the 0xFC regist= er as a mapping > > control rather than a pointer to the OpRegion in RAM, which doesn't r= eally > > match real hardware.=C2=A0=C2=A0The BIOS would need to pick an addres= s in this case. > >=C2=A0 > > 2) Read-only mappings version of 1) > >=C2=A0 > > Discussion: Really nothing changes from the issues above, just preven= ts any > > possibility of the guest modifying anything in the host.=C2=A0=C2=A0X= en apparently allows > > write access to the host page already. > >=C2=A0 > > 3) Copy OpRegion contents into buffer and do either 1) or 2) above. > >=C2=A0 > > Discussion: No benefit that I can see over above other than maybe all= owing > > write access that doesn't affect the host. > >=C2=A0 > > 4) Copy contents into a guest RAM location, mark it reserved, point t= o it via > > 0xFC config as scratch register. > > =C2=A0a) Done by QEMU (vfio-pci) > > =C2=A0b) Done by SeaBIOS/OVMF > >=C2=A0 > > Discussion: This is the most like real hardware.=C2=A0=C2=A04.a) has = the usual issue of > > how to pick an address, but the benefit of not requiring BIOS changes= (simply > > mark the RAM reserved via existing methods).=C2=A0=C2=A04.b) would re= quire passing a > > buffer containing the contents of the OpRegion via fw_cfg and letting= the > > BIOS do the setup.=C2=A0=C2=A0The latter of course requires modifying= each BIOS for this > > support. > >=C2=A0 > > Of course none of these support hotplug nor really can they since res= erved > > memory regions are not dynamic in the architecture. > >=C2=A0 > > In all cases, some piece of software needs to know where it can place= the > > OpRegion in guest memory.=C2=A0=C2=A0It seems like there are advantag= es or > > disadvantages whether that's done by QEMU or the BIOS, but we only ne= ed > > to do it once if it's QEMU.=C2=A0=C2=A0Suggestions, comments, prefere= nces? > >=C2=A0 >=C2=A0 > Hi Alex, another thing to consider is how to communicate to the guest d= river the address at 0xFC contains a valid GPA address that can be access= ed by the driver without causing a EPT fault - since > the same driver will be used on other hypervisors and they may not EPT = map OpRegion memory.=C2=A0=C2=A0On idea proposed by display driver team i= s to set bit0 of the address to 1 for indicating OpRegion memory > can be safely accessed by the guest driver. Hi Allen, Why is that any different than a guest accessing any other memory area that it shouldn't?=C2=A0=C2=A0The OpRegion starts with a 16-byte ID strin= g, so if the guest finds that it should feel fairly confident the OpRegion data is valid.=C2=A0=C2=A0The published spec also seems to define all bits of = 0xfc as valid, not implying any sort of alignment requirements, and the i915 driver does a memremap directly on the value read from 0xfc.=C2=A0=C2=A0S= o I'm not sure whether there's really a need to or ability to define any of those bits in an adhoc way to indicate mapping.=C2=A0=C2=A0If we do things righ= t, shouldn't the guest driver not even know it's running in a VM, at least for the KVMGT-d case, so we need to be compatible with physical hardware.=C2=A0=C2=A0Thanks, Alex