From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37399) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X1Y32-0007et-GQ for qemu-devel@nongnu.org; Mon, 30 Jun 2014 05:38:58 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1X1Y2w-0001oo-9o for qemu-devel@nongnu.org; Mon, 30 Jun 2014 05:38:52 -0400 Received: from mga09.intel.com ([134.134.136.24]:51332) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X1Y2v-0001n2-SC for qemu-devel@nongnu.org; Mon, 30 Jun 2014 05:38:46 -0400 Message-ID: <53B1300D.10001@intel.com> Date: Mon, 30 Jun 2014 17:38:21 +0800 From: "Chen, Tiejun" MIME-Version: 1.0 References: <20140625090925.GH32652@redhat.com> <53AA9480.1010005@intel.com> <53AA96DF.6070501@redhat.com> <53AA9B58.6050803@intel.com> <53AA9C4E.9070506@redhat.com> <53ABE551.3080407@intel.com> <53ABF00E.6000309@redhat.com> <53B0D0C5.60000@intel.com> <20140630064822.GB14491@redhat.com> <53B110CA.6070606@intel.com> <20140630090511.GB15777@redhat.com> In-Reply-To: <20140630090511.GB15777@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [v5][PATCH 0/5] xen: add Intel IGD passthrough support List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: peter.maydell@linaro.org, xen-devel@lists.xensource.com, stefano.stabellini@eu.citrix.com, allen.m.kay@intel.com, Kelly.Zytaruk@amd.com, qemu-devel@nongnu.org, yang.z.zhang@intel.com, anthony@codemonkey.ws, anthony.perard@citrix.com, Paolo Bonzini On 2014/6/30 17:05, Michael S. Tsirkin wrote: > On Mon, Jun 30, 2014 at 03:24:58PM +0800, Chen, Tiejun wrote: >> On 2014/6/30 14:48, Michael S. Tsirkin wrote: >>> On Mon, Jun 30, 2014 at 10:51:49AM +0800, Chen, Tiejun wrote: >>>> On 2014/6/26 18:03, Paolo Bonzini wrote: >>>>> Il 26/06/2014 11:18, Chen, Tiejun ha scritto: >>>>>> >>>>>>> >>>>>>> - offsets 0x0000..0x0fff map to configuration space of the host MCH >>>>>>> >>>>>> >>>>>> Are you saying the config space in the video device? >>>>> >>>>> No, I am saying in a new BAR, or at some magic offset of an existing >>>>> MMIO BAR. >>>>> >>>> >>>> As I mentioned previously, the IGD guy told me we have no any unused a >>>> offset or BAR in the config space. >>>> >>>> And guy who are responsible for the native driver seems not be accept to >>>> extend some magic offset of an existing MMIO BAR. >>>> >>>> In addition I think in a short time its not possible to migrate i440fx to >>>> q35 as a PCIe machine of xen. >>> >>> That seems like a weak motivation. I don't see a need to get something >>> merged upstream in a short time: this seems sure to miss 2.1, >>> so you have the time to make it architecturally sound. >>> "Making existing guests work" would be a better motivation. >> >> Yes. > > So focus on this then. Existing guests will probably work > fine on a newer chipset - likely better than on i440fx. > xen management tools need to do some work to support this? > That will just give everyone more long term benefits. > > If instead we create a hack that does not resemble > any existing hardware even remotely, what's the > chance that it will not break with any future > guest modification? > > >>> Isn't this possible with an mch chipset? >> >> If you're saying q35, I mean AFAIK we have no any plan to migrate to this >> MCH in xen case. > > q35 or a newer chipset that's closer to what guests expect. > >> Additionally, I think I should follow this feature after >> q35 can work for xen scenario. > > What's stopping you? I mean if you want create an new machine based on q35, actually this is equal to start making xen to migrate to q35 now. Right? I can't image how much effort should be done. So this is a reason why I'm saying I'd like to follow this feature after q35 can work with xen completely. > >>> >>> >>>> So could we do this step by step: >>>> >>>> #1 phase: We just cover current qemu-xen implementation based on i44fx, so >>>> still provide that pseudo ISA bridge at 00:1f.0 as we already did. >>> >>> By the way there is no reason to put it at 00:1f.0 specifically I think. >>> So it seems simple: create a dummy device that gets device and >>> vendor id as properties. If you really like, add an option to get values >> >> Yes, this is just what we did in [Xen-devel] [v5][PATCH 2/5] xen, gfx >> passthrough: create pseudo intel isa bridge. There, we fake this device just >> at 00:1f.0. >> But you guys don't like this, and shouldn't this be just this point we >> discussing now? >> >> If you guys agree that , everything is fine. > > Actually, this isn't what you did. > Don't tie it to xen, and don't tie it to 1f. > Just make it a simple stub pci device. > Whoever wants it, creates it. > > The thing I worry about, is the chance this will break going forward. > So you created a system with 2 isa bridges. I don't set class type to claim this is an ISA bridge, just a normal PCI device. +static int create_pseudo_pch_isa_bridge(PCIBus *bus, XenHostPCIDevice *hdev) +{ + struct PCIDevice *dev; + + char rid; + + /* We havt to use a simple PCI device to fake this ISA bridge + * to avoid making some confusion to BIOS and ACPI. + */ + dev = pci_create(bus, PCI_DEVFN(0x1f, 0), "pseudo-intel-pch-isa-bridge"); + + qdev_init_nofail(&dev->qdev); + + pci_config_set_vendor_id(dev->config, hdev->vendor_id); + pci_config_set_device_id(dev->config, hdev->device_id); + + xen_host_pci_get_block(hdev, PCI_REVISION_ID, (uint8_t *)&rid, 1); + + pci_config_set_revision(dev->config, rid); + + XEN_PT_LOG(dev, "The pseudo Intel PCH ISA bridge created.\n"); + return 0; +} > This is already not something that exists on real hardware. > So it might break some guests that will get confused. > Maybe we are lucky and most guests see an unfamiliar device > and ignore it. It seems believable. > > But your MCH hack overrides registers in the pci host. We just try to write *one* register we already confirm this is safe enough. Other register are read-only. > Are we lucky and there's nothing in these registers > of interest to guests? This seems much more fragile. > So please poke at the spec, and compile the list > of registers you want to touch, figure out why they are > safe to override, and put this all in code comments. > > And the same thing that applies to the isa bridge We just expose its own vendor/device ids here. We don't access to change anything in the isa bridge. > applies here too. It should work without QEMU touching > hosts' hardware. > > > >> >from sysfs: device and vendor id are world readable, so just get them >>> directly and not through xen wrappers, this way you can open the files >>> RO and not RW. >>> You seem to poke at revision as well, I don't see >>> driver looking at that - strictly necessary? >>> If yes please patch host kernel to expose that info in sysfs, >>> though we can fall back on pci config if not there. >>> >>> MCH (bridge_dev) hacks in i915 are nastier. >>> To clean them up, we really have to have an explicit driver for this >>> bridge, not a pass-through device. Long term, the right thing to do is >>> likely to extend host driver and expose the necessary information in >>> sysfs on host kernel. >>> >>> >> >> I'm a bit confused. Any sysfs should be filled by the associated PCIe >> device, shouldn't it? So qemu still need to emulate this PCIe device >> firstly, then set properties into sysfs. > > I am talking about getting host properties into qemu. > You don't want to give qemu R/W root access to host sysfs system > of the root bridge, that's not secure. igd_pci_read() | + xen_host_pci_get_block() | + xen_host_pci_config_read(() | + pread() So shouldn't we already expose these information via sysfs? > Avoiding read only access to filesystem is a good idea too, so it > should be possible to pass all parameters in as > device properties, and let whoever starts qemu > figure out what are reasonable values. > >>> >>> >>>> #2 phase: Now, we will choose a capability ID that won't be conflicting with >>>> others. To do this properly, we need to get one from PCI SIG group. To have >>>> this workable and consistently validated, this method shouldn't be virt >>>> specific. Then native driver should use the same method. >>> >>> You mean you will be able to talk sense into hardware guys? >>> I doubt that. If they could be convinced to make e.g. i915 base a >> >> We're negotiating this, so this is just our long term solution we figure out >> currently. >> >>> proper BAR, why didn't they? >> >> We already have no any free BAR as we mentioned previously. > > I thought you were talking about modifying hardware? Yes. Tiejun > >>> >>> >>>> So when xen work on >>>> q35 PCIe machine, we can walk this way. >>> >>> If you are emulating MCH anyway, pick one that is close >>> to what i915 driver expects. It would then work with existing >> >> Looks you guys prefer we create a new MCH anyway, right? But is it necessary >> to create a new based on i440fx just for a little change? >> >> Thanks >> Tiejun > > You can inherit it. Maybe you are lucky and this happens to > work without conflicting with whatever other guests want to do. > But if you ask me, you are really just piling up hacks. > If some guest does not work on i440, you should just work on > emulating whatever it does work on. > That would have real value. > >>> devices, without new capability IDs. >>> >>> >>>> Anthony, >>>> >>>> Any comments to address this in xen case? >>>> >>>> Thanks >>>> Tiejun >>> >>> > >