From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:34363) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XbJAx-0001EQ-EL for qemu-devel@nongnu.org; Mon, 06 Oct 2014 21:02:56 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XbJAs-00067L-Nj for qemu-devel@nongnu.org; Mon, 06 Oct 2014 21:02:51 -0400 Received: from mga09.intel.com ([134.134.136.24]:2506) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XbJAs-000676-7u for qemu-devel@nongnu.org; Mon, 06 Oct 2014 21:02:46 -0400 Message-ID: <54333BAF.9080905@intel.com> Date: Tue, 07 Oct 2014 09:02:39 +0800 From: "Chen, Tiejun" MIME-Version: 1.0 References: <53FAC70F.1080201@intel.com> <53FBF5BD.6060104@intel.com> <53FE7E3A.7050907@intel.com> <53FFD752.2040402@intel.com> <20140831085857.GC1514@redhat.com> <5403DEFD.1020704@intel.com> <20140901060506.GA20186@redhat.com> <54042514.6090206@intel.com> <003AAFE53969E14CB1F09B6FD68C3CD478F16D2A@ORSMSX101.amr.corp.intel.com> <54277979.7070408@intel.com> <20140929100111.GA32459@redhat.com> <542A18BD.2060008@intel.com> In-Reply-To: <542A18BD.2060008@intel.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [Xen-devel] [PATCH 2/2] xen:i386:pc_piix: create isa bridge specific to IGD passthrough List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: "xen-devel@lists.xensource.com" , "Kay, Allen M" , "qemu-devel@nongnu.org" , Konrad Rzeszutek Wilk Ping... Thanks Tiejun On 2014/9/30 10:43, Chen, Tiejun wrote: > On 2014/9/29 18:01, Michael S. Tsirkin wrote: >> On Sun, Sep 28, 2014 at 10:59:05AM +0800, Chen, Tiejun wrote: >>> On 2014/9/3 9:40, Kay, Allen M wrote: >>>> >>>> >>>>> -----Original Message----- >>>>> From: Chen, Tiejun >>>>> Sent: Monday, September 01, 2014 12:50 AM >>>>> To: Michael S. Tsirkin >>>>> Cc: xen-devel@lists.xensource.com; Kay, Allen M; >>>>> qemu-devel@nongnu.org; >>>>> Konrad Rzeszutek Wilk >>>>> Subject: Re: [Qemu-devel] [Xen-devel] [PATCH 2/2] xen:i386:pc_piix: >>>>> create >>>>> isa bridge specific to IGD passthrough >>>>> >>>>> On 2014/9/1 14:05, Michael S. Tsirkin wrote: >>>>>> On Mon, Sep 01, 2014 at 10:50:37AM +0800, Chen, Tiejun wrote: >>>>>>> On 2014/8/31 16:58, Michael S. Tsirkin wrote: >>>>>>>> On Fri, Aug 29, 2014 at 09:28:50AM +0800, Chen, Tiejun wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2014/8/28 8:56, Chen, Tiejun wrote: >>>>>>>>>>>>>>>> + */ >>>>>>>>>>>>>>>> + dev = pci_create_simple(bus, PCI_DEVFN(0x1f, 0), >>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>> "xen-igd-passthrough-isa-bridge"); >>>>>>>>>>>>>>>> + if (dev) { >>>>>>>>>>>>>>>> + r = xen_host_pci_device_get(&hdev, 0, 0, >>>>>>>>>>>>>>>> + PCI_DEVFN(0x1f, >>>>>>>>>>>>>>>> 0), 0); >>>>>>>>>>>>>>>> + if (!r) { >>>>>>>>>>>>>>>> + pci_config_set_vendor_id(dev->config, >>>>> hdev.vendor_id); >>>>>>>>>>>>>>>> + pci_config_set_device_id(dev->config, >>>>>>>>>>>>>>>> + hdev.device_id); >>>>>>>>>>>>> >>>>>>>>>>>>> Can you, instead, implement the reverse logic, probing the >>>>>>>>>>>>> card >>>>>>>>>>>>> and supplying the correct device id for PCH? >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Here what is your so-called reverse logic as I already asked >>>>>>>>>>>> you >>>>>>>>>>>> previously? Do you mean I should list all PCHs with a combo >>>>>>>>>>>> illustrated with the vendor/device id in advance? Then look up >>>>>>>>>>>> if we can find a >>>>>>>>>>> >>>>>>>>>>> Michael, >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Ping. >>>>>>>>>> >>>>>>>>>> Thanks >>>>>>>>>> Tiejun >>>>>>>>>> >>>>>>>>>>> Could you explain this exactly? Then I can try follow-up your >>>>>>>>>>> idea ASAP if this is necessary and possible. >>>>>>>>> >>>>>>>>> Michel, >>>>>>>>> >>>>>>>>> Could you give us some explanation for your "reverse logic" when >>>>>>>>> you're free? >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> Tiejun >>>>>>>> >>>>>>>> So future drivers will look at device ID for the card and figure >>>>>>>> out >>>>>>>> how things should work from there. >>>>>>>> Old drivers still poke at device id of the chipset for this, but >>>>>>>> maybe qemu can do what newer drivers do: >>>>>>>> look at the card and figure out what guest should do, then present >>>>>>>> the appropriate chipset id. >>>>>>>> >>>>>>>> This is based on what Jesse said: >>>>>>>> Practically speaking, we could probably assume specific GPU/PCH >>>>> combos, >>>>>>>> as I don't think they're generally mixed across generations, >>>>>>>> though >>>>> SNB >>>>>>>> and IVB did have compatible sockets, so there is the >>>>>>>> possibility of >>>>>>>> mixing CPT and PPT PCHs, but those are register identical as >>>>>>>> far as the >>>>>>>> graphics driver is concerned, so even that should be safe. >>>>>>>> >>>>>>> >>>>>>> Michael, >>>>>>> >>>>>>> Thanks for your explanation. >>>>>>> >>>>>>>> so the idea is to have a reverse table mapping GPU back to PCH. >>>>>>>> Present to guest the ID that will let it assume the correct GPU. >>>>>>> >>>>>>> I guess you mean we should create to maintain such a table: >>>>>>> >>>>>>> [GPU Card: device_id(s), PCH: device_id] >>>>>>> >>>>>>> Then each time, instead of exposing that real PCH device id >>>>>>> directly, >>>>>>> qemu first can get the real GPU device id to lookup this table to >>>>>>> present a appropriate PCH's device id to the guest. >>>>>>> >>>>>>> And looks here that appropriate PCH's device id is not always a that >>>>>>> real PCH's device id. Right? If I'm wrong please correct me. >>>>>> >>>>>> Exactly: we don't really care what the PCH ID is - we only need it >>>>>> for >>>>>> the guest driver to do the right thing. >>>>> >>>>> Agreed. >>>>> >>>>> I need to ask Allen to check if one given GPU card device id is always >>>>> corresponding to one given PCH on both HSW and BDW currently. If >>>>> yes, I can >>>>> do this quickly. Otherwise I need some time to establish that sort >>>>> of connection. >>> >>> Michael, >>> >>> Sorry for this delay response but please keep in mind we really are >>> in this >>> process. >>> >>> You know this involve different GPU components we have to take some >>> time to >>> communicate or even discuss internal. >>> >>> Now I have a draft codes, could you take a look at this? I hope that >>> comment >>> can help us to understand something, but if you have any question, we >>> can >>> further respond inline. >>> >>> --- >>> typedef struct { >>> uint16_t gpu_device_id; >>> uint16_t pch_device_id; >>> } XenIGDDeviceIDInfo; >>> >>> /* In real world different GPU should have different PCH. But actually >>> * the different PCH DIDs likely map to different PCH SKUs. We do the >>> * same thing for the GPU. For PCH, the different SKUs are going to be >>> * all the same silicon design and implementation, just different >>> * features turn on and off with fuses. The SW interfaces should be >>> * consistent across all SKUs in a given family (eg LPT). But just same >>> * features may not be supported. >>> * >>> * Most of these different PCH features probably don't matter to the >>> * Gfx driver, but obviously any difference in display port connections >>> * will so it should be fine with any PCH in case of passthrough. >>> * >>> * So currently use one PCH version, 0x8c4e, to cover all HSW >>> scenarios, >>> * 0x9cc3 for BDW. >> >> Pls write Haswell and Broadwell in full in comment. > > Are you saying I should list all PCH device ids both of HSW and BDW here > like this format? > > HSWGT1D: 0xABCD > ... > >> >>> */ >>> static const XenIGDDeviceIDInfo xen_igd_combo_id_infos[] = { >>> /* HSW Classic */ >>> {0x0402, 0x8c4e}, /* HSWGT1D, HSWD_w7 */ >>> {0x0406, 0x8c4e}, /* HSWGT1M, HSWM_w7 */ >>> {0x0412, 0x8c4e}, /* HSWGT2D, HSWD_w7 */ >>> {0x0416, 0x8c4e}, /* HSWGT2M, HSWM_w7 */ >>> {0x041E, 0x8c4e}, /* HSWGT15D, HSWD_w7 */ >>> /* HSW ULT */ >>> {0x0A06, 0x8c4e}, /* HSWGT1UT, HSWM_w7 */ >>> {0x0A16, 0x8c4e}, /* HSWGT2UT, HSWM_w7 */ >>> {0x0A26, 0x8c4e}, /* HSWGT3UT, HSWM_w7 */ >>> {0x0A2E, 0x8c4e}, /* HSWGT3UT28W, HSWM_w7 */ >>> {0x0A1E, 0x8c4e}, /* HSWGT2UX, HSWM_w7 */ >>> {0x0A0E, 0x8c4e}, /* HSWGT1ULX, HSWM_w7 */ >>> /* HSW CRW */ >>> {0x0D26, 0x8c4e}, /* HSWGT3CW, HSWM_w7 */ >>> {0x0D22, 0x8c4e}, /* HSWGT3CWDT, HSWD_w7 */ >>> /* HSW Server */ >>> {0x041A, 0x8c4e}, /* HSWSVGT2, HSWD_w7 */ >>> /* HSW SRVR */ >>> {0x040A, 0x8c4e}, /* HSWSVGT1, HSWD_w7 */ >>> /* BSW */ >>> {0x1606, 0x9cc3}, /* BDWULTGT1, BDWM_w7 */ >>> {0x1616, 0x9cc3}, /* BDWULTGT2, BDWM_w7 */ >>> {0x1626, 0x9cc3}, /* BDWULTGT3, BDWM_w7 */ >>> {0x160E, 0x9cc3}, /* BDWULXGT1, BDWM_w7 */ >>> {0x161E, 0x9cc3}, /* BDWULXGT2, BDWM_w7 */ >>> {0x1602, 0x9cc3}, /* BDWHALOGT1, BDWM_w7 */ >>> {0x1612, 0x9cc3}, /* BDWHALOGT2, BDWM_w7 */ >>> {0x1622, 0x9cc3}, /* BDWHALOGT3, BDWM_w7 */ >>> {0x162B, 0x9cc3}, /* BDWHALO28W, BDWM_w7 */ >>> {0x162A, 0x9cc3}, /* BDWGT3WRKS, BDWM_w7 */ >>> {0x162D, 0x9cc3}, /* BDWGT3SRVR, BDWM_w7 */ >>> }; >>> >>> static void xen_igd_passthrough_pciisabridge_get_pci_device_id(Object >>> *obj, >>> >>> Visitor *v, >>> void >>> *opaque, >>> const >>> char >>> *name, >>> Error >>> **errp) >>> { >>> uint32_t value = 0; >>> XenHostPCIDevice hdev; >>> int r = 0, num; >>> >>> r = xen_host_pci_device_get(&hdev, 0, 0, 0x02, 0); >>> if (!r) { >>> value = hdev.device_id; >>> >>> num = sizeof(xen_igd_combo_id_infos)/sizeof(uint16_t); >>> for (r = 0; r < num; r++) >>> if (value == xen_igd_combo_id_infos[r].gpu_device_id) >>> value = xen_igd_combo_id_infos[r].pch_device_id; >> >> that's messy, I would use different variables for ID >> of GPU and PCH. > > for (r = 0; r < num; r++) > if (gpu_id == xen_igd_combo_id_infos[r].gpu_device_id) > pch_id = xen_igd_combo_id_infos[r].pch_device_id; > >> >>> } >>> >>> visit_type_uint32(v, &value, name, errp); >> >> >> what I was suggesting is to start with the GPU device ID >> (you can get it e.g. from the config array). > > I think current codes is doing this: > > r = xen_host_pci_device_get(&hdev, 0, 0, 0x02, 0); > if (!r) { > gpu_id = hdev.device_id; > > Here xen_host_pci_device_get() is a wrapper of reading pci sysfs. > >> Use that to look up the PCH ID in xen_igd_combo_id_infos. >> If there, override the PCH ID. >> >> If not there, this is a new device so its driver will >> not look at PCH at all, we can make it whatever or >> skip it completely. > > I think we should return one initialized value, 0xffff, since often this > represents an invalid PCI device. > >> >> This seems to almost do this, however >> - Why are you looking at host PCH device ID at all? >> - Why don't you look at the GPU device ID? > > We still fix this bridge at 1f.0, and current our implementation can > cover our requirement and safe. I means this bridge should not be used > for other use cases, so if its still be accessed we mightn't take care > of them, right? > >> >> >>> } >>> >>> static void xen_igd_passthrough_isa_bridge_initfn(Object *obj) >>> { >>> object_property_add(obj, "device-id", "int", >>> >>> xen_igd_passthrough_pciisabridge_get_pci_device_id, >>> NULL, NULL, NULL, NULL); >>> } >> >> OK and what reads this property? > > In sequent patch I will do something like this, > > @@ -464,6 +464,32 @@ static void pc_xen_hvm_init(MachineState *machine) > } > } > > +static void xen_igd_passthrough_isa_bridge_create(PCIBus *bus) > +{ > + struct PCIDevice *dev; > + Error *local_err = NULL; > + uint16_t device_id = 0xffff; > + > + /* Currently IGD drivers always need to access PCH by 1f.0. */ > + dev = pci_create_simple(bus, PCI_DEVFN(0x1f, 0), > + "xen-igd-passthrough-isa-bridge"); > + > + /* Identify PCH card with its own real vendor/device ids. > + * Here that vendor id is always PCI_VENDOR_ID_INTEL. > + */ > + if (dev) { > + device_id = (uint16_t)object_property_get_int(OBJECT(dev), > + "device-id", > + &local_err); > + if ((!local_err) && (device_id != 0xffff)) { > + pci_config_set_device_id(dev->config, device_id); > + return; > + } > + } > + > + fprintf(stderr, "xen set xen-igd-passthrough-isa-bridge failed\n"); > +} > + > static void xen_igd_passthrough_pc_hvm_init(MachineState *machine) > { > PCIBus *bus; > @@ -473,6 +499,7 @@ static void > xen_igd_passthrough_pc_hvm_init(MachineState *machine) > bus = pci_find_primary_bus(); > if (bus != NULL) { > pci_create_simple(bus, -1, "xen-platform"); > + xen_igd_passthrough_isa_bridge_create(bus); > } > } > #endif > > Tiejun > >> >>> Thanks >>> Tiejun >>> >>> >>>>> >>>> >>>> If I understand this correctly, the only difference is instead of >>>> reading PCH DevID/RevID from the host hardware, QEMU inserts those >>>> values into PCH virtual device by looking at the reverse mapping >>>> table it maintains. >>>> >>>> I agree the downside of doing this is the reverse mapping table may >>>> be hard to maintain. >>>> >>>> What is the advantage of doing this instead of having QEMU reading >>>> it from the host? Is it to test to make sure reverse mapping >>>> methods works before it is adopted in the new drivers? >>>> >>>>> Thanks >>>>> Tiejun >>>>> >>>>>> >>>>>>>> >>>>>>>> the problem with these tables is they are hard to keep up to date >>>>>>> >>>>>>> Yeah. But I think currently we can just start from some modern CPU >>>>>>> such as HSW and BDW, then things could be easy. >>>>>>> >>>>>>> Allen, >>>>>>> >>>>>>> Any idea to this suggestion? >>>>>>> >>>>>>>> as new hardware comes out, but as future hardware won't need these >>>>>>>> hacks, we shall be fine. >>>>>>> >>>>>>> Yeah. >>>>>>> >>>>>>> Thanks >>>>>>> Tiejun >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks >>>>>>>>>>> Tiejun >>>>>>>>>>> >>>>>>>>>>>> matched PCH? If yes, what is that benefit you expect in >>>>>>>>>>>> passthrough case? Shouldn't we pass these info to VM >>>>>>>>>>>> directly in >>>>> passthrough case? >>>>>>>>>>>> >>>>>>>>>>>> Thanks >>>>>>>>>>>> Tiejun >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>> >>>> Allen >>>> >>>> >> > > >