* Re: [PATCH v3 2/3] drivers/perf: add DesignWare PCIe PMU driver [not found] ` <7aec4dbd-91d5-2a14-8779-f239a58cbbae@linux.alibaba.com> @ 2023-05-16 15:03 ` Jonathan Cameron 2023-05-16 19:17 ` Bjorn Helgaas 0 siblings, 1 reply; 5+ messages in thread From: Jonathan Cameron @ 2023-05-16 15:03 UTC (permalink / raw) To: Shuai Xue Cc: Robin Murphy, helgaas, yangyicong, will, baolin.wang, linux-arm-kernel, linux-kernel, linux-pci, rdunlap, mark.rutland, zhuo.song, linux-cxl PCI folks, Question below directed at you. Please take a look. +CC linux-cxl because a similar question is going to bite us shortly if we want CXL PMUs to work well on RP or Switch ports. > >> +static int dwc_pcie_ras_des_discover(struct dwc_pcie_pmu_priv *priv) > >> +{ > >> + int index = 0; > >> + struct pci_dev *pdev = NULL; > >> + struct dwc_pcie_rp_info *rp_info; > >> + > >> + INIT_LIST_HEAD(&priv->rp_infos); > >> + > >> + /* Match the rootport with VSEC_RAS_DES_ID */ > >> + for_each_pci_dev(pdev) { > > > > Does the PCI layer not offer a more robust mechanism for this? (PCI fixups come to mind, but I don't actually know whether that would be a viable approach or not.) > > I am afraid not yet. Jonathan try to add a PMU service but it is not merged into mainline. I wouldn't read much into that 'failure'. We never persisted with that driver because it was for an old generation of hardware. Mostly the aim with that was to explore the area of PCIe PMU in general rather than to get the support upstream. Some of the counters on that hardware were too small to be of much use anyway :) Grabbing just relevant functions.. Bjorn, we need to figure out a way forwards for this sort of case and I'd appreciate your input on the broad brush question of 'how should it be done'? This is a case where a PCIe port (RP here) correctly has the PCIe class code so binds to the pcie_port driver, but has a VSEC (others examples use DOE, or DVSEC) that provides extended functionality. The referred to PCIe PMU from our older Hisilicon platforms did it by adding another service driver - that probably doesn't extend well. The approach used here is to separately walk the PCI topology and register the devices. It can 'maybe' get away with that because no interrupts and I assume resets have no nasty impacts on it because the device is fairly simple. In general that's not going to work. CXL does a similar trick (which I don't much like, but too late now), but we've also run into the problem of how to get interrupts if not the main driver. So what approach should we look at to solve this in general? Jonathan > +static int dwc_pcie_ras_des_discover(struct dwc_pcie_pmu_priv *priv) > +{ > + int index = 0; > + struct pci_dev *pdev = NULL; > + struct dwc_pcie_rp_info *rp_info; > + > + INIT_LIST_HEAD(&priv->rp_infos); > + > + /* Match the rootport with VSEC_RAS_DES_ID */ > + for_each_pci_dev(pdev) { > + u16 vsec; > + u32 val; > + > + if (!pci_dev_is_rootport(pdev)) > + continue; > + > + rp_info = devm_kzalloc(&pdev->dev, sizeof(*rp_info), GFP_KERNEL); > + if (!rp_info) > + return -ENOMEM; > + > + rp_info->bdf = PCI_DEVID(pdev->bus->number, pdev->devfn); > + rp_info->pdev = pdev; > + > + vsec = pci_find_vsec_capability(pdev, PCI_VENDOR_ID_ALIBABA, > + DWC_PCIE_VSEC_RAS_DES_ID); > + if (!vsec) > + continue; > + > + pci_read_config_dword(pdev, vsec + PCI_VNDR_HEADER, &val); > + if (PCI_VNDR_HEADER_REV(val) != 0x04 || > + PCI_VNDR_HEADER_LEN(val) != 0x100) > + continue; > + pci_dbg(pdev, > + "Detected PCIe Vendor-Specific Extended Capability RAS DES\n"); > + > + rp_info->ras_des = vsec; > + rp_info->num_lanes = pcie_get_width_cap(pdev); > + > + list_add(&rp_info->rp_node, &priv->rp_infos); > + index++; > + } > + > + if (!index) > + return -ENODEV; > + > + priv->pcie_ctrl_num = index; > + > + return 0; > +} > +static int dwc_pcie_pmu_probe(struct platform_device *pdev) > +{ > + int ret; > + struct dwc_pcie_pmu_priv *priv; > + struct dwc_pcie_rp_info *rp_info; > + > + priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL); > + if (!priv) > + return -ENOMEM; > + > + priv->dev = &pdev->dev; > + platform_set_drvdata(pdev, priv); > + > + /* If RAS_DES PMU is not supported on current platform, keep silent */ > + ret = dwc_pcie_ras_des_discover(priv); > + if (ret) > + return ret; > + > + list_for_each_entry(rp_info, &priv->rp_infos, rp_node) { > + struct pci_dev *rp = rp_info->pdev; > + > + ret = __dwc_pcie_pmu_probe(priv, rp_info); > + if (ret) { > + dev_err(&rp->dev, "PCIe PMU probe fail\n"); > + goto pmu_unregister; > + } > + } > + > + return 0; > + > +pmu_unregister: > + dwc_pcie_pmu_remove(pdev); > + > + return ret; > +} ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v3 2/3] drivers/perf: add DesignWare PCIe PMU driver 2023-05-16 15:03 ` [PATCH v3 2/3] drivers/perf: add DesignWare PCIe PMU driver Jonathan Cameron @ 2023-05-16 19:17 ` Bjorn Helgaas 2023-05-17 9:54 ` Jonathan Cameron 0 siblings, 1 reply; 5+ messages in thread From: Bjorn Helgaas @ 2023-05-16 19:17 UTC (permalink / raw) To: Jonathan Cameron Cc: Shuai Xue, Robin Murphy, yangyicong, will, baolin.wang, linux-arm-kernel, linux-kernel, linux-pci, rdunlap, mark.rutland, zhuo.song, linux-cxl On Tue, May 16, 2023 at 04:03:04PM +0100, Jonathan Cameron wrote: > > PCI folks, Question below directed at you. Please take a look. > +CC linux-cxl because a similar question is going to bite us shortly > if we want CXL PMUs to work well on RP or Switch ports. > > > >> +static int dwc_pcie_ras_des_discover(struct dwc_pcie_pmu_priv *priv) > > >> +{ > > >> + int index = 0; > > >> + struct pci_dev *pdev = NULL; > > >> + struct dwc_pcie_rp_info *rp_info; > > >> + > > >> + INIT_LIST_HEAD(&priv->rp_infos); > > >> + > > >> + /* Match the rootport with VSEC_RAS_DES_ID */ > > >> + for_each_pci_dev(pdev) { > > > > > > Does the PCI layer not offer a more robust mechanism for this? > > > (PCI fixups come to mind, but I don't actually know whether that > > > would be a viable approach or not.) > > > > I am afraid not yet. Jonathan try to add a PMU service but it is > > not merged into mainline. > > I wouldn't read much into that 'failure'. We never persisted with > that driver because it was for an old generation of hardware. > Mostly the aim with that was to explore the area of PCIe PMU in > general rather than to get the support upstream. Some of the > counters on that hardware were too small to be of much use anyway :) > > Grabbing just relevant functions.. > > Bjorn, we need to figure out a way forwards for this sort of case > and I'd appreciate your input on the broad brush question of 'how > should it be done'? > > This is a case where a PCIe port (RP here) correctly has the PCIe > class code so binds to the pcie_port driver, but has a VSEC (others > examples use DOE, or DVSEC) that provides extended functionality. > The referred to PCIe PMU from our older Hisilicon platforms did it > by adding another service driver - that probably doesn't extend > well. > > The approach used here is to separately walk the PCI topology and > register the devices. It can 'maybe' get away with that because no > interrupts and I assume resets have no nasty impacts on it because > the device is fairly simple. In general that's not going to work. > CXL does a similar trick (which I don't much like, but too late > now), but we've also run into the problem of how to get interrupts > if not the main driver. Yes, this is a real problem. I think the "walk all PCI devices looking for one we like" approach is terrible because it breaks a lot of driver model assumptions (no device ID to autoload module via udev, hotplug doesn't work, etc), but we don't have a good alternative right now. I think portdrv is slightly better because at least it claims the device in the usual way and gives a way for service drivers to register with it. But I don't really like that either because it created a new weird /sys/bus/pci_express hierarchy full of these sub-devices that aren't really devices, and it doesn't solve the module load and hotplug issues. I would like to have portdrv be completely built into the PCI core and not claim Root Ports or Switch Ports. Then those devices would be available via the usual driver model for driver loading and binding and for hotplug. Bjorn ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v3 2/3] drivers/perf: add DesignWare PCIe PMU driver 2023-05-16 19:17 ` Bjorn Helgaas @ 2023-05-17 9:54 ` Jonathan Cameron 2023-05-17 16:27 ` Bjorn Helgaas 0 siblings, 1 reply; 5+ messages in thread From: Jonathan Cameron @ 2023-05-17 9:54 UTC (permalink / raw) To: Bjorn Helgaas Cc: Shuai Xue, Robin Murphy, yangyicong, will, baolin.wang, linux-arm-kernel, linux-kernel, linux-pci, rdunlap, mark.rutland, zhuo.song, linux-cxl On Tue, 16 May 2023 14:17:52 -0500 Bjorn Helgaas <helgaas@kernel.org> wrote: > On Tue, May 16, 2023 at 04:03:04PM +0100, Jonathan Cameron wrote: > > > > PCI folks, Question below directed at you. Please take a look. > > +CC linux-cxl because a similar question is going to bite us shortly > > if we want CXL PMUs to work well on RP or Switch ports. > > > > > >> +static int dwc_pcie_ras_des_discover(struct dwc_pcie_pmu_priv *priv) > > > >> +{ > > > >> + int index = 0; > > > >> + struct pci_dev *pdev = NULL; > > > >> + struct dwc_pcie_rp_info *rp_info; > > > >> + > > > >> + INIT_LIST_HEAD(&priv->rp_infos); > > > >> + > > > >> + /* Match the rootport with VSEC_RAS_DES_ID */ > > > >> + for_each_pci_dev(pdev) { > > > > > > > > Does the PCI layer not offer a more robust mechanism for this? > > > > (PCI fixups come to mind, but I don't actually know whether that > > > > would be a viable approach or not.) > > > > > > I am afraid not yet. Jonathan try to add a PMU service but it is > > > not merged into mainline. > > > > I wouldn't read much into that 'failure'. We never persisted with > > that driver because it was for an old generation of hardware. > > Mostly the aim with that was to explore the area of PCIe PMU in > > general rather than to get the support upstream. Some of the > > counters on that hardware were too small to be of much use anyway :) > > > > Grabbing just relevant functions.. > > > > Bjorn, we need to figure out a way forwards for this sort of case > > and I'd appreciate your input on the broad brush question of 'how > > should it be done'? > > > > This is a case where a PCIe port (RP here) correctly has the PCIe > > class code so binds to the pcie_port driver, but has a VSEC (others > > examples use DOE, or DVSEC) that provides extended functionality. > > The referred to PCIe PMU from our older Hisilicon platforms did it > > by adding another service driver - that probably doesn't extend > > well. > > > > The approach used here is to separately walk the PCI topology and > > register the devices. It can 'maybe' get away with that because no > > interrupts and I assume resets have no nasty impacts on it because > > the device is fairly simple. In general that's not going to work. > > CXL does a similar trick (which I don't much like, but too late > > now), but we've also run into the problem of how to get interrupts > > if not the main driver. > > Yes, this is a real problem. I think the "walk all PCI devices > looking for one we like" approach is terrible because it breaks a lot > of driver model assumptions (no device ID to autoload module via udev, > hotplug doesn't work, etc), but we don't have a good alternative right > now. > > I think portdrv is slightly better because at least it claims the > device in the usual way and gives a way for service drivers to > register with it. But I don't really like that either because it > created a new weird /sys/bus/pci_express hierarchy full of these > sub-devices that aren't really devices, and it doesn't solve the > module load and hotplug issues. > > I would like to have portdrv be completely built into the PCI core and > not claim Root Ports or Switch Ports. Then those devices would be > available via the usual driver model for driver loading and binding > and for hotplug. Let me see if I understand this correctly as I can think of a few options that perhaps are inline with what you are thinking. 1) All the portdrv stuff converted to normal PCI core helper functions that a driver bound to the struct pci_dev can use. 2) Driver core itself provides a bunch of extra devices alongside the struct pci_dev one to which additional drivers can bind? - so kind of portdrv handling, but squashed into the PCI device topology? 3) Have portdrv operated under the hood, so all the services etc that it provides don't require a driver to be bound at all. Then allow usual VID/DID based driver binding. If 1 - we are going to run into class device restrictions and that will just move where we have to handle the potential vendor specific parts. We probably don't want that to be a hydra with all the functionality and lookups etc driven from there, so do we end up with sub devices of that new PCI port driver with a discover method based on either vsec + VID or DVSEC with devices created under the main pci_dev. That would have to include nastiness around interrupt discovery for those sub devices. So ends up roughly like port_drv. I don't think 2 solves anything. For 3 - interrupts and ownership of facilities is going to be tricky as initially those need to be owned by the PCI core (no device driver bound) and then I guess handed off to the driver once it shows up? Maybe that driver should call a pci_claim_port() that gives it control of everything and pci_release_port() that hands it all back to the core. That seems racey. As another similar proposal to 3 (and one Greg KH will hate :) can we do something similar to vfio and allow an unbind of a class driver followed by a bind of a more specific one? I think 1 is probably the easiest to implement, but it just moves the problem. If we had a way to reliably override the class driver if a more specific one exists, that might work around the problem but I don't think we can do that currently. Jonathan > > Bjorn ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v3 2/3] drivers/perf: add DesignWare PCIe PMU driver 2023-05-17 9:54 ` Jonathan Cameron @ 2023-05-17 16:27 ` Bjorn Helgaas 2023-05-19 10:08 ` Shuai Xue 0 siblings, 1 reply; 5+ messages in thread From: Bjorn Helgaas @ 2023-05-17 16:27 UTC (permalink / raw) To: Jonathan Cameron Cc: Shuai Xue, Robin Murphy, yangyicong, will, baolin.wang, linux-arm-kernel, linux-kernel, linux-pci, rdunlap, mark.rutland, zhuo.song, linux-cxl On Wed, May 17, 2023 at 10:54:21AM +0100, Jonathan Cameron wrote: > On Tue, 16 May 2023 14:17:52 -0500 > Bjorn Helgaas <helgaas@kernel.org> wrote: > > > On Tue, May 16, 2023 at 04:03:04PM +0100, Jonathan Cameron wrote: > ... > > > The approach used here is to separately walk the PCI topology and > > > register the devices. It can 'maybe' get away with that because no > > > interrupts and I assume resets have no nasty impacts on it because > > > the device is fairly simple. In general that's not going to work. > > > CXL does a similar trick (which I don't much like, but too late > > > now), but we've also run into the problem of how to get interrupts > > > if not the main driver. > > > > Yes, this is a real problem. I think the "walk all PCI devices > > looking for one we like" approach is terrible because it breaks a lot > > of driver model assumptions (no device ID to autoload module via udev, > > hotplug doesn't work, etc), but we don't have a good alternative right > > now. > > > > I think portdrv is slightly better because at least it claims the > > device in the usual way and gives a way for service drivers to > > register with it. But I don't really like that either because it > > created a new weird /sys/bus/pci_express hierarchy full of these > > sub-devices that aren't really devices, and it doesn't solve the > > module load and hotplug issues. > > > > I would like to have portdrv be completely built into the PCI core and > > not claim Root Ports or Switch Ports. Then those devices would be > > available via the usual driver model for driver loading and binding > > and for hotplug. > > Let me see if I understand this correctly as I can think of a few options > that perhaps are inline with what you are thinking. > > 1) All the portdrv stuff converted to normal PCI core helper functions > that a driver bound to the struct pci_dev can use. > 2) Driver core itself provides a bunch of extra devices alongside the > struct pci_dev one to which additional drivers can bind? - so kind > of portdrv handling, but squashed into the PCI device topology? > 3) Have portdrv operated under the hood, so all the services etc that > it provides don't require a driver to be bound at all. Then > allow usual VID/DID based driver binding. > > If 1 - we are going to run into class device restrictions and that will > just move where we have to handle the potential vendor specific parts. > We probably don't want that to be a hydra with all the functionality > and lookups etc driven from there, so do we end up with sub devices > of that new PCI port driver with a discover method based on either > vsec + VID or DVSEC with devices created under the main pci_dev. > That would have to include nastiness around interrupt discovery for > those sub devices. So ends up roughly like port_drv. > > I don't think 2 solves anything. > > For 3 - interrupts and ownership of facilities is going to be tricky > as initially those need to be owned by the PCI core (no device driver bound) > and then I guess handed off to the driver once it shows up? Maybe that > driver should call a pci_claim_port() that gives it control of everything > and pci_release_port() that hands it all back to the core. That seems > racey. Yes, 3 is the option I want to explore. That's what we already do for things like ASPM. Agreed, interrupts is a potential issue. I think the architected parts of config space should be implicitly owned by the PCI core, with interfaces à la pci_disable_link_state() if drivers need them. Bjorn ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v3 2/3] drivers/perf: add DesignWare PCIe PMU driver 2023-05-17 16:27 ` Bjorn Helgaas @ 2023-05-19 10:08 ` Shuai Xue 0 siblings, 0 replies; 5+ messages in thread From: Shuai Xue @ 2023-05-19 10:08 UTC (permalink / raw) To: Bjorn Helgaas, Jonathan Cameron Cc: Robin Murphy, yangyicong, will, baolin.wang, linux-arm-kernel, linux-kernel, linux-pci, rdunlap, mark.rutland, zhuo.song, linux-cxl On 2023/5/18 00:27, Bjorn Helgaas wrote: > On Wed, May 17, 2023 at 10:54:21AM +0100, Jonathan Cameron wrote: >> On Tue, 16 May 2023 14:17:52 -0500 >> Bjorn Helgaas <helgaas@kernel.org> wrote: >> >>> On Tue, May 16, 2023 at 04:03:04PM +0100, Jonathan Cameron wrote: >> ... > >>>> The approach used here is to separately walk the PCI topology and >>>> register the devices. It can 'maybe' get away with that because no >>>> interrupts and I assume resets have no nasty impacts on it because >>>> the device is fairly simple. In general that's not going to work. >>>> CXL does a similar trick (which I don't much like, but too late >>>> now), but we've also run into the problem of how to get interrupts >>>> if not the main driver. >>> >>> Yes, this is a real problem. I think the "walk all PCI devices >>> looking for one we like" approach is terrible because it breaks a lot >>> of driver model assumptions (no device ID to autoload module via udev, >>> hotplug doesn't work, etc), but we don't have a good alternative right >>> now. >>> >>> I think portdrv is slightly better because at least it claims the >>> device in the usual way and gives a way for service drivers to >>> register with it. But I don't really like that either because it >>> created a new weird /sys/bus/pci_express hierarchy full of these >>> sub-devices that aren't really devices, and it doesn't solve the >>> module load and hotplug issues. >>> >>> I would like to have portdrv be completely built into the PCI core and >>> not claim Root Ports or Switch Ports. Then those devices would be >>> available via the usual driver model for driver loading and binding >>> and for hotplug. >> >> Let me see if I understand this correctly as I can think of a few options >> that perhaps are inline with what you are thinking. >> >> 1) All the portdrv stuff converted to normal PCI core helper functions >> that a driver bound to the struct pci_dev can use. >> 2) Driver core itself provides a bunch of extra devices alongside the >> struct pci_dev one to which additional drivers can bind? - so kind >> of portdrv handling, but squashed into the PCI device topology? >> 3) Have portdrv operated under the hood, so all the services etc that >> it provides don't require a driver to be bound at all. Then >> allow usual VID/DID based driver binding. >> >> If 1 - we are going to run into class device restrictions and that will >> just move where we have to handle the potential vendor specific parts. >> We probably don't want that to be a hydra with all the functionality >> and lookups etc driven from there, so do we end up with sub devices >> of that new PCI port driver with a discover method based on either >> vsec + VID or DVSEC with devices created under the main pci_dev. >> That would have to include nastiness around interrupt discovery for >> those sub devices. So ends up roughly like port_drv. >> >> I don't think 2 solves anything. >> >> For 3 - interrupts and ownership of facilities is going to be tricky >> as initially those need to be owned by the PCI core (no device driver bound) >> and then I guess handed off to the driver once it shows up? Maybe that >> driver should call a pci_claim_port() that gives it control of everything >> and pci_release_port() that hands it all back to the core. That seems >> racey. > > Yes, 3 is the option I want to explore. That's what we already do for > things like ASPM. Agreed, interrupts is a potential issue. I think > the architected parts of config space should be implicitly owned by > the PCI core, with interfaces à la pci_disable_link_state() if drivers > need them. > I agree "walk all PCI devices looking for one we like" approach is terrible in general. And I am glad to modify my code to adapt to a more suitable solution when it comes. For now, I will collect comments from v3 and v4 and send a new version after addressed them. Any alternative option is welcomed, always :) > Bjorn Thank you. Best Regards, Shuai ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-05-19 10:08 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20220917121036.14864-1-xueshuai@linux.alibaba.com>
[not found] ` <20230417061729.84422-3-xueshuai@linux.alibaba.com>
[not found] ` <b1d9509c-8e8c-b05b-cbaf-576a7b2f99e7@arm.com>
[not found] ` <7aec4dbd-91d5-2a14-8779-f239a58cbbae@linux.alibaba.com>
2023-05-16 15:03 ` [PATCH v3 2/3] drivers/perf: add DesignWare PCIe PMU driver Jonathan Cameron
2023-05-16 19:17 ` Bjorn Helgaas
2023-05-17 9:54 ` Jonathan Cameron
2023-05-17 16:27 ` Bjorn Helgaas
2023-05-19 10:08 ` Shuai Xue
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox