* [PATCH 0/3] PCI-e Max Payload Size configuration @ 2015-07-29 22:18 Keith Busch 2015-07-29 22:18 ` [PATCH] pci: Default MPS tuning, match upstream port Keith Busch ` (2 more replies) 0 siblings, 3 replies; 12+ messages in thread From: Keith Busch @ 2015-07-29 22:18 UTC (permalink / raw) To: linux-pci, linux-kernel; +Cc: Bjorn Helgass, Dave Jiang, Keith Busch This patch series removes arch, driver, and hot-plug specific calls to configure a PCI-e device's MPS setting. The device is instead configured in the generic pci-core layer when the device is initially added, and before it is bound to a driver. The default policy is also changed from "do nothing" to update the down stream port to match the upstream port if it is capable. Dave Jiang (2): QIB: Removing usage of pcie_set_mps() PCIE: Remove symbol export for pcie_set_mps() Keith Busch (1): pci: Default MPS tuning to match upstream port arch/arm/kernel/bios32.c | 12 ------------ arch/powerpc/kernel/pci-common.c | 7 ------- arch/tile/kernel/pci_gx.c | 4 ---- arch/x86/pci/acpi.c | 9 --------- drivers/infiniband/hw/qib/qib_pcie.c | 27 +-------------------------- drivers/pci/bus.c | 4 ++++ drivers/pci/hotplug/acpiphp_glue.c | 1 - drivers/pci/hotplug/pciehp_pci.c | 1 - drivers/pci/hotplug/shpchp_pci.c | 1 - drivers/pci/pci.c | 1 - drivers/pci/probe.c | 22 ++++++++++++++++------ include/linux/pci.h | 2 -- 12 files changed, 21 insertions(+), 70 deletions(-) -- 1.7.10.4 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH] pci: Default MPS tuning, match upstream port 2015-07-29 22:18 [PATCH 0/3] PCI-e Max Payload Size configuration Keith Busch @ 2015-07-29 22:18 ` Keith Busch 2015-08-17 22:28 ` Bjorn Helgaas 2015-07-29 22:18 ` [PATCH 2/3] QIB: Removing usage of pcie_set_mps() Keith Busch 2015-07-29 22:18 ` [PATCH 3/3] PCIE: Remove symbol export for pcie_set_mps() Keith Busch 2 siblings, 1 reply; 12+ messages in thread From: Keith Busch @ 2015-07-29 22:18 UTC (permalink / raw) To: linux-pci, linux-kernel Cc: Bjorn Helgass, Dave Jiang, Keith Busch, Austin Bolen, Myron Stowe, Jon Mason A hot plugged PCI-e device max payload size (MPS) defaults to 0 for 128bytes. The device is not usable if the upstream port is configured to a higher setting. Bus configuration was previously done by arch specific and hot plug code after the root port or bridge was scanned, and default behavior logged a warning without changing the setting if there was a problem. This patch unifies tuning with a default policy that affects only misconfigured downstream devices, and preserves existing end result if a non-default bus tuning is used (ex: pci=pcie_bus_safe). The new pcie tuning will check the device's MPS against the parent bridge when it is initially added to the pci subsystem, prior to attaching to a driver. If MPS is mismatched, the downstream port is set to the parent bridge's if capable. This is safe to change here since the device being configured is not bound to a driver and does not affect previously configured devices in the domain hierarchy. A device incapable of matching the upstream bridge will log a warning message and not bind to a driver. This is the safe option since proper MPS configuration must consider the entire hierarchy between communicating end points, and we can't safely modify a root port's subtree MPS settings while it is in use. Since the bus is configured everytime a bridge is scanned, this potentially creates unnecessary re-walking of an already configured sub-tree, but the code only executes during root port initialization, hot adding a switch, or explicit request to rescan. Signed-off-by: Keith Busch <keith.busch@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Austin Bolen <Austin_Bolen@Dell.com> Cc: Myron Stowe <mstowe@redhat.com> Cc: Jon Mason <jdmason@kudzu.us> Cc: Bjorn Helgaas <bhelgaas@google.com> --- arch/arm/kernel/bios32.c | 12 ------------ arch/powerpc/kernel/pci-common.c | 7 ------- arch/tile/kernel/pci_gx.c | 4 ---- arch/x86/pci/acpi.c | 9 --------- drivers/pci/bus.c | 4 ++++ drivers/pci/hotplug/acpiphp_glue.c | 1 - drivers/pci/hotplug/pciehp_pci.c | 1 - drivers/pci/hotplug/shpchp_pci.c | 1 - drivers/pci/probe.c | 27 ++++++++++++++++++++++----- include/linux/pci.h | 2 -- 10 files changed, 26 insertions(+), 42 deletions(-) diff --git a/arch/arm/kernel/bios32.c b/arch/arm/kernel/bios32.c index fcbbbb1..4fff58e 100644 --- a/arch/arm/kernel/bios32.c +++ b/arch/arm/kernel/bios32.c @@ -537,18 +537,6 @@ void pci_common_init_dev(struct device *parent, struct hw_pci *hw) */ pci_bus_add_devices(bus); } - - list_for_each_entry(sys, &head, node) { - struct pci_bus *bus = sys->bus; - - /* Configure PCI Express settings */ - if (bus && !pci_has_flag(PCI_PROBE_ONLY)) { - struct pci_bus *child; - - list_for_each_entry(child, &bus->children, node) - pcie_bus_configure_settings(child); - } - } } #ifndef CONFIG_PCI_HOST_ITE8152 diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c index b9de34d..7f27ffe 100644 --- a/arch/powerpc/kernel/pci-common.c +++ b/arch/powerpc/kernel/pci-common.c @@ -1682,13 +1682,6 @@ void pcibios_scan_phb(struct pci_controller *hose) */ if (ppc_md.pcibios_fixup_phb) ppc_md.pcibios_fixup_phb(hose); - - /* Configure PCI Express settings */ - if (bus && !pci_has_flag(PCI_PROBE_ONLY)) { - struct pci_bus *child; - list_for_each_entry(child, &bus->children, node) - pcie_bus_configure_settings(child); - } } EXPORT_SYMBOL_GPL(pcibios_scan_phb); diff --git a/arch/tile/kernel/pci_gx.c b/arch/tile/kernel/pci_gx.c index b1df847..67492fb 100644 --- a/arch/tile/kernel/pci_gx.c +++ b/arch/tile/kernel/pci_gx.c @@ -599,10 +599,6 @@ static void fixup_read_and_payload_sizes(struct pci_controller *controller) __gxio_mmio_write32(trio_context->mmio_base_mac + reg_offset, rc_dev_cap.word); - /* Configure PCI Express MPS setting. */ - list_for_each_entry(child, &root_bus->children, node) - pcie_bus_configure_settings(child); - /* * Set the mac_config register in trio based on the MPS/MRS of the link. */ diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c index ff99117..ab5977a 100644 --- a/arch/x86/pci/acpi.c +++ b/arch/x86/pci/acpi.c @@ -478,15 +478,6 @@ struct pci_bus *pci_acpi_scan_root(struct acpi_pci_root *root) } } - /* After the PCI-E bus has been walked and all devices discovered, - * configure any settings of the fabric that might be necessary. - */ - if (bus) { - struct pci_bus *child; - list_for_each_entry(child, &bus->children, node) - pcie_bus_configure_settings(child); - } - if (bus && node != NUMA_NO_NODE) dev_printk(KERN_DEBUG, &bus->dev, "on NUMA node %d\n", node); diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c index 6fbd3f2..8f8428a 100644 --- a/drivers/pci/bus.c +++ b/drivers/pci/bus.c @@ -277,6 +277,10 @@ void pci_bus_add_device(struct pci_dev *dev) { int retval; + if (dev->bus->self && + pcie_get_mps(dev) != pcie_get_mps(dev->bus->self)) + return; + /* * Can not put in pci_device_add yet because resources * are not assigned yet for some devices. diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c index ff53856..cd98649 100644 --- a/drivers/pci/hotplug/acpiphp_glue.c +++ b/drivers/pci/hotplug/acpiphp_glue.c @@ -509,7 +509,6 @@ static void enable_slot(struct acpiphp_slot *slot) __pci_bus_assign_resources(bus, &add_list, NULL); acpiphp_sanitize_bus(bus); - pcie_bus_configure_settings(bus); acpiphp_set_acpi_region(slot); list_for_each_entry(dev, &bus->devices, bus_list) { diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c index 9e69403..93bc7f6 100644 --- a/drivers/pci/hotplug/pciehp_pci.c +++ b/drivers/pci/hotplug/pciehp_pci.c @@ -65,7 +65,6 @@ int pciehp_configure_device(struct slot *p_slot) pci_hp_add_bridge(dev); pci_assign_unassigned_bridge_resources(bridge); - pcie_bus_configure_settings(parent); pci_bus_add_devices(parent); out: diff --git a/drivers/pci/hotplug/shpchp_pci.c b/drivers/pci/hotplug/shpchp_pci.c index f8cd3a2..ca3dc3c 100644 --- a/drivers/pci/hotplug/shpchp_pci.c +++ b/drivers/pci/hotplug/shpchp_pci.c @@ -69,7 +69,6 @@ int shpchp_configure_device(struct slot *p_slot) } pci_assign_unassigned_bridge_resources(bridge); - pcie_bus_configure_settings(parent); pci_bus_add_devices(parent); out: diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index cefd636..b469298 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -37,6 +37,9 @@ struct pci_domain_busn_res { int domain_nr; }; +static void pcie_bus_detect_mps(struct pci_dev *dev); +static void pcie_bus_configure_settings(struct pci_bus *bus); + static struct resource *get_pci_domain_busn_res(int domain_nr) { struct pci_domain_busn_res *r; @@ -929,6 +932,8 @@ int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max, int pass) (is_cardbus ? "PCI CardBus %04x:%02x" : "PCI Bus %04x:%02x"), pci_domain_nr(bus), child->number); + pcie_bus_configure_settings(bus); + /* Has only triggered on CardBus, fixup is in yenta_socket */ while (bus->parent) { if ((child->busn_res.end > bus->busn_res.end) || @@ -1589,6 +1594,9 @@ void pci_device_add(struct pci_dev *dev, struct pci_bus *bus) dev->match_driver = false; ret = device_add(&dev->dev); WARN_ON(ret < 0); + + if (pci_is_pcie(dev)) + pcie_bus_detect_mps(dev); } struct pci_dev *pci_scan_single_device(struct pci_bus *bus, int devfn) @@ -1802,9 +1810,17 @@ static void pcie_bus_detect_mps(struct pci_dev *dev) mps = pcie_get_mps(dev); p_mps = pcie_get_mps(bridge); - if (mps != p_mps) - dev_warn(&dev->dev, "Max Payload Size %d, but upstream %s set to %d; if necessary, use \"pci=pcie_bus_safe\" and report a bug\n", - mps, pci_name(bridge), p_mps); + if (mps != p_mps) { + if (128 << dev->pcie_mpss < p_mps) { + dev_warn(&dev->dev, + "Max Payload Size Supported %d, but upstream %s set to %d. If problems are experienced, try running with pci=pcie_bus_safe\n", + 128 << dev->pcie_mpss, pci_name(bridge), p_mps); + return; + } + pcie_write_mps(dev, p_mps); + dev_info(&dev->dev, "Max Payload Size set to %4d/%4d (was %4d)\n", + pcie_get_mps(dev), 128 << dev->pcie_mpss, mps); + } } static int pcie_bus_configure_set(struct pci_dev *dev, void *data) @@ -1836,7 +1852,7 @@ static int pcie_bus_configure_set(struct pci_dev *dev, void *data) * parents then children fashion. If this changes, then this code will not * work as designed. */ -void pcie_bus_configure_settings(struct pci_bus *bus) +static void pcie_bus_configure_settings(struct pci_bus *bus) { u8 smpss = 0; @@ -1863,7 +1879,6 @@ void pcie_bus_configure_settings(struct pci_bus *bus) pcie_bus_configure_set(bus->self, &smpss); pci_walk_bus(bus, pcie_bus_configure_set, &smpss); } -EXPORT_SYMBOL_GPL(pcie_bus_configure_settings); unsigned int pci_scan_child_bus(struct pci_bus *bus) { @@ -1895,6 +1910,8 @@ unsigned int pci_scan_child_bus(struct pci_bus *bus) max = pci_scan_bridge(bus, dev, max, pass); } + pcie_bus_configure_settings(bus); + /* * We've scanned the bus and so we know all about what's on * the other side of any bridges that may be on this bus plus diff --git a/include/linux/pci.h b/include/linux/pci.h index 8a0321a..e1df5f9 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -735,8 +735,6 @@ struct pci_driver { /* these external functions are only available when PCI support is enabled */ #ifdef CONFIG_PCI -void pcie_bus_configure_settings(struct pci_bus *bus); - enum pcie_bus_config_types { PCIE_BUS_TUNE_OFF, PCIE_BUS_SAFE, -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] pci: Default MPS tuning, match upstream port 2015-07-29 22:18 ` [PATCH] pci: Default MPS tuning, match upstream port Keith Busch @ 2015-08-17 22:28 ` Bjorn Helgaas 2015-08-17 23:03 ` Keith Busch 0 siblings, 1 reply; 12+ messages in thread From: Bjorn Helgaas @ 2015-08-17 22:28 UTC (permalink / raw) To: Keith Busch Cc: linux-pci, linux-kernel, Dave Jiang, Austin Bolen, Myron Stowe, Jon Mason On Wed, Jul 29, 2015 at 04:18:53PM -0600, Keith Busch wrote: > A hot plugged PCI-e device max payload size (MPS) defaults to 0 for > 128bytes. The device is not usable if the upstream port is configured > to a higher setting. > > Bus configuration was previously done by arch specific and hot plug code > after the root port or bridge was scanned, and default behavior logged a > warning without changing the setting if there was a problem. This patch > unifies tuning with a default policy that affects only misconfigured > downstream devices, and preserves existing end result if a non-default > bus tuning is used (ex: pci=pcie_bus_safe). > > The new pcie tuning will check the device's MPS against the parent bridge > when it is initially added to the pci subsystem, prior to attaching > to a driver. If MPS is mismatched, the downstream port is set to the > parent bridge's if capable. This is a little confusing (at least, *I'm* confused). You're talking about setting the MPS configuration of the "downstream port," but I think you are talking about either a Switch Upstream Port or an Endpoint. Such a port is indeed *downstream* from the parent bridge, but referring to it as a "downstream port" risks confusing it with the parent bridge, which is either a Switch Downstream Port or a Root Port. > This is safe to change here since the device > being configured is not bound to a driver and does not affect previously > configured devices in the domain hierarchy. > > A device incapable of matching the upstream bridge will log a > warning message and not bind to a driver. This is the safe option since > proper MPS configuration must consider the entire hierarchy between > communicating end points, and we can't safely modify a root port's > subtree MPS settings while it is in use. > > Since the bus is configured everytime a bridge is scanned, this > potentially creates unnecessary re-walking of an already configured > sub-tree, but the code only executes during root port initialization, > hot adding a switch, or explicit request to rescan. > > Signed-off-by: Keith Busch <keith.busch@intel.com> > Cc: Dave Jiang <dave.jiang@intel.com> > Cc: Austin Bolen <Austin_Bolen@Dell.com> > Cc: Myron Stowe <mstowe@redhat.com> > Cc: Jon Mason <jdmason@kudzu.us> > Cc: Bjorn Helgaas <bhelgaas@google.com> > --- > arch/arm/kernel/bios32.c | 12 ------------ > arch/powerpc/kernel/pci-common.c | 7 ------- > arch/tile/kernel/pci_gx.c | 4 ---- > arch/x86/pci/acpi.c | 9 --------- > drivers/pci/bus.c | 4 ++++ > drivers/pci/hotplug/acpiphp_glue.c | 1 - > drivers/pci/hotplug/pciehp_pci.c | 1 - > drivers/pci/hotplug/shpchp_pci.c | 1 - > drivers/pci/probe.c | 27 ++++++++++++++++++++++----- > include/linux/pci.h | 2 -- > 10 files changed, 26 insertions(+), 42 deletions(-) > > diff --git a/arch/arm/kernel/bios32.c b/arch/arm/kernel/bios32.c > index fcbbbb1..4fff58e 100644 > --- a/arch/arm/kernel/bios32.c > +++ b/arch/arm/kernel/bios32.c > @@ -537,18 +537,6 @@ void pci_common_init_dev(struct device *parent, struct hw_pci *hw) > */ > pci_bus_add_devices(bus); > } > - > - list_for_each_entry(sys, &head, node) { > - struct pci_bus *bus = sys->bus; > - > - /* Configure PCI Express settings */ > - if (bus && !pci_has_flag(PCI_PROBE_ONLY)) { > - struct pci_bus *child; > - > - list_for_each_entry(child, &bus->children, node) > - pcie_bus_configure_settings(child); > - } > - } I think nothing terrible happens if we call pcie_bus_configure_settings() more than once (we might get some duplicate messages, but I don't consider that terrible). If that's true, maybe we could split the removal of these calls into a separate patch. It doesn't need to be a separate patch for every arch, but I think getting the "redundant code removal" out of this patch will make the interesting changes more obvious. > } > > #ifndef CONFIG_PCI_HOST_ITE8152 > diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c > index b9de34d..7f27ffe 100644 > --- a/arch/powerpc/kernel/pci-common.c > +++ b/arch/powerpc/kernel/pci-common.c > @@ -1682,13 +1682,6 @@ void pcibios_scan_phb(struct pci_controller *hose) > */ > if (ppc_md.pcibios_fixup_phb) > ppc_md.pcibios_fixup_phb(hose); > - > - /* Configure PCI Express settings */ > - if (bus && !pci_has_flag(PCI_PROBE_ONLY)) { > - struct pci_bus *child; > - list_for_each_entry(child, &bus->children, node) > - pcie_bus_configure_settings(child); > - } > } > EXPORT_SYMBOL_GPL(pcibios_scan_phb); > > diff --git a/arch/tile/kernel/pci_gx.c b/arch/tile/kernel/pci_gx.c > index b1df847..67492fb 100644 > --- a/arch/tile/kernel/pci_gx.c > +++ b/arch/tile/kernel/pci_gx.c > @@ -599,10 +599,6 @@ static void fixup_read_and_payload_sizes(struct pci_controller *controller) > __gxio_mmio_write32(trio_context->mmio_base_mac + reg_offset, > rc_dev_cap.word); > > - /* Configure PCI Express MPS setting. */ > - list_for_each_entry(child, &root_bus->children, node) > - pcie_bus_configure_settings(child); > - > /* > * Set the mac_config register in trio based on the MPS/MRS of the link. > */ > diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c > index ff99117..ab5977a 100644 > --- a/arch/x86/pci/acpi.c > +++ b/arch/x86/pci/acpi.c > @@ -478,15 +478,6 @@ struct pci_bus *pci_acpi_scan_root(struct acpi_pci_root *root) > } > } > > - /* After the PCI-E bus has been walked and all devices discovered, > - * configure any settings of the fabric that might be necessary. > - */ > - if (bus) { > - struct pci_bus *child; > - list_for_each_entry(child, &bus->children, node) > - pcie_bus_configure_settings(child); > - } > - > if (bus && node != NUMA_NO_NODE) > dev_printk(KERN_DEBUG, &bus->dev, "on NUMA node %d\n", node); > > diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c > index 6fbd3f2..8f8428a 100644 > --- a/drivers/pci/bus.c > +++ b/drivers/pci/bus.c > @@ -277,6 +277,10 @@ void pci_bus_add_device(struct pci_dev *dev) > { > int retval; > > + if (dev->bus->self && > + pcie_get_mps(dev) != pcie_get_mps(dev->bus->self)) > + return; This part looks like it could be in its own separate patch that basically enforces the "if we can't safely configure this device's MPS setting, prevent drivers from using the device" idea. What happens in this case? Does the device still appear in sysfs and lspci, even if we can't configure its MPS safely? This seems like good place for a dev_warn(). > /* > * Can not put in pci_device_add yet because resources > * are not assigned yet for some devices. > diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c > index ff53856..cd98649 100644 > --- a/drivers/pci/hotplug/acpiphp_glue.c > +++ b/drivers/pci/hotplug/acpiphp_glue.c > @@ -509,7 +509,6 @@ static void enable_slot(struct acpiphp_slot *slot) > __pci_bus_assign_resources(bus, &add_list, NULL); > > acpiphp_sanitize_bus(bus); > - pcie_bus_configure_settings(bus); > acpiphp_set_acpi_region(slot); > > list_for_each_entry(dev, &bus->devices, bus_list) { > diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c > index 9e69403..93bc7f6 100644 > --- a/drivers/pci/hotplug/pciehp_pci.c > +++ b/drivers/pci/hotplug/pciehp_pci.c > @@ -65,7 +65,6 @@ int pciehp_configure_device(struct slot *p_slot) > pci_hp_add_bridge(dev); > > pci_assign_unassigned_bridge_resources(bridge); > - pcie_bus_configure_settings(parent); > pci_bus_add_devices(parent); > > out: > diff --git a/drivers/pci/hotplug/shpchp_pci.c b/drivers/pci/hotplug/shpchp_pci.c > index f8cd3a2..ca3dc3c 100644 > --- a/drivers/pci/hotplug/shpchp_pci.c > +++ b/drivers/pci/hotplug/shpchp_pci.c > @@ -69,7 +69,6 @@ int shpchp_configure_device(struct slot *p_slot) > } > > pci_assign_unassigned_bridge_resources(bridge); > - pcie_bus_configure_settings(parent); > pci_bus_add_devices(parent); > > out: > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c > index cefd636..b469298 100644 > --- a/drivers/pci/probe.c > +++ b/drivers/pci/probe.c > @@ -37,6 +37,9 @@ struct pci_domain_busn_res { > int domain_nr; > }; > > +static void pcie_bus_detect_mps(struct pci_dev *dev); > +static void pcie_bus_configure_settings(struct pci_bus *bus); > + > static struct resource *get_pci_domain_busn_res(int domain_nr) > { > struct pci_domain_busn_res *r; > @@ -929,6 +932,8 @@ int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max, int pass) > (is_cardbus ? "PCI CardBus %04x:%02x" : "PCI Bus %04x:%02x"), > pci_domain_nr(bus), child->number); > > + pcie_bus_configure_settings(bus); > + > /* Has only triggered on CardBus, fixup is in yenta_socket */ > while (bus->parent) { > if ((child->busn_res.end > bus->busn_res.end) || > @@ -1589,6 +1594,9 @@ void pci_device_add(struct pci_dev *dev, struct pci_bus *bus) > dev->match_driver = false; > ret = device_add(&dev->dev); > WARN_ON(ret < 0); > + > + if (pci_is_pcie(dev)) > + pcie_bus_detect_mps(dev); > } > > struct pci_dev *pci_scan_single_device(struct pci_bus *bus, int devfn) > @@ -1802,9 +1810,17 @@ static void pcie_bus_detect_mps(struct pci_dev *dev) > mps = pcie_get_mps(dev); > p_mps = pcie_get_mps(bridge); > > - if (mps != p_mps) > - dev_warn(&dev->dev, "Max Payload Size %d, but upstream %s set to %d; if necessary, use \"pci=pcie_bus_safe\" and report a bug\n", > - mps, pci_name(bridge), p_mps); > + if (mps != p_mps) { > + if (128 << dev->pcie_mpss < p_mps) { > + dev_warn(&dev->dev, > + "Max Payload Size Supported %d, but upstream %s set to %d. If problems are experienced, try running with pci=pcie_bus_safe\n", > + 128 << dev->pcie_mpss, pci_name(bridge), p_mps); > + return; Isn't this the same case where the above change to pci_bus_add_device() will now decline to add the device? So I think there are really two policy changes: 1) If an device can be configured to match the upstream bridge's MPS setting, do it, and 2) Don't add a device when its MPS setting doesn't match the upstream bridge I'd like those to be separate patches. > + } > + pcie_write_mps(dev, p_mps); > + dev_info(&dev->dev, "Max Payload Size set to %4d/%4d (was %4d)\n", > + pcie_get_mps(dev), 128 << dev->pcie_mpss, mps); > + } I assume this hunk is related to the policy change (from "do nothing" to "update the downstream port"). Can you split that policy change into its own separate minimal patch? Yes, I'm looking for minimal and specific bisect targets :) I think this will be clearer if you write it as: if (mps == p_mps) return; if (128 << dev->pcie_mpss < p_mps) { dev_warn(...); return; } pcie_write_mps(...); dev_info(...); Now that this actively updates the MPS setting, it's starting to grow beyond the original "pcie_bus_detect_mps()" function name. > } > > static int pcie_bus_configure_set(struct pci_dev *dev, void *data) > @@ -1836,7 +1852,7 @@ static int pcie_bus_configure_set(struct pci_dev *dev, void *data) > * parents then children fashion. If this changes, then this code will not > * work as designed. > */ > -void pcie_bus_configure_settings(struct pci_bus *bus) > +static void pcie_bus_configure_settings(struct pci_bus *bus) > { > u8 smpss = 0; > > @@ -1863,7 +1879,6 @@ void pcie_bus_configure_settings(struct pci_bus *bus) > pcie_bus_configure_set(bus->self, &smpss); > pci_walk_bus(bus, pcie_bus_configure_set, &smpss); > } > -EXPORT_SYMBOL_GPL(pcie_bus_configure_settings); > > unsigned int pci_scan_child_bus(struct pci_bus *bus) > { > @@ -1895,6 +1910,8 @@ unsigned int pci_scan_child_bus(struct pci_bus *bus) > max = pci_scan_bridge(bus, dev, max, pass); > } > > + pcie_bus_configure_settings(bus); > + > /* > * We've scanned the bus and so we know all about what's on > * the other side of any bridges that may be on this bus plus > diff --git a/include/linux/pci.h b/include/linux/pci.h > index 8a0321a..e1df5f9 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -735,8 +735,6 @@ struct pci_driver { > /* these external functions are only available when PCI support is enabled */ > #ifdef CONFIG_PCI > > -void pcie_bus_configure_settings(struct pci_bus *bus); > - > enum pcie_bus_config_types { > PCIE_BUS_TUNE_OFF, > PCIE_BUS_SAFE, > -- > 1.7.10.4 > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] pci: Default MPS tuning, match upstream port 2015-08-17 22:28 ` Bjorn Helgaas @ 2015-08-17 23:03 ` Keith Busch 0 siblings, 0 replies; 12+ messages in thread From: Keith Busch @ 2015-08-17 23:03 UTC (permalink / raw) To: Bjorn Helgaas Cc: Keith Busch, linux-pci, linux-kernel, Dave Jiang, Austin Bolen, Myron Stowe, Jon Mason On Mon, 17 Aug 2015, Bjorn Helgaas wrote: > On Wed, Jul 29, 2015 at 04:18:53PM -0600, Keith Busch wrote: >> The new pcie tuning will check the device's MPS against the parent bridge >> when it is initially added to the pci subsystem, prior to attaching >> to a driver. If MPS is mismatched, the downstream port is set to the >> parent bridge's if capable. > > This is a little confusing (at least, *I'm* confused). You're talking > about setting the MPS configuration of the "downstream port," but I > think you are talking about either a Switch Upstream Port or an > Endpoint. Such a port is indeed *downstream* from the parent bridge, > but referring to it as a "downstream port" risks confusing it with the > parent bridge, which is either a Switch Downstream Port or a Root > Port. Yes, the wording is confusing, yet you've managed to describe my intention, though. :) >> { >> int retval; >> >> + if (dev->bus->self && >> + pcie_get_mps(dev) != pcie_get_mps(dev->bus->self)) >> + return; > > This part looks like it could be in its own separate patch that > basically enforces the "if we can't safely configure this device's MPS > setting, prevent drivers from using the device" idea. What happens in > this case? Does the device still appear in sysfs and lspci, even if > we can't configure its MPS safely? This seems like good place for a > dev_warn(). It will remain in the pci topology and all the associated sysfs artifacts and lspic capabilities will be there. You could retune the whole topology from user space with "setpci" and do a rescan if you were so inclined, but that could cause problems if transactions are in-flight while re-tuning. A dev_warn will be produced when the device is initially scanned as you noted below, but another at this point might be useful to make it clear a driver will not be bound. The above is broken, though, for PCI device's that are not PCI-e, so need to fix that at the very least if we will enforce this. >> - if (mps != p_mps) >> - dev_warn(&dev->dev, "Max Payload Size %d, but upstream %s set to %d; if necessary, use \"pci=pcie_bus_safe\" and report a bug\n", >> - mps, pci_name(bridge), p_mps); >> + if (mps != p_mps) { >> + if (128 << dev->pcie_mpss < p_mps) { >> + dev_warn(&dev->dev, >> + "Max Payload Size Supported %d, but upstream %s set to %d. If problems are experienced, try running with pci=pcie_bus_safe\n", >> + 128 << dev->pcie_mpss, pci_name(bridge), p_mps); >> + return; > > Isn't this the same case where the above change to > pci_bus_add_device() will now decline to add the device? So I think > there are really two policy changes: > > 1) If an device can be configured to match the upstream bridge's MPS > setting, do it, and > > 2) Don't add a device when its MPS setting doesn't match the > upstream bridge > > I'd like those to be separate patches. Ok. >> + } >> + pcie_write_mps(dev, p_mps); >> + dev_info(&dev->dev, "Max Payload Size set to %4d/%4d (was %4d)\n", >> + pcie_get_mps(dev), 128 << dev->pcie_mpss, mps); >> + } > > I assume this hunk is related to the policy change (from "do nothing" > to "update the downstream port"). Can you split that policy change > into its own separate minimal patch? Yes, I'm looking for minimal and > specific bisect targets :) > > I think this will be clearer if you write it as: > > if (mps == p_mps) > return; > > if (128 << dev->pcie_mpss < p_mps) { > dev_warn(...); > return; > } > > pcie_write_mps(...); > dev_info(...); > > Now that this actively updates the MPS setting, it's starting to grow > beyond the original "pcie_bus_detect_mps()" function name. Agreed that's a cleaner flow. Thanks for all the feedback. Will work on a revision this week. ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 2/3] QIB: Removing usage of pcie_set_mps() 2015-07-29 22:18 [PATCH 0/3] PCI-e Max Payload Size configuration Keith Busch 2015-07-29 22:18 ` [PATCH] pci: Default MPS tuning, match upstream port Keith Busch @ 2015-07-29 22:18 ` Keith Busch 2015-08-17 22:30 ` Bjorn Helgaas 2015-07-29 22:18 ` [PATCH 3/3] PCIE: Remove symbol export for pcie_set_mps() Keith Busch 2 siblings, 1 reply; 12+ messages in thread From: Keith Busch @ 2015-07-29 22:18 UTC (permalink / raw) To: linux-pci, linux-kernel; +Cc: Bjorn Helgass, Dave Jiang From: Dave Jiang <dave.jiang@intel.com> This is in perperation of un-exporting the pcie_set_mps() function symbol. A driver should not be changing the MPS as that is the responsibility of the PCI subsystem. Signed-off-by: Dave Jiang <dave.jiang@intel.com> --- drivers/infiniband/hw/qib/qib_pcie.c | 27 +-------------------------- 1 file changed, 1 insertion(+), 26 deletions(-) diff --git a/drivers/infiniband/hw/qib/qib_pcie.c b/drivers/infiniband/hw/qib/qib_pcie.c index 4758a38..b8a2dcd 100644 --- a/drivers/infiniband/hw/qib/qib_pcie.c +++ b/drivers/infiniband/hw/qib/qib_pcie.c @@ -557,12 +557,11 @@ static void qib_tune_pcie_coalesce(struct qib_devdata *dd) */ static int qib_pcie_caps; module_param_named(pcie_caps, qib_pcie_caps, int, S_IRUGO); -MODULE_PARM_DESC(pcie_caps, "Max PCIe tuning: Payload (0..3), ReadReq (4..7)"); +MODULE_PARM_DESC(pcie_caps, "Max PCIe tuning: ReadReq (4..7)"); static void qib_tune_pcie_caps(struct qib_devdata *dd) { struct pci_dev *parent; - u16 rc_mpss, rc_mps, ep_mpss, ep_mps; u16 rc_mrrs, ep_mrrs, max_mrrs; /* Find out supported and configured values for parent (root) */ @@ -575,30 +574,6 @@ static void qib_tune_pcie_caps(struct qib_devdata *dd) if (!pci_is_pcie(parent) || !pci_is_pcie(dd->pcidev)) return; - rc_mpss = parent->pcie_mpss; - rc_mps = ffs(pcie_get_mps(parent)) - 8; - /* Find out supported and configured values for endpoint (us) */ - ep_mpss = dd->pcidev->pcie_mpss; - ep_mps = ffs(pcie_get_mps(dd->pcidev)) - 8; - - /* Find max payload supported by root, endpoint */ - if (rc_mpss > ep_mpss) - rc_mpss = ep_mpss; - - /* If Supported greater than limit in module param, limit it */ - if (rc_mpss > (qib_pcie_caps & 7)) - rc_mpss = qib_pcie_caps & 7; - /* If less than (allowed, supported), bump root payload */ - if (rc_mpss > rc_mps) { - rc_mps = rc_mpss; - pcie_set_mps(parent, 128 << rc_mps); - } - /* If less than (allowed, supported), bump endpoint payload */ - if (rc_mpss > ep_mps) { - ep_mps = rc_mpss; - pcie_set_mps(dd->pcidev, 128 << ep_mps); - } - /* * Now the Read Request size. * No field for max supported, but PCIe spec limits it to 4096, -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] QIB: Removing usage of pcie_set_mps() 2015-07-29 22:18 ` [PATCH 2/3] QIB: Removing usage of pcie_set_mps() Keith Busch @ 2015-08-17 22:30 ` Bjorn Helgaas 2015-08-17 22:50 ` Jiang, Dave 0 siblings, 1 reply; 12+ messages in thread From: Bjorn Helgaas @ 2015-08-17 22:30 UTC (permalink / raw) To: Keith Busch; +Cc: linux-pci, linux-kernel, Dave Jiang [+cc Mike, linux-rdma] On Wed, Jul 29, 2015 at 04:18:54PM -0600, Keith Busch wrote: > From: Dave Jiang <dave.jiang@intel.com> > > This is in perperation of un-exporting the pcie_set_mps() function > symbol. A driver should not be changing the MPS as that is the > responsibility of the PCI subsystem. Please explain the implications of removing this code. Does this affect performance of the device? If so, how do we get that performance back? I also cc'd the QIB maintainers for you: QIB DRIVER M: Mike Marciniszyn <infinipath@intel.com> L: linux-rdma@vger.kernel.org F: drivers/infiniband/hw/qib/ > Signed-off-by: Dave Jiang <dave.jiang@intel.com> > --- > drivers/infiniband/hw/qib/qib_pcie.c | 27 +-------------------------- > 1 file changed, 1 insertion(+), 26 deletions(-) > > diff --git a/drivers/infiniband/hw/qib/qib_pcie.c b/drivers/infiniband/hw/qib/qib_pcie.c > index 4758a38..b8a2dcd 100644 > --- a/drivers/infiniband/hw/qib/qib_pcie.c > +++ b/drivers/infiniband/hw/qib/qib_pcie.c > @@ -557,12 +557,11 @@ static void qib_tune_pcie_coalesce(struct qib_devdata *dd) > */ > static int qib_pcie_caps; > module_param_named(pcie_caps, qib_pcie_caps, int, S_IRUGO); > -MODULE_PARM_DESC(pcie_caps, "Max PCIe tuning: Payload (0..3), ReadReq (4..7)"); > +MODULE_PARM_DESC(pcie_caps, "Max PCIe tuning: ReadReq (4..7)"); > > static void qib_tune_pcie_caps(struct qib_devdata *dd) > { > struct pci_dev *parent; > - u16 rc_mpss, rc_mps, ep_mpss, ep_mps; > u16 rc_mrrs, ep_mrrs, max_mrrs; > > /* Find out supported and configured values for parent (root) */ > @@ -575,30 +574,6 @@ static void qib_tune_pcie_caps(struct qib_devdata *dd) > if (!pci_is_pcie(parent) || !pci_is_pcie(dd->pcidev)) > return; > > - rc_mpss = parent->pcie_mpss; > - rc_mps = ffs(pcie_get_mps(parent)) - 8; > - /* Find out supported and configured values for endpoint (us) */ > - ep_mpss = dd->pcidev->pcie_mpss; > - ep_mps = ffs(pcie_get_mps(dd->pcidev)) - 8; > - > - /* Find max payload supported by root, endpoint */ > - if (rc_mpss > ep_mpss) > - rc_mpss = ep_mpss; > - > - /* If Supported greater than limit in module param, limit it */ > - if (rc_mpss > (qib_pcie_caps & 7)) > - rc_mpss = qib_pcie_caps & 7; > - /* If less than (allowed, supported), bump root payload */ > - if (rc_mpss > rc_mps) { > - rc_mps = rc_mpss; > - pcie_set_mps(parent, 128 << rc_mps); > - } > - /* If less than (allowed, supported), bump endpoint payload */ > - if (rc_mpss > ep_mps) { > - ep_mps = rc_mpss; > - pcie_set_mps(dd->pcidev, 128 << ep_mps); > - } > - > /* > * Now the Read Request size. > * No field for max supported, but PCIe spec limits it to 4096, > -- > 1.7.10.4 > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] QIB: Removing usage of pcie_set_mps() 2015-08-17 22:30 ` Bjorn Helgaas @ 2015-08-17 22:50 ` Jiang, Dave 2015-08-18 0:06 ` Bjorn Helgaas 0 siblings, 1 reply; 12+ messages in thread From: Jiang, Dave @ 2015-08-17 22:50 UTC (permalink / raw) To: Busch, Keith, bhelgaas@google.com Cc: linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, linux-rdma@vger.kernel.org, infinipath T24gTW9uLCAyMDE1LTA4LTE3IGF0IDE3OjMwIC0wNTAwLCBCam9ybiBIZWxnYWFzIHdyb3RlOg0K PiBbK2NjIE1pa2UsIGxpbnV4LXJkbWFdDQo+IA0KPiBPbiBXZWQsIEp1bCAyOSwgMjAxNSBhdCAw NDoxODo1NFBNIC0wNjAwLCBLZWl0aCBCdXNjaCB3cm90ZToNCj4gPiBGcm9tOiBEYXZlIEppYW5n IDxkYXZlLmppYW5nQGludGVsLmNvbT4NCj4gPiANCj4gPiBUaGlzIGlzIGluIHBlcnBlcmF0aW9u IG9mIHVuLWV4cG9ydGluZyB0aGUgcGNpZV9zZXRfbXBzKCkgZnVuY3Rpb24NCj4gPiBzeW1ib2wu IEEgZHJpdmVyIHNob3VsZCBub3QgYmUgY2hhbmdpbmcgdGhlIE1QUyBhcyB0aGF0IGlzIHRoZQ0K PiA+IHJlc3BvbnNpYmlsaXR5IG9mIHRoZSBQQ0kgc3Vic3lzdGVtLg0KPiANCj4gUGxlYXNlIGV4 cGxhaW4gdGhlIGltcGxpY2F0aW9ucyBvZiByZW1vdmluZyB0aGlzIGNvZGUuICBEb2VzIHRoaXMg DQo+IGFmZmVjdA0KPiBwZXJmb3JtYW5jZSBvZiB0aGUgZGV2aWNlPyAgSWYgc28sIGhvdyBkbyB3 ZSBnZXQgdGhhdCBwZXJmb3JtYW5jZSANCj4gYmFjaz8NCg0KSG9uZXN0bHkgSSBkb24ndCBrbm93 LiBCdXQgYXQgdGhlIHNhbWUgdGltZSBJIHRoaW5rIHRoZSBkcml2ZXINCnNob3VsZG4ndCBiZSB0 b3VjaGluZyB0aGUgTVBTIGF0IGFsbC4gU2hvdWxkbid0IHRoYXQgYmUgbGVmdCB0byB0aGUNClBD SWUgc3Vic3lzdGVtIGFuZCByZWx5IG9uIHRoZSBQQ0llIHN1YnN5c3RlbSB0byBzZXQgdGhpcyB0 byBhIHNhbmUNCnZhbHVlPyANCg0KPiANCj4gSSBhbHNvIGNjJ2QgdGhlIFFJQiBtYWludGFpbmVy cyBmb3IgeW91Og0KPiANCj4gICBRSUIgRFJJVkVSDQo+ICAgTTogICAgICBNaWtlIE1hcmNpbmlz enluIDxpbmZpbmlwYXRoQGludGVsLmNvbT4NCj4gICBMOiAgICAgIGxpbnV4LXJkbWFAdmdlci5r ZXJuZWwub3JnDQo+ICAgRjogICAgICBkcml2ZXJzL2luZmluaWJhbmQvaHcvcWliLw0KPiANCj4g PiBTaWduZWQtb2ZmLWJ5OiBEYXZlIEppYW5nIDxkYXZlLmppYW5nQGludGVsLmNvbT4NCj4gPiAt LS0NCj4gPiAgZHJpdmVycy9pbmZpbmliYW5kL2h3L3FpYi9xaWJfcGNpZS5jIHwgICAyNyArLS0t LS0tLS0tLS0tLS0tLS0tLS0tDQo+ID4gLS0tLS0NCj4gPiAgMSBmaWxlIGNoYW5nZWQsIDEgaW5z ZXJ0aW9uKCspLCAyNiBkZWxldGlvbnMoLSkNCj4gPiANCj4gPiBkaWZmIC0tZ2l0IGEvZHJpdmVy cy9pbmZpbmliYW5kL2h3L3FpYi9xaWJfcGNpZS5jIA0KPiA+IGIvZHJpdmVycy9pbmZpbmliYW5k L2h3L3FpYi9xaWJfcGNpZS5jDQo+ID4gaW5kZXggNDc1OGEzOC4uYjhhMmRjZCAxMDA2NDQNCj4g PiAtLS0gYS9kcml2ZXJzL2luZmluaWJhbmQvaHcvcWliL3FpYl9wY2llLmMNCj4gPiArKysgYi9k cml2ZXJzL2luZmluaWJhbmQvaHcvcWliL3FpYl9wY2llLmMNCj4gPiBAQCAtNTU3LDEyICs1NTcs MTEgQEAgc3RhdGljIHZvaWQgcWliX3R1bmVfcGNpZV9jb2FsZXNjZShzdHJ1Y3QgDQo+ID4gcWli X2RldmRhdGEgKmRkKQ0KPiA+ICAgKi8NCj4gPiAgc3RhdGljIGludCBxaWJfcGNpZV9jYXBzOw0K PiA+ICBtb2R1bGVfcGFyYW1fbmFtZWQocGNpZV9jYXBzLCBxaWJfcGNpZV9jYXBzLCBpbnQsIFNf SVJVR08pOw0KPiA+IC1NT0RVTEVfUEFSTV9ERVNDKHBjaWVfY2FwcywgIk1heCBQQ0llIHR1bmlu ZzogUGF5bG9hZCAoMC4uMyksIA0KPiA+IFJlYWRSZXEgKDQuLjcpIik7DQo+ID4gK01PRFVMRV9Q QVJNX0RFU0MocGNpZV9jYXBzLCAiTWF4IFBDSWUgdHVuaW5nOiBSZWFkUmVxICg0Li43KSIpOw0K PiA+ICANCj4gPiAgc3RhdGljIHZvaWQgcWliX3R1bmVfcGNpZV9jYXBzKHN0cnVjdCBxaWJfZGV2 ZGF0YSAqZGQpDQo+ID4gIHsNCj4gPiAgCXN0cnVjdCBwY2lfZGV2ICpwYXJlbnQ7DQo+ID4gLQl1 MTYgcmNfbXBzcywgcmNfbXBzLCBlcF9tcHNzLCBlcF9tcHM7DQo+ID4gIAl1MTYgcmNfbXJycywg ZXBfbXJycywgbWF4X21ycnM7DQo+ID4gIA0KPiA+ICAJLyogRmluZCBvdXQgc3VwcG9ydGVkIGFu ZCBjb25maWd1cmVkIHZhbHVlcyBmb3IgcGFyZW50IA0KPiA+IChyb290KSAqLw0KPiA+IEBAIC01 NzUsMzAgKzU3NCw2IEBAIHN0YXRpYyB2b2lkIHFpYl90dW5lX3BjaWVfY2FwcyhzdHJ1Y3QgDQo+ ID4gcWliX2RldmRhdGEgKmRkKQ0KPiA+ICAJaWYgKCFwY2lfaXNfcGNpZShwYXJlbnQpIHx8ICFw Y2lfaXNfcGNpZShkZC0+cGNpZGV2KSkNCj4gPiAgCQlyZXR1cm47DQo+ID4gIA0KPiA+IC0JcmNf bXBzcyA9IHBhcmVudC0+cGNpZV9tcHNzOw0KPiA+IC0JcmNfbXBzID0gZmZzKHBjaWVfZ2V0X21w cyhwYXJlbnQpKSAtIDg7DQo+ID4gLQkvKiBGaW5kIG91dCBzdXBwb3J0ZWQgYW5kIGNvbmZpZ3Vy ZWQgdmFsdWVzIGZvciBlbmRwb2ludCANCj4gPiAodXMpICovDQo+ID4gLQllcF9tcHNzID0gZGQt PnBjaWRldi0+cGNpZV9tcHNzOw0KPiA+IC0JZXBfbXBzID0gZmZzKHBjaWVfZ2V0X21wcyhkZC0+ cGNpZGV2KSkgLSA4Ow0KPiA+IC0NCj4gPiAtCS8qIEZpbmQgbWF4IHBheWxvYWQgc3VwcG9ydGVk IGJ5IHJvb3QsIGVuZHBvaW50ICovDQo+ID4gLQlpZiAocmNfbXBzcyA+IGVwX21wc3MpDQo+ID4g LQkJcmNfbXBzcyA9IGVwX21wc3M7DQo+ID4gLQ0KPiA+IC0JLyogSWYgU3VwcG9ydGVkIGdyZWF0 ZXIgdGhhbiBsaW1pdCBpbiBtb2R1bGUgcGFyYW0sIGxpbWl0IA0KPiA+IGl0ICovDQo+ID4gLQlp ZiAocmNfbXBzcyA+IChxaWJfcGNpZV9jYXBzICYgNykpDQo+ID4gLQkJcmNfbXBzcyA9IHFpYl9w Y2llX2NhcHMgJiA3Ow0KPiA+IC0JLyogSWYgbGVzcyB0aGFuIChhbGxvd2VkLCBzdXBwb3J0ZWQp LCBidW1wIHJvb3QgcGF5bG9hZCAqLw0KPiA+IC0JaWYgKHJjX21wc3MgPiByY19tcHMpIHsNCj4g PiAtCQlyY19tcHMgPSByY19tcHNzOw0KPiA+IC0JCXBjaWVfc2V0X21wcyhwYXJlbnQsIDEyOCA8 PCByY19tcHMpOw0KPiA+IC0JfQ0KPiA+IC0JLyogSWYgbGVzcyB0aGFuIChhbGxvd2VkLCBzdXBw b3J0ZWQpLCBidW1wIGVuZHBvaW50IA0KPiA+IHBheWxvYWQgKi8NCj4gPiAtCWlmIChyY19tcHNz ID4gZXBfbXBzKSB7DQo+ID4gLQkJZXBfbXBzID0gcmNfbXBzczsNCj4gPiAtCQlwY2llX3NldF9t cHMoZGQtPnBjaWRldiwgMTI4IDw8IGVwX21wcyk7DQo+ID4gLQl9DQo+ID4gLQ0KPiA+ICAJLyoN Cj4gPiAgCSAqIE5vdyB0aGUgUmVhZCBSZXF1ZXN0IHNpemUuDQo+ID4gIAkgKiBObyBmaWVsZCBm b3IgbWF4IHN1cHBvcnRlZCwgYnV0IFBDSWUgc3BlYyBsaW1pdHMgaXQgdG8gDQo+ID4gNDA5Niw= ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] QIB: Removing usage of pcie_set_mps() 2015-08-17 22:50 ` Jiang, Dave @ 2015-08-18 0:06 ` Bjorn Helgaas 2015-08-18 1:49 ` Jason Gunthorpe 0 siblings, 1 reply; 12+ messages in thread From: Bjorn Helgaas @ 2015-08-18 0:06 UTC (permalink / raw) To: Jiang, Dave Cc: Busch, Keith, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, linux-rdma@vger.kernel.org, infinipath On Mon, Aug 17, 2015 at 5:50 PM, Jiang, Dave <dave.jiang@intel.com> wrote: > On Mon, 2015-08-17 at 17:30 -0500, Bjorn Helgaas wrote: >> [+cc Mike, linux-rdma] >> >> On Wed, Jul 29, 2015 at 04:18:54PM -0600, Keith Busch wrote: >> > From: Dave Jiang <dave.jiang@intel.com> >> > >> > This is in perperation of un-exporting the pcie_set_mps() function >> > symbol. A driver should not be changing the MPS as that is the >> > responsibility of the PCI subsystem. >> >> Please explain the implications of removing this code. Does this >> affect >> performance of the device? If so, how do we get that performance >> back? > > Honestly I don't know. But at the same time I think the driver > shouldn't be touching the MPS at all. Shouldn't that be left to the > PCIe subsystem and rely on the PCIe subsystem to set this to a sane > value? Yes, I think in principle the PCI core should own this, but I also don't want to introduce a performance regression, so I think we need to understand whether there's a problem, and if there is, fix it. >> I also cc'd the QIB maintainers for you: >> >> QIB DRIVER >> M: Mike Marciniszyn <infinipath@intel.com> >> L: linux-rdma@vger.kernel.org >> F: drivers/infiniband/hw/qib/ >> >> > Signed-off-by: Dave Jiang <dave.jiang@intel.com> >> > --- >> > drivers/infiniband/hw/qib/qib_pcie.c | 27 +--------------------- >> > ----- >> > 1 file changed, 1 insertion(+), 26 deletions(-) >> > >> > diff --git a/drivers/infiniband/hw/qib/qib_pcie.c >> > b/drivers/infiniband/hw/qib/qib_pcie.c >> > index 4758a38..b8a2dcd 100644 >> > --- a/drivers/infiniband/hw/qib/qib_pcie.c >> > +++ b/drivers/infiniband/hw/qib/qib_pcie.c >> > @@ -557,12 +557,11 @@ static void qib_tune_pcie_coalesce(struct >> > qib_devdata *dd) >> > */ >> > static int qib_pcie_caps; >> > module_param_named(pcie_caps, qib_pcie_caps, int, S_IRUGO); >> > -MODULE_PARM_DESC(pcie_caps, "Max PCIe tuning: Payload (0..3), >> > ReadReq (4..7)"); >> > +MODULE_PARM_DESC(pcie_caps, "Max PCIe tuning: ReadReq (4..7)"); >> > >> > static void qib_tune_pcie_caps(struct qib_devdata *dd) >> > { >> > struct pci_dev *parent; >> > - u16 rc_mpss, rc_mps, ep_mpss, ep_mps; >> > u16 rc_mrrs, ep_mrrs, max_mrrs; >> > >> > /* Find out supported and configured values for parent >> > (root) */ >> > @@ -575,30 +574,6 @@ static void qib_tune_pcie_caps(struct >> > qib_devdata *dd) >> > if (!pci_is_pcie(parent) || !pci_is_pcie(dd->pcidev)) >> > return; >> > >> > - rc_mpss = parent->pcie_mpss; >> > - rc_mps = ffs(pcie_get_mps(parent)) - 8; >> > - /* Find out supported and configured values for endpoint >> > (us) */ >> > - ep_mpss = dd->pcidev->pcie_mpss; >> > - ep_mps = ffs(pcie_get_mps(dd->pcidev)) - 8; >> > - >> > - /* Find max payload supported by root, endpoint */ >> > - if (rc_mpss > ep_mpss) >> > - rc_mpss = ep_mpss; >> > - >> > - /* If Supported greater than limit in module param, limit >> > it */ >> > - if (rc_mpss > (qib_pcie_caps & 7)) >> > - rc_mpss = qib_pcie_caps & 7; >> > - /* If less than (allowed, supported), bump root payload */ >> > - if (rc_mpss > rc_mps) { >> > - rc_mps = rc_mpss; >> > - pcie_set_mps(parent, 128 << rc_mps); >> > - } >> > - /* If less than (allowed, supported), bump endpoint >> > payload */ >> > - if (rc_mpss > ep_mps) { >> > - ep_mps = rc_mpss; >> > - pcie_set_mps(dd->pcidev, 128 << ep_mps); >> > - } >> > - >> > /* >> > * Now the Read Request size. >> > * No field for max supported, but PCIe spec limits it to >> > 4096, ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] QIB: Removing usage of pcie_set_mps() 2015-08-18 0:06 ` Bjorn Helgaas @ 2015-08-18 1:49 ` Jason Gunthorpe 0 siblings, 0 replies; 12+ messages in thread From: Jason Gunthorpe @ 2015-08-18 1:49 UTC (permalink / raw) To: Mike Marciniszyn, Bjorn Helgaas Cc: Jiang, Dave, Busch, Keith, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, linux-rdma@vger.kernel.org, infinipath On Mon, Aug 17, 2015 at 07:06:10PM -0500, Bjorn Helgaas wrote: > On Mon, Aug 17, 2015 at 5:50 PM, Jiang, Dave <dave.jiang@intel.com> wrote: > > On Mon, 2015-08-17 at 17:30 -0500, Bjorn Helgaas wrote: > >> [+cc Mike, linux-rdma] > >> > >> On Wed, Jul 29, 2015 at 04:18:54PM -0600, Keith Busch wrote: > >> > From: Dave Jiang <dave.jiang@intel.com> > >> > > >> > This is in perperation of un-exporting the pcie_set_mps() function > >> > symbol. A driver should not be changing the MPS as that is the > >> > responsibility of the PCI subsystem. > >> > >> Please explain the implications of removing this code. Does this > >> affect > >> performance of the device? If so, how do we get that performance > >> back? > > > > Honestly I don't know. But at the same time I think the driver > > shouldn't be touching the MPS at all. Shouldn't that be left to the > > PCIe subsystem and rely on the PCIe subsystem to set this to a sane > > value? > > Yes, I think in principle the PCI core should own this, but I also > don't want to introduce a performance regression, so I think we need > to understand whether there's a problem, and if there is, fix it. Making sure Mike is CC'd directly.. Mike: I see this has been cut and pasted to HFI1 too, I would be disappointed if HFI needs it as well. :( FWIW, I totally agree with the above. MPS/MRS and related have to do with root port capability, switches in the path and any platform bugs.. I'm not sure why a driver would ever want to mess with this, and since this code doesn't walk the bus toward the root port, it is technically wrong, right? I also find it strange that qib_pcie_caps defaults to zero which means the 'tuning' reduces the payload size to the minimums (??) > >> > /* > >> > * Now the Read Request size. > >> > * No field for max supported, but PCIe spec limits it to > >> > 4096, .. it has been a bit since I looked at this, but IIRC, MRRS is also something that should not be touched by a driver? Jason ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 3/3] PCIE: Remove symbol export for pcie_set_mps() 2015-07-29 22:18 [PATCH 0/3] PCI-e Max Payload Size configuration Keith Busch 2015-07-29 22:18 ` [PATCH] pci: Default MPS tuning, match upstream port Keith Busch 2015-07-29 22:18 ` [PATCH 2/3] QIB: Removing usage of pcie_set_mps() Keith Busch @ 2015-07-29 22:18 ` Keith Busch 2015-08-17 22:31 ` Bjorn Helgaas 2 siblings, 1 reply; 12+ messages in thread From: Keith Busch @ 2015-07-29 22:18 UTC (permalink / raw) To: linux-pci, linux-kernel; +Cc: Bjorn Helgass, Dave Jiang From: Dave Jiang <dave.jiang@intel.com> The setting of PCIe MPS should be left to the PCI subsystem and not the driver. An ill configured MPS by the driver could cause the device to not function or unstablize the entire system. Removing the exported symbol. Signed-off-by: Dave Jiang <dave.jiang@intel.com> --- drivers/pci/pci.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 0008c95..92349ee 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -4121,7 +4121,6 @@ int pcie_set_mps(struct pci_dev *dev, int mps) return pcie_capability_clear_and_set_word(dev, PCI_EXP_DEVCTL, PCI_EXP_DEVCTL_PAYLOAD, v); } -EXPORT_SYMBOL(pcie_set_mps); /** * pcie_get_minimum_link - determine minimum link settings of a PCI device -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] PCIE: Remove symbol export for pcie_set_mps() 2015-07-29 22:18 ` [PATCH 3/3] PCIE: Remove symbol export for pcie_set_mps() Keith Busch @ 2015-08-17 22:31 ` Bjorn Helgaas 2015-08-17 23:32 ` Jiang, Dave 0 siblings, 1 reply; 12+ messages in thread From: Bjorn Helgaas @ 2015-08-17 22:31 UTC (permalink / raw) To: Keith Busch; +Cc: linux-pci, linux-kernel, Dave Jiang On Wed, Jul 29, 2015 at 04:18:55PM -0600, Keith Busch wrote: > From: Dave Jiang <dave.jiang@intel.com> > > The setting of PCIe MPS should be left to the PCI subsystem and not > the driver. An ill configured MPS by the driver could cause the device > to not function or unstablize the entire system. Removing the exported > symbol. > > Signed-off-by: Dave Jiang <dave.jiang@intel.com> > --- > drivers/pci/pci.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 0008c95..92349ee 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -4121,7 +4121,6 @@ int pcie_set_mps(struct pci_dev *dev, int mps) > return pcie_capability_clear_and_set_word(dev, PCI_EXP_DEVCTL, > PCI_EXP_DEVCTL_PAYLOAD, v); > } > -EXPORT_SYMBOL(pcie_set_mps); I think the pcie_set_mps() declaration could be moved from include/linux/pci.h to drivers/pci/pci.h, couldn't it? ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] PCIE: Remove symbol export for pcie_set_mps() 2015-08-17 22:31 ` Bjorn Helgaas @ 2015-08-17 23:32 ` Jiang, Dave 0 siblings, 0 replies; 12+ messages in thread From: Jiang, Dave @ 2015-08-17 23:32 UTC (permalink / raw) To: Busch, Keith, bhelgaas@google.com Cc: linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org T24gTW9uLCAyMDE1LTA4LTE3IGF0IDE3OjMxIC0wNTAwLCBCam9ybiBIZWxnYWFzIHdyb3RlOg0K PiBPbiBXZWQsIEp1bCAyOSwgMjAxNSBhdCAwNDoxODo1NVBNIC0wNjAwLCBLZWl0aCBCdXNjaCB3 cm90ZToNCj4gPiBGcm9tOiBEYXZlIEppYW5nIDxkYXZlLmppYW5nQGludGVsLmNvbT4NCj4gPiAN Cj4gPiBUaGUgc2V0dGluZyBvZiBQQ0llIE1QUyBzaG91bGQgYmUgbGVmdCB0byB0aGUgUENJIHN1 YnN5c3RlbSBhbmQgbm90DQo+ID4gdGhlIGRyaXZlci4gQW4gaWxsIGNvbmZpZ3VyZWQgTVBTIGJ5 IHRoZSBkcml2ZXIgY291bGQgY2F1c2UgdGhlIA0KPiA+IGRldmljZQ0KPiA+IHRvIG5vdCBmdW5j dGlvbiBvciB1bnN0YWJsaXplIHRoZSBlbnRpcmUgc3lzdGVtLiBSZW1vdmluZyB0aGUgDQo+ID4g ZXhwb3J0ZWQNCj4gPiBzeW1ib2wuDQo+ID4gDQo+ID4gU2lnbmVkLW9mZi1ieTogRGF2ZSBKaWFu ZyA8ZGF2ZS5qaWFuZ0BpbnRlbC5jb20+DQo+ID4gLS0tDQo+ID4gIGRyaXZlcnMvcGNpL3BjaS5j IHwgICAgMSAtDQo+ID4gIDEgZmlsZSBjaGFuZ2VkLCAxIGRlbGV0aW9uKC0pDQo+ID4gDQo+ID4g ZGlmZiAtLWdpdCBhL2RyaXZlcnMvcGNpL3BjaS5jIGIvZHJpdmVycy9wY2kvcGNpLmMNCj4gPiBp bmRleCAwMDA4Yzk1Li45MjM0OWVlIDEwMDY0NA0KPiA+IC0tLSBhL2RyaXZlcnMvcGNpL3BjaS5j DQo+ID4gKysrIGIvZHJpdmVycy9wY2kvcGNpLmMNCj4gPiBAQCAtNDEyMSw3ICs0MTIxLDYgQEAg aW50IHBjaWVfc2V0X21wcyhzdHJ1Y3QgcGNpX2RldiAqZGV2LCBpbnQgDQo+ID4gbXBzKQ0KPiA+ ICAJcmV0dXJuIHBjaWVfY2FwYWJpbGl0eV9jbGVhcl9hbmRfc2V0X3dvcmQoZGV2LCANCj4gPiBQ Q0lfRVhQX0RFVkNUTCwNCj4gPiAgCQkJCQkJIA0KPiA+ICBQQ0lfRVhQX0RFVkNUTF9QQVlMT0FE LCB2KTsNCj4gPiAgfQ0KPiA+IC1FWFBPUlRfU1lNQk9MKHBjaWVfc2V0X21wcyk7DQo+IA0KPiBJ IHRoaW5rIHRoZSBwY2llX3NldF9tcHMoKSBkZWNsYXJhdGlvbiBjb3VsZCBiZSBtb3ZlZCBmcm9t DQo+IGluY2x1ZGUvbGludXgvcGNpLmggdG8gZHJpdmVycy9wY2kvcGNpLmgsIGNvdWxkbid0IGl0 Pw0KDQpPay4= ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2015-08-18 1:49 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-07-29 22:18 [PATCH 0/3] PCI-e Max Payload Size configuration Keith Busch 2015-07-29 22:18 ` [PATCH] pci: Default MPS tuning, match upstream port Keith Busch 2015-08-17 22:28 ` Bjorn Helgaas 2015-08-17 23:03 ` Keith Busch 2015-07-29 22:18 ` [PATCH 2/3] QIB: Removing usage of pcie_set_mps() Keith Busch 2015-08-17 22:30 ` Bjorn Helgaas 2015-08-17 22:50 ` Jiang, Dave 2015-08-18 0:06 ` Bjorn Helgaas 2015-08-18 1:49 ` Jason Gunthorpe 2015-07-29 22:18 ` [PATCH 3/3] PCIE: Remove symbol export for pcie_set_mps() Keith Busch 2015-08-17 22:31 ` Bjorn Helgaas 2015-08-17 23:32 ` Jiang, Dave
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).