* [PATCH v2 0/2] PCI: Init RCB from pci_configure_device and fix program_hpx_type2 @ 2026-01-21 11:35 Håkon Bugge 2026-01-21 11:35 ` [PATCH v2 1/2] PCI: Initialize RCB from pci_configure_device Håkon Bugge 2026-01-21 11:35 ` [PATCH v2 2/2] PCI/ACPI: Confine program_hpx_type2 to the AER bits Håkon Bugge 0 siblings, 2 replies; 14+ messages in thread From: Håkon Bugge @ 2026-01-21 11:35 UTC (permalink / raw) To: Bjorn Helgaas; +Cc: Johannes Thumshirn, linux-pci, linux-kernel, linux-acpi This series add the initialization of the Link Control register's RCB to pci_configure_rcb() called from pci_configure_device() and also cleans up the incorrect program_hpx_type2(): 1. It should only be called when we own the PCIe native hotplug and not the AER ownership 2. It should only manipulate the AER-bits In addition, the second commit adds a warning if the _HPX type2 record attempts to modify the Link Control register. Håkon Bugge (2): PCI: Initialize RCB from pci_configure_device PCI/ACPI: Confine program_hpx_type2 to the AER bits drivers/pci/pci-acpi.c | 61 +++++++++++++++++++----------------------- drivers/pci/pci.h | 3 +++ drivers/pci/pcie/aer.c | 3 --- drivers/pci/probe.c | 27 +++++++++++++++++++ 4 files changed, 57 insertions(+), 37 deletions(-) -- 2.43.5 ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 1/2] PCI: Initialize RCB from pci_configure_device 2026-01-21 11:35 [PATCH v2 0/2] PCI: Init RCB from pci_configure_device and fix program_hpx_type2 Håkon Bugge @ 2026-01-21 11:35 ` Håkon Bugge 2026-01-21 17:41 ` Lukas Wunner 2026-01-21 22:40 ` Bjorn Helgaas 2026-01-21 11:35 ` [PATCH v2 2/2] PCI/ACPI: Confine program_hpx_type2 to the AER bits Håkon Bugge 1 sibling, 2 replies; 14+ messages in thread From: Håkon Bugge @ 2026-01-21 11:35 UTC (permalink / raw) To: Bjorn Helgaas Cc: Johannes Thumshirn, linux-pci, linux-kernel, linux-acpi, Håkon Bugge Commit e42010d8207f ("PCI: Set Read Completion Boundary to 128 iff Root Port supports it (_HPX)") fixed a bogus _HPX type 2 record, which instructed program_hpx_type2() to set the RCB in an endpoint, although it's RC did not have the RCB bit set. e42010d8207f fixed that by qualifying the setting of the RCB in the endpoint with the RC supporting an 128 byte RCB. In retrospect, the program_hpx_type2() should only modify the AER bits, and stay away from fiddling with the Link Control Register. Hence, we explicitly program the RCB from pci_configure_device(). RCB is RO in Root Ports, and in VFs, the bit is RsvdP, so for these two cases we skip programming it. Then, if the Root Port has RCB set and it is not set in the EP, we set it. Fixes: Commit e42010d8207f ("PCI: Set Read Completion Boundary to 128 iff Root Port supports it (_HPX)") Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> --- Note, that the current duplication of pcie_root_rcb_set() will be removed in the next commit. --- drivers/pci/probe.c | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 41183aed8f5d9..347af29868124 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -2410,6 +2410,41 @@ static void pci_configure_serr(struct pci_dev *dev) } } +static bool pcie_root_rcb_set(struct pci_dev *dev) +{ + struct pci_dev *rp = pcie_find_root_port(dev); + u16 lnkctl; + + if (!rp) + return false; + + pcie_capability_read_word(rp, PCI_EXP_LNKCTL, &lnkctl); + + return !!(lnkctl & PCI_EXP_LNKCTL_RCB); +} + +static void pci_configure_rcb(struct pci_dev *dev) +{ + /* + * Obviously, we need a Link Control register. The RCB is RO + * in Root Ports, so no need to attempt to set it for + * them. For VFs, the RCB is RsvdP, so, no need to set it. + * Then, if the Root Port has RCB set, then we set for the EP + * unless already set. + */ + if (pcie_cap_has_lnkctl(dev) && + (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT) && + !dev->is_virtfn && pcie_root_rcb_set(dev)) { + u16 lnkctl; + + pcie_capability_read_word(dev, PCI_EXP_LNKCTL, &lnkctl); + if (lnkctl & PCI_EXP_LNKCTL_RCB) + return; + + pcie_capability_write_word(dev, PCI_EXP_LNKCTL, lnkctl | PCI_EXP_LNKCTL_RCB); + } +} + static void pci_configure_device(struct pci_dev *dev) { pci_configure_mps(dev); @@ -2419,6 +2454,7 @@ static void pci_configure_device(struct pci_dev *dev) pci_configure_aspm_l1ss(dev); pci_configure_eetlp_prefix(dev); pci_configure_serr(dev); + pci_configure_rcb(dev); pci_acpi_program_hp_params(dev); } -- 2.43.5 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v2 1/2] PCI: Initialize RCB from pci_configure_device 2026-01-21 11:35 ` [PATCH v2 1/2] PCI: Initialize RCB from pci_configure_device Håkon Bugge @ 2026-01-21 17:41 ` Lukas Wunner 2026-01-21 19:13 ` Haakon Bugge 2026-01-21 22:40 ` Bjorn Helgaas 1 sibling, 1 reply; 14+ messages in thread From: Lukas Wunner @ 2026-01-21 17:41 UTC (permalink / raw) To: Håkon Bugge Cc: Bjorn Helgaas, Johannes Thumshirn, linux-pci, linux-kernel, linux-acpi On Wed, Jan 21, 2026 at 12:35:40PM +0100, Håkon Bugge wrote: > + if (pcie_cap_has_lnkctl(dev) && > + (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT) && > + !dev->is_virtfn && pcie_root_rcb_set(dev)) { > + u16 lnkctl; > + > + pcie_capability_read_word(dev, PCI_EXP_LNKCTL, &lnkctl); > + if (lnkctl & PCI_EXP_LNKCTL_RCB) > + return; > + > + pcie_capability_write_word(dev, PCI_EXP_LNKCTL, lnkctl | PCI_EXP_LNKCTL_RCB); You may want to use pcie_capability_set_word() for brevity. Thanks, Lukas ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 1/2] PCI: Initialize RCB from pci_configure_device 2026-01-21 17:41 ` Lukas Wunner @ 2026-01-21 19:13 ` Haakon Bugge 0 siblings, 0 replies; 14+ messages in thread From: Haakon Bugge @ 2026-01-21 19:13 UTC (permalink / raw) To: Lukas Wunner Cc: Bjorn Helgaas, Johannes Thumshirn, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org > On Wed, Jan 21, 2026 at 12:35:40PM +0100, Håkon Bugge wrote: > + if (pcie_cap_has_lnkctl(dev) && > + (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT) && > + !dev->is_virtfn && pcie_root_rcb_set(dev)) { > + u16 lnkctl; > + > + pcie_capability_read_word(dev, PCI_EXP_LNKCTL, &lnkctl); > + if (lnkctl & PCI_EXP_LNKCTL_RCB) > + return; > + > + pcie_capability_write_word(dev, PCI_EXP_LNKCTL, lnkctl | PCI_EXP_LNKCTL_RCB); > > You may want to use pcie_capability_set_word() for brevity. Well, that would incur an additional pcie_capability_read_word(), so between brevity and performance, I chose performance. Another, probably better reason to use pcie_capability_set_word() is that it calls (for PCI_EXP_LNKCTL) pcie_capability_clear_and_set_word_locked(). However, I assume that during device probing, the locking is not needed. Thxs, Håkon ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 1/2] PCI: Initialize RCB from pci_configure_device 2026-01-21 11:35 ` [PATCH v2 1/2] PCI: Initialize RCB from pci_configure_device Håkon Bugge 2026-01-21 17:41 ` Lukas Wunner @ 2026-01-21 22:40 ` Bjorn Helgaas 2026-01-22 9:42 ` Haakon Bugge 2026-01-22 10:36 ` Bjorn Helgaas 1 sibling, 2 replies; 14+ messages in thread From: Bjorn Helgaas @ 2026-01-21 22:40 UTC (permalink / raw) To: Håkon Bugge Cc: Bjorn Helgaas, Johannes Thumshirn, linux-pci, linux-kernel, linux-acpi On Wed, Jan 21, 2026 at 12:35:40PM +0100, Håkon Bugge wrote: > Commit e42010d8207f ("PCI: Set Read Completion Boundary to 128 iff > Root Port supports it (_HPX)") fixed a bogus _HPX type 2 record, which > instructed program_hpx_type2() to set the RCB in an endpoint, > although it's RC did not have the RCB bit set. > > e42010d8207f fixed that by qualifying the setting of the RCB in the > endpoint with the RC supporting an 128 byte RCB. > > In retrospect, the program_hpx_type2() should only modify the AER > bits, and stay away from fiddling with the Link Control Register. > > Hence, we explicitly program the RCB from pci_configure_device(). RCB > is RO in Root Ports, and in VFs, the bit is RsvdP, so for these two > cases we skip programming it. Then, if the Root Port has RCB set and > it is not set in the EP, we set it. > > Fixes: Commit e42010d8207f ("PCI: Set Read Completion Boundary to 128 iff Root Port supports it (_HPX)") > Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> > > --- > > Note, that the current duplication of pcie_root_rcb_set() will be > removed in the next commit. > --- > drivers/pci/probe.c | 36 ++++++++++++++++++++++++++++++++++++ > 1 file changed, 36 insertions(+) > > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c > index 41183aed8f5d9..347af29868124 100644 > --- a/drivers/pci/probe.c > +++ b/drivers/pci/probe.c > @@ -2410,6 +2410,41 @@ static void pci_configure_serr(struct pci_dev *dev) > } > } > > +static bool pcie_root_rcb_set(struct pci_dev *dev) > +{ > + struct pci_dev *rp = pcie_find_root_port(dev); > + u16 lnkctl; > + > + if (!rp) > + return false; > + > + pcie_capability_read_word(rp, PCI_EXP_LNKCTL, &lnkctl); > + > + return !!(lnkctl & PCI_EXP_LNKCTL_RCB); > +} > + > +static void pci_configure_rcb(struct pci_dev *dev) > +{ > + /* > + * Obviously, we need a Link Control register. The RCB is RO > + * in Root Ports, so no need to attempt to set it for > + * them. For VFs, the RCB is RsvdP, so, no need to set it. > + * Then, if the Root Port has RCB set, then we set for the EP > + * unless already set. > + */ > + if (pcie_cap_has_lnkctl(dev) && > + (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT) && > + !dev->is_virtfn && pcie_root_rcb_set(dev)) { > + u16 lnkctl; > + > + pcie_capability_read_word(dev, PCI_EXP_LNKCTL, &lnkctl); > + if (lnkctl & PCI_EXP_LNKCTL_RCB) > + return; > + > + pcie_capability_write_word(dev, PCI_EXP_LNKCTL, lnkctl | PCI_EXP_LNKCTL_RCB); > + } RCB isn't meaningful for switches, so we'll read their LNKCTL unnecessarily. I propose something like this, which also clears RCB if it's set when it shouldn't be (I think this would indicate a firmware defect): /* * Per PCIe r7.0, sec 7.5.3.7, RCB is only meaningful in Root Ports * (where it is read-only), Endpoints, and Bridges. It may only be * set for Endpoints and Bridges if it is set in the Root Port. */ if (!pci_is_pcie(dev) || pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT || pci_pcie_type(dev) == PCI_EXP_TYPE_UPSTREAM || pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM || dev->is_virtfn) return; rcb = pcie_root_rcb_set(dev); pcie_capability_read_word(dev, PCI_EXP_LNKCTL, &lnkctl); if (rcb) { if (lnkctl & PCI_EXP_LNKCTL_RCB) return; lnkctl |= PCI_EXP_LNKCTL_RCB; } else { if (!(lnkctl & PCI_EXP_LNKCTL_RCB)) return; pci_info(FW_INFO "clearing RCB (RCB not set in Root Port)\n"); lnkctl &= ~PCI_EXP_LNKCTL_RCB; } pcie_capability_write_word(dev, PCI_EXP_LNKCTL, lnkctl); > +} > + > static void pci_configure_device(struct pci_dev *dev) > { > pci_configure_mps(dev); > @@ -2419,6 +2454,7 @@ static void pci_configure_device(struct pci_dev *dev) > pci_configure_aspm_l1ss(dev); > pci_configure_eetlp_prefix(dev); > pci_configure_serr(dev); > + pci_configure_rcb(dev); > > pci_acpi_program_hp_params(dev); > } > -- > 2.43.5 > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 1/2] PCI: Initialize RCB from pci_configure_device 2026-01-21 22:40 ` Bjorn Helgaas @ 2026-01-22 9:42 ` Haakon Bugge 2026-01-22 10:22 ` Bjorn Helgaas 2026-01-22 10:36 ` Bjorn Helgaas 1 sibling, 1 reply; 14+ messages in thread From: Haakon Bugge @ 2026-01-22 9:42 UTC (permalink / raw) To: Bjorn Helgaas Cc: Bjorn Helgaas, Johannes Thumshirn, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org > On Wed, Jan 21, 2026 at 12:35:40PM +0100, Håkon Bugge wrote: > [snip] > RCB isn't meaningful for switches, so we'll read their LNKCTL > unnecessarily. I propose something like this, which also clears RCB > if it's set when it shouldn't be (I think this would indicate a > firmware defect): > > /* > * Per PCIe r7.0, sec 7.5.3.7, RCB is only meaningful in Root Ports > * (where it is read-only), Endpoints, and Bridges. It may only be > * set for Endpoints and Bridges if it is set in the Root Port. > */ > if (!pci_is_pcie(dev) || > pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT || > pci_pcie_type(dev) == PCI_EXP_TYPE_UPSTREAM || > pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM || > dev->is_virtfn) I see that sec 1.3.2 defines "Endpoints are classified as either legacy, PCI Express, or Root Complex Integrated Endpoints (RCiEPs)." But, shouldn't we also exclude Root Complex Event Collectors (PCI_EXP_TYPE_RC_EC)? > return; > > rcb = pcie_root_rcb_set(dev); > > pcie_capability_read_word(dev, PCI_EXP_LNKCTL, &lnkctl); > if (rcb) { > if (lnkctl & PCI_EXP_LNKCTL_RCB) > return; > > lnkctl |= PCI_EXP_LNKCTL_RCB; > } else { I was thinking that since sec 7.5.3.7 states (for Endpoints and Bridges): "Default value of this bit is 0b.", that this case didn't happen. But of course, a defect BIOS should be considered. A v3 is on its way. I really appreciate your diligent review, Bjorn! Thxs, Håkon > if (!(lnkctl & PCI_EXP_LNKCTL_RCB)) > return; > > pci_info(FW_INFO "clearing RCB (RCB not set in Root Port)\n"); > lnkctl &= ~PCI_EXP_LNKCTL_RCB; > } > > pcie_capability_write_word(dev, PCI_EXP_LNKCTL, lnkctl); > >> +} >> + >> static void pci_configure_device(struct pci_dev *dev) >> { >> pci_configure_mps(dev); >> @@ -2419,6 +2454,7 @@ static void pci_configure_device(struct pci_dev *dev) >> pci_configure_aspm_l1ss(dev); >> pci_configure_eetlp_prefix(dev); >> pci_configure_serr(dev); >> + pci_configure_rcb(dev); >> >> pci_acpi_program_hp_params(dev); >> } >> -- >> 2.43.5 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 1/2] PCI: Initialize RCB from pci_configure_device 2026-01-22 9:42 ` Haakon Bugge @ 2026-01-22 10:22 ` Bjorn Helgaas 0 siblings, 0 replies; 14+ messages in thread From: Bjorn Helgaas @ 2026-01-22 10:22 UTC (permalink / raw) To: Haakon Bugge Cc: Bjorn Helgaas, Johannes Thumshirn, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org On Thu, Jan 22, 2026 at 09:42:46AM +0000, Haakon Bugge wrote: > > On Wed, Jan 21, 2026 at 12:35:40PM +0100, Håkon Bugge wrote: > > [snip] > > > RCB isn't meaningful for switches, so we'll read their LNKCTL > > unnecessarily. I propose something like this, which also clears RCB > > if it's set when it shouldn't be (I think this would indicate a > > firmware defect): > > > > /* > > * Per PCIe r7.0, sec 7.5.3.7, RCB is only meaningful in Root Ports > > * (where it is read-only), Endpoints, and Bridges. It may only be > > * set for Endpoints and Bridges if it is set in the Root Port. > > */ > > if (!pci_is_pcie(dev) || > > pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT || > > pci_pcie_type(dev) == PCI_EXP_TYPE_UPSTREAM || > > pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM || > > dev->is_virtfn) > > I see that sec 1.3.2 defines "Endpoints are classified as either > legacy, PCI Express, or Root Complex Integrated Endpoints (RCiEPs)." > But, shouldn't we also exclude Root Complex Event Collectors > (PCI_EXP_TYPE_RC_EC)? Yes, probably so. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 1/2] PCI: Initialize RCB from pci_configure_device 2026-01-21 22:40 ` Bjorn Helgaas 2026-01-22 9:42 ` Haakon Bugge @ 2026-01-22 10:36 ` Bjorn Helgaas 2026-01-22 10:49 ` Haakon Bugge 2026-01-22 11:35 ` Niklas Schnelle 1 sibling, 2 replies; 14+ messages in thread From: Bjorn Helgaas @ 2026-01-22 10:36 UTC (permalink / raw) To: Håkon Bugge Cc: Bjorn Helgaas, Johannes Thumshirn, linux-pci, linux-kernel, linux-acpi, Alex Williamson, Niklas Schnelle [+cc Alex, Niklas] On Wed, Jan 21, 2026 at 04:40:10PM -0600, Bjorn Helgaas wrote: > On Wed, Jan 21, 2026 at 12:35:40PM +0100, Håkon Bugge wrote: > > Commit e42010d8207f ("PCI: Set Read Completion Boundary to 128 iff > > Root Port supports it (_HPX)") fixed a bogus _HPX type 2 record, which > > instructed program_hpx_type2() to set the RCB in an endpoint, > > although it's RC did not have the RCB bit set. > > > > e42010d8207f fixed that by qualifying the setting of the RCB in the > > endpoint with the RC supporting an 128 byte RCB. > > > > In retrospect, the program_hpx_type2() should only modify the AER > > bits, and stay away from fiddling with the Link Control Register. > > > > Hence, we explicitly program the RCB from pci_configure_device(). RCB > > is RO in Root Ports, and in VFs, the bit is RsvdP, so for these two > > cases we skip programming it. Then, if the Root Port has RCB set and > > it is not set in the EP, we set it. > > > > Fixes: Commit e42010d8207f ("PCI: Set Read Completion Boundary to 128 iff Root Port supports it (_HPX)") > > Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> > > > > --- > > > > Note, that the current duplication of pcie_root_rcb_set() will be > > removed in the next commit. > > --- > > drivers/pci/probe.c | 36 ++++++++++++++++++++++++++++++++++++ > > 1 file changed, 36 insertions(+) > > > > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c > > index 41183aed8f5d9..347af29868124 100644 > > --- a/drivers/pci/probe.c > > +++ b/drivers/pci/probe.c > > @@ -2410,6 +2410,41 @@ static void pci_configure_serr(struct pci_dev *dev) > > } > > } > > > > +static bool pcie_root_rcb_set(struct pci_dev *dev) > > +{ > > + struct pci_dev *rp = pcie_find_root_port(dev); > > + u16 lnkctl; > > + > > + if (!rp) > > + return false; > > + > > + pcie_capability_read_word(rp, PCI_EXP_LNKCTL, &lnkctl); > > + > > + return !!(lnkctl & PCI_EXP_LNKCTL_RCB); > > +} > > + > > +static void pci_configure_rcb(struct pci_dev *dev) > > +{ > > + /* > > + * Obviously, we need a Link Control register. The RCB is RO > > + * in Root Ports, so no need to attempt to set it for > > + * them. For VFs, the RCB is RsvdP, so, no need to set it. > > + * Then, if the Root Port has RCB set, then we set for the EP > > + * unless already set. > > + */ > > + if (pcie_cap_has_lnkctl(dev) && > > + (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT) && > > + !dev->is_virtfn && pcie_root_rcb_set(dev)) { > > + u16 lnkctl; > > + > > + pcie_capability_read_word(dev, PCI_EXP_LNKCTL, &lnkctl); > > + if (lnkctl & PCI_EXP_LNKCTL_RCB) > > + return; > > + > > + pcie_capability_write_word(dev, PCI_EXP_LNKCTL, lnkctl | PCI_EXP_LNKCTL_RCB); > > + } > > RCB isn't meaningful for switches, so we'll read their LNKCTL > unnecessarily. I propose something like this, which also clears RCB > if it's set when it shouldn't be (I think this would indicate a > firmware defect): > > /* > * Per PCIe r7.0, sec 7.5.3.7, RCB is only meaningful in Root Ports > * (where it is read-only), Endpoints, and Bridges. It may only be > * set for Endpoints and Bridges if it is set in the Root Port. > */ > if (!pci_is_pcie(dev) || > pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT || > pci_pcie_type(dev) == PCI_EXP_TYPE_UPSTREAM || > pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM || > dev->is_virtfn) > return; > > rcb = pcie_root_rcb_set(dev); > > pcie_capability_read_word(dev, PCI_EXP_LNKCTL, &lnkctl); > if (rcb) { > if (lnkctl & PCI_EXP_LNKCTL_RCB) > return; > > lnkctl |= PCI_EXP_LNKCTL_RCB; > } else { > if (!(lnkctl & PCI_EXP_LNKCTL_RCB)) > return; > > pci_info(FW_INFO "clearing RCB (RCB not set in Root Port)\n"); > lnkctl &= ~PCI_EXP_LNKCTL_RCB; On second thought, I think this is too aggressive. I think VM guests will often see endpoints but not the Root Port. In that case, pcie_root_rcb_set() would return false because it couldn't find the RP, but the RP might actually have RCB set. Then we would clear the endpoint RCB unnecessarily, which should be safe but would reduce performance and will result in annoying misleading warnings. Could either ignore this case (as in your original patch) or bring pcie_root_rcb_set() inline here and return early if we can't find the RP. > } > > pcie_capability_write_word(dev, PCI_EXP_LNKCTL, lnkctl); > > > +} > > + > > static void pci_configure_device(struct pci_dev *dev) > > { > > pci_configure_mps(dev); > > @@ -2419,6 +2454,7 @@ static void pci_configure_device(struct pci_dev *dev) > > pci_configure_aspm_l1ss(dev); > > pci_configure_eetlp_prefix(dev); > > pci_configure_serr(dev); > > + pci_configure_rcb(dev); > > > > pci_acpi_program_hp_params(dev); > > } > > -- > > 2.43.5 > > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 1/2] PCI: Initialize RCB from pci_configure_device 2026-01-22 10:36 ` Bjorn Helgaas @ 2026-01-22 10:49 ` Haakon Bugge 2026-01-22 11:04 ` Bjorn Helgaas 2026-01-22 11:35 ` Niklas Schnelle 1 sibling, 1 reply; 14+ messages in thread From: Haakon Bugge @ 2026-01-22 10:49 UTC (permalink / raw) To: Bjorn Helgaas Cc: Bjorn Helgaas, Johannes Thumshirn, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org, Alex Williamson, Niklas Schnelle > On 22 Jan 2026, at 11:36, Bjorn Helgaas <helgaas@kernel.org> wrote: > > [+cc Alex, Niklas] > > On Wed, Jan 21, 2026 at 04:40:10PM -0600, Bjorn Helgaas wrote: >> On Wed, Jan 21, 2026 at 12:35:40PM +0100, Håkon Bugge wrote: >>> Commit e42010d8207f ("PCI: Set Read Completion Boundary to 128 iff >>> Root Port supports it (_HPX)") fixed a bogus _HPX type 2 record, which >>> instructed program_hpx_type2() to set the RCB in an endpoint, >>> although it's RC did not have the RCB bit set. >>> >>> e42010d8207f fixed that by qualifying the setting of the RCB in the >>> endpoint with the RC supporting an 128 byte RCB. >>> >>> In retrospect, the program_hpx_type2() should only modify the AER >>> bits, and stay away from fiddling with the Link Control Register. >>> >>> Hence, we explicitly program the RCB from pci_configure_device(). RCB >>> is RO in Root Ports, and in VFs, the bit is RsvdP, so for these two >>> cases we skip programming it. Then, if the Root Port has RCB set and >>> it is not set in the EP, we set it. >>> >>> Fixes: Commit e42010d8207f ("PCI: Set Read Completion Boundary to 128 iff Root Port supports it (_HPX)") >>> Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> >>> >>> --- >>> >>> Note, that the current duplication of pcie_root_rcb_set() will be >>> removed in the next commit. >>> --- >>> drivers/pci/probe.c | 36 ++++++++++++++++++++++++++++++++++++ >>> 1 file changed, 36 insertions(+) >>> >>> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c >>> index 41183aed8f5d9..347af29868124 100644 >>> --- a/drivers/pci/probe.c >>> +++ b/drivers/pci/probe.c >>> @@ -2410,6 +2410,41 @@ static void pci_configure_serr(struct pci_dev *dev) >>> } >>> } >>> >>> +static bool pcie_root_rcb_set(struct pci_dev *dev) >>> +{ >>> + struct pci_dev *rp = pcie_find_root_port(dev); >>> + u16 lnkctl; >>> + >>> + if (!rp) >>> + return false; >>> + >>> + pcie_capability_read_word(rp, PCI_EXP_LNKCTL, &lnkctl); >>> + >>> + return !!(lnkctl & PCI_EXP_LNKCTL_RCB); >>> +} >>> + >>> +static void pci_configure_rcb(struct pci_dev *dev) >>> +{ >>> + /* >>> + * Obviously, we need a Link Control register. The RCB is RO >>> + * in Root Ports, so no need to attempt to set it for >>> + * them. For VFs, the RCB is RsvdP, so, no need to set it. >>> + * Then, if the Root Port has RCB set, then we set for the EP >>> + * unless already set. >>> + */ >>> + if (pcie_cap_has_lnkctl(dev) && >>> + (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT) && >>> + !dev->is_virtfn && pcie_root_rcb_set(dev)) { >>> + u16 lnkctl; >>> + >>> + pcie_capability_read_word(dev, PCI_EXP_LNKCTL, &lnkctl); >>> + if (lnkctl & PCI_EXP_LNKCTL_RCB) >>> + return; >>> + >>> + pcie_capability_write_word(dev, PCI_EXP_LNKCTL, lnkctl | PCI_EXP_LNKCTL_RCB); >>> + } >> >> RCB isn't meaningful for switches, so we'll read their LNKCTL >> unnecessarily. I propose something like this, which also clears RCB >> if it's set when it shouldn't be (I think this would indicate a >> firmware defect): >> >> /* >> * Per PCIe r7.0, sec 7.5.3.7, RCB is only meaningful in Root Ports >> * (where it is read-only), Endpoints, and Bridges. It may only be >> * set for Endpoints and Bridges if it is set in the Root Port. >> */ >> if (!pci_is_pcie(dev) || >> pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT || >> pci_pcie_type(dev) == PCI_EXP_TYPE_UPSTREAM || >> pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM || >> dev->is_virtfn) >> return; >> >> rcb = pcie_root_rcb_set(dev); >> >> pcie_capability_read_word(dev, PCI_EXP_LNKCTL, &lnkctl); >> if (rcb) { >> if (lnkctl & PCI_EXP_LNKCTL_RCB) >> return; >> >> lnkctl |= PCI_EXP_LNKCTL_RCB; >> } else { >> if (!(lnkctl & PCI_EXP_LNKCTL_RCB)) >> return; >> >> pci_info(FW_INFO "clearing RCB (RCB not set in Root Port)\n"); >> lnkctl &= ~PCI_EXP_LNKCTL_RCB; > > On second thought, I think this is too aggressive. I think VM guests > will often see endpoints but not the Root Port. In that case, > pcie_root_rcb_set() would return false because it couldn't find the > RP, but the RP might actually have RCB set. Then we would clear the > endpoint RCB unnecessarily, which should be safe but would reduce > performance and will result in annoying misleading warnings. If the VM has a PF in pass-through and the RP is not there, you're right. If it is a VF, we do not program it anyway. > Could either ignore this case (as in your original patch) or bring > pcie_root_rcb_set() inline here and return early if we can't find the > RP. I think pcie_root_rcb_set() should return true when it was able to retrieve the RP's RCB value, and if not, we bail out. Thxs, Håkon > >> } >> >> pcie_capability_write_word(dev, PCI_EXP_LNKCTL, lnkctl); >> >>> +} >>> + >>> static void pci_configure_device(struct pci_dev *dev) >>> { >>> pci_configure_mps(dev); >>> @@ -2419,6 +2454,7 @@ static void pci_configure_device(struct pci_dev *dev) >>> pci_configure_aspm_l1ss(dev); >>> pci_configure_eetlp_prefix(dev); >>> pci_configure_serr(dev); >>> + pci_configure_rcb(dev); >>> >>> pci_acpi_program_hp_params(dev); >>> } >>> -- >>> 2.43.5 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 1/2] PCI: Initialize RCB from pci_configure_device 2026-01-22 10:49 ` Haakon Bugge @ 2026-01-22 11:04 ` Bjorn Helgaas 2026-01-22 11:12 ` Haakon Bugge 0 siblings, 1 reply; 14+ messages in thread From: Bjorn Helgaas @ 2026-01-22 11:04 UTC (permalink / raw) To: Haakon Bugge Cc: Bjorn Helgaas, Johannes Thumshirn, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org, Alex Williamson, Niklas Schnelle On Thu, Jan 22, 2026 at 10:49:45AM +0000, Haakon Bugge wrote: > > On 22 Jan 2026, at 11:36, Bjorn Helgaas <helgaas@kernel.org> wrote: > > On Wed, Jan 21, 2026 at 04:40:10PM -0600, Bjorn Helgaas wrote: > >> On Wed, Jan 21, 2026 at 12:35:40PM +0100, Håkon Bugge wrote: > >>> Commit e42010d8207f ("PCI: Set Read Completion Boundary to 128 iff > >>> Root Port supports it (_HPX)") fixed a bogus _HPX type 2 record, which > >>> instructed program_hpx_type2() to set the RCB in an endpoint, > >>> although it's RC did not have the RCB bit set. > >>> > >>> e42010d8207f fixed that by qualifying the setting of the RCB in the > >>> endpoint with the RC supporting an 128 byte RCB. > >>> > >>> In retrospect, the program_hpx_type2() should only modify the AER > >>> bits, and stay away from fiddling with the Link Control Register. > >>> > >>> Hence, we explicitly program the RCB from pci_configure_device(). RCB > >>> is RO in Root Ports, and in VFs, the bit is RsvdP, so for these two > >>> cases we skip programming it. Then, if the Root Port has RCB set and > >>> it is not set in the EP, we set it. > >>> > >>> Fixes: Commit e42010d8207f ("PCI: Set Read Completion Boundary to 128 iff Root Port supports it (_HPX)") > >>> Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> > >>> > >>> --- > >>> > >>> Note, that the current duplication of pcie_root_rcb_set() will be > >>> removed in the next commit. > >>> --- > >>> drivers/pci/probe.c | 36 ++++++++++++++++++++++++++++++++++++ > >>> 1 file changed, 36 insertions(+) > >>> > >>> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c > >>> index 41183aed8f5d9..347af29868124 100644 > >>> --- a/drivers/pci/probe.c > >>> +++ b/drivers/pci/probe.c > >>> @@ -2410,6 +2410,41 @@ static void pci_configure_serr(struct pci_dev *dev) > >>> } > >>> } > >>> > >>> +static bool pcie_root_rcb_set(struct pci_dev *dev) > >>> +{ > >>> + struct pci_dev *rp = pcie_find_root_port(dev); > >>> + u16 lnkctl; > >>> + > >>> + if (!rp) > >>> + return false; > >>> + > >>> + pcie_capability_read_word(rp, PCI_EXP_LNKCTL, &lnkctl); > >>> + > >>> + return !!(lnkctl & PCI_EXP_LNKCTL_RCB); > >>> +} > >>> + > >>> +static void pci_configure_rcb(struct pci_dev *dev) > >>> +{ > >>> + /* > >>> + * Obviously, we need a Link Control register. The RCB is RO > >>> + * in Root Ports, so no need to attempt to set it for > >>> + * them. For VFs, the RCB is RsvdP, so, no need to set it. > >>> + * Then, if the Root Port has RCB set, then we set for the EP > >>> + * unless already set. > >>> + */ > >>> + if (pcie_cap_has_lnkctl(dev) && > >>> + (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT) && > >>> + !dev->is_virtfn && pcie_root_rcb_set(dev)) { > >>> + u16 lnkctl; > >>> + > >>> + pcie_capability_read_word(dev, PCI_EXP_LNKCTL, &lnkctl); > >>> + if (lnkctl & PCI_EXP_LNKCTL_RCB) > >>> + return; > >>> + > >>> + pcie_capability_write_word(dev, PCI_EXP_LNKCTL, lnkctl | PCI_EXP_LNKCTL_RCB); > >>> + } > >> > >> RCB isn't meaningful for switches, so we'll read their LNKCTL > >> unnecessarily. I propose something like this, which also clears RCB > >> if it's set when it shouldn't be (I think this would indicate a > >> firmware defect): > >> > >> /* > >> * Per PCIe r7.0, sec 7.5.3.7, RCB is only meaningful in Root Ports > >> * (where it is read-only), Endpoints, and Bridges. It may only be > >> * set for Endpoints and Bridges if it is set in the Root Port. > >> */ > >> if (!pci_is_pcie(dev) || > >> pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT || > >> pci_pcie_type(dev) == PCI_EXP_TYPE_UPSTREAM || > >> pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM || > >> dev->is_virtfn) > >> return; > >> > >> rcb = pcie_root_rcb_set(dev); > >> > >> pcie_capability_read_word(dev, PCI_EXP_LNKCTL, &lnkctl); > >> if (rcb) { > >> if (lnkctl & PCI_EXP_LNKCTL_RCB) > >> return; > >> > >> lnkctl |= PCI_EXP_LNKCTL_RCB; > >> } else { > >> if (!(lnkctl & PCI_EXP_LNKCTL_RCB)) > >> return; > >> > >> pci_info(FW_INFO "clearing RCB (RCB not set in Root Port)\n"); > >> lnkctl &= ~PCI_EXP_LNKCTL_RCB; > > > > On second thought, I think this is too aggressive. I think VM > > guests will often see endpoints but not the Root Port. In that > > case, pcie_root_rcb_set() would return false because it couldn't > > find the RP, but the RP might actually have RCB set. Then we > > would clear the endpoint RCB unnecessarily, which should be safe > > but would reduce performance and will result in annoying > > misleading warnings. > > If the VM has a PF in pass-through and the RP is not there, you're > right. If it is a VF, we do not program it anyway. > > > Could either ignore this case (as in your original patch) or bring > > pcie_root_rcb_set() inline here and return early if we can't find > > the RP. > > I think pcie_root_rcb_set() should return true when it was able to > retrieve the RP's RCB value, and if not, we bail out. Currently it returns: - true if we found the RP and it had RCB set - false if (a) we couldn't find the RP or (b) we found the RP and it had RCB cleared I don't think it's worth complicating the signature to make (a) and (b) distinguishable. Inlining pcie_root_rcb_set() would also remove any ambiguity about whether pcie_root_rcb_set() actually *sets* RCB or just tests it. I think any readability advantage of using a separate function is minimal. > >> } > >> > >> pcie_capability_write_word(dev, PCI_EXP_LNKCTL, lnkctl); > >> > >>> +} > >>> + > >>> static void pci_configure_device(struct pci_dev *dev) > >>> { > >>> pci_configure_mps(dev); > >>> @@ -2419,6 +2454,7 @@ static void pci_configure_device(struct pci_dev *dev) > >>> pci_configure_aspm_l1ss(dev); > >>> pci_configure_eetlp_prefix(dev); > >>> pci_configure_serr(dev); > >>> + pci_configure_rcb(dev); > >>> > >>> pci_acpi_program_hp_params(dev); > >>> } > >>> -- > >>> 2.43.5 > > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 1/2] PCI: Initialize RCB from pci_configure_device 2026-01-22 11:04 ` Bjorn Helgaas @ 2026-01-22 11:12 ` Haakon Bugge 0 siblings, 0 replies; 14+ messages in thread From: Haakon Bugge @ 2026-01-22 11:12 UTC (permalink / raw) To: Bjorn Helgaas Cc: Bjorn Helgaas, Johannes Thumshirn, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org, Alex Williamson, Niklas Schnelle > On 22 Jan 2026, at 12:04, Bjorn Helgaas <helgaas@kernel.org> wrote: > > On Thu, Jan 22, 2026 at 10:49:45AM +0000, Haakon Bugge wrote: >>> On 22 Jan 2026, at 11:36, Bjorn Helgaas <helgaas@kernel.org> wrote: >>> On Wed, Jan 21, 2026 at 04:40:10PM -0600, Bjorn Helgaas wrote: >>>> On Wed, Jan 21, 2026 at 12:35:40PM +0100, Håkon Bugge wrote: >>>>> Commit e42010d8207f ("PCI: Set Read Completion Boundary to 128 iff >>>>> Root Port supports it (_HPX)") fixed a bogus _HPX type 2 record, which >>>>> instructed program_hpx_type2() to set the RCB in an endpoint, >>>>> although it's RC did not have the RCB bit set. >>>>> >>>>> e42010d8207f fixed that by qualifying the setting of the RCB in the >>>>> endpoint with the RC supporting an 128 byte RCB. >>>>> >>>>> In retrospect, the program_hpx_type2() should only modify the AER >>>>> bits, and stay away from fiddling with the Link Control Register. >>>>> >>>>> Hence, we explicitly program the RCB from pci_configure_device(). RCB >>>>> is RO in Root Ports, and in VFs, the bit is RsvdP, so for these two >>>>> cases we skip programming it. Then, if the Root Port has RCB set and >>>>> it is not set in the EP, we set it. >>>>> >>>>> Fixes: Commit e42010d8207f ("PCI: Set Read Completion Boundary to 128 iff Root Port supports it (_HPX)") >>>>> Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> >>>>> >>>>> --- >>>>> >>>>> Note, that the current duplication of pcie_root_rcb_set() will be >>>>> removed in the next commit. >>>>> --- >>>>> drivers/pci/probe.c | 36 ++++++++++++++++++++++++++++++++++++ >>>>> 1 file changed, 36 insertions(+) >>>>> >>>>> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c >>>>> index 41183aed8f5d9..347af29868124 100644 >>>>> --- a/drivers/pci/probe.c >>>>> +++ b/drivers/pci/probe.c >>>>> @@ -2410,6 +2410,41 @@ static void pci_configure_serr(struct pci_dev *dev) >>>>> } >>>>> } >>>>> >>>>> +static bool pcie_root_rcb_set(struct pci_dev *dev) >>>>> +{ >>>>> + struct pci_dev *rp = pcie_find_root_port(dev); >>>>> + u16 lnkctl; >>>>> + >>>>> + if (!rp) >>>>> + return false; >>>>> + >>>>> + pcie_capability_read_word(rp, PCI_EXP_LNKCTL, &lnkctl); >>>>> + >>>>> + return !!(lnkctl & PCI_EXP_LNKCTL_RCB); >>>>> +} >>>>> + >>>>> +static void pci_configure_rcb(struct pci_dev *dev) >>>>> +{ >>>>> + /* >>>>> + * Obviously, we need a Link Control register. The RCB is RO >>>>> + * in Root Ports, so no need to attempt to set it for >>>>> + * them. For VFs, the RCB is RsvdP, so, no need to set it. >>>>> + * Then, if the Root Port has RCB set, then we set for the EP >>>>> + * unless already set. >>>>> + */ >>>>> + if (pcie_cap_has_lnkctl(dev) && >>>>> + (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT) && >>>>> + !dev->is_virtfn && pcie_root_rcb_set(dev)) { >>>>> + u16 lnkctl; >>>>> + >>>>> + pcie_capability_read_word(dev, PCI_EXP_LNKCTL, &lnkctl); >>>>> + if (lnkctl & PCI_EXP_LNKCTL_RCB) >>>>> + return; >>>>> + >>>>> + pcie_capability_write_word(dev, PCI_EXP_LNKCTL, lnkctl | PCI_EXP_LNKCTL_RCB); >>>>> + } >>>> >>>> RCB isn't meaningful for switches, so we'll read their LNKCTL >>>> unnecessarily. I propose something like this, which also clears RCB >>>> if it's set when it shouldn't be (I think this would indicate a >>>> firmware defect): >>>> >>>> /* >>>> * Per PCIe r7.0, sec 7.5.3.7, RCB is only meaningful in Root Ports >>>> * (where it is read-only), Endpoints, and Bridges. It may only be >>>> * set for Endpoints and Bridges if it is set in the Root Port. >>>> */ >>>> if (!pci_is_pcie(dev) || >>>> pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT || >>>> pci_pcie_type(dev) == PCI_EXP_TYPE_UPSTREAM || >>>> pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM || >>>> dev->is_virtfn) >>>> return; >>>> >>>> rcb = pcie_root_rcb_set(dev); >>>> >>>> pcie_capability_read_word(dev, PCI_EXP_LNKCTL, &lnkctl); >>>> if (rcb) { >>>> if (lnkctl & PCI_EXP_LNKCTL_RCB) >>>> return; >>>> >>>> lnkctl |= PCI_EXP_LNKCTL_RCB; >>>> } else { >>>> if (!(lnkctl & PCI_EXP_LNKCTL_RCB)) >>>> return; >>>> >>>> pci_info(FW_INFO "clearing RCB (RCB not set in Root Port)\n"); >>>> lnkctl &= ~PCI_EXP_LNKCTL_RCB; >>> >>> On second thought, I think this is too aggressive. I think VM >>> guests will often see endpoints but not the Root Port. In that >>> case, pcie_root_rcb_set() would return false because it couldn't >>> find the RP, but the RP might actually have RCB set. Then we >>> would clear the endpoint RCB unnecessarily, which should be safe >>> but would reduce performance and will result in annoying >>> misleading warnings. >> >> If the VM has a PF in pass-through and the RP is not there, you're >> right. If it is a VF, we do not program it anyway. >> >>> Could either ignore this case (as in your original patch) or bring >>> pcie_root_rcb_set() inline here and return early if we can't find >>> the RP. >> >> I think pcie_root_rcb_set() should return true when it was able to >> retrieve the RP's RCB value, and if not, we bail out. > > Currently it returns: > > - true if we found the RP and it had RCB set > > - false if (a) we couldn't find the RP or (b) we found the RP and it > had RCB cleared > > I don't think it's worth complicating the signature to make (a) and > (b) distinguishable. Case (b) will not trustfully detect the case where the EP's RCB is set and the RP is (b.1) not found or (b.2) RP is found but it's RCB is not set. In case (b.2), we _shall_ reset the EP's RCB, IIUC. Thxs, Håkon > > Inlining pcie_root_rcb_set() would also remove any ambiguity about > whether pcie_root_rcb_set() actually *sets* RCB or just tests it. I > think any readability advantage of using a separate function is > minimal. > >>>> } >>>> >>>> pcie_capability_write_word(dev, PCI_EXP_LNKCTL, lnkctl); >>>> >>>>> +} >>>>> + >>>>> static void pci_configure_device(struct pci_dev *dev) >>>>> { >>>>> pci_configure_mps(dev); >>>>> @@ -2419,6 +2454,7 @@ static void pci_configure_device(struct pci_dev *dev) >>>>> pci_configure_aspm_l1ss(dev); >>>>> pci_configure_eetlp_prefix(dev); >>>>> pci_configure_serr(dev); >>>>> + pci_configure_rcb(dev); >>>>> >>>>> pci_acpi_program_hp_params(dev); >>>>> } >>>>> -- >>>>> 2.43.5 >> >> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 1/2] PCI: Initialize RCB from pci_configure_device 2026-01-22 10:36 ` Bjorn Helgaas 2026-01-22 10:49 ` Haakon Bugge @ 2026-01-22 11:35 ` Niklas Schnelle 2026-01-22 13:25 ` Haakon Bugge 1 sibling, 1 reply; 14+ messages in thread From: Niklas Schnelle @ 2026-01-22 11:35 UTC (permalink / raw) To: Bjorn Helgaas, Håkon Bugge Cc: Bjorn Helgaas, Johannes Thumshirn, linux-pci, linux-kernel, linux-acpi, Alex Williamson On Thu, 2026-01-22 at 04:36 -0600, Bjorn Helgaas wrote: > [+cc Alex, Niklas] > > On Wed, Jan 21, 2026 at 04:40:10PM -0600, Bjorn Helgaas wrote: > > On Wed, Jan 21, 2026 at 12:35:40PM +0100, Håkon Bugge wrote: > > > Commit e42010d8207f ("PCI: Set Read Completion Boundary to 128 iff > > > Root Port supports it (_HPX)") fixed a bogus _HPX type 2 record, which > > > instructed program_hpx_type2() to set the RCB in an endpoint, > > > although it's RC did not have the RCB bit set. > > > > > > e42010d8207f fixed that by qualifying the setting of the RCB in the > > > endpoint with the RC supporting an 128 byte RCB. > > > > > > In retrospect, the program_hpx_type2() should only modify the AER > > > bits, and stay away from fiddling with the Link Control Register. > > > > > > Hence, we explicitly program the RCB from pci_configure_device(). RCB > > > is RO in Root Ports, and in VFs, the bit is RsvdP, so for these two > > > cases we skip programming it. Then, if the Root Port has RCB set and > > > it is not set in the EP, we set it. > > > > > > Fixes: Commit e42010d8207f ("PCI: Set Read Completion Boundary to 128 iff Root Port supports it (_HPX)") > > > Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> > > > > > > --- > > > > > > Note, that the current duplication of pcie_root_rcb_set() will be > > > removed in the next commit. > > > --- > > > drivers/pci/probe.c | 36 ++++++++++++++++++++++++++++++++++++ > > > 1 file changed, 36 insertions(+) > > > > > > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c > > > index 41183aed8f5d9..347af29868124 100644 > > > --- a/drivers/pci/probe.c > > > +++ b/drivers/pci/probe.c > > > @@ -2410,6 +2410,41 @@ static void pci_configure_serr(struct pci_dev *dev) > > > } > > > } > > > > > > +static bool pcie_root_rcb_set(struct pci_dev *dev) > > > +{ > > > + struct pci_dev *rp = pcie_find_root_port(dev); > > > + u16 lnkctl; > > > + > > > + if (!rp) > > > + return false; > > > + > > > + pcie_capability_read_word(rp, PCI_EXP_LNKCTL, &lnkctl); > > > + > > > + return !!(lnkctl & PCI_EXP_LNKCTL_RCB); > > > +} > > > + > > > +static void pci_configure_rcb(struct pci_dev *dev) > > > +{ > > > --- snip --- > > > > pcie_capability_read_word(dev, PCI_EXP_LNKCTL, &lnkctl); > > if (rcb) { > > if (lnkctl & PCI_EXP_LNKCTL_RCB) > > return; > > > > lnkctl |= PCI_EXP_LNKCTL_RCB; > > } else { > > if (!(lnkctl & PCI_EXP_LNKCTL_RCB)) > > return; > > > > pci_info(FW_INFO "clearing RCB (RCB not set in Root Port)\n"); > > lnkctl &= ~PCI_EXP_LNKCTL_RCB; > > On second thought, I think this is too aggressive. I think VM guests > will often see endpoints but not the Root Port. In that case, > pcie_root_rcb_set() would return false because it couldn't find the > RP, but the RP might actually have RCB set. Then we would clear the > endpoint RCB unnecessarily, which should be safe but would reduce > performance and will result in annoying misleading warnings. > > Could either ignore this case (as in your original patch) or bring > pcie_root_rcb_set() inline here and return early if we can't find the > RP. > > Thanks Bjorn for looping me in. If I'm reading later comments correctly we're already in agreement that if the root port isn't found the function should bail and leave the setting as is which sounds good to me. Still, feel free to directly add me in To on the next version and I'll be happy to test it and take a look at the code. Nevertheless I'd like to confirm that yes on s390 we definitely have the case where PFs are passed-through to guests without the guest having access to / seeing the root port as a PCIe device. This is true on both our machine hypervisor guests (LPARs) and in KVM guests. And yes I think this would potentially incorrectly clear the RCB which could have been set by firmware or platform PCI code based on its knowledge of the actual root port. That said from a quick look we currently seem to keep the RCB at 64 bytes in endpoints. Thanks, Niklas ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 1/2] PCI: Initialize RCB from pci_configure_device 2026-01-22 11:35 ` Niklas Schnelle @ 2026-01-22 13:25 ` Haakon Bugge 0 siblings, 0 replies; 14+ messages in thread From: Haakon Bugge @ 2026-01-22 13:25 UTC (permalink / raw) To: Niklas Schnelle Cc: Bjorn Helgaas, Bjorn Helgaas, Johannes Thumshirn, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org, Alex Williamson > On 22 Jan 2026, at 12:35, Niklas Schnelle <schnelle@linux.ibm.com> wrote: > > On Thu, 2026-01-22 at 04:36 -0600, Bjorn Helgaas wrote: >> [+cc Alex, Niklas] >> >> On Wed, Jan 21, 2026 at 04:40:10PM -0600, Bjorn Helgaas wrote: >>> On Wed, Jan 21, 2026 at 12:35:40PM +0100, Håkon Bugge wrote: >>>> Commit e42010d8207f ("PCI: Set Read Completion Boundary to 128 iff >>>> Root Port supports it (_HPX)") fixed a bogus _HPX type 2 record, which >>>> instructed program_hpx_type2() to set the RCB in an endpoint, >>>> although it's RC did not have the RCB bit set. >>>> >>>> e42010d8207f fixed that by qualifying the setting of the RCB in the >>>> endpoint with the RC supporting an 128 byte RCB. >>>> >>>> In retrospect, the program_hpx_type2() should only modify the AER >>>> bits, and stay away from fiddling with the Link Control Register. >>>> >>>> Hence, we explicitly program the RCB from pci_configure_device(). RCB >>>> is RO in Root Ports, and in VFs, the bit is RsvdP, so for these two >>>> cases we skip programming it. Then, if the Root Port has RCB set and >>>> it is not set in the EP, we set it. >>>> >>>> Fixes: Commit e42010d8207f ("PCI: Set Read Completion Boundary to 128 iff Root Port supports it (_HPX)") >>>> Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> >>>> >>>> --- >>>> >>>> Note, that the current duplication of pcie_root_rcb_set() will be >>>> removed in the next commit. >>>> --- >>>> drivers/pci/probe.c | 36 ++++++++++++++++++++++++++++++++++++ >>>> 1 file changed, 36 insertions(+) >>>> >>>> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c >>>> index 41183aed8f5d9..347af29868124 100644 >>>> --- a/drivers/pci/probe.c >>>> +++ b/drivers/pci/probe.c >>>> @@ -2410,6 +2410,41 @@ static void pci_configure_serr(struct pci_dev *dev) >>>> } >>>> } >>>> >>>> +static bool pcie_root_rcb_set(struct pci_dev *dev) >>>> +{ >>>> + struct pci_dev *rp = pcie_find_root_port(dev); >>>> + u16 lnkctl; >>>> + >>>> + if (!rp) >>>> + return false; >>>> + >>>> + pcie_capability_read_word(rp, PCI_EXP_LNKCTL, &lnkctl); >>>> + >>>> + return !!(lnkctl & PCI_EXP_LNKCTL_RCB); >>>> +} >>>> + >>>> +static void pci_configure_rcb(struct pci_dev *dev) >>>> +{ >>>> > --- snip --- >>> >>> pcie_capability_read_word(dev, PCI_EXP_LNKCTL, &lnkctl); >>> if (rcb) { >>> if (lnkctl & PCI_EXP_LNKCTL_RCB) >>> return; >>> >>> lnkctl |= PCI_EXP_LNKCTL_RCB; >>> } else { >>> if (!(lnkctl & PCI_EXP_LNKCTL_RCB)) >>> return; >>> >>> pci_info(FW_INFO "clearing RCB (RCB not set in Root Port)\n"); >>> lnkctl &= ~PCI_EXP_LNKCTL_RCB; >> >> On second thought, I think this is too aggressive. I think VM guests >> will often see endpoints but not the Root Port. In that case, >> pcie_root_rcb_set() would return false because it couldn't find the >> RP, but the RP might actually have RCB set. Then we would clear the >> endpoint RCB unnecessarily, which should be safe but would reduce >> performance and will result in annoying misleading warnings. >> >> Could either ignore this case (as in your original patch) or bring >> pcie_root_rcb_set() inline here and return early if we can't find the >> RP. >>> > > Thanks Bjorn for looping me in. If I'm reading later comments correctly > we're already in agreement that if the root port isn't found the > function should bail and leave the setting as is which sounds good to > me. Still, feel free to directly add me in To on the next version and > I'll be happy to test it and take a look at the code. > > Nevertheless I'd like to confirm that yes on s390 we definitely have > the case where PFs are passed-through to guests without the guest > having access to / seeing the root port as a PCIe device. This is true > on both our machine hypervisor guests (LPARs) and in KVM guests. And > yes I think this would potentially incorrectly clear the RCB which > could have been set by firmware or platform PCI code based on its > knowledge of the actual root port. That said from a quick look we > currently seem to keep the RCB at 64 bytes in endpoints. Hi Niklas, Thanks for chiming in. Just out a v3 [1]. Thxs, Håkon [1] https://lore.kernel.org/linux-pci/20260122130957.68757-1-haakon.bugge@oracle.com/ > > Thanks, > Niklas ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 2/2] PCI/ACPI: Confine program_hpx_type2 to the AER bits 2026-01-21 11:35 [PATCH v2 0/2] PCI: Init RCB from pci_configure_device and fix program_hpx_type2 Håkon Bugge 2026-01-21 11:35 ` [PATCH v2 1/2] PCI: Initialize RCB from pci_configure_device Håkon Bugge @ 2026-01-21 11:35 ` Håkon Bugge 1 sibling, 0 replies; 14+ messages in thread From: Håkon Bugge @ 2026-01-21 11:35 UTC (permalink / raw) To: Bjorn Helgaas, Rafael J. Wysocki, Len Brown, Mahesh J Salgaonkar, Oliver O'Halloran, Greg Kroah-Hartman, Kenji Kaneshige Cc: Johannes Thumshirn, linux-pci, linux-kernel, linux-acpi, Håkon Bugge, linuxppc-dev program_hpx_type2() is today unconditionally called, despite the fact that when the _HPX was added to the ACPI spec. v3.0, the description stated: OSPM [1] will only evaluate _HPX with Setting Record – Type 2 if OSPM is not controlling the PCI Express Advanced Error Reporting capability. Hence, we only call program_hpx_type2() when the OSPM owns the PCIe hotplug capability but not the AER. The Advanced Configuration and Power Interface (ACPI) Specification version 6.6 has a provision that gives the OSPM the ability to control the other PCIe Device Control bits any way. In a note in section 6.2.9, it is stated: "OSPM may override the settings provided by the _HPX object's Type2 record (PCI Express Settings) or Type3 record (PCI Express Descriptor Settings) when OSPM has assumed native control of the corresponding feature." So, in order to preserve the non-AER bits in PCIe Device Control, in particular the performance sensitive ExtTag and RO, we make sure program_hpx_type2() if called, doesn't modify any non-AER bits. Also, when program_hpx_type2() is called, we completely avoid modifying any bits in the Link Control register. However, if the _HPX type 2 records contains bits indicating such modifications, we print an info message. [1] Operating System-directed configuration and Power Management Fixes: 40abb96c51bb ("[PATCH] pciehp: Fix programming hotplug parameters") Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> --- v1 -> v2: * Fixed comment style * Simplified the and/or logic when programming the Device Control register * Fixed the incorrect and brutal warning about Link Control register bits set and changed it to an info message about _HPX attempting to set/reset bits therein. * Removed the RCB programming from program_hpx_type2() * Moved the PCI_EXP_AER_FLAGS definition from drivers/pci/pcie/aer.c to drivers/pci/pci.h --- drivers/pci/pci-acpi.c | 61 +++++++++++++++++++----------------------- drivers/pci/pci.h | 3 +++ drivers/pci/pcie/aer.c | 3 --- 3 files changed, 30 insertions(+), 37 deletions(-) diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c index 9369377725fa0..34ea22f65a410 100644 --- a/drivers/pci/pci-acpi.c +++ b/drivers/pci/pci-acpi.c @@ -271,21 +271,6 @@ static acpi_status decode_type1_hpx_record(union acpi_object *record, return AE_OK; } -static bool pcie_root_rcb_set(struct pci_dev *dev) -{ - struct pci_dev *rp = pcie_find_root_port(dev); - u16 lnkctl; - - if (!rp) - return false; - - pcie_capability_read_word(rp, PCI_EXP_LNKCTL, &lnkctl); - if (lnkctl & PCI_EXP_LNKCTL_RCB) - return true; - - return false; -} - /* _HPX PCI Express Setting Record (Type 2) */ struct hpx_type2 { u32 revision; @@ -311,6 +296,7 @@ static void program_hpx_type2(struct pci_dev *dev, struct hpx_type2 *hpx) { int pos; u32 reg32; + const struct pci_host_bridge *host; if (!hpx) return; @@ -318,6 +304,15 @@ static void program_hpx_type2(struct pci_dev *dev, struct hpx_type2 *hpx) if (!pci_is_pcie(dev)) return; + host = pci_find_host_bridge(dev->bus); + + /* + * We only do the HP programming if we own the PCIe native + * hotplug and not the AER ownership + */ + if (!host->native_pcie_hotplug || host->native_aer) + return; + if (hpx->revision > 1) { pci_warn(dev, "PCIe settings rev %d not supported\n", hpx->revision); @@ -325,33 +320,31 @@ static void program_hpx_type2(struct pci_dev *dev, struct hpx_type2 *hpx) } /* - * Don't allow _HPX to change MPS or MRRS settings. We manage - * those to make sure they're consistent with the rest of the + * We only allow _HPX to program the AER registers, namely + * PCI_EXP_DEVCTL_CERE, PCI_EXP_DEVCTL_NFERE, + * PCI_EXP_DEVCTL_FERE, and PCI_EXP_DEVCTL_URRE. + * + * The other settings in PCIe DEVCTL are managed by OS in + * order to make sure they're consistent with the rest of the * platform. */ - hpx->pci_exp_devctl_and |= PCI_EXP_DEVCTL_PAYLOAD | - PCI_EXP_DEVCTL_READRQ; - hpx->pci_exp_devctl_or &= ~(PCI_EXP_DEVCTL_PAYLOAD | - PCI_EXP_DEVCTL_READRQ); + hpx->pci_exp_devctl_and |= ~PCI_EXP_AER_FLAGS; + hpx->pci_exp_devctl_or &= PCI_EXP_AER_FLAGS; /* Initialize Device Control Register */ pcie_capability_clear_and_set_word(dev, PCI_EXP_DEVCTL, ~hpx->pci_exp_devctl_and, hpx->pci_exp_devctl_or); - /* Initialize Link Control Register */ + /* Log if _HPX attempts to modify PCIe Link Control register */ if (pcie_cap_has_lnkctl(dev)) { - - /* - * If the Root Port supports Read Completion Boundary of - * 128, set RCB to 128. Otherwise, clear it. - */ - hpx->pci_exp_lnkctl_and |= PCI_EXP_LNKCTL_RCB; - hpx->pci_exp_lnkctl_or &= ~PCI_EXP_LNKCTL_RCB; - if (pcie_root_rcb_set(dev)) - hpx->pci_exp_lnkctl_or |= PCI_EXP_LNKCTL_RCB; - - pcie_capability_clear_and_set_word(dev, PCI_EXP_LNKCTL, - ~hpx->pci_exp_lnkctl_and, hpx->pci_exp_lnkctl_or); + if (hpx->pci_exp_lnkctl_and) + pci_info(dev, + "_HPX attempts to reset the following bits in PCIe Link Control: 0x%04x\n", + hpx->pci_exp_lnkctl_and); + if (hpx->pci_exp_lnkctl_or) + pci_info(dev, + "_HPX attempts to set the following bits in PCIe Link Control: 0x%04x\n", + hpx->pci_exp_lnkctl_or); } /* Find Advanced Error Reporting Enhanced Capability */ diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index 0e67014aa0013..f388d4414dd3a 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -88,6 +88,9 @@ struct pcie_tlp_log; #define PCI_BUS_BRIDGE_MEM_WINDOW 1 #define PCI_BUS_BRIDGE_PREF_MEM_WINDOW 2 +#define PCI_EXP_AER_FLAGS (PCI_EXP_DEVCTL_CERE | PCI_EXP_DEVCTL_NFERE | \ + PCI_EXP_DEVCTL_FERE | PCI_EXP_DEVCTL_URRE) + extern const unsigned char pcie_link_speed[]; extern bool pci_early_dump; diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c index e0bcaa896803c..9472d86cef552 100644 --- a/drivers/pci/pcie/aer.c +++ b/drivers/pci/pcie/aer.c @@ -239,9 +239,6 @@ void pcie_ecrc_get_policy(char *str) } #endif /* CONFIG_PCIE_ECRC */ -#define PCI_EXP_AER_FLAGS (PCI_EXP_DEVCTL_CERE | PCI_EXP_DEVCTL_NFERE | \ - PCI_EXP_DEVCTL_FERE | PCI_EXP_DEVCTL_URRE) - int pcie_aer_is_native(struct pci_dev *dev) { struct pci_host_bridge *host = pci_find_host_bridge(dev->bus); -- 2.43.5 ^ permalink raw reply related [flat|nested] 14+ messages in thread
end of thread, other threads:[~2026-01-22 13:25 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-01-21 11:35 [PATCH v2 0/2] PCI: Init RCB from pci_configure_device and fix program_hpx_type2 Håkon Bugge 2026-01-21 11:35 ` [PATCH v2 1/2] PCI: Initialize RCB from pci_configure_device Håkon Bugge 2026-01-21 17:41 ` Lukas Wunner 2026-01-21 19:13 ` Haakon Bugge 2026-01-21 22:40 ` Bjorn Helgaas 2026-01-22 9:42 ` Haakon Bugge 2026-01-22 10:22 ` Bjorn Helgaas 2026-01-22 10:36 ` Bjorn Helgaas 2026-01-22 10:49 ` Haakon Bugge 2026-01-22 11:04 ` Bjorn Helgaas 2026-01-22 11:12 ` Haakon Bugge 2026-01-22 11:35 ` Niklas Schnelle 2026-01-22 13:25 ` Haakon Bugge 2026-01-21 11:35 ` [PATCH v2 2/2] PCI/ACPI: Confine program_hpx_type2 to the AER bits Håkon Bugge
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox