* [PATCH v4 1/2] PCI: Fix isolated PCI function probing with ARI and SR-IOV
2025-10-23 15:20 [PATCH v4 0/2] PCI: Fix isolated function probing and enable ARI for s390 Niklas Schnelle
@ 2025-10-23 15:20 ` Niklas Schnelle
2025-10-27 8:28 ` Niklas Schnelle
2025-10-23 15:20 ` [PATCH v4 2/2] PCI: s390: Handle ARI on bus without associated struct pci_dev Niklas Schnelle
1 sibling, 1 reply; 4+ messages in thread
From: Niklas Schnelle @ 2025-10-23 15:20 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Jan Kiszka, Huacai Chen, linux-s390, loongarch, Farhan Ali,
Matthew Rosato, Tianrui Zhao, Gerald Schaefer, Heiko Carstens,
Vasily Gorbik, Alexander Gordeev, Sven Schnelle,
Christian Borntraeger, Gerd Bayer, linux-s390, linux-kernel,
linux-pci, jailhouse-dev, Niklas Schnelle
When the isolated PCI function probing mechanism is used in conjunction
with ARI or SR-IOV it may not find all available PCI functions. In the
case of ARI the problem is that next_ari_fn() always returns -ENODEV if
dev is NULL and thus if fn 0 is missing the scan stops.
For SR-IOV things are more complex. Here the problem is that the check
for multifunction may fail. One example where this can occur is if the
first passed-through function is a VF with devfn 8. Now in
pci_scan_slot() this means it is fn 0 and thus multifunction doesn't get
set. Since VFs don't get multifunction set via PCI_HEADER_TYPE_MFD it
remains unset and probing stops even if there is a devfn 9.
Now at the moment both of these issues are hidden on s390. The first one
because ARI is detected as disabled as struct pci_bus's self is NULL
even though firmware does enable and use ARI. The second issue is hidden
as a side effect of commit 25f39d3dcb48 ("s390/pci: Ignore RID for
isolated VFs"). This is because VFs are either put on their own virtual
bus if the parent PF is not passed-through to the same instance or VFs
are hotplugged once SR-IOV is enabled on the parent PF and then
pci_scan_single_device() is used.
Still especially the first issue prevents correct detection of ARI and
the second might be a problem for other users of isolated function
probing. Fix both issues by keeping things as simple as possible. If
isolated function probing is enabled simply scan every possible devfn.
Fixes: 189c6c33ff42 ("PCI: Extend isolated function probing to s390")
Link: https://lore.kernel.org/linux-pci/d3f11e8562f589ddb2c1c83e74161bd8948084c3.camel@linux.ibm.com/
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
---
drivers/pci/probe.c | 20 ++++++++++++--------
1 file changed, 12 insertions(+), 8 deletions(-)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 0ce98e18b5a876afe72af35a9f4a44d598e8d500..13495b12fbcfae4b890bbd4b2f913742adf6dfed 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2808,16 +2808,18 @@ static int next_ari_fn(struct pci_bus *bus, struct pci_dev *dev, int fn)
return next_fn;
}
-static int next_fn(struct pci_bus *bus, struct pci_dev *dev, int fn)
+static int next_fn(struct pci_bus *bus, struct pci_dev *dev, int fn, bool isolated)
{
- if (pci_ari_enabled(bus))
- return next_ari_fn(bus, dev, fn);
+ if (!isolated) {
+ if (pci_ari_enabled(bus))
+ return next_ari_fn(bus, dev, fn);
+ /* only multifunction devices may have more functions */
+ if (dev && !dev->multifunction)
+ return -ENODEV;
+ }
if (fn >= 7)
return -ENODEV;
- /* only multifunction devices may have more functions */
- if (dev && !dev->multifunction)
- return -ENODEV;
return fn + 1;
}
@@ -2857,12 +2859,14 @@ static int only_one_child(struct pci_bus *bus)
*/
int pci_scan_slot(struct pci_bus *bus, int devfn)
{
+ bool isolated_functions;
struct pci_dev *dev;
int fn = 0, nr = 0;
if (only_one_child(bus) && (devfn > 0))
return 0; /* Already scanned the entire slot */
+ isolated_functions = hypervisor_isolated_pci_functions();
do {
dev = pci_scan_single_device(bus, devfn + fn);
if (dev) {
@@ -2876,10 +2880,10 @@ int pci_scan_slot(struct pci_bus *bus, int devfn)
* a hypervisor that passes through individual PCI
* functions.
*/
- if (!hypervisor_isolated_pci_functions())
+ if (!isolated_functions)
break;
}
- fn = next_fn(bus, dev, fn);
+ fn = next_fn(bus, dev, fn, isolated_functions);
} while (fn >= 0);
/* Only one slot has PCIe device */
--
2.48.1
^ permalink raw reply related [flat|nested] 4+ messages in thread* [PATCH v4 2/2] PCI: s390: Handle ARI on bus without associated struct pci_dev
2025-10-23 15:20 [PATCH v4 0/2] PCI: Fix isolated function probing and enable ARI for s390 Niklas Schnelle
2025-10-23 15:20 ` [PATCH v4 1/2] PCI: Fix isolated PCI function probing with ARI and SR-IOV Niklas Schnelle
@ 2025-10-23 15:20 ` Niklas Schnelle
1 sibling, 0 replies; 4+ messages in thread
From: Niklas Schnelle @ 2025-10-23 15:20 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Jan Kiszka, Huacai Chen, linux-s390, loongarch, Farhan Ali,
Matthew Rosato, Tianrui Zhao, Gerald Schaefer, Heiko Carstens,
Vasily Gorbik, Alexander Gordeev, Sven Schnelle,
Christian Borntraeger, Gerd Bayer, linux-s390, linux-kernel,
linux-pci, jailhouse-dev, Niklas Schnelle
On s390 PCI busses are virtualized and the downstream ports are
invisible to the OS and self in struct pci_bus is NULL. This associated
struct pci_dev is however relied upon in pci_ari_enabled() to check
whether ARI is enabled for the bus. ARI is therefor always detected as
disabled. At the same time, firmware on s390 always enables and relies
upon ARI thus causing a mismatch.
Despite simply being a mismatch this causes problems as some PCI devices
present a different SR-IOV topology depending on PCI_SRIOV_CTRL_ARI.
A similar mismatch may occur with SR-IOV when virtfn_add_bus() creates new
busses with no associated struct pci_dev. Here too pci_ari_enabled()
on these busses returns false even if ARI is actually used.
Prevent both mismatches by moving the ari_enabled flag from struct
pci_dev to struct pci_bus making it independent from self in struct
pci_bus. Let the bus inherit the ari_enabled state from its parent bus
when there is no bridge device such that busses added by virtfn_add_bus()
match their parent. For s390 set ari_enabled when the device supports
ARI in the awareness that all PCIe ports on s390 systems are ARI
capable.
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
---
arch/s390/pci/pci.c | 7 +++++++
arch/s390/pci/pci_bus.c | 10 ++++++++++
drivers/pci/pci.c | 4 ++--
drivers/pci/probe.c | 1 +
include/linux/pci.h | 4 ++--
5 files changed, 22 insertions(+), 4 deletions(-)
diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
index c82c577db2bcd2143476cb8189fd89b9a4dc9836..773c0cbfc313ea1a6419a44d6158397dd13f6e76 100644
--- a/arch/s390/pci/pci.c
+++ b/arch/s390/pci/pci.c
@@ -609,6 +609,13 @@ int pcibios_device_add(struct pci_dev *pdev)
continue;
pci_claim_resource(pdev, i);
}
+ /*
+ * The below is the s390 equivalent of pci_configure_ari()
+ * which we can't use directly because the bridge devices
+ * are hidden in firmware.
+ */
+ if (!pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ARI))
+ zdev->zbus->bus->ari_enabled = 0;
return 0;
}
diff --git a/arch/s390/pci/pci_bus.c b/arch/s390/pci/pci_bus.c
index 45a1c36c5a54e3a841e61cc365d3f36e9a94ba50..c887e61eb384ca98ff27d4f8af69e58c715b5002 100644
--- a/arch/s390/pci/pci_bus.c
+++ b/arch/s390/pci/pci_bus.c
@@ -207,6 +207,16 @@ static int zpci_bus_create_pci_bus(struct zpci_bus *zbus, struct zpci_dev *fr, s
return -EFAULT;
}
+ /*
+ * On s390 PCI busses are virtualized and the bridge
+ * devices are invisible to the OS. Furthermore busses
+ * may exist without a devfn 0 function. Thus the normal
+ * ARI detection does not work. At the same time fw/hw
+ * has always enabled ARI when possible. Reflect the actual
+ * state by setting ari_enabled whenever a device on the bus
+ * supports it.
+ */
+ bus->ari_enabled = 1;
zbus->bus = bus;
return 0;
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b14dd064006cca80ec5275e45a35d6dc2b4d0bbc..8ef3c68280a629449e0a2176d938bf987c68dddf 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3532,11 +3532,11 @@ void pci_configure_ari(struct pci_dev *dev)
if (pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ARI)) {
pcie_capability_set_word(bridge, PCI_EXP_DEVCTL2,
PCI_EXP_DEVCTL2_ARI);
- bridge->ari_enabled = 1;
+ dev->bus->ari_enabled = 1;
} else {
pcie_capability_clear_word(bridge, PCI_EXP_DEVCTL2,
PCI_EXP_DEVCTL2_ARI);
- bridge->ari_enabled = 0;
+ dev->bus->ari_enabled = 0;
}
}
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 13495b12fbcfae4b890bbd4b2f913742adf6dfed..338bb7e6738d27865e3d50aa3094ca5ab29a6a47 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1216,6 +1216,7 @@ static struct pci_bus *pci_alloc_child_bus(struct pci_bus *parent,
if (!bridge) {
child->dev.parent = parent->bridge;
+ child->ari_enabled = parent->ari_enabled;
goto add_dev;
}
diff --git a/include/linux/pci.h b/include/linux/pci.h
index d1fdf81fbe1e427aecbc951fa3fdf65c20450b05..a9c3dbf17339e523362bd179ad3c7c8c91293cf0 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -445,7 +445,6 @@ struct pci_dev {
unsigned int irq_reroute_variant:2; /* Needs IRQ rerouting variant */
unsigned int msi_enabled:1;
unsigned int msix_enabled:1;
- unsigned int ari_enabled:1; /* ARI forwarding */
unsigned int ats_enabled:1; /* Address Translation Svc */
unsigned int pasid_enabled:1; /* Process Address Space ID */
unsigned int pri_enabled:1; /* Page Request Interface */
@@ -691,6 +690,7 @@ struct pci_bus {
unsigned int is_added:1;
unsigned int unsafe_warn:1; /* warned about RW1C config write */
unsigned int flit_mode:1; /* Link in Flit mode */
+ unsigned int ari_enabled:1; /* ARI forwarding enabled */
};
#define to_pci_bus(n) container_of(n, struct pci_bus, dev)
@@ -2740,7 +2740,7 @@ static inline bool pci_is_dev_assigned(struct pci_dev *pdev)
*/
static inline bool pci_ari_enabled(struct pci_bus *bus)
{
- return bus->self && bus->self->ari_enabled;
+ return bus->ari_enabled;
}
/**
--
2.48.1
^ permalink raw reply related [flat|nested] 4+ messages in thread