* [RFC 01/12] PCI: Add helpers to identify SR-IOV PFs/VFs.
2026-06-04 15:01 [RFC 00/12] PCI: Add support for Scalable I/O Virtualization Dimitri Daskalakis
@ 2026-06-04 15:01 ` Dimitri Daskalakis
2026-06-04 15:01 ` [RFC 02/12] PCI: Convert iov.c to pci_is_sriov_* helpers Dimitri Daskalakis
` (12 subsequent siblings)
13 siblings, 0 replies; 25+ messages in thread
From: Dimitri Daskalakis @ 2026-06-04 15:01 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, Mahesh J Salgaonkar, Oliver O'Halloran,
Niklas Schnelle, Gerald Schaefer, Heiko Carstens, Vasily Gorbik,
Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
Alex Williamson, Jason Gunthorpe, Kevin Tian, Ankit Agrawal,
Leon Romanovsky, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, Keith Busch, Alexander Duyck,
Jakub Kicinski, Dimitri Daskalakis, linuxppc-dev, linux-s390, kvm,
xen-devel
From: Dimitri Daskalakis <daskald@meta.com>
Throughout core the pci_dev attribute is_physfn is used to determine
whether or not a PF has SR-IOV active. And is_virtfn is used to check
if a device is a SR-IOV VF.
These attributes should be generalized to represent PFs/VFs for any type
of virtualization. So in preparation, wrap the existing usage in
helpers and use helpers in subsequent patches to ease the transition.
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: Dimitri Daskalakis <daskald@meta.com>
---
include/linux/pci.h | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 2c4454583c11..28892243f49f 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -604,6 +604,16 @@ static inline struct pci_dev *pci_physfn(struct pci_dev *dev)
return dev;
}
+static inline bool pci_is_sriov_physfn(const struct pci_dev *dev)
+{
+ return dev->is_physfn;
+}
+
+static inline bool pci_is_sriov_virtfn(const struct pci_dev *dev)
+{
+ return dev->is_virtfn;
+}
+
struct pci_dev *pci_alloc_dev(struct pci_bus *bus);
#define to_pci_dev(n) container_of(n, struct pci_dev, dev)
@@ -1277,6 +1287,7 @@ void pcibios_setup_bridge(struct pci_bus *bus, unsigned long type);
void pci_sort_breadthfirst(void);
#define dev_is_pci(d) ((d)->bus == &pci_bus_type)
#define dev_is_pf(d) ((dev_is_pci(d) ? to_pci_dev(d)->is_physfn : false))
+#define dev_is_sriov_pf(d) ((dev_is_pci(d) ? pci_is_sriov_physfn(to_pci_dev(d)) : false))
/* Generic PCI functions exported to card drivers */
@@ -2207,6 +2218,7 @@ static inline struct pci_dev *pci_dev_get(struct pci_dev *dev) { return NULL; }
#define dev_is_pci(d) (false)
#define dev_is_pf(d) (false)
+#define dev_is_sriov_pf(d) (false)
static inline bool pci_acs_enabled(struct pci_dev *pdev, u16 acs_flags)
{ return false; }
static inline int pci_irqd_intx_xlate(struct irq_domain *d,
--
2.52.0
^ permalink raw reply related [flat|nested] 25+ messages in thread* [RFC 02/12] PCI: Convert iov.c to pci_is_sriov_* helpers
2026-06-04 15:01 [RFC 00/12] PCI: Add support for Scalable I/O Virtualization Dimitri Daskalakis
2026-06-04 15:01 ` [RFC 01/12] PCI: Add helpers to identify SR-IOV PFs/VFs Dimitri Daskalakis
@ 2026-06-04 15:01 ` Dimitri Daskalakis
2026-06-04 15:15 ` sashiko-bot
2026-06-04 15:01 ` [RFC 03/12] PCI: Convert pci.h " Dimitri Daskalakis
` (11 subsequent siblings)
13 siblings, 1 reply; 25+ messages in thread
From: Dimitri Daskalakis @ 2026-06-04 15:01 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, Mahesh J Salgaonkar, Oliver O'Halloran,
Niklas Schnelle, Gerald Schaefer, Heiko Carstens, Vasily Gorbik,
Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
Alex Williamson, Jason Gunthorpe, Kevin Tian, Ankit Agrawal,
Leon Romanovsky, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, Keith Busch, Alexander Duyck,
Jakub Kicinski, Dimitri Daskalakis, linuxppc-dev, linux-s390, kvm,
xen-devel
From: Dimitri Daskalakis <daskald@meta.com>
No functional changes.
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: Dimitri Daskalakis <daskald@meta.com>
---
drivers/pci/iov.c | 46 +++++++++++++++++++++++-----------------------
1 file changed, 23 insertions(+), 23 deletions(-)
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 91ac4e37ecb9..5de26057b99a 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -23,7 +23,7 @@
int pci_iov_virtfn_bus(struct pci_dev *dev, int vf_id)
{
- if (!dev->is_physfn)
+ if (!pci_is_sriov_physfn(dev))
return -EINVAL;
return dev->bus->number + ((dev->devfn + dev->sriov->offset +
dev->sriov->stride * vf_id) >> 8);
@@ -31,7 +31,7 @@ int pci_iov_virtfn_bus(struct pci_dev *dev, int vf_id)
int pci_iov_virtfn_devfn(struct pci_dev *dev, int vf_id)
{
- if (!dev->is_physfn)
+ if (!pci_is_sriov_physfn(dev))
return -EINVAL;
return (dev->devfn + dev->sriov->offset +
dev->sriov->stride * vf_id) & 0xff;
@@ -42,7 +42,7 @@ int pci_iov_vf_id(struct pci_dev *dev)
{
struct pci_dev *pf;
- if (!dev->is_virtfn)
+ if (!pci_is_sriov_virtfn(dev))
return -EINVAL;
pf = pci_physfn(dev);
@@ -71,7 +71,7 @@ void *pci_iov_get_pf_drvdata(struct pci_dev *dev, struct pci_driver *pf_driver)
{
struct pci_dev *pf_dev;
- if (!dev->is_virtfn)
+ if (!pci_is_sriov_virtfn(dev))
return ERR_PTR(-EINVAL);
pf_dev = dev->physfn;
if (pf_dev->driver != pf_driver)
@@ -152,7 +152,7 @@ static void virtfn_remove_bus(struct pci_bus *physbus, struct pci_bus *virtbus)
resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno)
{
- if (!dev->is_physfn)
+ if (!pci_is_sriov_physfn(dev))
return 0;
return dev->sriov->barsz[pci_resource_num_to_vf_bar(resno)];
@@ -300,7 +300,7 @@ static umode_t sriov_vf_attrs_are_visible(struct kobject *kobj,
struct device *dev = kobj_to_dev(kobj);
struct pci_dev *pdev = to_pci_dev(dev);
- if (!pdev->is_virtfn)
+ if (!pci_is_sriov_virtfn(pdev))
return 0;
return a->mode;
@@ -604,7 +604,7 @@ static umode_t sriov_pf_attrs_are_visible(struct kobject *kobj,
{
struct device *dev = kobj_to_dev(kobj);
- if (!dev_is_pf(dev))
+ if (!dev_is_sriov_pf(dev))
return 0;
return a->mode;
@@ -707,7 +707,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
if (!pdev)
return -ENODEV;
- if (!pdev->is_physfn) {
+ if (!pci_is_sriov_physfn(pdev)) {
pci_dev_put(pdev);
return -ENOSYS;
}
@@ -814,7 +814,7 @@ static int sriov_init(struct pci_dev *dev, int pos)
ctrl = 0;
list_for_each_entry(pdev, &dev->bus->devices, bus_list)
- if (pdev->is_physfn)
+ if (pci_is_sriov_physfn(pdev))
goto found;
pdev = NULL;
@@ -1006,7 +1006,7 @@ int pci_iov_init(struct pci_dev *dev)
*/
void pci_iov_release(struct pci_dev *dev)
{
- if (dev->is_physfn)
+ if (pci_is_sriov_physfn(dev))
sriov_release(dev);
}
@@ -1018,7 +1018,7 @@ void pci_iov_remove(struct pci_dev *dev)
{
struct pci_sriov *iov = dev->sriov;
- if (!dev->is_physfn)
+ if (!pci_is_sriov_physfn(dev))
return;
iov->driver_max_VFs = iov->total_VFs;
@@ -1035,7 +1035,7 @@ void pci_iov_remove(struct pci_dev *dev)
*/
void pci_iov_update_resource(struct pci_dev *dev, int resno)
{
- struct pci_sriov *iov = dev->is_physfn ? dev->sriov : NULL;
+ struct pci_sriov *iov = pci_is_sriov_physfn(dev) ? dev->sriov : NULL;
struct resource *res = pci_resource_n(dev, resno);
int vf_bar = pci_resource_num_to_vf_bar(resno);
struct pci_bus_region region;
@@ -1111,7 +1111,7 @@ resource_size_t pci_sriov_resource_alignment(struct pci_dev *dev, int resno)
*/
void pci_restore_iov_state(struct pci_dev *dev)
{
- if (dev->is_physfn) {
+ if (pci_is_sriov_physfn(dev)) {
sriov_restore_vf_rebar_state(dev);
sriov_restore_state(dev);
}
@@ -1124,7 +1124,7 @@ void pci_restore_iov_state(struct pci_dev *dev)
*/
void pci_vf_drivers_autoprobe(struct pci_dev *dev, bool auto_probe)
{
- if (dev->is_physfn)
+ if (pci_is_sriov_physfn(dev))
dev->sriov->drivers_autoprobe = auto_probe;
}
@@ -1141,7 +1141,7 @@ int pci_iov_bus_range(struct pci_bus *bus)
struct pci_dev *dev;
list_for_each_entry(dev, &bus->devices, bus_list) {
- if (!dev->is_physfn)
+ if (!pci_is_sriov_physfn(dev))
continue;
if (dev->sriov->max_VF_buses > max)
max = dev->sriov->max_VF_buses;
@@ -1161,7 +1161,7 @@ int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
{
might_sleep();
- if (!dev->is_physfn)
+ if (!pci_is_sriov_physfn(dev))
return -ENOSYS;
return sriov_enable(dev, nr_virtfn);
@@ -1176,7 +1176,7 @@ void pci_disable_sriov(struct pci_dev *dev)
{
might_sleep();
- if (!dev->is_physfn)
+ if (!pci_is_sriov_physfn(dev))
return;
sriov_disable(dev);
@@ -1191,7 +1191,7 @@ EXPORT_SYMBOL_GPL(pci_disable_sriov);
*/
int pci_num_vf(struct pci_dev *dev)
{
- if (!dev->is_physfn)
+ if (!pci_is_sriov_physfn(dev))
return 0;
return dev->sriov->num_VFs;
@@ -1212,7 +1212,7 @@ int pci_vfs_assigned(struct pci_dev *dev)
unsigned short dev_id;
/* only search if we are a PF */
- if (!dev->is_physfn)
+ if (!pci_is_sriov_physfn(dev))
return 0;
/*
@@ -1228,7 +1228,7 @@ int pci_vfs_assigned(struct pci_dev *dev)
* It is considered assigned if it is a virtual function with
* our dev as the physical function and the assigned bit is set
*/
- if (vfdev->is_virtfn && (vfdev->physfn == dev) &&
+ if (pci_is_sriov_virtfn(vfdev) && (vfdev->physfn == dev) &&
pci_is_dev_assigned(vfdev))
vfs_assigned++;
@@ -1254,7 +1254,7 @@ EXPORT_SYMBOL_GPL(pci_vfs_assigned);
*/
int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs)
{
- if (!dev->is_physfn)
+ if (!pci_is_sriov_physfn(dev))
return -ENOSYS;
if (numvfs > dev->sriov->total_VFs)
@@ -1279,7 +1279,7 @@ EXPORT_SYMBOL_GPL(pci_sriov_set_totalvfs);
*/
int pci_sriov_get_totalvfs(struct pci_dev *dev)
{
- if (!dev->is_physfn)
+ if (!pci_is_sriov_physfn(dev))
return 0;
return dev->sriov->driver_max_VFs;
@@ -1301,7 +1301,7 @@ int pci_sriov_configure_simple(struct pci_dev *dev, int nr_virtfn)
might_sleep();
- if (!dev->is_physfn)
+ if (!pci_is_sriov_physfn(dev))
return -ENODEV;
if (pci_vfs_assigned(dev)) {
--
2.52.0
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [RFC 02/12] PCI: Convert iov.c to pci_is_sriov_* helpers
2026-06-04 15:01 ` [RFC 02/12] PCI: Convert iov.c to pci_is_sriov_* helpers Dimitri Daskalakis
@ 2026-06-04 15:15 ` sashiko-bot
0 siblings, 0 replies; 25+ messages in thread
From: sashiko-bot @ 2026-06-04 15:15 UTC (permalink / raw)
To: Dimitri Daskalakis; +Cc: kvm, linux-pci
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
Pre-existing issues:
- [High] Division by zero in pci_iov_vf_id() if the VF Stride is 0.
--
commit ef3219de5f4e200d2b4b6b2b9d4bac7ee1daf6b4
Author: Dimitri Daskalakis <daskald@meta.com>
PCI: Convert iov.c to pci_is_sriov_* helpers
This commit converts the dev->is_physfn and dev->is_virtfn checks in
drivers/pci/iov.c to use the pci_is_sriov_physfn() and
pci_is_sriov_virtfn() helpers instead.
> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
> index 91ac4e37ecb9c..5de26057b99aa 100644
> --- a/drivers/pci/iov.c
> +++ b/drivers/pci/iov.c
[ ... ]
> @@ -42,7 +42,7 @@ int pci_iov_vf_id(struct pci_dev *dev)
> {
> struct pci_dev *pf;
>
> - if (!dev->is_virtfn)
> + if (!pci_is_sriov_virtfn(dev))
> return -EINVAL;
>
> pf = pci_physfn(dev);
[Severity: High]
This is a pre-existing issue, but is there a potential divide-by-zero
risk in pci_iov_vf_id() when a device has only a single virtual function?
According to the PCI SR-IOV specification, if TotalVFs is 1, the VF Stride
is undefined and can be 0. While compute_max_vf_buses() correctly permits
a 0 stride when the number of VFs is 1 or fewer, the end of pci_iov_vf_id()
does an unconditional division:
int pci_iov_vf_id(struct pci_dev *dev)
{
...
pf = pci_physfn(dev);
return (pci_dev_id(dev) - (pci_dev_id(pf) + pf->sriov->offset)) /
pf->sriov->stride;
}
Could this cause a kernel panic during the driver probe path when
enumerating fully compliant PCIe hardware with a 0 stride?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260604150153.3619662-1-dimitri.daskalakis1@gmail.com?part=2
^ permalink raw reply [flat|nested] 25+ messages in thread
* [RFC 03/12] PCI: Convert pci.h to pci_is_sriov_* helpers
2026-06-04 15:01 [RFC 00/12] PCI: Add support for Scalable I/O Virtualization Dimitri Daskalakis
2026-06-04 15:01 ` [RFC 01/12] PCI: Add helpers to identify SR-IOV PFs/VFs Dimitri Daskalakis
2026-06-04 15:01 ` [RFC 02/12] PCI: Convert iov.c to pci_is_sriov_* helpers Dimitri Daskalakis
@ 2026-06-04 15:01 ` Dimitri Daskalakis
2026-06-04 15:01 ` [RFC 04/12] PCI: Convert arch/powerpc " Dimitri Daskalakis
` (10 subsequent siblings)
13 siblings, 0 replies; 25+ messages in thread
From: Dimitri Daskalakis @ 2026-06-04 15:01 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, Mahesh J Salgaonkar, Oliver O'Halloran,
Niklas Schnelle, Gerald Schaefer, Heiko Carstens, Vasily Gorbik,
Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
Alex Williamson, Jason Gunthorpe, Kevin Tian, Ankit Agrawal,
Leon Romanovsky, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, Keith Busch, Alexander Duyck,
Jakub Kicinski, Dimitri Daskalakis, linuxppc-dev, linux-s390, kvm,
xen-devel
From: Dimitri Daskalakis <daskald@meta.com>
No functional changes.
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: Dimitri Daskalakis <daskald@meta.com>
---
drivers/pci/pci.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 4a14f88e543a..73b913bcb87a 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -954,7 +954,7 @@ void pci_iov_resource_set_size(struct pci_dev *dev, int resno, int size);
bool pci_iov_is_memory_decoding_enabled(struct pci_dev *dev);
static inline u16 pci_iov_vf_rebar_cap(struct pci_dev *dev)
{
- if (!dev->is_physfn)
+ if (!pci_is_sriov_physfn(dev))
return 0;
return dev->sriov->vf_rebar_cap;
--
2.52.0
^ permalink raw reply related [flat|nested] 25+ messages in thread* [RFC 04/12] PCI: Convert arch/powerpc to pci_is_sriov_* helpers
2026-06-04 15:01 [RFC 00/12] PCI: Add support for Scalable I/O Virtualization Dimitri Daskalakis
` (2 preceding siblings ...)
2026-06-04 15:01 ` [RFC 03/12] PCI: Convert pci.h " Dimitri Daskalakis
@ 2026-06-04 15:01 ` Dimitri Daskalakis
2026-06-04 17:26 ` sashiko-bot
2026-06-04 15:01 ` [RFC 05/12] PCI: Convert s390/pci/pci.c " Dimitri Daskalakis
` (9 subsequent siblings)
13 siblings, 1 reply; 25+ messages in thread
From: Dimitri Daskalakis @ 2026-06-04 15:01 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, Mahesh J Salgaonkar, Oliver O'Halloran,
Niklas Schnelle, Gerald Schaefer, Heiko Carstens, Vasily Gorbik,
Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
Alex Williamson, Jason Gunthorpe, Kevin Tian, Ankit Agrawal,
Leon Romanovsky, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, Keith Busch, Alexander Duyck,
Jakub Kicinski, Dimitri Daskalakis, linuxppc-dev, linux-s390, kvm,
xen-devel
From: Dimitri Daskalakis <daskald@meta.com>
Convert SR-IOV-specific is_physfn / is_virtfn reads in the PowerPC
PCI code to use pci_is_sriov_physfn() / pci_is_sriov_virtfn(). These
call sites are all SR-IOV-specific: they guard SR-IOV state
dereferences, VF PE management, or sit inside #ifdef CONFIG_PCI_IOV
blocks. Converting them keeps SR-IOV semantics intact once is_physfn
and is_virtfn widen to cover any virtualization type.
Files touched:
arch/powerpc/kernel/pci_dn.c
arch/powerpc/platforms/powernv/pci-ioda.c
arch/powerpc/platforms/powernv/pci-sriov.c
arch/powerpc/platforms/pseries/eeh_pseries.c
arch/powerpc/platforms/pseries/setup.c
No functional changes.
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: Dimitri Daskalakis <daskald@meta.com>
---
arch/powerpc/kernel/pci_dn.c | 4 ++--
arch/powerpc/platforms/powernv/pci-ioda.c | 6 +++---
arch/powerpc/platforms/powernv/pci-sriov.c | 10 +++++-----
arch/powerpc/platforms/pseries/eeh_pseries.c | 8 ++++----
arch/powerpc/platforms/pseries/setup.c | 4 ++--
5 files changed, 16 insertions(+), 16 deletions(-)
diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
index a7b664befed2..cf44ec368a36 100644
--- a/arch/powerpc/kernel/pci_dn.c
+++ b/arch/powerpc/kernel/pci_dn.c
@@ -176,7 +176,7 @@ struct pci_dn *add_sriov_vf_pdns(struct pci_dev *pdev)
int i;
/* Only support IOV for now */
- if (WARN_ON(!pdev->is_physfn))
+ if (WARN_ON(!pci_is_sriov_physfn(pdev)))
return NULL;
/* Check if VFs have been populated */
@@ -221,7 +221,7 @@ void remove_sriov_vf_pdns(struct pci_dev *pdev)
int i;
/* Only support IOV PF for now */
- if (WARN_ON(!pdev->is_physfn))
+ if (WARN_ON(!pci_is_sriov_physfn(pdev)))
return;
/* Check if VFs have been populated */
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 32ecbc46e74b..63eacc8001fe 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -987,7 +987,7 @@ static void pnv_pci_ioda_dma_dev_setup(struct pci_dev *pdev)
pe = pnv_pci_bdfn_to_pe(phb, pci_dev_id(pdev));
if (!pe) {
/* VF PEs should be pre-configured in pnv_pci_sriov_enable() */
- if (WARN_ON(pdev->is_virtfn))
+ if (WARN_ON(pci_is_sriov_virtfn(pdev)))
return;
pnv_pci_configure_bus(pdev->bus);
@@ -2379,7 +2379,7 @@ static void pnv_pci_release_device(struct pci_dev *pdev)
struct pnv_ioda_pe *pe;
/* The VF PE state is torn down when sriov_disable() is called */
- if (pdev->is_virtfn)
+ if (pci_is_sriov_virtfn(pdev))
return;
if (!pdn || pdn->pe_number == IODA_INVALID_PE)
@@ -2391,7 +2391,7 @@ static void pnv_pci_release_device(struct pci_dev *pdev)
* the iov state at probe time since we need to fiddle with the IOV
* resources.
*/
- if (pdev->is_physfn)
+ if (pci_is_sriov_physfn(pdev))
kfree(pdev->dev.archdata.iov_data);
#endif
diff --git a/arch/powerpc/platforms/powernv/pci-sriov.c b/arch/powerpc/platforms/powernv/pci-sriov.c
index 7105a573aec4..1113488f4372 100644
--- a/arch/powerpc/platforms/powernv/pci-sriov.c
+++ b/arch/powerpc/platforms/powernv/pci-sriov.c
@@ -225,7 +225,7 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
void pnv_pci_ioda_fixup_iov(struct pci_dev *pdev)
{
- if (pdev->is_virtfn) {
+ if (pci_is_sriov_virtfn(pdev)) {
struct pnv_ioda_pe *pe = pnv_ioda_get_pe(pdev);
/*
@@ -235,7 +235,7 @@ void pnv_pci_ioda_fixup_iov(struct pci_dev *pdev)
*/
pe->pdev = pdev;
WARN_ON(!(pe->flags & PNV_IODA_PE_VF));
- } else if (pdev->is_physfn) {
+ } else if (pci_is_sriov_physfn(pdev)) {
/*
* For PFs adjust their allocated IOV resources to match what
* the PHB can support using its M64 BAR table.
@@ -479,7 +479,7 @@ static void pnv_ioda_release_vf_PE(struct pci_dev *pdev)
phb = pci_bus_to_pnvhb(pdev->bus);
- if (!pdev->is_physfn)
+ if (!pci_is_sriov_physfn(pdev))
return;
/* FIXME: Use pnv_ioda_release_pe()? */
@@ -508,7 +508,7 @@ static int pnv_pci_vf_resource_shift(struct pci_dev *dev, int offset)
u16 num_vfs;
int i;
- if (!dev->is_physfn)
+ if (!pci_is_sriov_physfn(dev))
return -EINVAL;
iov = pnv_iov_get(dev);
@@ -620,7 +620,7 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
struct pnv_iov_data *iov;
struct pci_dn *pdn;
- if (!pdev->is_physfn)
+ if (!pci_is_sriov_physfn(pdev))
return;
phb = pci_bus_to_pnvhb(pdev->bus);
diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c
index b12ef382fec7..32030ac9be51 100644
--- a/arch/powerpc/platforms/pseries/eeh_pseries.c
+++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
@@ -54,7 +54,7 @@ static void pseries_pcibios_bus_add_device(struct pci_dev *pdev)
dev_dbg(&pdev->dev, "EEH: Setting up device\n");
#ifdef CONFIG_PCI_IOV
- if (pdev->is_virtfn) {
+ if (pci_is_sriov_virtfn(pdev)) {
pdn->device_id = pdev->device;
pdn->vendor_id = pdev->vendor;
pdn->class_code = pdev->class;
@@ -68,7 +68,7 @@ static void pseries_pcibios_bus_add_device(struct pci_dev *pdev)
#endif
pseries_eeh_init_edev(pdn);
#ifdef CONFIG_PCI_IOV
- if (pdev->is_virtfn) {
+ if (pci_is_sriov_virtfn(pdev)) {
/*
* FIXME: This really should be handled by choosing the right
* parent PE in pseries_eeh_init_edev().
@@ -731,7 +731,7 @@ static int pseries_call_allow_unfreeze(struct eeh_dev *edev)
if (!vf_pe_array)
return -ENOMEM;
if (pci_num_vf(edev->physfn ? edev->physfn : edev->pdev)) {
- if (edev->pdev->is_physfn) {
+ if (pci_is_sriov_physfn(edev->pdev)) {
cur_vfs = pci_num_vf(edev->pdev);
pdn = eeh_dev_to_pdn(edev);
parent = pdn->parent;
@@ -779,7 +779,7 @@ static int pseries_notify_resume(struct eeh_dev *edev)
if (rtas_function_token(RTAS_FN_IBM_OPEN_SRIOV_ALLOW_UNFREEZE) == RTAS_UNKNOWN_SERVICE)
return -EINVAL;
- if (edev->pdev->is_physfn || edev->pdev->is_virtfn)
+ if (pci_is_sriov_physfn(edev->pdev) || pci_is_sriov_virtfn(edev->pdev))
return pseries_call_allow_unfreeze(edev);
return 0;
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index 50b26ed8432d..8165ae9adbd6 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -778,7 +778,7 @@ static void pseries_pci_fixup_iov_resources(struct pci_dev *pdev)
const int *indexes;
struct device_node *dn = pci_device_to_OF_node(pdev);
- if (!pdev->is_physfn)
+ if (!pci_is_sriov_physfn(pdev))
return;
/*Firmware must support open sriov otherwise don't configure*/
indexes = of_get_property(dn, "ibm,open-sriov-vf-bar-info", NULL);
@@ -799,7 +799,7 @@ static resource_size_t pseries_pci_iov_resource_alignment(struct pci_dev *pdev,
if (!reg)
return pci_iov_resource_size(pdev, resno);
- if (!pdev->is_physfn)
+ if (!pci_is_sriov_physfn(pdev))
return 0;
return pseries_get_iov_fw_value(pdev,
resno - PCI_IOV_RESOURCES,
--
2.52.0
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [RFC 04/12] PCI: Convert arch/powerpc to pci_is_sriov_* helpers
2026-06-04 15:01 ` [RFC 04/12] PCI: Convert arch/powerpc " Dimitri Daskalakis
@ 2026-06-04 17:26 ` sashiko-bot
0 siblings, 0 replies; 25+ messages in thread
From: sashiko-bot @ 2026-06-04 17:26 UTC (permalink / raw)
To: Dimitri Daskalakis; +Cc: kvm, linux-pci
Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
New issues:
- [High] Missed conversions of SR-IOV specific is_virtfn and is_physfn checks.
Pre-existing issues:
- [High] Unconditional NULL pointer dereference in pnv_pci_sriov_disable() if PE allocation fails.
- [High] Unchecked NULL pointer dereference in pnv_pci_ioda_fixup_iov() for VFs with failed configuration.
- [High] Failure to unshift IOV BAR resources in pnv_pci_sriov_disable() if the first VF's PE failed to configure.
- [High] Lockless list traversal of phb->ioda.pe_list exposes traversal to concurrent modifications.
--
commit 5e49eb3e31cb4ac469a301b263f891c4ba0713e3
Author: Dimitri Daskalakis <daskald@meta.com>
PCI: Convert arch/powerpc to pci_is_sriov_* helpers
This patch converts SR-IOV specific is_physfn and is_virtfn reads in the
PowerPC PCI code to use pci_is_sriov_physfn() and pci_is_sriov_virtfn().
This is intended to keep SR-IOV semantics intact once those flags are
widened to cover other virtualization types.
> Convert SR-IOV-specific is_physfn / is_virtfn reads in the PowerPC
> PCI code to use pci_is_sriov_physfn() / pci_is_sriov_virtfn(). These
> call sites are all SR-IOV-specific: they guard SR-IOV state
> dereferences, VF PE management, or sit inside #ifdef CONFIG_PCI_IOV
> blocks.
[Severity: High]
The commit message states that all SR-IOV specific is_physfn and is_virtfn
reads are converted, but did we miss some checks in other PowerPC PCI files?
For instance, in arch/powerpc/kernel/eeh.c:eeh_break_device():
if (pdev->is_virtfn) {
...
pos = pci_find_ext_capability(pdev->physfn, PCI_EXT_CAP_ID_SRIOV);
...
If a non-SR-IOV VF (like SIOV) has is_virtfn set, wouldn't this read from
pdev->physfn (which might be NULL) or fail to find the SR-IOV capability
(pos = 0) and corrupt the PCI config space at offset 8 (PCI_SRIOV_CTRL)?
Similarly, there appear to be missed conversions guarding SR-IOV logic in:
- arch/powerpc/platforms/powernv/eeh-powernv.c:pnv_eeh_get_upstream_pe()
- arch/powerpc/kernel/eeh_sysfs.c:eeh_notify_resume_add()
- arch/powerpc/kernel/eeh_sysfs.c:eeh_notify_resume_remove()
> diff --git a/arch/powerpc/platforms/powernv/pci-sriov.c b/arch/powerpc/platforms/powernv/pci-sriov.c
> index 7105a573aec42..1113488f43720 100644
> --- a/arch/powerpc/platforms/powernv/pci-sriov.c
> +++ b/arch/powerpc/platforms/powernv/pci-sriov.c
> @@ -225,7 +225,7 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
>
> void pnv_pci_ioda_fixup_iov(struct pci_dev *pdev)
> {
> - if (pdev->is_virtfn) {
> + if (pci_is_sriov_virtfn(pdev)) {
> struct pnv_ioda_pe *pe = pnv_ioda_get_pe(pdev);
>
> /*
> @@ -235,7 +235,7 @@ void pnv_pci_ioda_fixup_iov(struct pci_dev *pdev)
> */
> pe->pdev = pdev;
[Severity: High]
This is a pre-existing issue, but can pnv_ioda_get_pe() return NULL here,
leading to a NULL pointer dereference when assigning pe->pdev = pdev?
During SR-IOV enablement, if pnv_ioda_configure_pe() fails for a specific VF
in pnv_ioda_setup_vf_PE(), the error is ignored and the VF's PE number
remains IODA_INVALID_PE. When the PCI core subsequently adds the VF and calls
this function, pnv_ioda_get_pe() will return NULL because the PE number is
invalid.
[ ... ]
> @@ -479,7 +479,7 @@ static void pnv_ioda_release_vf_PE(struct pci_dev *pdev)
>
> phb = pci_bus_to_pnvhb(pdev->bus);
>
> - if (!pdev->is_physfn)
> + if (!pci_is_sriov_physfn(pdev))
> return;
>
> /* FIXME: Use pnv_ioda_release_pe()? */
> list_for_each_entry_safe(pe, pe_n, &phb->ioda.pe_list, list) {
[Severity: High]
This is a pre-existing issue, but does this lockless traversal of
phb->ioda.pe_list expose us to concurrent modifications?
The lock phb->ioda.pe_list_mutex is only held briefly during list_del()
inside this loop. However, other PCI hotplug paths (such as
pnv_ioda_setup_PE()) add entries to this list while holding the mutex.
If a concurrent hotplug event mutates the list during this traversal,
could it corrupt the cached pe_n pointer or list linkages, leading to
a use-after-free?
[ ... ]
[Severity: High]
This is a pre-existing issue, but I noticed a potential NULL pointer
dereference in pnv_pci_sriov_disable() in the same file if PE allocation
fails:
arch/powerpc/platforms/powernv/pci-sriov.c:pnv_pci_sriov_disable() {
...
base_pe = iov->vf_pe_arr[0].pe_number;
...
}
In pnv_pci_sriov_enable(), if pnv_ioda_alloc_pe() fails, it returns -EBUSY
and iov->vf_pe_arr remains NULL. The generic PCI core handles this failure
by calling pcibios_sriov_disable(), which invokes pnv_pci_sriov_disable().
This unconditionally accesses iov->vf_pe_arr[0], which would crash.
[Severity: High]
This is another pre-existing issue in pnv_pci_sriov_disable(). Does the
error teardown path fail to unshift IOV BAR resources if the first VF's
PE failed to configure?
arch/powerpc/platforms/powernv/pci-sriov.c:pnv_pci_sriov_disable() {
...
base_pe = iov->vf_pe_arr[0].pe_number;
...
pnv_pci_vf_resource_shift(pdev, -base_pe);
...
}
During SR-IOV enablement, pnv_pci_vf_resource_shift() shifts the IOV BARs
based on the allocated base PE number. If pnv_ioda_configure_pe() fails
for the first VF (VF 0), it zero-initializes the PE structure. During
error teardown, base_pe is read as 0, resulting in a call to
pnv_pci_vf_resource_shift(pdev, -0). This fails to unshift the IOV BARs
back to their original addresses, which could leak the shifted MMIO windows.
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260604150153.3619662-1-dimitri.daskalakis1@gmail.com?part=4
^ permalink raw reply [flat|nested] 25+ messages in thread
* [RFC 05/12] PCI: Convert s390/pci/pci.c to pci_is_sriov_* helpers
2026-06-04 15:01 [RFC 00/12] PCI: Add support for Scalable I/O Virtualization Dimitri Daskalakis
` (3 preceding siblings ...)
2026-06-04 15:01 ` [RFC 04/12] PCI: Convert arch/powerpc " Dimitri Daskalakis
@ 2026-06-04 15:01 ` Dimitri Daskalakis
2026-06-04 15:01 ` [RFC 06/12] PCI: Convert vfio_pci_core.c " Dimitri Daskalakis
` (8 subsequent siblings)
13 siblings, 0 replies; 25+ messages in thread
From: Dimitri Daskalakis @ 2026-06-04 15:01 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, Mahesh J Salgaonkar, Oliver O'Halloran,
Niklas Schnelle, Gerald Schaefer, Heiko Carstens, Vasily Gorbik,
Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
Alex Williamson, Jason Gunthorpe, Kevin Tian, Ankit Agrawal,
Leon Romanovsky, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, Keith Busch, Alexander Duyck,
Jakub Kicinski, Dimitri Daskalakis, linuxppc-dev, linux-s390, kvm,
xen-devel
From: Dimitri Daskalakis <daskald@meta.com>
No functional changes.
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: Dimitri Daskalakis <daskald@meta.com>
---
arch/s390/pci/pci.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
index 39bd2adfc240..5e6f600bf60b 100644
--- a/arch/s390/pci/pci.c
+++ b/arch/s390/pci/pci.c
@@ -611,7 +611,7 @@ int pcibios_device_add(struct pci_dev *pdev)
/* The pdev has a reference to the zdev via its bus */
zpci_zdev_get(zdev);
- if (pdev->is_physfn)
+ if (pci_is_sriov_physfn(pdev))
pdev->no_vf_scan = 1;
zpci_map_resources(pdev);
--
2.52.0
^ permalink raw reply related [flat|nested] 25+ messages in thread* [RFC 06/12] PCI: Convert vfio_pci_core.c to pci_is_sriov_* helpers
2026-06-04 15:01 [RFC 00/12] PCI: Add support for Scalable I/O Virtualization Dimitri Daskalakis
` (4 preceding siblings ...)
2026-06-04 15:01 ` [RFC 05/12] PCI: Convert s390/pci/pci.c " Dimitri Daskalakis
@ 2026-06-04 15:01 ` Dimitri Daskalakis
2026-06-04 15:01 ` [RFC 07/12] PCI: Convert xen-pciback and pci-driver " Dimitri Daskalakis
` (7 subsequent siblings)
13 siblings, 0 replies; 25+ messages in thread
From: Dimitri Daskalakis @ 2026-06-04 15:01 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, Mahesh J Salgaonkar, Oliver O'Halloran,
Niklas Schnelle, Gerald Schaefer, Heiko Carstens, Vasily Gorbik,
Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
Alex Williamson, Jason Gunthorpe, Kevin Tian, Ankit Agrawal,
Leon Romanovsky, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, Keith Busch, Alexander Duyck,
Jakub Kicinski, Dimitri Daskalakis, linuxppc-dev, linux-s390, kvm,
xen-devel
From: Dimitri Daskalakis <daskald@meta.com>
No functional changes.
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: Dimitri Daskalakis <daskald@meta.com>
---
drivers/vfio/pci/vfio_pci_core.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 3f8d093aacf8..ad8069612cb2 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -1856,7 +1856,7 @@ int vfio_pci_core_match_token_uuid(struct vfio_device *core_vdev,
*
* If the VF token is provided but unused, an error is generated.
*/
- if (vdev->pdev->is_virtfn) {
+ if (pci_is_sriov_virtfn(vdev->pdev)) {
struct vfio_pci_core_device *pf_vdev = vdev->sriov_pf_core_dev;
bool match;
@@ -1979,13 +1979,13 @@ static int vfio_pci_bus_notifier(struct notifier_block *nb,
struct pci_dev *physfn = pci_physfn(pdev);
if (action == BUS_NOTIFY_ADD_DEVICE &&
- pdev->is_virtfn && physfn == vdev->pdev) {
+ pci_is_sriov_virtfn(pdev) && physfn == vdev->pdev) {
pci_info(vdev->pdev, "Captured SR-IOV VF %s driver_override\n",
pci_name(pdev));
WARN_ON(device_set_driver_override(&pdev->dev,
vdev->vdev.ops->name));
} else if (action == BUS_NOTIFY_BOUND_DRIVER &&
- pdev->is_virtfn && physfn == vdev->pdev) {
+ pci_is_sriov_virtfn(pdev) && physfn == vdev->pdev) {
struct pci_driver *drv = pci_dev_driver(pdev);
if (drv && drv != pci_dev_driver(vdev->pdev))
@@ -2005,7 +2005,7 @@ static int vfio_pci_vf_init(struct vfio_pci_core_device *vdev)
struct pci_dev *physfn;
int ret;
- if (pdev->is_virtfn) {
+ if (pci_is_sriov_virtfn(pdev)) {
/*
* If this VF was created by our vfio_pci_core_sriov_configure()
* then we can find the PF vfio_pci_core_device now, and due to
@@ -2025,7 +2025,7 @@ static int vfio_pci_vf_init(struct vfio_pci_core_device *vdev)
}
/* Not a SRIOV PF */
- if (!pdev->is_physfn)
+ if (!pci_is_sriov_physfn(pdev))
return 0;
vdev->vf_token = kzalloc_obj(*vdev->vf_token);
@@ -2166,7 +2166,7 @@ int vfio_pci_core_register_device(struct vfio_pci_core_device *vdev)
return -EBUSY;
}
- if (pci_is_root_bus(pdev->bus) || pdev->is_virtfn) {
+ if (pci_is_root_bus(pdev->bus) || pci_is_sriov_virtfn(pdev)) {
ret = vfio_assign_device_set(&vdev->vdev, vdev);
} else if (!pci_probe_reset_slot(pdev->slot)) {
ret = vfio_assign_device_set(&vdev->vdev, pdev->slot);
--
2.52.0
^ permalink raw reply related [flat|nested] 25+ messages in thread* [RFC 07/12] PCI: Convert xen-pciback and pci-driver to pci_is_sriov_* helpers
2026-06-04 15:01 [RFC 00/12] PCI: Add support for Scalable I/O Virtualization Dimitri Daskalakis
` (5 preceding siblings ...)
2026-06-04 15:01 ` [RFC 06/12] PCI: Convert vfio_pci_core.c " Dimitri Daskalakis
@ 2026-06-04 15:01 ` Dimitri Daskalakis
2026-06-04 15:11 ` Juergen Gross
2026-06-04 15:24 ` sashiko-bot
2026-06-04 15:01 ` [RFC 08/12] PCI: Add is_sriov bit to struct pci_dev Dimitri Daskalakis
` (6 subsequent siblings)
13 siblings, 2 replies; 25+ messages in thread
From: Dimitri Daskalakis @ 2026-06-04 15:01 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, Mahesh J Salgaonkar, Oliver O'Halloran,
Niklas Schnelle, Gerald Schaefer, Heiko Carstens, Vasily Gorbik,
Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
Alex Williamson, Jason Gunthorpe, Kevin Tian, Ankit Agrawal,
Leon Romanovsky, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, Keith Busch, Alexander Duyck,
Jakub Kicinski, Dimitri Daskalakis, linuxppc-dev, linux-s390, kvm,
xen-devel
From: Dimitri Daskalakis <daskald@meta.com>
No functional changes.
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: Dimitri Daskalakis <daskald@meta.com>
---
drivers/pci/pci-driver.c | 4 ++--
drivers/xen/xen-pciback/pci_stub.c | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index d10ece0889f0..926f80bccd70 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -357,7 +357,7 @@ static void local_pci_probe_callback(struct work_struct *work)
static bool pci_physfn_is_probed(struct pci_dev *dev)
{
#ifdef CONFIG_PCI_IOV
- return dev->is_virtfn && dev->physfn->is_probed;
+ return pci_is_sriov_virtfn(dev) && dev->physfn->is_probed;
#else
return false;
#endif
@@ -453,7 +453,7 @@ static int __pci_device_probe(struct pci_driver *drv, struct pci_dev *pci_dev)
#ifdef CONFIG_PCI_IOV
static inline bool pci_device_can_probe(struct pci_dev *pdev)
{
- return (!pdev->is_virtfn || pdev->physfn->sriov->drivers_autoprobe ||
+ return (!pci_is_sriov_virtfn(pdev) || pdev->physfn->sriov->drivers_autoprobe ||
device_has_driver_override(&pdev->dev));
}
#else
diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index 79a2b5dfd694..fd5a13f79fcf 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -1695,7 +1695,7 @@ static int pci_stub_notifier(struct notifier_block *nb,
if (action != BUS_NOTIFY_UNBIND_DRIVER)
return NOTIFY_DONE;
- if (!pdev->is_physfn)
+ if (!pci_is_sriov_physfn(pdev))
return NOTIFY_DONE;
for (;;) {
--
2.52.0
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [RFC 07/12] PCI: Convert xen-pciback and pci-driver to pci_is_sriov_* helpers
2026-06-04 15:01 ` [RFC 07/12] PCI: Convert xen-pciback and pci-driver " Dimitri Daskalakis
@ 2026-06-04 15:11 ` Juergen Gross
2026-06-04 15:24 ` sashiko-bot
1 sibling, 0 replies; 25+ messages in thread
From: Juergen Gross @ 2026-06-04 15:11 UTC (permalink / raw)
To: Dimitri Daskalakis, Bjorn Helgaas
Cc: linux-pci, Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, Mahesh J Salgaonkar, Oliver O'Halloran,
Niklas Schnelle, Gerald Schaefer, Heiko Carstens, Vasily Gorbik,
Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
Alex Williamson, Jason Gunthorpe, Kevin Tian, Ankit Agrawal,
Leon Romanovsky, Stefano Stabellini, Oleksandr Tyshchenko,
Keith Busch, Alexander Duyck, Jakub Kicinski, Dimitri Daskalakis,
linuxppc-dev, linux-s390, kvm, xen-devel
[-- Attachment #1.1.1: Type: text/plain, Size: 283 bytes --]
On 04.06.26 17:01, Dimitri Daskalakis wrote:
> From: Dimitri Daskalakis <daskald@meta.com>
>
> No functional changes.
>
> Assisted-by: Claude:claude-opus-4.7
> Signed-off-by: Dimitri Daskalakis <daskald@meta.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Juergen
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3743 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC 07/12] PCI: Convert xen-pciback and pci-driver to pci_is_sriov_* helpers
2026-06-04 15:01 ` [RFC 07/12] PCI: Convert xen-pciback and pci-driver " Dimitri Daskalakis
2026-06-04 15:11 ` Juergen Gross
@ 2026-06-04 15:24 ` sashiko-bot
1 sibling, 0 replies; 25+ messages in thread
From: sashiko-bot @ 2026-06-04 15:24 UTC (permalink / raw)
To: Dimitri Daskalakis; +Cc: kvm, linux-pci
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
New issues:
- [Medium] Incomplete implementation: missed conversion of dev->is_virtfn in vpci.c.
Pre-existing issues:
- [High] Concurrent driver unbind can cause a use-after-free of `pcistub_device` in `pci_stub_notifier`.
--
commit 88d02306e6f1bfe4b75d5df018b4e5de4f2ff35a
Author: Dimitri Daskalakis <daskald@meta.com>
PCI: Convert xen-pciback and pci-driver to pci_is_sriov_* helpers
No functional changes.
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: Dimitri Daskalakis <daskald@meta.com>
[Severity: Medium]
The commit message explicitly states its scope is to convert xen-pciback and
pci-driver to the new pci_is_sriov_* helpers. Did we miss converting
dev->is_virtfn in drivers/xen/xen-pciback/vpci.c?
For example, in __xen_pcibk_add_pci_dev() there are still direct accesses:
drivers/xen/xen-pciback/vpci.c:__xen_pcibk_add_pci_dev() {
...
if (!dev->is_virtfn || func) {
...
if (t->dev->is_virtfn && !PCI_FUNC(t->dev->devfn))
...
}
Leaving these direct accesses unconverted means they will fail to distinguish
between SRIOV and SIOV VFs when the helpers are updated to check dev->is_sriov
in later patches.
> diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
> index 79a2b5dfd6941..fd5a13f79fcf4 100644
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -1695,7 +1695,7 @@ static int pci_stub_notifier(struct notifier_block *nb,
> if (action != BUS_NOTIFY_UNBIND_DRIVER)
> return NOTIFY_DONE;
>
> - if (!pdev->is_physfn)
> + if (!pci_is_sriov_physfn(pdev))
> return NOTIFY_DONE;
>
> for (;;) {
[Severity: High]
This is a pre-existing issue, but is there a use-after-free risk here if a
concurrent unbind occurs?
find_vfs() searches the pcistub_devices list and returns a pcistub_device
pointer without taking a reference count via pcistub_device_get():
drivers/xen/xen-pciback/pci_stub.c:pci_stub_notifier() {
...
for (;;) {
struct pcistub_device *psdev = find_vfs(pdev);
if (!psdev)
break;
device_release_driver(&psdev->dev->dev);
}
...
}
If a concurrent thread unbinds the VF (e.g., via sysfs unbind),
pcistub_remove() executing under the VF's device lock will remove the device
from the list and drop the final reference using pcistub_device_put(), freeing
psdev before the notifier has a chance to execute.
The notifier thread would then dereference the freed psdev memory when
evaluating &psdev->dev->dev in device_release_driver().
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260604150153.3619662-1-dimitri.daskalakis1@gmail.com?part=7
^ permalink raw reply [flat|nested] 25+ messages in thread
* [RFC 08/12] PCI: Add is_sriov bit to struct pci_dev
2026-06-04 15:01 [RFC 00/12] PCI: Add support for Scalable I/O Virtualization Dimitri Daskalakis
` (6 preceding siblings ...)
2026-06-04 15:01 ` [RFC 07/12] PCI: Convert xen-pciback and pci-driver " Dimitri Daskalakis
@ 2026-06-04 15:01 ` Dimitri Daskalakis
2026-06-04 15:01 ` [RFC 09/12] PCI: Add helper to compute VF Routing ID to pci.h Dimitri Daskalakis
` (5 subsequent siblings)
13 siblings, 0 replies; 25+ messages in thread
From: Dimitri Daskalakis @ 2026-06-04 15:01 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, Mahesh J Salgaonkar, Oliver O'Halloran,
Niklas Schnelle, Gerald Schaefer, Heiko Carstens, Vasily Gorbik,
Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
Alex Williamson, Jason Gunthorpe, Kevin Tian, Ankit Agrawal,
Leon Romanovsky, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, Keith Busch, Alexander Duyck,
Jakub Kicinski, Dimitri Daskalakis, linuxppc-dev, linux-s390, kvm,
xen-devel
From: Dimitri Daskalakis <daskald@meta.com>
We need a way to disambiguate the virtualization type of a PF/VF.
PFs may support multiple types of virtualization, while a VF should
only support one.
Tighten pci_is_sriov_physfn() / pci_is_sriov_virtfn() to ensure the
is_sriov bit is set. This allows the existing is_physfn/is_virtfn
bits to be agnostic of virtualization type.
No functional changes for SR-IOV.
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: Dimitri Daskalakis <daskald@meta.com>
---
arch/s390/pci/pci_iov.c | 1 +
drivers/pci/iov.c | 4 ++++
include/linux/pci.h | 5 +++--
3 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/arch/s390/pci/pci_iov.c b/arch/s390/pci/pci_iov.c
index 13050ce5c3e9..82e9ef1f132f 100644
--- a/arch/s390/pci/pci_iov.c
+++ b/arch/s390/pci/pci_iov.c
@@ -53,6 +53,7 @@ static int zpci_iov_link_virtfn(struct pci_dev *pdev, struct pci_dev *virtfn, in
return rc;
virtfn->is_virtfn = 1;
+ virtfn->is_sriov = 1;
virtfn->multifunction = 0;
virtfn->physfn = pci_dev_get(pdev);
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 5de26057b99a..4aed4f6a42c3 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -326,6 +326,7 @@ static struct pci_dev *pci_iov_scan_device(struct pci_dev *dev, int id,
virtfn->vendor = dev->vendor;
virtfn->device = iov->vf_device;
virtfn->is_virtfn = 1;
+ virtfn->is_sriov = 1;
virtfn->physfn = pci_dev_get(dev);
virtfn->no_command_memory = 1;
@@ -897,6 +898,7 @@ static int sriov_init(struct pci_dev *dev, int pos)
iov->dev = dev;
dev->sriov = iov;
+ dev->is_sriov = 1;
dev->is_physfn = 1;
rc = compute_max_vf_buses(dev);
if (rc)
@@ -906,6 +908,7 @@ static int sriov_init(struct pci_dev *dev, int pos)
fail_max_buses:
dev->sriov = NULL;
+ dev->is_sriov = 0;
dev->is_physfn = 0;
failed:
for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
@@ -926,6 +929,7 @@ static void sriov_release(struct pci_dev *dev)
kfree(dev->sriov);
dev->sriov = NULL;
+ dev->is_sriov = 0;
}
static void sriov_restore_vf_rebar_state(struct pci_dev *dev)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 28892243f49f..ca84f66425b2 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -479,6 +479,7 @@ struct pci_dev {
unsigned int state_saved:1;
unsigned int is_physfn:1;
unsigned int is_virtfn:1;
+ unsigned int is_sriov:1; /* SR-IOV is enabled on this device (PF or VF) */
unsigned int is_hotplug_bridge:1;
unsigned int is_pciehp:1;
unsigned int shpc_managed:1; /* SHPC owned by shpchp */
@@ -606,12 +607,12 @@ static inline struct pci_dev *pci_physfn(struct pci_dev *dev)
static inline bool pci_is_sriov_physfn(const struct pci_dev *dev)
{
- return dev->is_physfn;
+ return dev->is_physfn && dev->is_sriov;
}
static inline bool pci_is_sriov_virtfn(const struct pci_dev *dev)
{
- return dev->is_virtfn;
+ return dev->is_virtfn && dev->is_sriov;
}
struct pci_dev *pci_alloc_dev(struct pci_bus *bus);
--
2.52.0
^ permalink raw reply related [flat|nested] 25+ messages in thread* [RFC 09/12] PCI: Add helper to compute VF Routing ID to pci.h
2026-06-04 15:01 [RFC 00/12] PCI: Add support for Scalable I/O Virtualization Dimitri Daskalakis
` (7 preceding siblings ...)
2026-06-04 15:01 ` [RFC 08/12] PCI: Add is_sriov bit to struct pci_dev Dimitri Daskalakis
@ 2026-06-04 15:01 ` Dimitri Daskalakis
2026-06-04 15:01 ` [RFC 10/12] PCI: Add Scalable I/O Virtualization data structure definitions Dimitri Daskalakis
` (4 subsequent siblings)
13 siblings, 0 replies; 25+ messages in thread
From: Dimitri Daskalakis @ 2026-06-04 15:01 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, Mahesh J Salgaonkar, Oliver O'Halloran,
Niklas Schnelle, Gerald Schaefer, Heiko Carstens, Vasily Gorbik,
Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
Alex Williamson, Jason Gunthorpe, Kevin Tian, Ankit Agrawal,
Leon Romanovsky, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, Keith Busch, Alexander Duyck,
Jakub Kicinski, Dimitri Daskalakis, linuxppc-dev, linux-s390, kvm,
xen-devel
From: Dimitri Daskalakis <daskald@meta.com>
The VF RID computation is identical for SR-IOV and SIOV. Add a common
helper so we can share the logic across both.
No functional changes.
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: Dimitri Daskalakis <daskald@meta.com>
---
drivers/pci/iov.c | 8 ++++----
drivers/pci/pci.h | 12 ++++++++++++
2 files changed, 16 insertions(+), 4 deletions(-)
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 4aed4f6a42c3..5d65413ce98d 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -25,16 +25,16 @@ int pci_iov_virtfn_bus(struct pci_dev *dev, int vf_id)
{
if (!pci_is_sriov_physfn(dev))
return -EINVAL;
- return dev->bus->number + ((dev->devfn + dev->sriov->offset +
- dev->sriov->stride * vf_id) >> 8);
+ return pci_virtfn_routing_id(dev, dev->sriov->offset,
+ dev->sriov->stride, vf_id) >> 8;
}
int pci_iov_virtfn_devfn(struct pci_dev *dev, int vf_id)
{
if (!pci_is_sriov_physfn(dev))
return -EINVAL;
- return (dev->devfn + dev->sriov->offset +
- dev->sriov->stride * vf_id) & 0xff;
+ return pci_virtfn_routing_id(dev, dev->sriov->offset,
+ dev->sriov->stride, vf_id) & 0xff;
}
EXPORT_SYMBOL_GPL(pci_iov_virtfn_devfn);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 73b913bcb87a..45411960fd2e 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -1017,6 +1017,18 @@ static inline int pci_resource_num_to_vf_bar(int resno)
}
#endif /* CONFIG_PCI_IOV */
+#if defined(CONFIG_PCI_IOV) || defined(CONFIG_PCI_SIOV)
+/*
+ * Compute the Routing ID (bus/devfn) for a VF or SDI under @pf, given the
+ * capability's offset and stride.
+ */
+static inline u16 pci_virtfn_routing_id(struct pci_dev *pf, u16 offset,
+ u16 stride, int id)
+{
+ return (pf->bus->number << 8) + pf->devfn + offset + stride * id;
+}
+#endif
+
#ifdef CONFIG_PCIE_TPH
void pci_restore_tph_state(struct pci_dev *dev);
void pci_save_tph_state(struct pci_dev *dev);
--
2.52.0
^ permalink raw reply related [flat|nested] 25+ messages in thread* [RFC 10/12] PCI: Add Scalable I/O Virtualization data structure definitions
2026-06-04 15:01 [RFC 00/12] PCI: Add support for Scalable I/O Virtualization Dimitri Daskalakis
` (8 preceding siblings ...)
2026-06-04 15:01 ` [RFC 09/12] PCI: Add helper to compute VF Routing ID to pci.h Dimitri Daskalakis
@ 2026-06-04 15:01 ` Dimitri Daskalakis
2026-06-04 15:23 ` sashiko-bot
2026-06-04 15:01 ` [RFC 11/12] PCI: Initialize and release SIOV capability Dimitri Daskalakis
` (3 subsequent siblings)
13 siblings, 1 reply; 25+ messages in thread
From: Dimitri Daskalakis @ 2026-06-04 15:01 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, Mahesh J Salgaonkar, Oliver O'Halloran,
Niklas Schnelle, Gerald Schaefer, Heiko Carstens, Vasily Gorbik,
Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
Alex Williamson, Jason Gunthorpe, Kevin Tian, Ankit Agrawal,
Leon Romanovsky, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, Keith Busch, Alexander Duyck,
Jakub Kicinski, Dimitri Daskalakis, linuxppc-dev, linux-s390, kvm,
xen-devel
From: Dimitri Daskalakis <daskald@meta.com>
Define the PCIe SIOV extended capability registers per the PCIe 7.0
spec, and introduce the kernel-internal data structures needed to track
SIOV state on a Physical Function. PCI-SIG members can access the spec
here https://members.pcisig.com/wg/PCI-SIG/document/previewpdf/22464.
The PCI_SIOV kconfig selects PCI_ATS rather than attempting to decouple
the sriov/physfn union within struct pci_dev from CONFIG_PCI_ATS. If
desired this can be done in the future, since ATS is optional
for SR-IOV and SIOV.
Inspired by struct pci_sriov, struct pci_siov records the
capability position, total SDI count, routing ID offset/stride, and
driver-configurable limits.
Add an is_siov bit to struct pci_dev along with helpers to identify
SIOV PFs/VFs.
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: Dimitri Daskalakis <daskald@meta.com>
---
drivers/pci/Kconfig | 11 +++++++++++
drivers/pci/pci.h | 13 +++++++++++++
include/linux/pci.h | 16 +++++++++++++++-
include/uapi/linux/pci_regs.h | 12 +++++++++++-
4 files changed, 50 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index 33c88432b728..930231835c40 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -164,6 +164,17 @@ config PCI_IOV
If unsure, say N.
+config PCI_SIOV
+ bool "PCI Scalable IOV support"
+ select PCI_ATS
+ help
+ Scalable I/O Virtualization is a PCIe feature that allows devices
+ to expose lightweight Scalable Device Interfaces (SDIs). Unlike
+ SR-IOV Virtual Functions, SDIs have no config space or BARs and
+ rely on software to compose the control path.
+
+ If unsure, say N.
+
config PCI_NPEM
bool "Native PCIe Enclosure Management"
depends on LEDS_CLASS=y
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 45411960fd2e..fd7c04e26c16 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -683,6 +683,19 @@ struct pci_sriov {
bool drivers_autoprobe; /* Auto probing of VFs by driver */
};
+/* Scalable I/O Virtualization */
+struct pci_siov {
+ struct pci_dev *self; /* This PF */
+ u32 cap; /* SIOV Capabilities */
+ u16 pos; /* Capability position */
+ u16 total_SDIs; /* Total SDIs associated with the PF */
+ u16 num_SDIs; /* Number of SDIs currently enabled */
+ u16 offset; /* First SDI Routing ID offset */
+ u16 stride; /* Following SDI stride */
+ u16 driver_max_SDIs;/* Max num SDIs driver supports */
+ u8 max_SDI_buses; /* Max buses consumed by SDIs */
+};
+
#ifdef CONFIG_PCI_DOE
void pci_doe_init(struct pci_dev *pdev);
void pci_doe_destroy(struct pci_dev *pdev);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index ca84f66425b2..eba562474017 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -480,6 +480,7 @@ struct pci_dev {
unsigned int is_physfn:1;
unsigned int is_virtfn:1;
unsigned int is_sriov:1; /* SR-IOV is enabled on this device (PF or VF) */
+ unsigned int is_siov:1; /* SIOV is enabled on this device (PF or VF/SDI) */
unsigned int is_hotplug_bridge:1;
unsigned int is_pciehp:1;
unsigned int shpc_managed:1; /* SHPC owned by shpchp */
@@ -549,6 +550,9 @@ struct pci_dev {
u16 ats_cap; /* ATS Capability offset */
u8 ats_stu; /* ATS Smallest Translation Unit */
#endif
+#ifdef CONFIG_PCI_SIOV
+ struct pci_siov *siov; /* PF: Scalable IOV info */
+#endif
#ifdef CONFIG_PCI_PRI
u16 pri_cap; /* PRI Capability offset */
u32 pri_reqs_alloc; /* Number of PRI requests allocated */
@@ -598,7 +602,7 @@ struct pci_dev {
static inline struct pci_dev *pci_physfn(struct pci_dev *dev)
{
-#ifdef CONFIG_PCI_IOV
+#if defined(CONFIG_PCI_IOV) || defined(CONFIG_PCI_SIOV)
if (dev->is_virtfn)
dev = dev->physfn;
#endif
@@ -615,6 +619,16 @@ static inline bool pci_is_sriov_virtfn(const struct pci_dev *dev)
return dev->is_virtfn && dev->is_sriov;
}
+static inline bool pci_is_siov_physfn(const struct pci_dev *dev)
+{
+ return dev->is_physfn && dev->is_siov;
+}
+
+static inline bool pci_is_siov_virtfn(const struct pci_dev *dev)
+{
+ return dev->is_virtfn && dev->is_siov;
+}
+
struct pci_dev *pci_alloc_dev(struct pci_bus *bus);
#define to_pci_dev(n) container_of(n, struct pci_dev, dev)
diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index 14f634ab9350..0f81c8c72b05 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -763,7 +763,8 @@
#define PCI_EXT_CAP_ID_DEV3 0x2F /* Device 3 Capability/Control/Status */
#define PCI_EXT_CAP_ID_IDE 0x30 /* Integrity and Data Encryption */
#define PCI_EXT_CAP_ID_PL_64GT 0x31 /* Physical Layer 64.0 GT/s */
-#define PCI_EXT_CAP_ID_MAX PCI_EXT_CAP_ID_PL_64GT
+#define PCI_EXT_CAP_ID_SIOV 0x38 /* Scalable I/O Virtualization */
+#define PCI_EXT_CAP_ID_MAX PCI_EXT_CAP_ID_SIOV
#define PCI_EXT_CAP_DSN_SIZEOF 12
#define PCI_EXT_CAP_MCAST_ENDPOINT_SIZEOF 40
@@ -1005,6 +1006,15 @@
#define PCI_SRIOV_VFM_AV 0x3 /* Active.Available */
#define PCI_EXT_CAP_SRIOV_SIZEOF 0x40
+/* Scalable I/O Virtualization */
+#define PCI_SIOV_CAP 0x04 /* SIOV Capabilities */
+#define PCI_SIOV_TOTAL_SDI 0x08 /* Total SDIs */
+#define PCI_SIOV_STATUS 0x0B /* SIOV Status */
+#define PCI_SIOV_STATUS_ENABLED 0x01 /* At least one SDI is enabled */
+#define PCI_SIOV_SDI_OFFSET 0x0C /* First SDI Offset */
+#define PCI_SIOV_SDI_STRIDE 0x0E /* SDI Stride */
+#define PCI_EXT_CAP_SIOV_SIZEOF 0x10
+
#define PCI_LTR_MAX_SNOOP_LAT 0x4
#define PCI_LTR_MAX_NOSNOOP_LAT 0x6
#define PCI_LTR_VALUE_MASK 0x000003ff
--
2.52.0
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [RFC 10/12] PCI: Add Scalable I/O Virtualization data structure definitions
2026-06-04 15:01 ` [RFC 10/12] PCI: Add Scalable I/O Virtualization data structure definitions Dimitri Daskalakis
@ 2026-06-04 15:23 ` sashiko-bot
0 siblings, 0 replies; 25+ messages in thread
From: sashiko-bot @ 2026-06-04 15:23 UTC (permalink / raw)
To: Dimitri Daskalakis; +Cc: linux-pci, kvm
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] Refactoring of VF routing ID calculation introduces a silent 16-bit truncation that bypasses the maximum bus number bounds check.
--
commit ca9b97e7dff7c499701ce4b55ae9cf14d51bbed7
Author: Dimitri Daskalakis <daskald@meta.com>
PCI: Add Scalable I/O Virtualization data structure definitions
Define the PCIe SIOV extended capability registers per the PCIe 7.0
spec, and introduce the kernel-internal data structures needed to track
SIOV state on a Physical Function. Add an is_siov bit to struct pci_dev
along with helpers to identify SIOV PFs/VFs.
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 45411960fd2ef..fd7c04e26c164 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -683,6 +683,19 @@ struct pci_sriov {
> bool drivers_autoprobe; /* Auto probing of VFs by driver */
> };
>
> +/* Scalable I/O Virtualization */
> +struct pci_siov {
> + struct pci_dev *self; /* This PF */
> + u32 cap; /* SIOV Capabilities */
> + u16 pos; /* Capability position */
> + u16 total_SDIs; /* Total SDIs associated with the PF */
> + u16 num_SDIs; /* Number of SDIs currently enabled */
> + u16 offset; /* First SDI Routing ID offset */
> + u16 stride; /* Following SDI stride */
> + u16 driver_max_SDIs;/* Max num SDIs driver supports */
> + u8 max_SDI_buses; /* Max buses consumed by SDIs */
> +};
> +
[Severity: High]
This patch series relies on the pci_virtfn_routing_id() helper introduced in
an earlier commit ("PCI: Add helper to compute VF Routing ID to pci.h"), but
doesn't this helper introduce a silent truncation issue?
In drivers/pci/pci.h:
static inline u16 pci_virtfn_routing_id(struct pci_dev *pf, u16 offset,
u16 stride, int id)
{
return (pf->bus->number << 8) + pf->devfn + offset + stride * id;
}
If the calculated bus number exceeds 255, the internal calculation exceeds
0xFFFF and is truncated by the u16 return type.
When this is used to determine the bus number in drivers/pci/iov.c:
int pci_iov_virtfn_bus(struct pci_dev *dev, int vf_id)
{
...
return pci_virtfn_routing_id(dev, dev->sriov->offset,
dev->sriov->stride, vf_id) >> 8;
}
The right shift is performed on the truncated 16-bit value, resulting in a
wrapped-around bus number.
Consequently, does this allow the bounds check in sriov_enable() to be
bypassed?
drivers/pci/iov.c:sriov_enable() {
...
bus = pci_iov_virtfn_bus(dev, nr_virtfn - 1);
if (bus > dev->bus->busn_res.end) {
...
}
Because the wrapped-around bus value appears to be within bounds, this could
allow VFs to be enabled on invalid, overlapping bus numbers.
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260604150153.3619662-1-dimitri.daskalakis1@gmail.com?part=10
^ permalink raw reply [flat|nested] 25+ messages in thread
* [RFC 11/12] PCI: Initialize and release SIOV capability
2026-06-04 15:01 [RFC 00/12] PCI: Add support for Scalable I/O Virtualization Dimitri Daskalakis
` (9 preceding siblings ...)
2026-06-04 15:01 ` [RFC 10/12] PCI: Add Scalable I/O Virtualization data structure definitions Dimitri Daskalakis
@ 2026-06-04 15:01 ` Dimitri Daskalakis
2026-06-04 15:23 ` sashiko-bot
2026-06-04 15:01 ` [RFC 12/12] PCI: Reserve bus range for SIOV devices Dimitri Daskalakis
` (2 subsequent siblings)
13 siblings, 1 reply; 25+ messages in thread
From: Dimitri Daskalakis @ 2026-06-04 15:01 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, Mahesh J Salgaonkar, Oliver O'Halloran,
Niklas Schnelle, Gerald Schaefer, Heiko Carstens, Vasily Gorbik,
Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
Alex Williamson, Jason Gunthorpe, Kevin Tian, Ankit Agrawal,
Leon Romanovsky, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, Keith Busch, Alexander Duyck,
Jakub Kicinski, Dimitri Daskalakis, linuxppc-dev, linux-s390, kvm,
xen-devel
From: Dimitri Daskalakis <daskald@meta.com>
Modify pci_init_capabilities() to discover the SIOV extended capability
(cap ID 0x38). When present, allocate struct pci_siov that records the
capability position, total SDI count, routing ID offset and stride, and
the maximum bus range the SDIs can span.
The init path mirrors sriov_init(): read the capability registers,
compute the worst-case bus consumption from total_SDIs, and stash the
result in the PF's pci_dev. Release frees the structure on teardown.
If is_physfn was already set (by sriov_init), it will not be cleared if
siov_init() fails. This prevents clobbering the flag for devices that
enable both virtualization types.
The SR-IOV code does not unset the is_physfn bit of a pci device
when disabled, and the SIOV code follows that pattern.
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: Dimitri Daskalakis <daskald@meta.com>
---
drivers/pci/Makefile | 1 +
drivers/pci/pci.h | 16 ++++++
drivers/pci/probe.c | 2 +
drivers/pci/siov.c | 113 +++++++++++++++++++++++++++++++++++++++++++
4 files changed, 132 insertions(+)
create mode 100644 drivers/pci/siov.c
diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
index 41ebc3b9a518..a584cd1bf08a 100644
--- a/drivers/pci/Makefile
+++ b/drivers/pci/Makefile
@@ -23,6 +23,7 @@ obj-$(CONFIG_PCI_QUIRKS) += quirks.o
obj-$(CONFIG_HOTPLUG_PCI) += hotplug/
obj-$(CONFIG_PCI_ATS) += ats.o
obj-$(CONFIG_PCI_IOV) += iov.o
+obj-$(CONFIG_PCI_SIOV) += siov.o
obj-$(CONFIG_PCI_BRIDGE_EMUL) += pci-bridge-emul.o
obj-$(CONFIG_PCI_LABEL) += pci-label.o
obj-$(CONFIG_X86_INTEL_MID) += pci-mid.o
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index fd7c04e26c16..a516db996aab 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -1042,6 +1042,22 @@ static inline u16 pci_virtfn_routing_id(struct pci_dev *pf, u16 offset,
}
#endif
+#ifdef CONFIG_PCI_SIOV
+int pci_siov_init(struct pci_dev *dev);
+void pci_siov_release(struct pci_dev *dev);
+int pci_siov_bus_range(struct pci_bus *bus);
+#else
+static inline int pci_siov_init(struct pci_dev *dev)
+{
+ return -ENODEV;
+}
+static inline void pci_siov_release(struct pci_dev *dev) { }
+static inline int pci_siov_bus_range(struct pci_bus *bus)
+{
+ return 0;
+}
+#endif
+
#ifdef CONFIG_PCIE_TPH
void pci_restore_tph_state(struct pci_dev *dev);
void pci_save_tph_state(struct pci_dev *dev);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index b63cd0c310bc..bebc32c8d374 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2473,6 +2473,7 @@ static void pci_release_capabilities(struct pci_dev *dev)
pci_aer_exit(dev);
pci_rcec_exit(dev);
pci_iov_release(dev);
+ pci_siov_release(dev);
pci_free_cap_save_buffers(dev);
}
@@ -2666,6 +2667,7 @@ static void pci_init_capabilities(struct pci_dev *dev)
pci_vpd_init(dev); /* Vital Product Data */
pci_configure_ari(dev); /* Alternative Routing-ID Forwarding */
pci_iov_init(dev); /* Single Root I/O Virtualization */
+ pci_siov_init(dev); /* Scalable I/O Virtualization */
pci_ats_init(dev); /* Address Translation Services */
pci_pri_init(dev); /* Page Request Interface */
pci_pasid_init(dev); /* Process Address Space ID */
diff --git a/drivers/pci/siov.c b/drivers/pci/siov.c
new file mode 100644
index 000000000000..7372ce95714b
--- /dev/null
+++ b/drivers/pci/siov.c
@@ -0,0 +1,113 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * PCI Express Scalable I/O Virtualization (SIOV) support
+ */
+
+#include <linux/pci.h>
+#include <linux/slab.h>
+#include <linux/export.h>
+#include "pci.h"
+
+static int pci_siov_sdi_bus(struct pci_dev *dev, int sdi_id)
+{
+ if (!dev->siov)
+ return -EINVAL;
+ return pci_virtfn_routing_id(dev, dev->siov->offset,
+ dev->siov->stride, sdi_id) >> 8;
+}
+
+static int compute_max_sdi_buses(struct pci_dev *dev)
+{
+ struct pci_siov *siov = dev->siov;
+
+ if (!siov->offset || (siov->total_SDIs > 1 && !siov->stride))
+ return -EIO;
+
+ siov->max_SDI_buses = pci_siov_sdi_bus(dev, siov->total_SDIs - 1);
+ return 0;
+}
+
+static int siov_init(struct pci_dev *dev, int pos)
+{
+ struct pci_siov *siov;
+ bool was_physfn;
+ u16 total;
+ u8 status;
+ int rc;
+
+ pci_read_config_byte(dev, pos + PCI_SIOV_STATUS, &status);
+ if (status & PCI_SIOV_STATUS_ENABLED)
+ pci_warn(dev, "SIOV: SDIs active at init, FLR may be required\n");
+
+ pci_read_config_word(dev, pos + PCI_SIOV_TOTAL_SDI, &total);
+ if (!total)
+ return 0;
+
+ siov = kzalloc_obj(*siov);
+ if (!siov)
+ return -ENOMEM;
+
+ siov->pos = pos;
+ siov->total_SDIs = total;
+ siov->driver_max_SDIs = total;
+ siov->self = dev;
+ pci_read_config_dword(dev, pos + PCI_SIOV_CAP, &siov->cap);
+ pci_read_config_word(dev, pos + PCI_SIOV_SDI_OFFSET, &siov->offset);
+ pci_read_config_word(dev, pos + PCI_SIOV_SDI_STRIDE, &siov->stride);
+
+ was_physfn = dev->is_physfn;
+
+ dev->siov = siov;
+ dev->is_physfn = 1;
+ dev->is_siov = 1;
+ rc = compute_max_sdi_buses(dev);
+ if (rc) {
+ dev->siov = NULL;
+ dev->is_siov = 0;
+ if (!was_physfn)
+ dev->is_physfn = 0;
+ kfree(siov);
+ return rc;
+ }
+
+ return 0;
+}
+
+static void siov_release(struct pci_dev *dev)
+{
+ WARN_ON_ONCE(dev->siov->num_SDIs);
+
+ kfree(dev->siov);
+ dev->siov = NULL;
+ dev->is_siov = 0;
+}
+
+/**
+ * pci_siov_init - initialize the Scalable IOV capability
+ * @dev: the PCI device
+ *
+ * Returns 0 on success, or negative on failure.
+ */
+int pci_siov_init(struct pci_dev *dev)
+{
+ int pos;
+
+ if (!pci_is_pcie(dev))
+ return -ENODEV;
+
+ pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_SIOV);
+ if (pos)
+ return siov_init(dev, pos);
+
+ return -ENODEV;
+}
+
+/**
+ * pci_siov_release - release resources used by the SIOV capability
+ * @dev: the PCI device
+ */
+void pci_siov_release(struct pci_dev *dev)
+{
+ if (dev->siov)
+ siov_release(dev);
+}
--
2.52.0
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [RFC 11/12] PCI: Initialize and release SIOV capability
2026-06-04 15:01 ` [RFC 11/12] PCI: Initialize and release SIOV capability Dimitri Daskalakis
@ 2026-06-04 15:23 ` sashiko-bot
0 siblings, 0 replies; 25+ messages in thread
From: sashiko-bot @ 2026-06-04 15:23 UTC (permalink / raw)
To: Dimitri Daskalakis; +Cc: linux-pci, kvm
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] Calculation of maximum SIOV bus consumption silently wraps around due to 16-bit truncation, underreporting the maximum bus number and breaking IOMMU isolation.
--
commit 72b51539a88b3ec026656cdb5f408780a8f58cd2
Author: Dimitri Daskalakis <daskald@meta.com>
PCI: Initialize and release SIOV capability
This commit modifies pci_init_capabilities() to discover the SIOV extended
capability and allocate a struct pci_siov. It computes the worst-case bus
consumption from total_SDIs and mirrors SR-IOV initialization patterns.
> diff --git a/drivers/pci/siov.c b/drivers/pci/siov.c
> new file mode 100644
> index 0000000000000..7372ce95714b2
> --- /dev/null
> +++ b/drivers/pci/siov.c
[ ... ]
> +static int pci_siov_sdi_bus(struct pci_dev *dev, int sdi_id)
> +{
> + if (!dev->siov)
> + return -EINVAL;
> + return pci_virtfn_routing_id(dev, dev->siov->offset,
> + dev->siov->stride, sdi_id) >> 8;
> +}
> +
> +static int compute_max_sdi_buses(struct pci_dev *dev)
> +{
> + struct pci_siov *siov = dev->siov;
> +
> + if (!siov->offset || (siov->total_SDIs > 1 && !siov->stride))
> + return -EIO;
> +
> + siov->max_SDI_buses = pci_siov_sdi_bus(dev, siov->total_SDIs - 1);
[Severity: High]
Does this calculation properly handle 16-bit truncation wrap-around?
If a PCIe endpoint advertises a total_SDIs and stride combination that exceeds
the 16-bit routing ID space (i.e. requesting a bus > 255), the result from
pci_virtfn_routing_id() will mathematically wrap around modulo 65536 since it
returns a u16:
drivers/pci/pci.h:pci_virtfn_routing_id() {
return (pf->bus->number << 8) + pf->devfn + offset + stride * id;
}
Because this only evaluates the final SDI index (siov->total_SDIs - 1),
siov->max_SDI_buses could be set to a lower, wrapped value.
Later when reserving buses, this could cause an insufficient number of buses to
be reserved, potentially assigning legitimate SDI bus numbers to other
downstream devices and breaking IOMMU isolation.
> + return 0;
> +}
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260604150153.3619662-1-dimitri.daskalakis1@gmail.com?part=11
^ permalink raw reply [flat|nested] 25+ messages in thread
* [RFC 12/12] PCI: Reserve bus range for SIOV devices
2026-06-04 15:01 [RFC 00/12] PCI: Add support for Scalable I/O Virtualization Dimitri Daskalakis
` (10 preceding siblings ...)
2026-06-04 15:01 ` [RFC 11/12] PCI: Initialize and release SIOV capability Dimitri Daskalakis
@ 2026-06-04 15:01 ` Dimitri Daskalakis
2026-06-04 15:25 ` sashiko-bot
2026-06-04 18:20 ` [RFC 00/12] PCI: Add support for Scalable I/O Virtualization Jason Gunthorpe
2026-06-05 4:14 ` Christoph Hellwig
13 siblings, 1 reply; 25+ messages in thread
From: Dimitri Daskalakis @ 2026-06-04 15:01 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, Mahesh J Salgaonkar, Oliver O'Halloran,
Niklas Schnelle, Gerald Schaefer, Heiko Carstens, Vasily Gorbik,
Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
Alex Williamson, Jason Gunthorpe, Kevin Tian, Ankit Agrawal,
Leon Romanovsky, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, Keith Busch, Alexander Duyck,
Jakub Kicinski, Dimitri Daskalakis, linuxppc-dev, linux-s390, kvm,
xen-devel
From: Dimitri Daskalakis <daskald@meta.com>
SDI routing IDs are computed from the PF's devfn plus an offset and
stride, exactly like SR-IOV VFs. When the stride pushes routing IDs
past the current bus number, additional bus numbers must be reserved
during PCI bus scanning to ensure the SDIs can be enumerated.
Add pci_siov_bus_range(), which walks all SIOV-capable PFs on a bus
and returns the maximum number of additional buses required. This
parallels pci_iov_bus_range() for SR-IOV.
The bus range is pre-computed during siov_init() by computing the bus
number of the last valid SDI.
Note: The PCIe 7.0 spec outlines an alternative RID assignment
algorithm for SDIs. The spec states a Virtualization Intermediary
(likely a hypervisor) after boot can compute the set of RIDs that are
valid for SDIs. There is a six step algorithm to compute this
RID allowlist. To keep things simple, we are only adding support for
strided RID assignments.
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: Dimitri Daskalakis <daskald@meta.com>
---
drivers/pci/probe.c | 4 ++--
drivers/pci/siov.c | 21 +++++++++++++++++++++
2 files changed, 23 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index bebc32c8d374..9ef6827ab345 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -3100,8 +3100,8 @@ static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
for (devnr = 0; devnr < PCI_MAX_NR_DEVS; devnr++)
pci_scan_slot(bus, PCI_DEVFN(devnr, 0));
- /* Reserve buses for SR-IOV capability */
- used_buses = pci_iov_bus_range(bus);
+ /* Reserve buses for SR-IOV and SIOV capability */
+ used_buses = max(pci_iov_bus_range(bus), pci_siov_bus_range(bus));
max += used_buses;
/*
diff --git a/drivers/pci/siov.c b/drivers/pci/siov.c
index 7372ce95714b..6405a8830052 100644
--- a/drivers/pci/siov.c
+++ b/drivers/pci/siov.c
@@ -111,3 +111,24 @@ void pci_siov_release(struct pci_dev *dev)
if (dev->siov)
siov_release(dev);
}
+
+/**
+ * pci_siov_bus_range - find the max bus number consumed by SDIs
+ * @bus: the PCI bus
+ *
+ * Returns max additional buses consumed across all SIOV PFs on this bus.
+ */
+int pci_siov_bus_range(struct pci_bus *bus)
+{
+ int max = 0;
+ struct pci_dev *dev;
+
+ list_for_each_entry(dev, &bus->devices, bus_list) {
+ if (!dev->siov)
+ continue;
+ if (dev->siov->max_SDI_buses > max)
+ max = dev->siov->max_SDI_buses;
+ }
+
+ return max ? max - bus->number : 0;
+}
--
2.52.0
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [RFC 12/12] PCI: Reserve bus range for SIOV devices
2026-06-04 15:01 ` [RFC 12/12] PCI: Reserve bus range for SIOV devices Dimitri Daskalakis
@ 2026-06-04 15:25 ` sashiko-bot
0 siblings, 0 replies; 25+ messages in thread
From: sashiko-bot @ 2026-06-04 15:25 UTC (permalink / raw)
To: Dimitri Daskalakis; +Cc: linux-pci, kvm
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
Pre-existing issues:
- [High] Malicious PCIe configuration can cause an integer underflow in bus numbering reservation, leading to broken bus allocation logic.
--
commit b3162f0225b56986a66cccbae6221dcaf9c28ae6
Author: Dimitri Daskalakis <daskald@meta.com>
Subject: PCI: Reserve bus range for SIOV devices
SDI routing IDs are computed from the PF's devfn plus an offset and
stride. When the stride pushes routing IDs past the current bus number,
additional bus numbers must be reserved during PCI bus scanning. This
commit adds pci_siov_bus_range() to compute the maximum number of
additional buses required, parallel to SR-IOV.
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index bebc32c8d374f..9ef6827ab3453 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -3100,8 +3100,8 @@ static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
> for (devnr = 0; devnr < PCI_MAX_NR_DEVS; devnr++)
> pci_scan_slot(bus, PCI_DEVFN(devnr, 0));
>
> - /* Reserve buses for SR-IOV capability */
> - used_buses = pci_iov_bus_range(bus);
> + /* Reserve buses for SR-IOV and SIOV capability */
> + used_buses = max(pci_iov_bus_range(bus), pci_siov_bus_range(bus));
[Severity: High]
This isn't a bug introduced by this patch, but if pci_siov_bus_range()
returns a negative integer due to a 16-bit capability wraparound, won't
assigning it to the unsigned int used_buses result in an underflow?
If used_buses becomes a massive positive value, the subsequent
max += used_buses would wrap max backwards. This could corrupt the PCI
enumeration state, cause overlapping bus assignments, and potentially
bypass IOMMU isolation if RIDs collide.
> max += used_buses;
>
> /*
> diff --git a/drivers/pci/siov.c b/drivers/pci/siov.c
> index 7372ce95714b2..6405a88300524 100644
> --- a/drivers/pci/siov.c
> +++ b/drivers/pci/siov.c
[ ... ]
> @@ -111,3 +111,24 @@ void pci_siov_release(struct pci_dev *dev)
> +int pci_siov_bus_range(struct pci_bus *bus)
> +{
> + int max = 0;
> + struct pci_dev *dev;
> +
> + list_for_each_entry(dev, &bus->devices, bus_list) {
> + if (!dev->siov)
> + continue;
> + if (dev->siov->max_SDI_buses > max)
> + max = dev->siov->max_SDI_buses;
> + }
> +
> + return max ? max - bus->number : 0;
[Severity: High]
This is a pre-existing issue inherited from SR-IOV, but could a malicious
PCIe device provide offset and stride values that cause the 16-bit
capability to wrap around?
If max_SDI_buses wraps around and ends up strictly less than bus->number
(but non-zero), this subtraction yields a negative integer, leading to
the underflow in pci_scan_child_bus_extend() mentioned above.
> +}
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260604150153.3619662-1-dimitri.daskalakis1@gmail.com?part=12
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC 00/12] PCI: Add support for Scalable I/O Virtualization
2026-06-04 15:01 [RFC 00/12] PCI: Add support for Scalable I/O Virtualization Dimitri Daskalakis
` (11 preceding siblings ...)
2026-06-04 15:01 ` [RFC 12/12] PCI: Reserve bus range for SIOV devices Dimitri Daskalakis
@ 2026-06-04 18:20 ` Jason Gunthorpe
2026-06-04 23:49 ` Dimitri Daskalakis
2026-06-05 4:14 ` Christoph Hellwig
13 siblings, 1 reply; 25+ messages in thread
From: Jason Gunthorpe @ 2026-06-04 18:20 UTC (permalink / raw)
To: Dimitri Daskalakis
Cc: Bjorn Helgaas, linux-pci, Madhavan Srinivasan, Michael Ellerman,
Nicholas Piggin, Christophe Leroy, Mahesh J Salgaonkar,
Oliver O'Halloran, Niklas Schnelle, Gerald Schaefer,
Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
Christian Borntraeger, Sven Schnelle, Alex Williamson, Kevin Tian,
Ankit Agrawal, Leon Romanovsky, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, Keith Busch, Alexander Duyck,
Jakub Kicinski, Dimitri Daskalakis, linuxppc-dev, linux-s390, kvm,
xen-devel
On Thu, Jun 04, 2026 at 08:01:41AM -0700, Dimitri Daskalakis wrote:
> With this patchset core enumarates the SIOV capability and can identify
> SIOV PFs. But there is no central mechanism to allocate/manage SIOV VFs.
> To support device pass through, devices will need to add a vfio-mdev
> driver with IOMMUFD support (or something similar).
There is an enormous amount of missing work to do something useful
with the SIOVr2 stuff. IIRC there is even supposed to be BIOS
components in this plan and there are some missing PCI SIG topics too
IIRC.
So, I'm not sure how much value there is in merging just the cap
discovery without a roadmap for the missing parts..
Also, I'm quite surprised to see this out of the blue, there is an OCP
workstream that was building out a standard that outlines how all the
different components have to act to successfully implement it. What
is in PCI SIG was just some minor foundational adjustments without any
context on how to form them into a solution.
I think it is extremely premature to merge anything related to SIOV to
the kernel. Join the OCP work stream if you are interested. I think
the general feeling was there is not sufficient interest in the
industry to do this and it has gone quiet.
Jason
^ permalink raw reply [flat|nested] 25+ messages in thread* Re: [RFC 00/12] PCI: Add support for Scalable I/O Virtualization
2026-06-04 18:20 ` [RFC 00/12] PCI: Add support for Scalable I/O Virtualization Jason Gunthorpe
@ 2026-06-04 23:49 ` Dimitri Daskalakis
2026-06-04 23:53 ` Jason Gunthorpe
2026-06-05 0:59 ` Jakub Kicinski
0 siblings, 2 replies; 25+ messages in thread
From: Dimitri Daskalakis @ 2026-06-04 23:49 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Bjorn Helgaas, linux-pci, Madhavan Srinivasan, Michael Ellerman,
Nicholas Piggin, Christophe Leroy, Mahesh J Salgaonkar,
Oliver O'Halloran, Niklas Schnelle, Gerald Schaefer,
Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
Christian Borntraeger, Sven Schnelle, Alex Williamson, Kevin Tian,
Ankit Agrawal, Leon Romanovsky, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, Keith Busch, Alexander Duyck,
Jakub Kicinski, Dimitri Daskalakis, linuxppc-dev, linux-s390, kvm,
xen-devel
On 6/4/26 11:20 AM, Jason Gunthorpe wrote:
> On Thu, Jun 04, 2026 at 08:01:41AM -0700, Dimitri Daskalakis wrote:
>> With this patchset core enumarates the SIOV capability and can identify
>> SIOV PFs. But there is no central mechanism to allocate/manage SIOV VFs.
>> To support device pass through, devices will need to add a vfio-mdev
>> driver with IOMMUFD support (or something similar).
>
> There is an enormous amount of missing work to do something useful
> with the SIOVr2 stuff. IIRC there is even supposed to be BIOS
> components in this plan and there are some missing PCI SIG topics too
> IIRC.
>
> So, I'm not sure how much value there is in merging just the cap
> discovery without a roadmap for the missing parts..
>
> Also, I'm quite surprised to see this out of the blue, there is an OCP
> workstream that was building out a standard that outlines how all the
> different components have to act to successfully implement it. What
> is in PCI SIG was just some minor foundational adjustments without any
> context on how to form them into a solution.
>
> I think it is extremely premature to merge anything related to SIOV to
> the kernel. Join the OCP work stream if you are interested. I think
> the general feeling was there is not sufficient interest in the
> industry to do this and it has gone quiet.
>
> Jason
Hey Jason, thanks for the feedback. We (at Meta) are definitely
interested in SIOV-like capabilities for device passthrough to containers.
For those scenarios, having PCIe transactions per RID plus IOMMU
isolation is enough, but I can imagine hypervisors/VMs requiring more
platform support.
I hear you on the broader support story being premature. But on the
other hand, this series unblocks experimentation at the driver level
for basic data path validation.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC 00/12] PCI: Add support for Scalable I/O Virtualization
2026-06-04 23:49 ` Dimitri Daskalakis
@ 2026-06-04 23:53 ` Jason Gunthorpe
2026-06-05 0:59 ` Jakub Kicinski
1 sibling, 0 replies; 25+ messages in thread
From: Jason Gunthorpe @ 2026-06-04 23:53 UTC (permalink / raw)
To: Dimitri Daskalakis
Cc: Bjorn Helgaas, linux-pci, Madhavan Srinivasan, Michael Ellerman,
Nicholas Piggin, Christophe Leroy, Mahesh J Salgaonkar,
Oliver O'Halloran, Niklas Schnelle, Gerald Schaefer,
Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
Christian Borntraeger, Sven Schnelle, Alex Williamson, Kevin Tian,
Ankit Agrawal, Leon Romanovsky, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, Keith Busch, Alexander Duyck,
Jakub Kicinski, Dimitri Daskalakis, linuxppc-dev, linux-s390, kvm,
xen-devel
On Thu, Jun 04, 2026 at 04:49:05PM -0700, Dimitri Daskalakis wrote:
>
> On 6/4/26 11:20 AM, Jason Gunthorpe wrote:
> > On Thu, Jun 04, 2026 at 08:01:41AM -0700, Dimitri Daskalakis wrote:
> >> With this patchset core enumarates the SIOV capability and can identify
> >> SIOV PFs. But there is no central mechanism to allocate/manage SIOV VFs.
> >> To support device pass through, devices will need to add a vfio-mdev
> >> driver with IOMMUFD support (or something similar).
> >
> > There is an enormous amount of missing work to do something useful
> > with the SIOVr2 stuff. IIRC there is even supposed to be BIOS
> > components in this plan and there are some missing PCI SIG topics too
> > IIRC.
> >
> > So, I'm not sure how much value there is in merging just the cap
> > discovery without a roadmap for the missing parts..
> >
> > Also, I'm quite surprised to see this out of the blue, there is an OCP
> > workstream that was building out a standard that outlines how all the
> > different components have to act to successfully implement it. What
> > is in PCI SIG was just some minor foundational adjustments without any
> > context on how to form them into a solution.
> >
> > I think it is extremely premature to merge anything related to SIOV to
> > the kernel. Join the OCP work stream if you are interested. I think
> > the general feeling was there is not sufficient interest in the
> > industry to do this and it has gone quiet.
> >
> > Jason
>
> Hey Jason, thanks for the feedback. We (at Meta) are definitely
> interested in SIOV-like capabilities for device passthrough to containers.
>
> For those scenarios, having PCIe transactions per RID plus IOMMU
> isolation is enough, but I can imagine hypervisors/VMs requiring more
> platform support.
>
> I hear you on the broader support story being premature. But on the
> other hand, this series unblocks experimentation at the driver level
> for basic data path validation.
And you can do your experimentation OOT :)
Go to the OCP workstream and finish the standard if you want to get
something merged.
There is zero need for SIOV if all you want is containers. containers
do not require iommu isolation or unique RIDs.
Jason
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC 00/12] PCI: Add support for Scalable I/O Virtualization
2026-06-04 23:49 ` Dimitri Daskalakis
2026-06-04 23:53 ` Jason Gunthorpe
@ 2026-06-05 0:59 ` Jakub Kicinski
1 sibling, 0 replies; 25+ messages in thread
From: Jakub Kicinski @ 2026-06-05 0:59 UTC (permalink / raw)
To: Dimitri Daskalakis
Cc: Jason Gunthorpe, Bjorn Helgaas, linux-pci, Madhavan Srinivasan,
Michael Ellerman, Nicholas Piggin, Christophe Leroy,
Mahesh J Salgaonkar, Oliver O'Halloran, Niklas Schnelle,
Gerald Schaefer, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
Christian Borntraeger, Sven Schnelle, Alex Williamson, Kevin Tian,
Ankit Agrawal, Leon Romanovsky, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, Keith Busch, Alexander Duyck,
Dimitri Daskalakis, linuxppc-dev, linux-s390, kvm, xen-devel
On Thu, 4 Jun 2026 16:49:05 -0700 Dimitri Daskalakis wrote:
> Hey Jason, thanks for the feedback. We (at Meta) are definitely
> interested in SIOV-like capabilities for device passthrough to containers.
I think "definitely interested" is quite an exaggeration.
More importantly, which big company is interested in what
tech should be irrelevant upstream. Please don't make these
sort of arguments. The code has to stand on its own.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC 00/12] PCI: Add support for Scalable I/O Virtualization
2026-06-04 15:01 [RFC 00/12] PCI: Add support for Scalable I/O Virtualization Dimitri Daskalakis
` (12 preceding siblings ...)
2026-06-04 18:20 ` [RFC 00/12] PCI: Add support for Scalable I/O Virtualization Jason Gunthorpe
@ 2026-06-05 4:14 ` Christoph Hellwig
13 siblings, 0 replies; 25+ messages in thread
From: Christoph Hellwig @ 2026-06-05 4:14 UTC (permalink / raw)
To: Dimitri Daskalakis
Cc: Bjorn Helgaas, linux-pci, Madhavan Srinivasan, Michael Ellerman,
Nicholas Piggin, Christophe Leroy, Mahesh J Salgaonkar,
Oliver O'Halloran, Niklas Schnelle, Gerald Schaefer,
Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
Christian Borntraeger, Sven Schnelle, Alex Williamson,
Jason Gunthorpe, Kevin Tian, Ankit Agrawal, Leon Romanovsky,
Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
Keith Busch, Alexander Duyck, Jakub Kicinski, Dimitri Daskalakis,
linuxppc-dev, linux-s390, kvm, xen-devel
On Thu, Jun 04, 2026 at 08:01:41AM -0700, Dimitri Daskalakis wrote:
> From: Dimitri Daskalakis <daskald@meta.com>
>
> Scalable I/O Virtualization (SIOV) is the next-generation alternative
> to SR-IOV.
It's not. It is a yet another very different technology.
^ permalink raw reply [flat|nested] 25+ messages in thread