Linux PCI subsystem development
 help / color / mirror / Atom feed
* [PATCH v4 0/7] Add Devres managed IRQ vectors allocation
@ 2026-04-24  7:57 Shawn Lin
  2026-04-24  7:57 ` [PATCH v4 1/7] PCI/MSI: Split __pci_enable_msi_range() for reuse Shawn Lin
                   ` (7 more replies)
  0 siblings, 8 replies; 16+ messages in thread
From: Shawn Lin @ 2026-04-24  7:57 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Nirmal Patel, Jonathan Derrick, Kurt Schwemmer, Logan Gunthorpe,
	Philipp Stanner, linux-pci, Shawn Lin


There is a long-standing design ambiguity in the PCI/MSI subsystem regarding
the management of IRQ vectors. Currently, the management operates in a "hybrid"
mode. Sometimes it's handled by devres but sometimes not, creating potential
for resource management bugs.

Historically, pcim_enable_device() implicitly triggers automatic IRQ vector
management when calling pci_alloc_irq_vectors[_affinity], as pcim_enable_device()
sets the is_managed flag, thus pcim_msi_release() will register a cleanup action.
However, using pci_enable_device() will not make pci_alloc_irq_vectors[_affinity]
register a cleanup action.

This creates an ambiguous ownership model. Many drivers follow a pattern of:
1. Calling pci_alloc_irq_vectors() to allocate interrupts.
2. Also calling pci_free_irq_vectors() in their error paths or remove routines.

When such a driver also uses pcim_enable_device(), the devres framework may
attempt to free the IRQ vectors a second time upon device release, leading to
a double-free. Analysis of the tree shows this hazardous pattern exists widely,
while 37 other drivers correctly rely solely on the implicit cleanup.

To identify the scope of this issue, I used the following shell commands to
search for drivers using pcim_enable_device and allocation functions but missing
the explicit free call:

grep -rl "pcim_enable_device" drivers/ \
  | xargs grep -l -E "pci_alloc_irq_vectors|pci_alloc_irq_vectors_affinity" 2>/dev/null \
  | xargs grep -L "pci_free_irq_vectors" 2>/dev/null

drivers/dma/hsu/pci.c
drivers/dma/hisi_dma.c
drivers/hid/intel-thc-hidintel-quicki2c/pci-quicki2c.
drivers/hid/intel-thc-hid/intel-quickspi/pci-quickspi.c
drivers/hid/intel-ish-hid/ipc/pci-ish.c
drivers/accel/ivpu/ivpu_drv.c
drivers/accel/qaic/qaic_drv.c
drivers/accel/amdxdna/aie2_pci.c
drivers/platform/x86/intel/ehl_pse_io.c
drivers/media/pci/intel/ipu6/ipu6.c
drivers/mfd/intel-lpss-pci.c
drivers/i3c/master/mipi-i3c-hci/mipi-i3c-hci-pci.c
drivers/crypto/marvell/octeontx2/otx2_cptpf_main.c
drivers/crypto/marvell/octeontx2/otx2_cptvf_main.c
drivers/crypto/inside-secure/safexcel.c
drivers/thunderbolt/nhi.c
drivers/i2c/busses/i2c-mchp-pci1xxxx.c
drivers/i2c/busses/i2c-amd-mp2-pci.c
drivers/i2c/busses/i2c-thunderx-pcidrv.c
drivers/i2c/busses/i2c-designware-pcidrv.c
drivers/tty/serial/8250/8250_exar.c
drivers/tty/serial/8250/8250_pci.c
drivers/tty/serial/8250/8250_mid.c
drivers/gpio/gpio-merrifield.c
drivers/iommu/riscv/iommu-pci.c
drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
drivers/pci/switch/switchtec.c
drivers/pci/controller/vmd.c
drivers/net/ethernet/stmicro/stmmac/dwmac-intel.c
drivers/net/ethernet/hisilicon/hibmcge/hbg_irq.c
drivers/net/ethernet/cavium/thunder/thunder_bgx.c
drivers/net/ethernet/realtek/r8169_main.c
drivers/mmc/host/cavium-thunderx.c
drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gp.c
drivers/cxl/pci.c
drivers/spi/spi-pxa2xx-pci.c
drivers/spi/spi-pci1xxxx.c

To count how many drivers, including using lagacy APIs, have the double-free bugs, the
following shell commands find out 60 drivers.
grep -rl "pcim_enable_device" drivers/ --include="*.c" \
  | xargs grep -l -E "pci_alloc_irq_vectors|pci_alloc_irq_vectors_affinity" 2>/dev/null \
  | xargs grep -l "pci_free_irq_vectors" 2>/dev/null \
  | tee /dev/tty \
  | wc -l

drivers/dma/dw-edma/dw-edma-pcie.c
drivers/dma/plx_dma.c
drivers/dma/amd/ae4dma/ae4dma-pci.c
drivers/thermal/intel/int340x_thermal/processor_thermal_device_pci.c
drivers/perf/hisilicon/hisi_pcie_pmu.c
drivers/perf/hisilicon/hns3_pmu.c
drivers/platform/x86/intel_ips.c
....
drivers/hwtracing/intel_th/pci.c
drivers/bus/mhi/host/pci_generic.c
drivers/fpga/dfl-pci.c
51

grep -rl "pcim_enable_device" drivers/ --include="*.c" \
  | xargs grep -l -E "pci_enable_msi|pci_enable_msix" 2>/dev/null \
  | xargs grep -l "pci_disable_msi" 2>/dev/null \
  | tee /dev/tty \
  | wc -l

drivers/dma/ioat/init.c
drivers/dma/amd/ptdma/ptdma-pci.c
drivers/thermal/intel/int340x_thermal/processor_thermal_device_pci_legacy.c
drivers/crypto/ccp/sp-pci.c
drivers/i2c/busses/i2c-ismt.c
drivers/pci/msi/api.c
drivers/misc/mei/pci-me.c
drivers/misc/mei/pci-txe.c
drivers/block/mtip32xx/mtip32xx.c
9

This series introduces new managed APIs: pcim_alloc_irq_vectors() and
pcim_alloc_irq_vectors_affinity(). Drivers that wish to have devres-managed IRQ
vectors should use these functions to ensure proper automatic cleanup without
ambiguity.

Regarding the removal strategy discussed in v3 [1], two approaches were considered:
a) Introducing another hybrid function temporarily.
b) Duplicating code temporarily.

Following the discussion with Philipp, I agree that adding another hybrid function
does not guarantee the completion of the final cleanup work. Therefore, starting
from v4, I have chosen to duplicate the necessary code temporarily. Helper functions
have been factored out to minimize code duplication.

In the short term, the series converts two drivers within the PCI subsystem to
use the new APIs. The long-term goal is to convert all other drivers which wish to
use these managed functions. For the legacy APIs, like pci_enable_msix_range_*() and
pci_enable_msi(), we won't consider to provide devres-managed version for them but to
fix the only 9 double-free drivers by using the new APIs provided by this series. And
once ready, we could remove the problematic hybrid management pattern from
pcim_setup_msi_release() entirely.

[1]: https://lore.kernel.org/linux-pci/9dc66010d61670dc5182062ef5f1a932f7374323.camel@mailbox.org/T/#t


Changes in v4:
- Rework to create dedicated devres-managed function (Philipp)

Changes in v3:
- Rework the commit message and function doc (Philipp)
- Remove setting is_msi_managed flag from new APIs (Philipp)

Changes in v2:
- Rebase
- Introduce patches only for PCI subsystem to convert the API

Shawn Lin (7):
  PCI/MSI: Split __pci_enable_msi_range() for reuse
  PCI/MSI: Split __pci_enable_msix_range() for reuse
  PCI/MSI: Introduce __pcim_enable_msi_range()
  PCI/MSI: Introduce __pcim_enable_msix_range()
  PCI/MSI: Add Devres managed IRQ vectors allocation
  PCI: switchtec: Replace pci_alloc_irq_vectors() with
    pcim_alloc_irq_vectors()
  PCI: vmd: Replace pci_alloc_irq_vectors() with
    pcim_alloc_irq_vectors()

 drivers/pci/controller/vmd.c   |   4 +-
 drivers/pci/msi/api.c          |  84 +++++++++++++++++++++++++++++++
 drivers/pci/msi/msi.c          | 110 +++++++++++++++++++++++++++++++++++------
 drivers/pci/msi/msi.h          |   3 ++
 drivers/pci/switch/switchtec.c |   6 +--
 include/linux/pci.h            |  22 +++++++++
 6 files changed, 210 insertions(+), 19 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v4 1/7] PCI/MSI: Split __pci_enable_msi_range() for reuse
  2026-04-24  7:57 [PATCH v4 0/7] Add Devres managed IRQ vectors allocation Shawn Lin
@ 2026-04-24  7:57 ` Shawn Lin
  2026-05-12 13:08   ` Philipp Stanner
  2026-04-24  7:57 ` [PATCH v4 2/7] PCI/MSI: Split __pci_enable_msix_range() " Shawn Lin
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Shawn Lin @ 2026-04-24  7:57 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Nirmal Patel, Jonathan Derrick, Kurt Schwemmer, Logan Gunthorpe,
	Philipp Stanner, linux-pci, Shawn Lin

Splits the __pci_enable_msi_range() function into two helper
functions without changing the original behavior. The purpose is to allow
future functions (particularly managed devres variants) to reuse these
components.

The split is as follows:

1. pci_msi_range_alloc(): Handles the allocation logic, including
   parameter validation and vector number calculation. This function returns
   the calculated number of vectors or an error code.

2. pci_msi_range_init(): Handles the initialization of MSI context
   and the actual MSI capability setup. This function takes the pre-calculated
   number of vectors as input.

3. The original __pci_enable_msi_range() is now a wrapper that calls these
   two helper functions in sequence, maintaining exact same behavior.

This is a preparatory step for introducing devres-managed MSI allocation
API that will share the allocation logic while providing automatic cleanup
via devres.

No functional changes intended.

Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
---

Changes in v4: None
Changes in v3: None
Changes in v2: None

 drivers/pci/msi/msi.c | 30 ++++++++++++++++++++++++------
 1 file changed, 24 insertions(+), 6 deletions(-)

diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c
index 81d24a2..748dba6 100644
--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -420,11 +420,9 @@ static int msi_capability_init(struct pci_dev *dev, int nvec,
 	return __msi_capability_init(dev, nvec, masks);
 }
 
-int __pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec,
-			   struct irq_affinity *affd)
+static int pci_msi_range_alloc(struct pci_dev *dev, int minvec, int maxvec)
 {
 	int nvec;
-	int rc;
 
 	if (!pci_msi_supported(dev, minvec) || dev->current_state != PCI_D0)
 		return -EINVAL;
@@ -451,9 +449,13 @@ int __pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec,
 	if (nvec < minvec)
 		return -ENOSPC;
 
-	rc = pci_setup_msi_context(dev);
-	if (rc)
-		return rc;
+	return nvec;
+}
+
+static int pci_msi_range_init(struct pci_dev *dev, int minvec, int maxvec,
+			      int nvec, struct irq_affinity *affd)
+{
+	int rc;
 
 	if (!pci_setup_msi_device_domain(dev, nvec))
 		return -ENODEV;
@@ -481,6 +483,22 @@ int __pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec,
 	}
 }
 
+int __pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec,
+			   struct irq_affinity *affd)
+{
+	int nvec, rc;
+
+	nvec = pci_msi_range_alloc(dev, minvec, maxvec);
+	if (nvec < 0)
+		return nvec;
+
+	rc = pci_setup_msi_context(dev);
+	if (rc)
+		return rc;
+
+	return pci_msi_range_init(dev, minvec, maxvec, nvec, affd);
+}
+
 /**
  * pci_msi_vec_count - Return the number of MSI vectors a device can send
  * @dev: device to report about
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 2/7] PCI/MSI: Split __pci_enable_msix_range() for reuse
  2026-04-24  7:57 [PATCH v4 0/7] Add Devres managed IRQ vectors allocation Shawn Lin
  2026-04-24  7:57 ` [PATCH v4 1/7] PCI/MSI: Split __pci_enable_msi_range() for reuse Shawn Lin
@ 2026-04-24  7:57 ` Shawn Lin
  2026-05-12 13:20   ` Philipp Stanner
  2026-04-24  7:57 ` [PATCH v4 3/7] PCI/MSI: Introduce __pcim_enable_msi_range() Shawn Lin
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Shawn Lin @ 2026-04-24  7:57 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Nirmal Patel, Jonathan Derrick, Kurt Schwemmer, Logan Gunthorpe,
	Philipp Stanner, linux-pci, Shawn Lin

Splits the __pci_enable_msix_range() function into two helper
functions without changing the original behavior. The purpose is to allow
future functions (particularly managed devres variants) to reuse these
components.

The split is as follows:

1. pci_msix_range_alloc(): Handles the allocation logic, including
   parameter validation, hardware vector count verification, and entry
   validation. This function calculates the actual number of vectors
   to allocate and returns the hardware-supported vector count.

2. pci_msix_range_init(): Handles the initialization of MSI-X context
   and the actual MSI-X capability setup. This function takes the
   pre-determined number of vectors and hardware vector count as inputs.

3. The original __pci_enable_msix_range() is now a wrapper that calls
   these two helper functions in sequence, maintaining exact same
   behavior.

This is a preparatory step for introducing devres-managed MSI-X allocation
API that will share the allocation logic while providing automatic cleanup
via devres.

No functional changes intended.

Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
---

Changes in v4: None
Changes in v3: None
Changes in v2: None

 drivers/pci/msi/msi.c | 38 ++++++++++++++++++++++++++++++--------
 1 file changed, 30 insertions(+), 8 deletions(-)

diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c
index 748dba6..5c196c2 100644
--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -820,10 +820,10 @@ static bool pci_msix_validate_entries(struct pci_dev *dev, struct msix_entry *en
 	return true;
 }
 
-int __pci_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries, int minvec,
-			    int maxvec, struct irq_affinity *affd, int flags)
+static int pci_msix_range_alloc(struct pci_dev *dev, struct msix_entry *entries,
+				int minvec, int maxvec, int flags, int *nvec_ret)
 {
-	int hwsize, rc, nvec = maxvec;
+	int hwsize, nvec = maxvec;
 
 	if (maxvec < minvec)
 		return -ERANGE;
@@ -858,12 +858,16 @@ int __pci_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries, int
 			nvec = hwsize;
 	}
 
-	if (nvec < minvec)
-		return -ENOSPC;
+	*nvec_ret = nvec;
 
-	rc = pci_setup_msi_context(dev);
-	if (rc)
-		return rc;
+	return hwsize;
+}
+
+static int pci_msix_range_init(struct pci_dev *dev, struct msix_entry *entries,
+			       int minvec, int nvec, int hwsize,
+			       struct irq_affinity *affd)
+{
+	int rc;
 
 	if (!pci_setup_msix_device_domain(dev, hwsize))
 		return -ENODEV;
@@ -888,6 +892,24 @@ int __pci_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries, int
 	}
 }
 
+int __pci_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries,
+			    int minvec, int maxvec, struct irq_affinity *affd,
+			    int flags)
+{
+	int hwsize, nvec, rc;
+
+	hwsize = pci_msix_range_alloc(dev, entries, minvec,
+				      maxvec, flags, &nvec);
+	if (hwsize < 0)
+		return hwsize;
+
+	rc = pci_setup_msi_context(dev);
+	if (rc)
+		return rc;
+
+	return pci_msix_range_init(dev, entries, minvec, nvec, hwsize, affd);
+}
+
 void __pci_restore_msix_state(struct pci_dev *dev)
 {
 	struct msi_desc *entry;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 3/7] PCI/MSI: Introduce __pcim_enable_msi_range()
  2026-04-24  7:57 [PATCH v4 0/7] Add Devres managed IRQ vectors allocation Shawn Lin
  2026-04-24  7:57 ` [PATCH v4 1/7] PCI/MSI: Split __pci_enable_msi_range() for reuse Shawn Lin
  2026-04-24  7:57 ` [PATCH v4 2/7] PCI/MSI: Split __pci_enable_msix_range() " Shawn Lin
@ 2026-04-24  7:57 ` Shawn Lin
  2026-05-12 13:22   ` Philipp Stanner
  2026-04-24  7:57 ` [PATCH v4 4/7] PCI/MSI: Introduce __pcim_enable_msix_range() Shawn Lin
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Shawn Lin @ 2026-04-24  7:57 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Nirmal Patel, Jonathan Derrick, Kurt Schwemmer, Logan Gunthorpe,
	Philipp Stanner, linux-pci, Shawn Lin

Introduce __pcim_enable_msi_range(), a devres-managed variant of
__pci_enable_msi_range(). The new function provides automatic cleanup
of MSI interrupts via devres, reducing the risk of resource leaks and
simplifying driver error handling.

This function is particularly useful for drivers that already use
pcim_enable_device() and want consistent devres management for all
PCI resources, including MSI interrupts.

Drivers can replace calls to __pci_enable_msi_range() with
__pcim_enable_msi_range() to benefit from automatic cleanup without
changing their core logic.

Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
---

Changes in v4: None
Changes in v3: None
Changes in v2: None

 drivers/pci/msi/msi.c | 20 ++++++++++++++++++++
 drivers/pci/msi/msi.h |  1 +
 2 files changed, 21 insertions(+)

diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c
index 5c196c2..0aaff57 100644
--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -499,6 +499,26 @@ int __pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec,
 	return pci_msi_range_init(dev, minvec, maxvec, nvec, affd);
 }
 
+int __pcim_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec,
+			    struct irq_affinity *affd)
+{
+	int nvec, rc;
+
+	nvec = pci_msi_range_alloc(dev, minvec, maxvec);
+	if (nvec < 0)
+		return nvec;
+
+	rc = msi_setup_device_data(&dev->dev);
+	if (rc)
+		return rc;
+
+	rc = devm_add_action(&dev->dev, pcim_msi_release, dev);
+	if (rc)
+		return rc;
+
+	return pci_msi_range_init(dev, minvec, maxvec, nvec, affd);
+}
+
 /**
  * pci_msi_vec_count - Return the number of MSI vectors a device can send
  * @dev: device to report about
diff --git a/drivers/pci/msi/msi.h b/drivers/pci/msi/msi.h
index 0b420b3..81c6b099 100644
--- a/drivers/pci/msi/msi.h
+++ b/drivers/pci/msi/msi.h
@@ -94,6 +94,7 @@ void pci_msi_shutdown(struct pci_dev *dev);
 void pci_msix_shutdown(struct pci_dev *dev);
 void pci_free_msi_irqs(struct pci_dev *dev);
 int __pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec, struct irq_affinity *affd);
+int __pcim_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec, struct irq_affinity *affd);
 int __pci_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries, int minvec,
 			    int maxvec,  struct irq_affinity *affd, int flags);
 void __pci_restore_msi_state(struct pci_dev *dev);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 4/7] PCI/MSI: Introduce __pcim_enable_msix_range()
  2026-04-24  7:57 [PATCH v4 0/7] Add Devres managed IRQ vectors allocation Shawn Lin
                   ` (2 preceding siblings ...)
  2026-04-24  7:57 ` [PATCH v4 3/7] PCI/MSI: Introduce __pcim_enable_msi_range() Shawn Lin
@ 2026-04-24  7:57 ` Shawn Lin
  2026-05-12 13:26   ` Philipp Stanner
  2026-04-24  7:57 ` [PATCH v4 5/7] PCI/MSI: Add Devres managed IRQ vectors allocation Shawn Lin
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Shawn Lin @ 2026-04-24  7:57 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Nirmal Patel, Jonathan Derrick, Kurt Schwemmer, Logan Gunthorpe,
	Philipp Stanner, linux-pci, Shawn Lin

Introduce __pcim_enable_msix_range(), a devres-managed variant of
__pci_enable_msix_range(). Similar to the previously added MSI variant,
this function provides automatic cleanup of MSI-X interrupts via devres,
reducing the risk of resource leaks and simplifying driver error handling.

This function is particularly useful for drivers that already use
pcim_enable_device() and want consistent devres management for all
PCI resources, including MSI-X interrupts.

Drivers can replace calls to __pci_enable_msix_range() with
__pcim_enable_msix_range() to benefit from automatic cleanup without
changing their core logic. The flags parameter (e.g., PCI_IRQ_VIRTUAL)
is fully supported and passed through to the underlying functions.

This completes the set of devres-managed MSI/MSI-X allocation functions,
providing a consistent API for driver authors who prefer automatic
resource management.

Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
---

Changes in v4: None
Changes in v3: None
Changes in v2: None

 drivers/pci/msi/msi.c | 22 ++++++++++++++++++++++
 drivers/pci/msi/msi.h |  2 ++
 2 files changed, 24 insertions(+)

diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c
index 0aaff57..8fdf40d 100644
--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -930,6 +930,28 @@ int __pci_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries,
 	return pci_msix_range_init(dev, entries, minvec, nvec, hwsize, affd);
 }
 
+int __pcim_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries,
+			     int minvec, int maxvec, struct irq_affinity *affd,
+			     int flags)
+{
+	int hwsize, nvec, rc;
+
+	hwsize = pci_msix_range_alloc(dev, entries, minvec,
+				      maxvec, flags, &nvec);
+	if (hwsize < 0)
+		return hwsize;
+
+	rc = msi_setup_device_data(&dev->dev);
+	if (rc)
+		return rc;
+
+	rc = devm_add_action(&dev->dev, pcim_msi_release, dev);
+	if (rc)
+		return rc;
+
+	return pci_msix_range_init(dev, entries, minvec, nvec, hwsize, affd);
+}
+
 void __pci_restore_msix_state(struct pci_dev *dev)
 {
 	struct msi_desc *entry;
diff --git a/drivers/pci/msi/msi.h b/drivers/pci/msi/msi.h
index 81c6b099..e6364a8 100644
--- a/drivers/pci/msi/msi.h
+++ b/drivers/pci/msi/msi.h
@@ -97,6 +97,8 @@ int __pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec, struct i
 int __pcim_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec, struct irq_affinity *affd);
 int __pci_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries, int minvec,
 			    int maxvec,  struct irq_affinity *affd, int flags);
+int __pcim_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries, int minvec,
+			     int maxvec,  struct irq_affinity *affd, int flags);
 void __pci_restore_msi_state(struct pci_dev *dev);
 void __pci_restore_msix_state(struct pci_dev *dev);
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 5/7] PCI/MSI: Add Devres managed IRQ vectors allocation
  2026-04-24  7:57 [PATCH v4 0/7] Add Devres managed IRQ vectors allocation Shawn Lin
                   ` (3 preceding siblings ...)
  2026-04-24  7:57 ` [PATCH v4 4/7] PCI/MSI: Introduce __pcim_enable_msix_range() Shawn Lin
@ 2026-04-24  7:57 ` Shawn Lin
  2026-05-12 13:28   ` Philipp Stanner
  2026-04-24  7:57 ` [PATCH v4 6/7] PCI: switchtec: Replace pci_alloc_irq_vectors() with pcim_alloc_irq_vectors() Shawn Lin
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Shawn Lin @ 2026-04-24  7:57 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Nirmal Patel, Jonathan Derrick, Kurt Schwemmer, Logan Gunthorpe,
	Philipp Stanner, linux-pci, Shawn Lin

There is a long-standing design ambiguity in the PCI/MSI subsystem
regarding the management of IRQ vectors. Currently, the management
operates in a "hybrid" mode. Sometimes it's handled by devres but
sometimes not, creating potential for resource management bugs.

Historically, pcim_enable_device() implicitly triggers automatic IRQ
vector management when calling pci_alloc_irq_vectors[_affinity], as
pcim_enable_device() sets the is_managed flag, thus pcim_msi_release()
will register a cleanup action. However, using pci_enable_device()
will not make pci_alloc_irq_vectors[_affinity] register a cleanup
action.

This ambiguity causes problems when drivers follow a common but
hazardous pattern:
1. Using pcim_enable_device() (which implicitly marks IRQs as managed)
2. Explicitly calling pci_alloc_irq_vectors() for IRQ allocation
3. Also calling pci_free_irq_vectors() in error paths or remove routines

In this scenario, the devres framework may attempt to free the IRQ
vectors a second time upon device release, leading to double-free
issues. Analysis of the tree shows this hazardous pattern exists
in multiple drivers, while 37 other drivers correctly rely solely
on the implicit cleanup.

To resolve this ambiguity, introduce explicit managed APIs for IRQ
vector allocation:
- pcim_alloc_irq_vectors()
- pcim_alloc_irq_vectors_affinity()

These functions are the devres-managed counterparts to
pci_alloc_irq_vectors[_affinity](). Drivers that wish to have
devres-managed IRQ vectors should use these new APIs instead.

The long-term goal is to convert all drivers which use pcim_enable_device()
in combination with pci_alloc_irq_vectors[_affinity]() to use these managed
functions, and eventually remove the problematic hybrid management
pattern from pcim_enable_device() and pcim_setup_msi_release() entirely.
This patch lays the foundation by introducing the APIs.

Suggested-by: Philipp Stanner <phasta@kernel.org>
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>

---

Changes in v4:
- Rework to create dedicated devres-managed function (Philipp)

Changes in v3:
- Rework the commit message and function doc (Philipp)
- Remove setting is_msi_managed flag from new APIs (Philipp)

Changes in v2:
- Rebase
- Introduce patches only for PCI subsystem to convert the API

 drivers/pci/msi/api.c | 84 +++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/pci.h   | 22 ++++++++++++++
 2 files changed, 106 insertions(+)

diff --git a/drivers/pci/msi/api.c b/drivers/pci/msi/api.c
index c18559b..405da7b 100644
--- a/drivers/pci/msi/api.c
+++ b/drivers/pci/msi/api.c
@@ -297,6 +297,90 @@ int pci_alloc_irq_vectors_affinity(struct pci_dev *dev, unsigned int min_vecs,
 EXPORT_SYMBOL(pci_alloc_irq_vectors_affinity);
 
 /**
+ * pcim_alloc_irq_vectors - devres managed pci_alloc_irq_vectors()
+ * @dev: the PCI device to operate on
+ * @min_vecs: minimum number of vectors required (must be >= 1)
+ * @max_vecs: maximum (desired) number of vectors
+ * @flags: flags for this allocation, see pci_alloc_irq_vectors()
+ *
+ * This is a device resource managed version of pci_alloc_irq_vectors().
+ * Interrupt vectors are automatically freed on driver detach by devres.
+ * Drivers MUST NOT call pci_free_irq_vectors().
+ *
+ * Returns number of vectors allocated (which might be smaller than
+ * @max_vecs) on success, or a negative error code on failure.
+ */
+int pcim_alloc_irq_vectors(struct pci_dev *dev, unsigned int min_vecs,
+			   unsigned int max_vecs, unsigned int flags)
+{
+	return pcim_alloc_irq_vectors_affinity(dev, min_vecs, max_vecs,
+					       flags, NULL);
+}
+EXPORT_SYMBOL(pcim_alloc_irq_vectors);
+
+/**
+ * pcim_alloc_irq_vectors_affinity - devres managed pci_alloc_irq_vectors_affinity()
+ * @dev: the PCI device to operate on
+ * @min_vecs: minimum number of vectors required (must be >= 1)
+ * @max_vecs: maximum (desired) number of vectors
+ * @flags: flags for this allocation, see pci_alloc_irq_vectors()
+ * @affd: optional description of the affinity requirements
+ *
+ * This is a device resource managed version of pci_alloc_irq_vectors_affinity().
+ * Interrupt vectors are automatically freed on driver detach by devres.
+ * Drivers MUST NOT call pci_free_irq_vectors().
+ *
+ * Returns number of vectors allocated (which might be smaller than
+ * @max_vecs) on success, or a negative error code on failure.
+ */
+int pcim_alloc_irq_vectors_affinity(struct pci_dev *dev, unsigned int min_vecs,
+				    unsigned int max_vecs, unsigned int flags,
+				    struct irq_affinity *affd)
+{
+	struct irq_affinity msi_default_affd = {0};
+	int nvecs = -ENOSPC;
+
+	if (flags & PCI_IRQ_AFFINITY) {
+		if (!affd)
+			affd = &msi_default_affd;
+	} else {
+		if (WARN_ON(affd))
+			affd = NULL;
+	}
+
+	if (flags & PCI_IRQ_MSIX) {
+		nvecs = __pcim_enable_msix_range(dev, NULL, min_vecs, max_vecs,
+						 affd, flags);
+		if (nvecs > 0)
+			return nvecs;
+	}
+
+	if (flags & PCI_IRQ_MSI) {
+		nvecs = __pcim_enable_msi_range(dev, min_vecs, max_vecs, affd);
+		if (nvecs > 0)
+			return nvecs;
+	}
+
+	/* use INTx IRQ if allowed */
+	if (flags & PCI_IRQ_INTX) {
+		if (min_vecs == 1 && dev->irq) {
+			/*
+			 * Invoke the affinity spreading logic to ensure that
+			 * the device driver can adjust queue configuration
+			 * for the single interrupt case.
+			 */
+			if (affd)
+				kfree(irq_create_affinity_masks(1, affd));
+			pci_intx(dev, 1);
+			return 1;
+		}
+	}
+
+	return nvecs;
+}
+EXPORT_SYMBOL(pcim_alloc_irq_vectors_affinity);
+
+/**
  * pci_irq_vector() - Get Linux IRQ number of a device interrupt vector
  * @dev: the PCI device to operate on
  * @nr:  device-relative interrupt vector index (0-based); has different
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 2c44545..6eab553 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1770,6 +1770,12 @@ int pci_alloc_irq_vectors_affinity(struct pci_dev *dev, unsigned int min_vecs,
 				   unsigned int max_vecs, unsigned int flags,
 				   struct irq_affinity *affd);
 
+int pcim_alloc_irq_vectors(struct pci_dev *dev, unsigned int min_vecs,
+			   unsigned int max_vecs, unsigned int flags);
+int pcim_alloc_irq_vectors_affinity(struct pci_dev *dev, unsigned int min_vecs,
+				    unsigned int max_vecs, unsigned int flags,
+				    struct irq_affinity *affd);
+
 bool pci_msix_can_alloc_dyn(struct pci_dev *dev);
 struct msi_map pci_msix_alloc_irq_at(struct pci_dev *dev, unsigned int index,
 				     const struct irq_affinity_desc *affdesc);
@@ -1812,6 +1818,22 @@ pci_alloc_irq_vectors(struct pci_dev *dev, unsigned int min_vecs,
 					      flags, NULL);
 }
 
+static inline int
+pcim_alloc_irq_vectors_affinity(struct pci_dev *dev, unsigned int min_vecs,
+			       unsigned int max_vecs, unsigned int flags,
+			       struct irq_affinity *aff_desc)
+{
+	return pci_alloc_irq_vectors_affinity(dev, min_vecs, max_vecs,
+					      flags, aff_desc);
+}
+static inline int
+pcim_alloc_irq_vectors(struct pci_dev *dev, unsigned int min_vecs,
+		      unsigned int max_vecs, unsigned int flags)
+{
+	return pcim_alloc_irq_vectors_affinity(dev, min_vecs, max_vecs,
+					      flags, NULL);
+}
+
 static inline bool pci_msix_can_alloc_dyn(struct pci_dev *dev)
 { return false; }
 static inline struct msi_map pci_msix_alloc_irq_at(struct pci_dev *dev, unsigned int index,
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 6/7] PCI: switchtec: Replace pci_alloc_irq_vectors() with pcim_alloc_irq_vectors()
  2026-04-24  7:57 [PATCH v4 0/7] Add Devres managed IRQ vectors allocation Shawn Lin
                   ` (4 preceding siblings ...)
  2026-04-24  7:57 ` [PATCH v4 5/7] PCI/MSI: Add Devres managed IRQ vectors allocation Shawn Lin
@ 2026-04-24  7:57 ` Shawn Lin
  2026-04-24  7:57 ` [PATCH v4 7/7] PCI: vmd: " Shawn Lin
  2026-04-24 11:48 ` [PATCH v4 0/7] Add Devres managed IRQ vectors allocation Philipp Stanner
  7 siblings, 0 replies; 16+ messages in thread
From: Shawn Lin @ 2026-04-24  7:57 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Nirmal Patel, Jonathan Derrick, Kurt Schwemmer, Logan Gunthorpe,
	Philipp Stanner, linux-pci, Shawn Lin

Replace pci_alloc_irq_vectors() with pcim_alloc_irq_vectors() to
explicitly request devres-managed interrupt vectors. This makes the
driver's intention clear and avoids the ambiguous implicit management
previously provided by pcim_enable_device().

The change prepares the driver for the eventual removal of the hybrid
IRQ management pattern from pcim_enable_device(), ensuring consistent
resource management through devres.

Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
---

Changes in v4: None
Changes in v3: None
Changes in v2: None

 drivers/pci/switch/switchtec.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/switch/switchtec.c b/drivers/pci/switch/switchtec.c
index 93ebec9..ed42661 100644
--- a/drivers/pci/switch/switchtec.c
+++ b/drivers/pci/switch/switchtec.c
@@ -1493,9 +1493,9 @@ static int switchtec_init_isr(struct switchtec_dev *stdev)
 	if (nirqs < 4)
 		nirqs = 4;
 
-	nvecs = pci_alloc_irq_vectors(stdev->pdev, 1, nirqs,
-				      PCI_IRQ_MSIX | PCI_IRQ_MSI |
-				      PCI_IRQ_VIRTUAL);
+	nvecs = pcim_alloc_irq_vectors(stdev->pdev, 1, nirqs,
+				       PCI_IRQ_MSIX | PCI_IRQ_MSI |
+				       PCI_IRQ_VIRTUAL);
 	if (nvecs < 0)
 		return nvecs;
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 7/7] PCI: vmd: Replace pci_alloc_irq_vectors() with pcim_alloc_irq_vectors()
  2026-04-24  7:57 [PATCH v4 0/7] Add Devres managed IRQ vectors allocation Shawn Lin
                   ` (5 preceding siblings ...)
  2026-04-24  7:57 ` [PATCH v4 6/7] PCI: switchtec: Replace pci_alloc_irq_vectors() with pcim_alloc_irq_vectors() Shawn Lin
@ 2026-04-24  7:57 ` Shawn Lin
  2026-05-12 12:57   ` Manivannan Sadhasivam
  2026-04-24 11:48 ` [PATCH v4 0/7] Add Devres managed IRQ vectors allocation Philipp Stanner
  7 siblings, 1 reply; 16+ messages in thread
From: Shawn Lin @ 2026-04-24  7:57 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Nirmal Patel, Jonathan Derrick, Kurt Schwemmer, Logan Gunthorpe,
	Philipp Stanner, linux-pci, Shawn Lin

Replace pci_alloc_irq_vectors() with pcim_alloc_irq_vectors() to
explicitly request devres-managed interrupt vectors. This makes the
driver's intention clear and avoids the ambiguous implicit management
previously provided by pcim_enable_device().

The change prepares the driver for the eventual removal of the hybrid
IRQ management pattern from pcim_enable_device(), ensuring consistent
resource management through devres.

Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
---

Changes in v4: None
Changes in v3: None
Changes in v2: None

 drivers/pci/controller/vmd.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
index d4ae250..d8de63a 100644
--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -684,8 +684,8 @@ static int vmd_alloc_irqs(struct vmd_dev *vmd)
 	if (vmd->msix_count < 0)
 		return -ENODEV;
 
-	vmd->msix_count = pci_alloc_irq_vectors(dev, vmd->first_vec + 1,
-						vmd->msix_count, PCI_IRQ_MSIX);
+	vmd->msix_count = pcim_alloc_irq_vectors(dev, vmd->first_vec + 1,
+						 vmd->msix_count, PCI_IRQ_MSIX);
 	if (vmd->msix_count < 0)
 		return vmd->msix_count;
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 0/7] Add Devres managed IRQ vectors allocation
  2026-04-24  7:57 [PATCH v4 0/7] Add Devres managed IRQ vectors allocation Shawn Lin
                   ` (6 preceding siblings ...)
  2026-04-24  7:57 ` [PATCH v4 7/7] PCI: vmd: " Shawn Lin
@ 2026-04-24 11:48 ` Philipp Stanner
  7 siblings, 0 replies; 16+ messages in thread
From: Philipp Stanner @ 2026-04-24 11:48 UTC (permalink / raw)
  To: Shawn Lin, Bjorn Helgaas
  Cc: Nirmal Patel, Jonathan Derrick, Kurt Schwemmer, Logan Gunthorpe,
	Philipp Stanner, linux-pci

Hi Shawn,

On Fri, 2026-04-24 at 15:57 +0800, Shawn Lin wrote:
> 
> There is a long-standing design ambiguity in the PCI/MSI subsystem regarding
> the management of IRQ vectors. Currently, the management operates in a "hybrid"
> mode. Sometimes it's handled by devres but sometimes not, creating potential
> for resource management bugs.
> 
> Historically, pcim_enable_device() implicitly triggers automatic IRQ vector
> management when calling pci_alloc_irq_vectors[_affinity], as pcim_enable_device()
> sets the is_managed flag, thus pcim_msi_release() will register a cleanup action.
> However, using pci_enable_device() will not make pci_alloc_irq_vectors[_affinity]
> register a cleanup action.
> 
> This creates an ambiguous ownership model. Many drivers follow a pattern of:
> 1. Calling pci_alloc_irq_vectors() to allocate interrupts.
> 2. Also calling pci_free_irq_vectors() in their error paths or remove routines.
> 
> When such a driver also uses pcim_enable_device(), the devres framework may
> attempt to free the IRQ vectors a second time upon device release, leading to
> a double-free. Analysis of the tree shows this hazardous pattern exists widely,
> while 37 other drivers correctly rely solely on the implicit cleanup.
> 
> To identify the scope of this issue, I used the following shell commands to
> search for drivers using pcim_enable_device and allocation functions but missing
> the explicit free call:
> 
> grep -rl "pcim_enable_device" drivers/ \
>   | xargs grep -l -E "pci_alloc_irq_vectors|pci_alloc_irq_vectors_affinity" 2>/dev/null \
>   | xargs grep -L "pci_free_irq_vectors" 2>/dev/null
> 
> drivers/dma/hsu/pci.c
> drivers/dma/hisi_dma.c
> drivers/hid/intel-thc-hidintel-quicki2c/pci-quicki2c.
> drivers/hid/intel-thc-hid/intel-quickspi/pci-quickspi.c
> drivers/hid/intel-ish-hid/ipc/pci-ish.c
> drivers/accel/ivpu/ivpu_drv.c
> drivers/accel/qaic/qaic_drv.c
> drivers/accel/amdxdna/aie2_pci.c
> drivers/platform/x86/intel/ehl_pse_io.c
> drivers/media/pci/intel/ipu6/ipu6.c
> drivers/mfd/intel-lpss-pci.c
> drivers/i3c/master/mipi-i3c-hci/mipi-i3c-hci-pci.c
> drivers/crypto/marvell/octeontx2/otx2_cptpf_main.c
> drivers/crypto/marvell/octeontx2/otx2_cptvf_main.c
> drivers/crypto/inside-secure/safexcel.c
> drivers/thunderbolt/nhi.c
> drivers/i2c/busses/i2c-mchp-pci1xxxx.c
> drivers/i2c/busses/i2c-amd-mp2-pci.c
> drivers/i2c/busses/i2c-thunderx-pcidrv.c
> drivers/i2c/busses/i2c-designware-pcidrv.c
> drivers/tty/serial/8250/8250_exar.c
> drivers/tty/serial/8250/8250_pci.c
> drivers/tty/serial/8250/8250_mid.c
> drivers/gpio/gpio-merrifield.c
> drivers/iommu/riscv/iommu-pci.c
> drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
> drivers/pci/switch/switchtec.c
> drivers/pci/controller/vmd.c
> drivers/net/ethernet/stmicro/stmmac/dwmac-intel.c
> drivers/net/ethernet/hisilicon/hibmcge/hbg_irq.c
> drivers/net/ethernet/cavium/thunder/thunder_bgx.c
> drivers/net/ethernet/realtek/r8169_main.c
> drivers/mmc/host/cavium-thunderx.c
> drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gp.c
> drivers/cxl/pci.c
> drivers/spi/spi-pxa2xx-pci.c
> drivers/spi/spi-pci1xxxx.c
> 
> To count how many drivers, including using lagacy APIs, have the double-free bugs, the
> following shell commands find out 60 drivers.
> grep -rl "pcim_enable_device" drivers/ --include="*.c" \
>   | xargs grep -l -E "pci_alloc_irq_vectors|pci_alloc_irq_vectors_affinity" 2>/dev/null \
>   | xargs grep -l "pci_free_irq_vectors" 2>/dev/null \
>   | tee /dev/tty \
>   | wc -l
> 
> drivers/dma/dw-edma/dw-edma-pcie.c
> drivers/dma/plx_dma.c
> drivers/dma/amd/ae4dma/ae4dma-pci.c
> drivers/thermal/intel/int340x_thermal/processor_thermal_device_pci.c
> drivers/perf/hisilicon/hisi_pcie_pmu.c
> drivers/perf/hisilicon/hns3_pmu.c
> drivers/platform/x86/intel_ips.c
> ....
> drivers/hwtracing/intel_th/pci.c
> drivers/bus/mhi/host/pci_generic.c
> drivers/fpga/dfl-pci.c
> 51
> 
> grep -rl "pcim_enable_device" drivers/ --include="*.c" \
>   | xargs grep -l -E "pci_enable_msi|pci_enable_msix" 2>/dev/null \
>   | xargs grep -l "pci_disable_msi" 2>/dev/null \
>   | tee /dev/tty \
>   | wc -l
> 
> drivers/dma/ioat/init.c
> drivers/dma/amd/ptdma/ptdma-pci.c
> drivers/thermal/intel/int340x_thermal/processor_thermal_device_pci_legacy.c
> drivers/crypto/ccp/sp-pci.c
> drivers/i2c/busses/i2c-ismt.c
> drivers/pci/msi/api.c
> drivers/misc/mei/pci-me.c
> drivers/misc/mei/pci-txe.c
> drivers/block/mtip32xx/mtip32xx.c
> 9
> 
> This series introduces new managed APIs: pcim_alloc_irq_vectors() and
> pcim_alloc_irq_vectors_affinity(). Drivers that wish to have devres-managed IRQ
> vectors should use these functions to ensure proper automatic cleanup without
> ambiguity.
> 
> Regarding the removal strategy discussed in v3 [1], two approaches were considered:
> a) Introducing another hybrid function temporarily.
> b) Duplicating code temporarily.
> 
> Following the discussion with Philipp, I agree that adding another hybrid function
> does not guarantee the completion of the final cleanup work. Therefore, starting
> from v4, I have chosen to duplicate the necessary code temporarily. Helper functions
> have been factored out to minimize code duplication.
> 
> In the short term, the series converts two drivers within the PCI subsystem to
> use the new APIs. The long-term goal is to convert all other drivers which wish to
> use these managed functions. For the legacy APIs, like pci_enable_msix_range_*() and
> pci_enable_msi(), we won't consider to provide devres-managed version for them but to
> fix the only 9 double-free drivers by using the new APIs provided by this series. And
> once ready, we could remove the problematic hybrid management pattern from
> pcim_setup_msi_release() entirely.
> 
> [1]: https://lore.kernel.org/linux-pci/9dc66010d61670dc5182062ef5f1a932f7374323.camel@mailbox.org/T/#t
> 
> 
> Changes in v4:
> - Rework to create dedicated devres-managed function (Philipp)

Thanks for the rework.

I skimmed through it and it very much looks spot-on, I don't see major
concerns.

I'm a bit loaded right now. Can do a more detailed review earliest next
week, probably later.


Regards
P.

> 
> Changes in v3:
> - Rework the commit message and function doc (Philipp)
> - Remove setting is_msi_managed flag from new APIs (Philipp)
> 
> Changes in v2:
> - Rebase
> - Introduce patches only for PCI subsystem to convert the API
> 
> Shawn Lin (7):
>   PCI/MSI: Split __pci_enable_msi_range() for reuse
>   PCI/MSI: Split __pci_enable_msix_range() for reuse
>   PCI/MSI: Introduce __pcim_enable_msi_range()
>   PCI/MSI: Introduce __pcim_enable_msix_range()
>   PCI/MSI: Add Devres managed IRQ vectors allocation
>   PCI: switchtec: Replace pci_alloc_irq_vectors() with
>     pcim_alloc_irq_vectors()
>   PCI: vmd: Replace pci_alloc_irq_vectors() with
>     pcim_alloc_irq_vectors()
> 
>  drivers/pci/controller/vmd.c   |   4 +-
>  drivers/pci/msi/api.c          |  84 +++++++++++++++++++++++++++++++
>  drivers/pci/msi/msi.c          | 110 +++++++++++++++++++++++++++++++++++------
>  drivers/pci/msi/msi.h          |   3 ++
>  drivers/pci/switch/switchtec.c |   6 +--
>  include/linux/pci.h            |  22 +++++++++
>  6 files changed, 210 insertions(+), 19 deletions(-)
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 7/7] PCI: vmd: Replace pci_alloc_irq_vectors() with pcim_alloc_irq_vectors()
  2026-04-24  7:57 ` [PATCH v4 7/7] PCI: vmd: " Shawn Lin
@ 2026-05-12 12:57   ` Manivannan Sadhasivam
  0 siblings, 0 replies; 16+ messages in thread
From: Manivannan Sadhasivam @ 2026-05-12 12:57 UTC (permalink / raw)
  To: Shawn Lin
  Cc: Bjorn Helgaas, Nirmal Patel, Jonathan Derrick, Kurt Schwemmer,
	Logan Gunthorpe, Philipp Stanner, linux-pci

On Fri, Apr 24, 2026 at 03:57:40PM +0800, Shawn Lin wrote:
> Replace pci_alloc_irq_vectors() with pcim_alloc_irq_vectors() to
> explicitly request devres-managed interrupt vectors. This makes the
> driver's intention clear and avoids the ambiguous implicit management
> previously provided by pcim_enable_device().
> 
> The change prepares the driver for the eventual removal of the hybrid
> IRQ management pattern from pcim_enable_device(), ensuring consistent
> resource management through devres.
> 
> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>

Acked-by: Manivannan Sadhasivam <mani@kernel.org>

- Mani

> ---
> 
> Changes in v4: None
> Changes in v3: None
> Changes in v2: None
> 
>  drivers/pci/controller/vmd.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> index d4ae250..d8de63a 100644
> --- a/drivers/pci/controller/vmd.c
> +++ b/drivers/pci/controller/vmd.c
> @@ -684,8 +684,8 @@ static int vmd_alloc_irqs(struct vmd_dev *vmd)
>  	if (vmd->msix_count < 0)
>  		return -ENODEV;
>  
> -	vmd->msix_count = pci_alloc_irq_vectors(dev, vmd->first_vec + 1,
> -						vmd->msix_count, PCI_IRQ_MSIX);
> +	vmd->msix_count = pcim_alloc_irq_vectors(dev, vmd->first_vec + 1,
> +						 vmd->msix_count, PCI_IRQ_MSIX);
>  	if (vmd->msix_count < 0)
>  		return vmd->msix_count;
>  
> -- 
> 2.7.4
> 
> 

-- 
மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 1/7] PCI/MSI: Split __pci_enable_msi_range() for reuse
  2026-04-24  7:57 ` [PATCH v4 1/7] PCI/MSI: Split __pci_enable_msi_range() for reuse Shawn Lin
@ 2026-05-12 13:08   ` Philipp Stanner
  2026-05-12 13:09     ` Philipp Stanner
  0 siblings, 1 reply; 16+ messages in thread
From: Philipp Stanner @ 2026-05-12 13:08 UTC (permalink / raw)
  To: Shawn Lin, Bjorn Helgaas
  Cc: Nirmal Patel, Jonathan Derrick, Kurt Schwemmer, Logan Gunthorpe,
	Philipp Stanner, linux-pci

On Fri, 2026-04-24 at 15:57 +0800, Shawn Lin wrote:
> Splits the __pci_enable_msi_range() function into two helper
> functions without changing the original behavior. The purpose is to allow
> future functions (particularly managed devres variants) to reuse these
> components.
> 
> The split is as follows:
> 
> 1. pci_msi_range_alloc(): Handles the allocation logic, including
>    parameter validation and vector number calculation. This function returns
>    the calculated number of vectors or an error code.
> 
> 2. pci_msi_range_init(): Handles the initialization of MSI context
>    and the actual MSI capability setup. This function takes the pre-calculated
>    number of vectors as input.
> 
> 3. The original __pci_enable_msi_range() is now a wrapper that calls these
>    two helper functions in sequence, maintaining exact same behavior.
> 
> This is a preparatory step for introducing devres-managed MSI allocation
> API that will share the allocation logic while providing automatic cleanup
> via devres.

nit: the actual goal is to remove the devres aspect from the pci_
function.

> 
> No functional changes intended.
> 
> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
> ---
> 
> Changes in v4: None
> Changes in v3: None
> Changes in v2: None
> 
>  drivers/pci/msi/msi.c | 30 ++++++++++++++++++++++++------
>  1 file changed, 24 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c
> index 81d24a2..748dba6 100644
> --- a/drivers/pci/msi/msi.c
> +++ b/drivers/pci/msi/msi.c
> @@ -420,11 +420,9 @@ static int msi_capability_init(struct pci_dev *dev, int nvec,
>  	return __msi_capability_init(dev, nvec, masks);
>  }
>  
> -int __pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec,
> -			   struct irq_affinity *affd)
> +static int pci_msi_range_alloc(struct pci_dev *dev, int minvec, int maxvec)
>  {
>  	int nvec;
> -	int rc;
>  
>  	if (!pci_msi_supported(dev, minvec) || dev->current_state != PCI_D0)
>  		return -EINVAL;
> @@ -451,9 +449,13 @@ int __pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec,
>  	if (nvec < minvec)
>  		return -ENOSPC;
>  
> -	rc = pci_setup_msi_context(dev);
> -	if (rc)
> -		return rc;
> +	return nvec;
> +}
> +
> +static int pci_msi_range_init(struct pci_dev *dev, int minvec, int maxvec,
> +			      int nvec, struct irq_affinity *affd)
> +{
> +	int rc;
>  
>  	if (!pci_setup_msi_device_domain(dev, nvec))
>  		return -ENODEV;
> @@ -481,6 +483,22 @@ int __pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec,
>  	}
>  }
>  
> +int __pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec,
> +			   struct irq_affinity *affd)

Doesn't that trigger a dead-code warning?


P.

> +{
> +	int nvec, rc;
> +
> +	nvec = pci_msi_range_alloc(dev, minvec, maxvec);
> +	if (nvec < 0)
> +		return nvec;
> +
> +	rc = pci_setup_msi_context(dev);
> +	if (rc)
> +		return rc;
> +
> +	return pci_msi_range_init(dev, minvec, maxvec, nvec, affd);
> +}
> +
>  /**
>   * pci_msi_vec_count - Return the number of MSI vectors a device can send
>   * @dev: device to report about


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 1/7] PCI/MSI: Split __pci_enable_msi_range() for reuse
  2026-05-12 13:08   ` Philipp Stanner
@ 2026-05-12 13:09     ` Philipp Stanner
  0 siblings, 0 replies; 16+ messages in thread
From: Philipp Stanner @ 2026-05-12 13:09 UTC (permalink / raw)
  To: phasta, Shawn Lin, Bjorn Helgaas
  Cc: Nirmal Patel, Jonathan Derrick, Kurt Schwemmer, Logan Gunthorpe,
	linux-pci

On Tue, 2026-05-12 at 15:08 +0200, Philipp Stanner wrote:
> On Fri, 2026-04-24 at 15:57 +0800, Shawn Lin wrote:
> > Splits the __pci_enable_msi_range() function into two helper
> > functions without changing the original behavior. The purpose is to allow
> > future functions (particularly managed devres variants) to reuse these
> > components.
> > 
> > The split is as follows:
> > 
> > 1. pci_msi_range_alloc(): Handles the allocation logic, including
> >    parameter validation and vector number calculation. This function returns
> >    the calculated number of vectors or an error code.
> > 
> > 2. pci_msi_range_init(): Handles the initialization of MSI context
> >    and the actual MSI capability setup. This function takes the pre-calculated
> >    number of vectors as input.
> > 
> > 3. The original __pci_enable_msi_range() is now a wrapper that calls these
> >    two helper functions in sequence, maintaining exact same behavior.
> > 
> > This is a preparatory step for introducing devres-managed MSI allocation
> > API that will share the allocation logic while providing automatic cleanup
> > via devres.
> 
> nit: the actual goal is to remove the devres aspect from the pci_
> function.
> 
> > 
> > No functional changes intended.
> > 
> > Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
> > ---
> > 
> > Changes in v4: None
> > Changes in v3: None
> > Changes in v2: None
> > 
> >  drivers/pci/msi/msi.c | 30 ++++++++++++++++++++++++------
> >  1 file changed, 24 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c
> > index 81d24a2..748dba6 100644
> > --- a/drivers/pci/msi/msi.c
> > +++ b/drivers/pci/msi/msi.c
> > @@ -420,11 +420,9 @@ static int msi_capability_init(struct pci_dev *dev, int nvec,
> >  	return __msi_capability_init(dev, nvec, masks);
> >  }
> >  
> > -int __pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec,
> > -			   struct irq_affinity *affd)
> > +static int pci_msi_range_alloc(struct pci_dev *dev, int minvec, int maxvec)
> >  {
> >  	int nvec;
> > -	int rc;
> >  
> >  	if (!pci_msi_supported(dev, minvec) || dev->current_state != PCI_D0)
> >  		return -EINVAL;
> > @@ -451,9 +449,13 @@ int __pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec,
> >  	if (nvec < minvec)
> >  		return -ENOSPC;
> >  
> > -	rc = pci_setup_msi_context(dev);
> > -	if (rc)
> > -		return rc;
> > +	return nvec;
> > +}
> > +
> > +static int pci_msi_range_init(struct pci_dev *dev, int minvec, int maxvec,
> > +			      int nvec, struct irq_affinity *affd)
> > +{
> > +	int rc;
> >  
> >  	if (!pci_setup_msi_device_domain(dev, nvec))
> >  		return -ENODEV;
> > @@ -481,6 +483,22 @@ int __pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec,
> >  	}
> >  }
> >  
> > +int __pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec,
> > +			   struct irq_affinity *affd)
> 
> Doesn't that trigger a dead-code warning?

Ah no, wait, ofc doesn't. My bad.


Reviewed-by: Philipp Stanner <phasta@kernel.org>

> 
> 
> P.
> 
> > +{
> > +	int nvec, rc;
> > +
> > +	nvec = pci_msi_range_alloc(dev, minvec, maxvec);
> > +	if (nvec < 0)
> > +		return nvec;
> > +
> > +	rc = pci_setup_msi_context(dev);
> > +	if (rc)
> > +		return rc;
> > +
> > +	return pci_msi_range_init(dev, minvec, maxvec, nvec, affd);
> > +}
> > +
> >  /**
> >   * pci_msi_vec_count - Return the number of MSI vectors a device can send
> >   * @dev: device to report about
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 2/7] PCI/MSI: Split __pci_enable_msix_range() for reuse
  2026-04-24  7:57 ` [PATCH v4 2/7] PCI/MSI: Split __pci_enable_msix_range() " Shawn Lin
@ 2026-05-12 13:20   ` Philipp Stanner
  0 siblings, 0 replies; 16+ messages in thread
From: Philipp Stanner @ 2026-05-12 13:20 UTC (permalink / raw)
  To: Shawn Lin, Bjorn Helgaas
  Cc: Nirmal Patel, Jonathan Derrick, Kurt Schwemmer, Logan Gunthorpe,
	Philipp Stanner, linux-pci

On Fri, 2026-04-24 at 15:57 +0800, Shawn Lin wrote:
> Splits the __pci_enable_msix_range() function into two helper
> functions without changing the original behavior. The purpose is to allow
> future functions (particularly managed devres variants) to reuse these
> components.
> 
> The split is as follows:
> 
> 1. pci_msix_range_alloc(): Handles the allocation logic, including
>    parameter validation, hardware vector count verification, and entry
>    validation. This function calculates the actual number of vectors
>    to allocate and returns the hardware-supported vector count.
> 
> 2. pci_msix_range_init(): Handles the initialization of MSI-X context
>    and the actual MSI-X capability setup. This function takes the
>    pre-determined number of vectors and hardware vector count as inputs.
> 
> 3. The original __pci_enable_msix_range() is now a wrapper that calls
>    these two helper functions in sequence, maintaining exact same
>    behavior.
> 
> This is a preparatory step for introducing devres-managed MSI-X allocation
> API that will share the allocation logic while providing automatic cleanup
> via devres.
> 
> No functional changes intended.
> 
> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
> ---
> 
> Changes in v4: None
> Changes in v3: None
> Changes in v2: None
> 
>  drivers/pci/msi/msi.c | 38 ++++++++++++++++++++++++++++++--------
>  1 file changed, 30 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c
> index 748dba6..5c196c2 100644
> --- a/drivers/pci/msi/msi.c
> +++ b/drivers/pci/msi/msi.c
> @@ -820,10 +820,10 @@ static bool pci_msix_validate_entries(struct pci_dev *dev, struct msix_entry *en
>  	return true;
>  }
>  
> -int __pci_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries, int minvec,
> -			    int maxvec, struct irq_affinity *affd, int flags)
> +static int pci_msix_range_alloc(struct pci_dev *dev, struct msix_entry *entries,
> +				int minvec, int maxvec, int flags, int *nvec_ret)

I think it's relatively common to have the 'flags' field as the last
function parameter. Also nvec seems to be suitable more next to the
other vec size parameters.

>  {
> -	int hwsize, rc, nvec = maxvec;
> +	int hwsize, nvec = maxvec;
>  
>  	if (maxvec < minvec)
>  		return -ERANGE;
> @@ -858,12 +858,16 @@ int __pci_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries, int
>  			nvec = hwsize;
>  	}
>  
> -	if (nvec < minvec)
> -		return -ENOSPC;
> +	*nvec_ret = nvec;
>  
> -	rc = pci_setup_msi_context(dev);
> -	if (rc)
> -		return rc;
> +	return hwsize;
> +}
> +
> +static int pci_msix_range_init(struct pci_dev *dev, struct msix_entry *entries,
> +			       int minvec, int nvec, int hwsize,
> +			       struct irq_affinity *affd)
> +{
> +	int rc;
>  
>  	if (!pci_setup_msix_device_domain(dev, hwsize))
>  		return -ENODEV;
> @@ -888,6 +892,24 @@ int __pci_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries, int
>  	}
>  }
>  
> +int __pci_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries,
> +			    int minvec, int maxvec, struct irq_affinity *affd,
> +			    int flags)
> +{
> +	int hwsize, nvec, rc;
> +
> +	hwsize = pci_msix_range_alloc(dev, entries, minvec,
> +				      maxvec, flags, &nvec);
> +	if (hwsize < 0)
> +		return hwsize;
> +
> +	rc = pci_setup_msi_context(dev);

OK, so IIUC the plan is then to later just remove this line once all
users are ported, correct?

If so, I'd put a TODO comment here: "Remove this after…"

> +	if (rc)
> +		return rc;
> +
> +	return pci_msix_range_init(dev, entries, minvec, nvec, hwsize, affd);
> +}
> +



>  void __pci_restore_msix_state(struct pci_dev *dev)
>  {
>  	struct msi_desc *entry;


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 3/7] PCI/MSI: Introduce __pcim_enable_msi_range()
  2026-04-24  7:57 ` [PATCH v4 3/7] PCI/MSI: Introduce __pcim_enable_msi_range() Shawn Lin
@ 2026-05-12 13:22   ` Philipp Stanner
  0 siblings, 0 replies; 16+ messages in thread
From: Philipp Stanner @ 2026-05-12 13:22 UTC (permalink / raw)
  To: Shawn Lin, Bjorn Helgaas
  Cc: Nirmal Patel, Jonathan Derrick, Kurt Schwemmer, Logan Gunthorpe,
	Philipp Stanner, linux-pci

On Fri, 2026-04-24 at 15:57 +0800, Shawn Lin wrote:
> Introduce __pcim_enable_msi_range(), a devres-managed variant of
> __pci_enable_msi_range(). The new function provides automatic cleanup
> of MSI interrupts via devres, reducing the risk of resource leaks and
> simplifying driver error handling.
> 
> This function is particularly useful for drivers that already use
> pcim_enable_device() and want consistent devres management for all
> PCI resources, including MSI interrupts.
> 
> Drivers can replace calls to __pci_enable_msi_range() with
> __pcim_enable_msi_range() to benefit from automatic cleanup without
> changing their core logic.
> 
> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>

Reviewed-by: Philipp Stanner <phasta@kernel.org>

> ---
> 
> Changes in v4: None
> Changes in v3: None
> Changes in v2: None
> 
>  drivers/pci/msi/msi.c | 20 ++++++++++++++++++++
>  drivers/pci/msi/msi.h |  1 +
>  2 files changed, 21 insertions(+)
> 
> diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c
> index 5c196c2..0aaff57 100644
> --- a/drivers/pci/msi/msi.c
> +++ b/drivers/pci/msi/msi.c
> @@ -499,6 +499,26 @@ int __pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec,
>  	return pci_msi_range_init(dev, minvec, maxvec, nvec, affd);
>  }
>  
> +int __pcim_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec,
> +			    struct irq_affinity *affd)
> +{
> +	int nvec, rc;
> +
> +	nvec = pci_msi_range_alloc(dev, minvec, maxvec);
> +	if (nvec < 0)
> +		return nvec;
> +
> +	rc = msi_setup_device_data(&dev->dev);
> +	if (rc)
> +		return rc;
> +
> +	rc = devm_add_action(&dev->dev, pcim_msi_release, dev);

OK I guess. But pcim_msi_release also sets the is_msi_managed boolean,
which I think will not be needed anymore long-term? If you agree, I'd
add a TODO for removing that boolean at an appropriate place and patch,
too.


P.

> +	if (rc)
> +		return rc;
> +
> +	return pci_msi_range_init(dev, minvec, maxvec, nvec, affd);
> +}
> +
>  /**
>   * pci_msi_vec_count - Return the number of MSI vectors a device can send
>   * @dev: device to report about
> diff --git a/drivers/pci/msi/msi.h b/drivers/pci/msi/msi.h
> index 0b420b3..81c6b099 100644
> --- a/drivers/pci/msi/msi.h
> +++ b/drivers/pci/msi/msi.h
> @@ -94,6 +94,7 @@ void pci_msi_shutdown(struct pci_dev *dev);
>  void pci_msix_shutdown(struct pci_dev *dev);
>  void pci_free_msi_irqs(struct pci_dev *dev);
>  int __pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec, struct irq_affinity *affd);
> +int __pcim_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec, struct irq_affinity *affd);
>  int __pci_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries, int minvec,
>  			    int maxvec,  struct irq_affinity *affd, int flags);
>  void __pci_restore_msi_state(struct pci_dev *dev);


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 4/7] PCI/MSI: Introduce __pcim_enable_msix_range()
  2026-04-24  7:57 ` [PATCH v4 4/7] PCI/MSI: Introduce __pcim_enable_msix_range() Shawn Lin
@ 2026-05-12 13:26   ` Philipp Stanner
  0 siblings, 0 replies; 16+ messages in thread
From: Philipp Stanner @ 2026-05-12 13:26 UTC (permalink / raw)
  To: Shawn Lin, Bjorn Helgaas
  Cc: Nirmal Patel, Jonathan Derrick, Kurt Schwemmer, Logan Gunthorpe,
	Philipp Stanner, linux-pci

On Fri, 2026-04-24 at 15:57 +0800, Shawn Lin wrote:
> Introduce __pcim_enable_msix_range(), a devres-managed variant of
> __pci_enable_msix_range(). Similar to the previously added MSI variant,
> this function provides automatic cleanup of MSI-X interrupts via devres,
> reducing the risk of resource leaks and simplifying driver error handling.
> 
> This function is particularly useful for drivers that already use
> pcim_enable_device() and want consistent devres management for all
> PCI resources, including MSI-X interrupts.
> 
> Drivers can replace calls to __pci_enable_msix_range() with
> __pcim_enable_msix_range() to benefit from automatic cleanup without
> changing their core logic. The flags parameter (e.g., PCI_IRQ_VIRTUAL)
> is fully supported and passed through to the underlying functions.
> 
> This completes the set of devres-managed MSI/MSI-X allocation functions,
> providing a consistent API for driver authors who prefer automatic
> resource management.
> 
> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
> ---
> 
> Changes in v4: None
> Changes in v3: None
> Changes in v2: None
> 
>  drivers/pci/msi/msi.c | 22 ++++++++++++++++++++++
>  drivers/pci/msi/msi.h |  2 ++
>  2 files changed, 24 insertions(+)
> 
> diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c
> index 0aaff57..8fdf40d 100644
> --- a/drivers/pci/msi/msi.c
> +++ b/drivers/pci/msi/msi.c
> @@ -930,6 +930,28 @@ int __pci_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries,
>  	return pci_msix_range_init(dev, entries, minvec, nvec, hwsize, affd);
>  }
>  
> +int __pcim_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries,
> +			     int minvec, int maxvec, struct irq_affinity *affd,
> +			     int flags)
> +{
> +	int hwsize, nvec, rc;
> +
> +	hwsize = pci_msix_range_alloc(dev, entries, minvec,
> +				      maxvec, flags, &nvec);
> +	if (hwsize < 0)
> +		return hwsize;
> +
> +	rc = msi_setup_device_data(&dev->dev);

Uh uh... wait a second.

msi_setup_device_data() also invokes devres below the surface:

int msi_setup_device_data(struct device *dev)
{
	struct msi_device_data *md;
	int ret, i;

	if (dev->msi.data)
		return 0;

	md = devres_alloc(msi_device_data_release, sizeof(*md), GFP_KERNEL);


Do you have an idea whether that is a problem?

Looks suspicious to me. That's more than just automatic memory
management.


Sorry for just seeing that now. I might have seen that last year and
concluded it's one of the reasons why porting is difficult.. can't
remember anymore.


P.


> +	if (rc)
> +		return rc;
> +
> +	rc = devm_add_action(&dev->dev, pcim_msi_release, dev);
> +	if (rc)
> +		return rc;
> +
> +	return pci_msix_range_init(dev, entries, minvec, nvec, hwsize, affd);
> +}
> +
>  void __pci_restore_msix_state(struct pci_dev *dev)
>  {
>  	struct msi_desc *entry;
> diff --git a/drivers/pci/msi/msi.h b/drivers/pci/msi/msi.h
> index 81c6b099..e6364a8 100644
> --- a/drivers/pci/msi/msi.h
> +++ b/drivers/pci/msi/msi.h
> @@ -97,6 +97,8 @@ int __pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec, struct i
>  int __pcim_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec, struct irq_affinity *affd);
>  int __pci_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries, int minvec,
>  			    int maxvec,  struct irq_affinity *affd, int flags);
> +int __pcim_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries, int minvec,
> +			     int maxvec,  struct irq_affinity *affd, int flags);
>  void __pci_restore_msi_state(struct pci_dev *dev);
>  void __pci_restore_msix_state(struct pci_dev *dev);
>  


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 5/7] PCI/MSI: Add Devres managed IRQ vectors allocation
  2026-04-24  7:57 ` [PATCH v4 5/7] PCI/MSI: Add Devres managed IRQ vectors allocation Shawn Lin
@ 2026-05-12 13:28   ` Philipp Stanner
  0 siblings, 0 replies; 16+ messages in thread
From: Philipp Stanner @ 2026-05-12 13:28 UTC (permalink / raw)
  To: Shawn Lin, Bjorn Helgaas
  Cc: Nirmal Patel, Jonathan Derrick, Kurt Schwemmer, Logan Gunthorpe,
	Philipp Stanner, linux-pci

On Fri, 2026-04-24 at 15:57 +0800, Shawn Lin wrote:
> There is a long-standing design ambiguity in the PCI/MSI subsystem
> regarding the management of IRQ vectors. Currently, the management
> operates in a "hybrid" mode. Sometimes it's handled by devres but
> sometimes not, creating potential for resource management bugs.
> 
> Historically, pcim_enable_device() implicitly triggers automatic IRQ
> vector management when calling pci_alloc_irq_vectors[_affinity], as
> pcim_enable_device() sets the is_managed flag, thus pcim_msi_release()
> will register a cleanup action. However, using pci_enable_device()
> will not make pci_alloc_irq_vectors[_affinity] register a cleanup
> action.
> 
> This ambiguity causes problems when drivers follow a common but
> hazardous pattern:
> 1. Using pcim_enable_device() (which implicitly marks IRQs as managed)
> 2. Explicitly calling pci_alloc_irq_vectors() for IRQ allocation
> 3. Also calling pci_free_irq_vectors() in error paths or remove routines
> 
> In this scenario, the devres framework may attempt to free the IRQ
> vectors a second time upon device release, leading to double-free
> issues. Analysis of the tree shows this hazardous pattern exists
> in multiple drivers, while 37 other drivers correctly rely solely
> on the implicit cleanup.
> 
> To resolve this ambiguity, introduce explicit managed APIs for IRQ
> vector allocation:
> - pcim_alloc_irq_vectors()
> - pcim_alloc_irq_vectors_affinity()
> 
> These functions are the devres-managed counterparts to
> pci_alloc_irq_vectors[_affinity](). Drivers that wish to have
> devres-managed IRQ vectors should use these new APIs instead.
> 
> The long-term goal is to convert all drivers which use pcim_enable_device()
> in combination with pci_alloc_irq_vectors[_affinity]() to use these managed
> functions, and eventually remove the problematic hybrid management
> pattern from pcim_enable_device() and pcim_setup_msi_release() entirely.
> This patch lays the foundation by introducing the APIs.

Cool stuff.


Though what I'd also do is to warn somewhere about the current API
being hybrid and fishy and therefore users should be careful with
pcim_enable_device().
That's not the case right now, is it?


Regards
P.

> 
> Suggested-by: Philipp Stanner <phasta@kernel.org>
> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
> 
> ---
> 
> Changes in v4:
> - Rework to create dedicated devres-managed function (Philipp)
> 
> Changes in v3:
> - Rework the commit message and function doc (Philipp)
> - Remove setting is_msi_managed flag from new APIs (Philipp)
> 
> Changes in v2:
> - Rebase
> - Introduce patches only for PCI subsystem to convert the API
> 
>  drivers/pci/msi/api.c | 84 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/pci.h   | 22 ++++++++++++++
>  2 files changed, 106 insertions(+)
> 
> diff --git a/drivers/pci/msi/api.c b/drivers/pci/msi/api.c
> index c18559b..405da7b 100644
> --- a/drivers/pci/msi/api.c
> +++ b/drivers/pci/msi/api.c
> @@ -297,6 +297,90 @@ int pci_alloc_irq_vectors_affinity(struct pci_dev *dev, unsigned int min_vecs,
>  EXPORT_SYMBOL(pci_alloc_irq_vectors_affinity);
>  
>  /**
> + * pcim_alloc_irq_vectors - devres managed pci_alloc_irq_vectors()
> + * @dev: the PCI device to operate on
> + * @min_vecs: minimum number of vectors required (must be >= 1)
> + * @max_vecs: maximum (desired) number of vectors
> + * @flags: flags for this allocation, see pci_alloc_irq_vectors()
> + *
> + * This is a device resource managed version of pci_alloc_irq_vectors().
> + * Interrupt vectors are automatically freed on driver detach by devres.
> + * Drivers MUST NOT call pci_free_irq_vectors().
> + *
> + * Returns number of vectors allocated (which might be smaller than
> + * @max_vecs) on success, or a negative error code on failure.
> + */
> +int pcim_alloc_irq_vectors(struct pci_dev *dev, unsigned int min_vecs,
> +			   unsigned int max_vecs, unsigned int flags)
> +{
> +	return pcim_alloc_irq_vectors_affinity(dev, min_vecs, max_vecs,
> +					       flags, NULL);
> +}
> +EXPORT_SYMBOL(pcim_alloc_irq_vectors);
> +
> +/**
> + * pcim_alloc_irq_vectors_affinity - devres managed pci_alloc_irq_vectors_affinity()
> + * @dev: the PCI device to operate on
> + * @min_vecs: minimum number of vectors required (must be >= 1)
> + * @max_vecs: maximum (desired) number of vectors
> + * @flags: flags for this allocation, see pci_alloc_irq_vectors()
> + * @affd: optional description of the affinity requirements
> + *
> + * This is a device resource managed version of pci_alloc_irq_vectors_affinity().
> + * Interrupt vectors are automatically freed on driver detach by devres.
> + * Drivers MUST NOT call pci_free_irq_vectors().
> + *
> + * Returns number of vectors allocated (which might be smaller than
> + * @max_vecs) on success, or a negative error code on failure.
> + */
> +int pcim_alloc_irq_vectors_affinity(struct pci_dev *dev, unsigned int min_vecs,
> +				    unsigned int max_vecs, unsigned int flags,
> +				    struct irq_affinity *affd)
> +{
> +	struct irq_affinity msi_default_affd = {0};
> +	int nvecs = -ENOSPC;
> +
> +	if (flags & PCI_IRQ_AFFINITY) {
> +		if (!affd)
> +			affd = &msi_default_affd;
> +	} else {
> +		if (WARN_ON(affd))
> +			affd = NULL;
> +	}
> +
> +	if (flags & PCI_IRQ_MSIX) {
> +		nvecs = __pcim_enable_msix_range(dev, NULL, min_vecs, max_vecs,
> +						 affd, flags);
> +		if (nvecs > 0)
> +			return nvecs;
> +	}
> +
> +	if (flags & PCI_IRQ_MSI) {
> +		nvecs = __pcim_enable_msi_range(dev, min_vecs, max_vecs, affd);
> +		if (nvecs > 0)
> +			return nvecs;
> +	}
> +
> +	/* use INTx IRQ if allowed */
> +	if (flags & PCI_IRQ_INTX) {
> +		if (min_vecs == 1 && dev->irq) {
> +			/*
> +			 * Invoke the affinity spreading logic to ensure that
> +			 * the device driver can adjust queue configuration
> +			 * for the single interrupt case.
> +			 */
> +			if (affd)
> +				kfree(irq_create_affinity_masks(1, affd));
> +			pci_intx(dev, 1);
> +			return 1;
> +		}
> +	}
> +
> +	return nvecs;
> +}
> +EXPORT_SYMBOL(pcim_alloc_irq_vectors_affinity);
> +
> +/**
>   * pci_irq_vector() - Get Linux IRQ number of a device interrupt vector
>   * @dev: the PCI device to operate on
>   * @nr:  device-relative interrupt vector index (0-based); has different
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 2c44545..6eab553 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1770,6 +1770,12 @@ int pci_alloc_irq_vectors_affinity(struct pci_dev *dev, unsigned int min_vecs,
>  				   unsigned int max_vecs, unsigned int flags,
>  				   struct irq_affinity *affd);
>  
> +int pcim_alloc_irq_vectors(struct pci_dev *dev, unsigned int min_vecs,
> +			   unsigned int max_vecs, unsigned int flags);
> +int pcim_alloc_irq_vectors_affinity(struct pci_dev *dev, unsigned int min_vecs,
> +				    unsigned int max_vecs, unsigned int flags,
> +				    struct irq_affinity *affd);
> +
>  bool pci_msix_can_alloc_dyn(struct pci_dev *dev);
>  struct msi_map pci_msix_alloc_irq_at(struct pci_dev *dev, unsigned int index,
>  				     const struct irq_affinity_desc *affdesc);
> @@ -1812,6 +1818,22 @@ pci_alloc_irq_vectors(struct pci_dev *dev, unsigned int min_vecs,
>  					      flags, NULL);
>  }
>  
> +static inline int
> +pcim_alloc_irq_vectors_affinity(struct pci_dev *dev, unsigned int min_vecs,
> +			       unsigned int max_vecs, unsigned int flags,
> +			       struct irq_affinity *aff_desc)
> +{
> +	return pci_alloc_irq_vectors_affinity(dev, min_vecs, max_vecs,
> +					      flags, aff_desc);
> +}
> +static inline int
> +pcim_alloc_irq_vectors(struct pci_dev *dev, unsigned int min_vecs,
> +		      unsigned int max_vecs, unsigned int flags)
> +{
> +	return pcim_alloc_irq_vectors_affinity(dev, min_vecs, max_vecs,
> +					      flags, NULL);
> +}
> +
>  static inline bool pci_msix_can_alloc_dyn(struct pci_dev *dev)
>  { return false; }
>  static inline struct msi_map pci_msix_alloc_irq_at(struct pci_dev *dev, unsigned int index,


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2026-05-12 13:28 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-24  7:57 [PATCH v4 0/7] Add Devres managed IRQ vectors allocation Shawn Lin
2026-04-24  7:57 ` [PATCH v4 1/7] PCI/MSI: Split __pci_enable_msi_range() for reuse Shawn Lin
2026-05-12 13:08   ` Philipp Stanner
2026-05-12 13:09     ` Philipp Stanner
2026-04-24  7:57 ` [PATCH v4 2/7] PCI/MSI: Split __pci_enable_msix_range() " Shawn Lin
2026-05-12 13:20   ` Philipp Stanner
2026-04-24  7:57 ` [PATCH v4 3/7] PCI/MSI: Introduce __pcim_enable_msi_range() Shawn Lin
2026-05-12 13:22   ` Philipp Stanner
2026-04-24  7:57 ` [PATCH v4 4/7] PCI/MSI: Introduce __pcim_enable_msix_range() Shawn Lin
2026-05-12 13:26   ` Philipp Stanner
2026-04-24  7:57 ` [PATCH v4 5/7] PCI/MSI: Add Devres managed IRQ vectors allocation Shawn Lin
2026-05-12 13:28   ` Philipp Stanner
2026-04-24  7:57 ` [PATCH v4 6/7] PCI: switchtec: Replace pci_alloc_irq_vectors() with pcim_alloc_irq_vectors() Shawn Lin
2026-04-24  7:57 ` [PATCH v4 7/7] PCI: vmd: " Shawn Lin
2026-05-12 12:57   ` Manivannan Sadhasivam
2026-04-24 11:48 ` [PATCH v4 0/7] Add Devres managed IRQ vectors allocation Philipp Stanner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox