public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] PCI: Allow disabling port services on broken root ports
@ 2026-03-31 17:56 Han Gao
  2026-03-31 17:56 ` [PATCH 1/2] PCI: Add per-device flag to disable native PCIe port services Han Gao
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Han Gao @ 2026-03-31 17:56 UTC (permalink / raw)
  To: Bjorn Helgaas, Uwe Kleine-König, Jonathan Cameron,
	Lukas Wunner, Ilpo Järvinen, Kees Cook, Han Gao, Chen Wang,
	Manivannan Sadhasivam
  Cc: linux-pci, sophgo, linux-kernel, linux-riscv, Han Gao,
	Icenowy Zheng, Inochi Amaoto, Vivian Wang, Yao Zi

Some PCIe root ports break MSI delivery to downstream devices when
native port services (AER, PME, bwctrl, etc.) are active. The existing
pcie_ports=compat kernel parameter works around this globally, but
affects all ports on the system.

This series adds a per-device mechanism to skip port service probing:
  1. Introduce PCI_DEV_FLAGS_NO_PORT_SERVICES flag and wire it into
     the PCIe port driver
  2. Apply it via quirk to Sophgo SG2042 root ports [1f1c:2042], which
     fail to deliver MSI interrupts when port services are enabled

SG2042's PCIe root ports only support MSI, not MSI-X. The MSI
controller provides only 32 vectors shared across all devices behind
each root port. When native port services claim vectors from this
limited pool, downstream devices are starved of interrupts, resulting
in zero interrupts delivered and driver timeouts (e.g. amdgpu fence
fallback timer expired on all rings).

Han Gao (2):
  PCI: Add per-device flag to disable native PCIe port services
  PCI: Add quirk to disable PCIe port services on Sophgo SG2042

 drivers/pci/pcie/portdrv.c |  3 +++
 drivers/pci/quirks.c       | 12 ++++++++++++
 include/linux/pci.h        |  2 ++
 include/linux/pci_ids.h    |  2 ++
 4 files changed, 19 insertions(+)

-- 
2.47.3


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/2] PCI: Add per-device flag to disable native PCIe port services
  2026-03-31 17:56 [PATCH 0/2] PCI: Allow disabling port services on broken root ports Han Gao
@ 2026-03-31 17:56 ` Han Gao
  2026-03-31 17:56 ` [PATCH 2/2] PCI: Add quirk to disable PCIe port services on Sophgo SG2042 Han Gao
  2026-03-31 18:58 ` [PATCH 0/2] PCI: Allow disabling port services on broken root ports Lukas Wunner
  2 siblings, 0 replies; 12+ messages in thread
From: Han Gao @ 2026-03-31 17:56 UTC (permalink / raw)
  To: Bjorn Helgaas, Uwe Kleine-König, Jonathan Cameron,
	Lukas Wunner, Ilpo Järvinen, Kees Cook, Han Gao, Chen Wang,
	Manivannan Sadhasivam
  Cc: linux-pci, sophgo, linux-kernel, linux-riscv, Han Gao,
	Icenowy Zheng, Inochi Amaoto, Vivian Wang, Yao Zi, stable

Add PCI_DEV_FLAGS_NO_PORT_SERVICES to allow quirks to prevent the PCIe
port service driver from probing specific devices. This provides a
per-device equivalent of the global pcie_ports=compat kernel parameter.

Some platforms have PCIe root ports that break MSI delivery to downstream
devices when native port services (AER, PME, bwctrl, etc.) are active.
The existing pci_host_bridge native_* flags do not cover all services
(notably bwctrl), so a mechanism to skip port driver probing entirely
on a per-device basis is needed.

Cc: stable@vger.kernel.org
Signed-off-by: Han Gao <gaohan@iscas.ac.cn>
---
 drivers/pci/pcie/portdrv.c | 3 +++
 include/linux/pci.h        | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c
index 2d6aa488fe7b..3386818d200d 100644
--- a/drivers/pci/pcie/portdrv.c
+++ b/drivers/pci/pcie/portdrv.c
@@ -685,6 +685,9 @@ static const struct dev_pm_ops pcie_portdrv_pm_ops = {
 static int pcie_portdrv_probe(struct pci_dev *dev,
 					const struct pci_device_id *id)
 {
+	if (dev->dev_flags & PCI_DEV_FLAGS_NO_PORT_SERVICES)
+		return -ENODEV;
+
 	int type = pci_pcie_type(dev);
 	int status;
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 1c270f1d5123..e038fe14ef78 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -253,6 +253,8 @@ enum pci_dev_flags {
 	 * integrated with the downstream devices and doesn't use real PCI.
 	 */
 	PCI_DEV_FLAGS_PCI_BRIDGE_NO_ALIAS = (__force pci_dev_flags_t) (1 << 14),
+	/* Do not use native PCIe port services (equivalent to pcie_ports=compat) */
+	PCI_DEV_FLAGS_NO_PORT_SERVICES = (__force pci_dev_flags_t) (1 << 15),
 };
 
 enum pci_irq_reroute_variant {
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] PCI: Add quirk to disable PCIe port services on Sophgo SG2042
  2026-03-31 17:56 [PATCH 0/2] PCI: Allow disabling port services on broken root ports Han Gao
  2026-03-31 17:56 ` [PATCH 1/2] PCI: Add per-device flag to disable native PCIe port services Han Gao
@ 2026-03-31 17:56 ` Han Gao
  2026-05-01 16:53   ` Manivannan Sadhasivam
  2026-03-31 18:58 ` [PATCH 0/2] PCI: Allow disabling port services on broken root ports Lukas Wunner
  2 siblings, 1 reply; 12+ messages in thread
From: Han Gao @ 2026-03-31 17:56 UTC (permalink / raw)
  To: Bjorn Helgaas, Uwe Kleine-König, Jonathan Cameron,
	Lukas Wunner, Ilpo Järvinen, Kees Cook, Han Gao, Chen Wang,
	Manivannan Sadhasivam
  Cc: linux-pci, sophgo, linux-kernel, linux-riscv, Han Gao,
	Icenowy Zheng, Inochi Amaoto, Vivian Wang, Yao Zi, stable

SG2042's PCIe root ports [1f1c:2042] fail to deliver MSI interrupts to
downstream devices when native port services are enabled. Devices under
an affected root port receive zero interrupts despite successful vector
allocation, causing driver timeouts (e.g. amdgpu fence fallback timer
expired on all rings).

Set PCI_DEV_FLAGS_NO_PORT_SERVICES on SG2042 root ports to prevent the
port service driver from probing, restoring correct MSI delivery.

Fixes: 1c72774df028 ("PCI: sg2042: Add Sophgo SG2042 PCIe driver")
Cc: stable@vger.kernel.org
Signed-off-by: Han Gao <gaohan@iscas.ac.cn>
---
 drivers/pci/quirks.c    | 12 ++++++++++++
 include/linux/pci_ids.h |  2 ++
 2 files changed, 14 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 48946cca4be7..bbde482ff7cb 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -6380,3 +6380,15 @@ static void pci_mask_replay_timer_timeout(struct pci_dev *pdev)
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_GLI, 0x9750, pci_mask_replay_timer_timeout);
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_GLI, 0x9755, pci_mask_replay_timer_timeout);
 #endif
+
+/*
+ * SG2042's PCIe root ports do not correctly deliver MSI interrupts to
+ * downstream devices when native PCIe port services are enabled. All
+ * services including bwctrl must be disabled, equivalent to pcie_ports=compat.
+ */
+static void quirk_sg2042_no_port_services(struct pci_dev *dev)
+{
+	pci_info(dev, "SG2042: disabling native PCIe port services\n");
+	dev->dev_flags |= PCI_DEV_FLAGS_NO_PORT_SERVICES;
+}
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SOPHGO, 0x2042, quirk_sg2042_no_port_services);
diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index 406abf629be2..9663be526dd0 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -2630,6 +2630,8 @@
 
 #define PCI_VENDOR_ID_CXL		0x1e98
 
+#define PCI_VENDOR_ID_SOPHGO		0x1f1c
+
 #define PCI_VENDOR_ID_TEHUTI		0x1fc9
 #define PCI_DEVICE_ID_TEHUTI_3009	0x3009
 #define PCI_DEVICE_ID_TEHUTI_3010	0x3010
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] PCI: Allow disabling port services on broken root ports
  2026-03-31 17:56 [PATCH 0/2] PCI: Allow disabling port services on broken root ports Han Gao
  2026-03-31 17:56 ` [PATCH 1/2] PCI: Add per-device flag to disable native PCIe port services Han Gao
  2026-03-31 17:56 ` [PATCH 2/2] PCI: Add quirk to disable PCIe port services on Sophgo SG2042 Han Gao
@ 2026-03-31 18:58 ` Lukas Wunner
  2026-03-31 19:07   ` Han Gao
  2 siblings, 1 reply; 12+ messages in thread
From: Lukas Wunner @ 2026-03-31 18:58 UTC (permalink / raw)
  To: Han Gao
  Cc: Bjorn Helgaas, Uwe Kleine-König, Jonathan Cameron,
	Ilpo Järvinen, Kees Cook, Chen Wang, Manivannan Sadhasivam,
	linux-pci, sophgo, linux-kernel, linux-riscv, Han Gao,
	Icenowy Zheng, Inochi Amaoto, Vivian Wang, Yao Zi

On Wed, Apr 01, 2026 at 01:56:56AM +0800, Han Gao wrote:
> SG2042's PCIe root ports only support MSI, not MSI-X. The MSI
> controller provides only 32 vectors shared across all devices behind
> each root port. When native port services claim vectors from this
> limited pool, downstream devices are starved of interrupts, resulting
> in zero interrupts delivered and driver timeouts (e.g. amdgpu fence
> fallback timer expired on all rings).

Have you considered setting the pci_dev::no_msi flag on the Root Ports
to force them to use INTx interrupts instead of MSI?  That would seem
like a cleaner solution.  There are already several devices for which
the flag is set in drivers/pci/quirks.c, see quirk_no_msi().

> Some PCIe root ports break MSI delivery to downstream devices when
> native port services (AER, PME, bwctrl, etc.) are active. The existing
> pcie_ports=compat kernel parameter works around this globally, but
> affects all ports on the system.
> 
> This series adds a per-device mechanism to skip port service probing:
>   1. Introduce PCI_DEV_FLAGS_NO_PORT_SERVICES flag and wire it into
>      the PCIe port driver
>   2. Apply it via quirk to Sophgo SG2042 root ports [1f1c:2042], which
>      fail to deliver MSI interrupts when port services are enabled

I think we should try to minimize such workarounds or at least make them
as non-intrusive as possible, so please try the no_msi approach instead.

I also don't see why the stable designation is needed TBH.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] PCI: Allow disabling port services on broken root ports
  2026-03-31 18:58 ` [PATCH 0/2] PCI: Allow disabling port services on broken root ports Lukas Wunner
@ 2026-03-31 19:07   ` Han Gao
  0 siblings, 0 replies; 12+ messages in thread
From: Han Gao @ 2026-03-31 19:07 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Han Gao, Bjorn Helgaas, Uwe Kleine-König, Jonathan Cameron,
	Ilpo Järvinen, Kees Cook, Chen Wang, Manivannan Sadhasivam,
	linux-pci, sophgo, linux-kernel, linux-riscv, Icenowy Zheng,
	Inochi Amaoto, Vivian Wang, Yao Zi

On Wed, Apr 1, 2026 at 2:58 AM Lukas Wunner <lukas@wunner.de> wrote:
>
> On Wed, Apr 01, 2026 at 01:56:56AM +0800, Han Gao wrote:
> > SG2042's PCIe root ports only support MSI, not MSI-X. The MSI
> > controller provides only 32 vectors shared across all devices behind
> > each root port. When native port services claim vectors from this
> > limited pool, downstream devices are starved of interrupts, resulting
> > in zero interrupts delivered and driver timeouts (e.g. amdgpu fence
> > fallback timer expired on all rings).
>
> Have you considered setting the pci_dev::no_msi flag on the Root Ports
> to force them to use INTx interrupts instead of MSI?  That would seem
> like a cleaner solution.  There are already several devices for which
> the flag is set in drivers/pci/quirks.c, see quirk_no_msi().

Unfortunately, the SG2042 has no INTx interrupts.

>
> > Some PCIe root ports break MSI delivery to downstream devices when
> > native port services (AER, PME, bwctrl, etc.) are active. The existing
> > pcie_ports=compat kernel parameter works around this globally, but
> > affects all ports on the system.
> >
> > This series adds a per-device mechanism to skip port service probing:
> >   1. Introduce PCI_DEV_FLAGS_NO_PORT_SERVICES flag and wire it into
> >      the PCIe port driver
> >   2. Apply it via quirk to Sophgo SG2042 root ports [1f1c:2042], which
> >      fail to deliver MSI interrupts when port services are enabled
>
> I think we should try to minimize such workarounds or at least make them
> as non-intrusive as possible, so please try the no_msi approach instead.
>
> I also don't see why the stable designation is needed TBH.

6.18 has merged PCIe drivers, so a stable tag is required.

>
> Thanks,
>
> Lukas

Han

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] PCI: Add quirk to disable PCIe port services on Sophgo SG2042
  2026-03-31 17:56 ` [PATCH 2/2] PCI: Add quirk to disable PCIe port services on Sophgo SG2042 Han Gao
@ 2026-05-01 16:53   ` Manivannan Sadhasivam
  2026-05-02 13:58     ` Icenowy Zheng
  0 siblings, 1 reply; 12+ messages in thread
From: Manivannan Sadhasivam @ 2026-05-01 16:53 UTC (permalink / raw)
  To: Han Gao
  Cc: Bjorn Helgaas, Uwe Kleine-König, Jonathan Cameron,
	Lukas Wunner, Ilpo Järvinen, Kees Cook, Chen Wang, linux-pci,
	sophgo, linux-kernel, linux-riscv, Han Gao, Icenowy Zheng,
	Inochi Amaoto, Vivian Wang, Yao Zi, stable

On Wed, Apr 01, 2026 at 01:56:58AM +0800, Han Gao wrote:
> SG2042's PCIe root ports [1f1c:2042] fail to deliver MSI interrupts to
> downstream devices when native port services are enabled. Devices under
> an affected root port receive zero interrupts despite successful vector
> allocation, causing driver timeouts (e.g. amdgpu fence fallback timer
> expired on all rings).
> 

Have you investigated why the endpoint is not able to deliver MSIs to host when
Port services are enabled? Is it because the portdrv driver consumes all MSIs or
MSIs are masked in hw (if so why? due to hardware issue?) or something else?

Currently, the problem description is very vague.

- Mani

> Set PCI_DEV_FLAGS_NO_PORT_SERVICES on SG2042 root ports to prevent the
> port service driver from probing, restoring correct MSI delivery.
> 
> Fixes: 1c72774df028 ("PCI: sg2042: Add Sophgo SG2042 PCIe driver")
> Cc: stable@vger.kernel.org
> Signed-off-by: Han Gao <gaohan@iscas.ac.cn>
> ---
>  drivers/pci/quirks.c    | 12 ++++++++++++
>  include/linux/pci_ids.h |  2 ++
>  2 files changed, 14 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 48946cca4be7..bbde482ff7cb 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -6380,3 +6380,15 @@ static void pci_mask_replay_timer_timeout(struct pci_dev *pdev)
>  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_GLI, 0x9750, pci_mask_replay_timer_timeout);
>  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_GLI, 0x9755, pci_mask_replay_timer_timeout);
>  #endif
> +
> +/*
> + * SG2042's PCIe root ports do not correctly deliver MSI interrupts to
> + * downstream devices when native PCIe port services are enabled. All
> + * services including bwctrl must be disabled, equivalent to pcie_ports=compat.
> + */
> +static void quirk_sg2042_no_port_services(struct pci_dev *dev)
> +{
> +	pci_info(dev, "SG2042: disabling native PCIe port services\n");
> +	dev->dev_flags |= PCI_DEV_FLAGS_NO_PORT_SERVICES;
> +}
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SOPHGO, 0x2042, quirk_sg2042_no_port_services);
> diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
> index 406abf629be2..9663be526dd0 100644
> --- a/include/linux/pci_ids.h
> +++ b/include/linux/pci_ids.h
> @@ -2630,6 +2630,8 @@
>  
>  #define PCI_VENDOR_ID_CXL		0x1e98
>  
> +#define PCI_VENDOR_ID_SOPHGO		0x1f1c
> +
>  #define PCI_VENDOR_ID_TEHUTI		0x1fc9
>  #define PCI_DEVICE_ID_TEHUTI_3009	0x3009
>  #define PCI_DEVICE_ID_TEHUTI_3010	0x3010
> -- 
> 2.47.3
> 

-- 
மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] PCI: Add quirk to disable PCIe port services on Sophgo SG2042
  2026-05-01 16:53   ` Manivannan Sadhasivam
@ 2026-05-02 13:58     ` Icenowy Zheng
  2026-05-02 19:47       ` Lukas Wunner
  0 siblings, 1 reply; 12+ messages in thread
From: Icenowy Zheng @ 2026-05-02 13:58 UTC (permalink / raw)
  To: Manivannan Sadhasivam, Han Gao
  Cc: Bjorn Helgaas, Uwe Kleine-König, Jonathan Cameron,
	Lukas Wunner, Ilpo Järvinen, Kees Cook, Chen Wang, linux-pci,
	sophgo, linux-kernel, linux-riscv, Han Gao, Inochi Amaoto,
	Vivian Wang, Yao Zi, stable

在 2026-05-01五的 22:23 +0530,Manivannan Sadhasivam写道:
> On Wed, Apr 01, 2026 at 01:56:58AM +0800, Han Gao wrote:
> > SG2042's PCIe root ports [1f1c:2042] fail to deliver MSI interrupts
> > to
> > downstream devices when native port services are enabled. Devices
> > under
> > an affected root port receive zero interrupts despite successful
> > vector
> > allocation, causing driver timeouts (e.g. amdgpu fence fallback
> > timer
> > expired on all rings).
> > 
> 
> Have you investigated why the endpoint is not able to deliver MSIs to
> host when
> Port services are enabled? Is it because the portdrv driver consumes
> all MSIs or
> MSIs are masked in hw (if so why? due to hardware issue?) or
> something else?

The problem is that the MSI controller has only 16 MSIs usable (it's
wrongly described as 32 previously, a fix to this is pending[1]), and
the failing device have an onboard PCIe switch, which created many PCIe
ports (and corresponding pcieport devices).

With pcieport devices activated, 11 MSIs are requested by the pcieport
drivers -- 3 SoC PCIe ports and 8 switch downstream ports. Then only 5
MSIs are available, but there're still 10 downstream-facing PCIe ports
now (and 5 of them are hardwired to onboard peripherals).

Thanks,
Icenowy

[1]
https://lore.kernel.org/all/20260407160143.1182430-1-zhengxingda@iscas.ac.cn/

> 
> Currently, the problem description is very vague.
> 
> - Mani
> 
> > Set PCI_DEV_FLAGS_NO_PORT_SERVICES on SG2042 root ports to prevent
> > the
> > port service driver from probing, restoring correct MSI delivery.
> > 
> > Fixes: 1c72774df028 ("PCI: sg2042: Add Sophgo SG2042 PCIe driver")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Han Gao <gaohan@iscas.ac.cn>
> > ---
> >  drivers/pci/quirks.c    | 12 ++++++++++++
> >  include/linux/pci_ids.h |  2 ++
> >  2 files changed, 14 insertions(+)
> > 
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index 48946cca4be7..bbde482ff7cb 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -6380,3 +6380,15 @@ static void
> > pci_mask_replay_timer_timeout(struct pci_dev *pdev)
> >  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_GLI, 0x9750,
> > pci_mask_replay_timer_timeout);
> >  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_GLI, 0x9755,
> > pci_mask_replay_timer_timeout);
> >  #endif
> > +
> > +/*
> > + * SG2042's PCIe root ports do not correctly deliver MSI
> > interrupts to
> > + * downstream devices when native PCIe port services are enabled.
> > All
> > + * services including bwctrl must be disabled, equivalent to
> > pcie_ports=compat.
> > + */
> > +static void quirk_sg2042_no_port_services(struct pci_dev *dev)
> > +{
> > +	pci_info(dev, "SG2042: disabling native PCIe port
> > services\n");
> > +	dev->dev_flags |= PCI_DEV_FLAGS_NO_PORT_SERVICES;
> > +}
> > +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SOPHGO, 0x2042,
> > quirk_sg2042_no_port_services);
> > diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
> > index 406abf629be2..9663be526dd0 100644
> > --- a/include/linux/pci_ids.h
> > +++ b/include/linux/pci_ids.h
> > @@ -2630,6 +2630,8 @@
> >  
> >  #define PCI_VENDOR_ID_CXL		0x1e98
> >  
> > +#define PCI_VENDOR_ID_SOPHGO		0x1f1c
> > +
> >  #define PCI_VENDOR_ID_TEHUTI		0x1fc9
> >  #define PCI_DEVICE_ID_TEHUTI_3009	0x3009
> >  #define PCI_DEVICE_ID_TEHUTI_3010	0x3010
> > -- 
> > 2.47.3
> > 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] PCI: Add quirk to disable PCIe port services on Sophgo SG2042
  2026-05-02 13:58     ` Icenowy Zheng
@ 2026-05-02 19:47       ` Lukas Wunner
  2026-05-03  7:10         ` Icenowy Zheng
  0 siblings, 1 reply; 12+ messages in thread
From: Lukas Wunner @ 2026-05-02 19:47 UTC (permalink / raw)
  To: Icenowy Zheng
  Cc: Manivannan Sadhasivam, Han Gao, Bjorn Helgaas,
	Uwe Kleine-König, Jonathan Cameron, Ilpo Järvinen,
	Kees Cook, Chen Wang, linux-pci, sophgo, linux-kernel,
	linux-riscv, Han Gao, Inochi Amaoto, Vivian Wang, Yao Zi, stable

On Sat, May 02, 2026 at 09:58:04PM +0800, Icenowy Zheng wrote:
> The problem is that the MSI controller has only 16 MSIs usable (it's
> wrongly described as 32 previously, a fix to this is pending[1]), and
> the failing device have an onboard PCIe switch, which created many PCIe
> ports (and corresponding pcieport devices).

Is the SG2042 only used in that single product?  If it is used in other
products which do not have an on-board PCIe switch, why do you want to
disable MSIs on those other products as well?

My point is, you want to constrain this to a specific product, not to
the SoC.  Can you maybe solve this by not specifying interrupts in the
devicetree for the PCIe switch?

> With pcieport devices activated, 11 MSIs are requested by the pcieport
> drivers -- 3 SoC PCIe ports and 8 switch downstream ports. Then only 5
> MSIs are available, but there're still 10 downstream-facing PCIe ports
> now (and 5 of them are hardwired to onboard peripherals).

pcieport can make do with a single MSI vector because all port services
support a shared interrupt.  But I assume your point is that this
particular product has so many PCIe ports that you're still close
to the 16 MSIs limit?

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] PCI: Add quirk to disable PCIe port services on Sophgo SG2042
  2026-05-02 19:47       ` Lukas Wunner
@ 2026-05-03  7:10         ` Icenowy Zheng
  2026-05-03  8:52           ` Lukas Wunner
  0 siblings, 1 reply; 12+ messages in thread
From: Icenowy Zheng @ 2026-05-03  7:10 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Manivannan Sadhasivam, Han Gao, Bjorn Helgaas,
	Uwe Kleine-König, Jonathan Cameron, Ilpo Järvinen,
	Kees Cook, Chen Wang, linux-pci, sophgo, linux-kernel,
	linux-riscv, Han Gao, Inochi Amaoto, Vivian Wang, Yao Zi, stable

在 2026-05-02六的 21:47 +0200,Lukas Wunner写道:
> On Sat, May 02, 2026 at 09:58:04PM +0800, Icenowy Zheng wrote:
> > The problem is that the MSI controller has only 16 MSIs usable
> > (it's
> > wrongly described as 32 previously, a fix to this is pending[1]),
> > and
> > the failing device have an onboard PCIe switch, which created many
> > PCIe
> > ports (and corresponding pcieport devices).
> 
> Is the SG2042 only used in that single product?  If it is used in
> other
> products which do not have an on-board PCIe switch, why do you want
> to
> disable MSIs on those other products as well?

It's used in multiple products, but only one of them (EVBv1, which is
just an early EVB available for a few people including me) lacks an
onboard switch, because SG2042 is short on on-chip peripherals. All
other devices (including two mainlined ones, EVBv2 and Milk-V Pioneer,
and unmainlined dual socket rack servers; Milk-V Pioneer should be the
most popular device because it was on shelf) have an onboard switch to
mitigate the lack of on-chip peripherals in SG2042.

> 
> My point is, you want to constrain this to a specific product, not to
> the SoC.  Can you maybe solve this by not specifying interrupts in
> the
> devicetree for the PCIe switch?

The PCIe switches are not described in the device tree at all, because
they're all just discoverable; can we describe them in the DT and
redirect their interrupts to void?

> 
> > With pcieport devices activated, 11 MSIs are requested by the
> > pcieport
> > drivers -- 3 SoC PCIe ports and 8 switch downstream ports. Then
> > only 5
> > MSIs are available, but there're still 10 downstream-facing PCIe
> > ports
> > now (and 5 of them are hardwired to onboard peripherals).
> 
> pcieport can make do with a single MSI vector because all port
> services
> support a shared interrupt.  But I assume your point is that this
> particular product has so many PCIe ports that you're still close
> to the 16 MSIs limit?

Yes, different services of the same port are now sharing a single MSI
(the 3 native ports have PME, aerdrv, bwctrl sharing the same IRQ while
the only service available for switch downstream ports is bwctrl).
However there're 11 ports (3 native ports + 8 switch downstream ports),
so this still leaves too few room for other cards.

Thanks,
Icenowy

> 
> Thanks,
> 
> Lukas
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] PCI: Add quirk to disable PCIe port services on Sophgo SG2042
  2026-05-03  7:10         ` Icenowy Zheng
@ 2026-05-03  8:52           ` Lukas Wunner
  2026-05-06 13:39             ` Manivannan Sadhasivam
  0 siblings, 1 reply; 12+ messages in thread
From: Lukas Wunner @ 2026-05-03  8:52 UTC (permalink / raw)
  To: Icenowy Zheng
  Cc: Manivannan Sadhasivam, Han Gao, Bjorn Helgaas,
	Uwe Kleine-König, Jonathan Cameron, Ilpo Järvinen,
	Kees Cook, Chen Wang, linux-pci, sophgo, linux-kernel,
	linux-riscv, Han Gao, Inochi Amaoto, Vivian Wang, Yao Zi, stable

On Sun, May 03, 2026 at 03:10:58PM +0800, Icenowy Zheng wrote:
> It's used in multiple products, but only one of them (EVBv1, which is
> just an early EVB available for a few people including me) lacks an
> onboard switch, because SG2042 is short on on-chip peripherals. All
> other devices (including two mainlined ones, EVBv2 and Milk-V Pioneer,
> and unmainlined dual socket rack servers; Milk-V Pioneer should be the
> most popular device because it was on shelf) have an onboard switch to
> mitigate the lack of on-chip peripherals in SG2042.

Who knows, maybe someone will design a product which doesn't attach
a PCIe switch to the SoC, maybe the lack of peripherals isn't a
problem for them.

It seems reasonable to accommodate such non-switch use cases as well,
so I think you definitely do not want to quirk all products using that
SoC but only those that need it, regardless whether it's the majority.

> > My point is, you want to constrain this to a specific product, not to
> > the SoC.  Can you maybe solve this by not specifying interrupts in
> > the devicetree for the PCIe switch?
> 
> The PCIe switches are not described in the device tree at all, because
> they're all just discoverable; can we describe them in the DT and
> redirect their interrupts to void?

Yes, somebody did a writeup how to represent switches and endpoints
in the devicetree:

https://farlepet.github.io/linux/2024/02/20/using-linux-device-tree-with-pcie-devices.html

And then I would try providing an empty "interrupts" property for
those switch ports for which you want to avoid port services being
instantiated.

That way you could selectively *enable* port services for specific
ports where it's useful.  Let's say you need DPC on a specific port
to contain errors of an attached NVMe drive.  Just assign a single
MSI for that port and assign no MSIs for all the others.  Much more
flexible than globally disabling port services.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] PCI: Add quirk to disable PCIe port services on Sophgo SG2042
  2026-05-03  8:52           ` Lukas Wunner
@ 2026-05-06 13:39             ` Manivannan Sadhasivam
  2026-05-06 14:22               ` Icenowy Zheng
  0 siblings, 1 reply; 12+ messages in thread
From: Manivannan Sadhasivam @ 2026-05-06 13:39 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Icenowy Zheng, Han Gao, Bjorn Helgaas, Uwe Kleine-König,
	Jonathan Cameron, Ilpo Järvinen, Kees Cook, Chen Wang,
	linux-pci, sophgo, linux-kernel, linux-riscv, Han Gao,
	Inochi Amaoto, Vivian Wang, Yao Zi, stable

On Sun, May 03, 2026 at 10:52:06AM +0200, Lukas Wunner wrote:
> On Sun, May 03, 2026 at 03:10:58PM +0800, Icenowy Zheng wrote:
> > It's used in multiple products, but only one of them (EVBv1, which is
> > just an early EVB available for a few people including me) lacks an
> > onboard switch, because SG2042 is short on on-chip peripherals. All
> > other devices (including two mainlined ones, EVBv2 and Milk-V Pioneer,
> > and unmainlined dual socket rack servers; Milk-V Pioneer should be the
> > most popular device because it was on shelf) have an onboard switch to
> > mitigate the lack of on-chip peripherals in SG2042.
> 
> Who knows, maybe someone will design a product which doesn't attach
> a PCIe switch to the SoC, maybe the lack of peripherals isn't a
> problem for them.
> 
> It seems reasonable to accommodate such non-switch use cases as well,
> so I think you definitely do not want to quirk all products using that
> SoC but only those that need it, regardless whether it's the majority.
> 
> > > My point is, you want to constrain this to a specific product, not to
> > > the SoC.  Can you maybe solve this by not specifying interrupts in
> > > the devicetree for the PCIe switch?
> > 
> > The PCIe switches are not described in the device tree at all, because
> > they're all just discoverable; can we describe them in the DT and
> > redirect their interrupts to void?
> 
> Yes, somebody did a writeup how to represent switches and endpoints
> in the devicetree:
> 
> https://farlepet.github.io/linux/2024/02/20/using-linux-device-tree-with-pcie-devices.html
> 

I wouldn't recommend going this far... We do have some switches described in DT,
but they have some resource requirements like regulator, i2c...

> And then I would try providing an empty "interrupts" property for
> those switch ports for which you want to avoid port services being
> instantiated.
> 

There is no 'interrupts' property in DT binding for PCI bridge nodes. There is
'interrupt-map', but that's used for mapping INTx with platform interrupt
controller.

Moreover, DT should just describe the hardware topology/resource, not
platform constraints.

I'd recommend introducing a new cmdline param to the portdrv driver to disable
using MSIs for services. But the platform limitation would hit one way or the
other if one of the endpoints consume all MSIs...

- Mani

-- 
மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] PCI: Add quirk to disable PCIe port services on Sophgo SG2042
  2026-05-06 13:39             ` Manivannan Sadhasivam
@ 2026-05-06 14:22               ` Icenowy Zheng
  0 siblings, 0 replies; 12+ messages in thread
From: Icenowy Zheng @ 2026-05-06 14:22 UTC (permalink / raw)
  To: Manivannan Sadhasivam, Lukas Wunner
  Cc: Han Gao, Bjorn Helgaas, Uwe Kleine-König, Jonathan Cameron,
	Ilpo Järvinen, Kees Cook, Chen Wang, linux-pci, sophgo,
	linux-kernel, linux-riscv, Han Gao, Inochi Amaoto, Vivian Wang,
	Yao Zi, stable

在 2026-05-06三的 19:09 +0530,Manivannan Sadhasivam写道:
> On Sun, May 03, 2026 at 10:52:06AM +0200, Lukas Wunner wrote:
> > On Sun, May 03, 2026 at 03:10:58PM +0800, Icenowy Zheng wrote:
> > > It's used in multiple products, but only one of them (EVBv1,
> > > which is
> > > just an early EVB available for a few people including me) lacks
> > > an
> > > onboard switch, because SG2042 is short on on-chip peripherals.
> > > All
> > > other devices (including two mainlined ones, EVBv2 and Milk-V
> > > Pioneer,
> > > and unmainlined dual socket rack servers; Milk-V Pioneer should
> > > be the
> > > most popular device because it was on shelf) have an onboard
> > > switch to
> > > mitigate the lack of on-chip peripherals in SG2042.
> > 
> > Who knows, maybe someone will design a product which doesn't attach
> > a PCIe switch to the SoC, maybe the lack of peripherals isn't a
> > problem for them.
> > 
> > It seems reasonable to accommodate such non-switch use cases as
> > well,
> > so I think you definitely do not want to quirk all products using
> > that
> > SoC but only those that need it, regardless whether it's the
> > majority.
> > 
> > > > My point is, you want to constrain this to a specific product,
> > > > not to
> > > > the SoC.  Can you maybe solve this by not specifying interrupts
> > > > in
> > > > the devicetree for the PCIe switch?
> > > 
> > > The PCIe switches are not described in the device tree at all,
> > > because
> > > they're all just discoverable; can we describe them in the DT and
> > > redirect their interrupts to void?
> > 
> > Yes, somebody did a writeup how to represent switches and endpoints
> > in the devicetree:
> > 
> > https://farlepet.github.io/linux/2024/02/20/using-linux-device-tree-with-pcie-devices.html
> > 
> 
> I wouldn't recommend going this far... We do have some switches
> described in DT,
> but they have some resource requirements like regulator, i2c...
> 
> > And then I would try providing an empty "interrupts" property for
> > those switch ports for which you want to avoid port services being
> > instantiated.
> > 
> 
> There is no 'interrupts' property in DT binding for PCI bridge nodes.
> There is
> 'interrupt-map', but that's used for mapping INTx with platform
> interrupt
> controller.
> 
> Moreover, DT should just describe the hardware topology/resource, not
> platform constraints.
> 
> I'd recommend introducing a new cmdline param to the portdrv driver
> to disable
> using MSIs for services. But the platform limitation would hit one

Currently `pcie_ports=compat` command line parameter is used to
workaround the current situation, and this patch is designed to
integrate such workaround into the kernel.

> way or the
> other if one of the endpoints consume all MSIs...

I think one EP claiming multiple MSI (not MSI-X) is an extended
capability of the MSI controller controlled by MSI_FLAG_MULTI_PCI_MSI
flag, which isn't supported for the SG2042 MSI controller driver.

In fact the SG2042 MSI controller isn't originally designed for PCIe
MSIs -- one bit in its doorbell register corresponds to one interrupt.
It's why there's only 16 MSIs available (only one doorbell register is
available and the PCI MSI restricts to 16-bit message data), and also
why multiple PCI MSI isn't supported (multiple PCI MSIs must have
consecutive data values, which isn't possible in such case).

Thanks,
Icenowy

> 
> - Mani


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2026-05-06 14:23 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-31 17:56 [PATCH 0/2] PCI: Allow disabling port services on broken root ports Han Gao
2026-03-31 17:56 ` [PATCH 1/2] PCI: Add per-device flag to disable native PCIe port services Han Gao
2026-03-31 17:56 ` [PATCH 2/2] PCI: Add quirk to disable PCIe port services on Sophgo SG2042 Han Gao
2026-05-01 16:53   ` Manivannan Sadhasivam
2026-05-02 13:58     ` Icenowy Zheng
2026-05-02 19:47       ` Lukas Wunner
2026-05-03  7:10         ` Icenowy Zheng
2026-05-03  8:52           ` Lukas Wunner
2026-05-06 13:39             ` Manivannan Sadhasivam
2026-05-06 14:22               ` Icenowy Zheng
2026-03-31 18:58 ` [PATCH 0/2] PCI: Allow disabling port services on broken root ports Lukas Wunner
2026-03-31 19:07   ` Han Gao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox