linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V3 0/2] PCI: Fix problems about boot & kexec
@ 2025-05-11  8:34 Huacai Chen
  2025-05-11  8:34 ` [PATCH V3 1/2] PCI: Use local_pci_probe() when best selected cpu is offline Huacai Chen
  2025-05-11  8:34 ` [PATCH V3 2/2] PCI: Prevent LS7A Bus Master clearing on kexec Huacai Chen
  0 siblings, 2 replies; 7+ messages in thread
From: Huacai Chen @ 2025-05-11  8:34 UTC (permalink / raw)
  To: Bjorn Helgaas, Lorenzo Pieralisi, Krzysztof Wilczyński,
	Rob Herring
  Cc: linux-pci, Jianmin Lv, Xuefeng Li, Huacai Chen, Jiaxun Yang,
	Huacai Chen, Hongchen Zhang

This series fix two PCI problems about boot & kexec. They are first
observed on Loongson but not limited on Loongson.

V1 -> V2:
1. Update commit message.

V2 -> V3:
1. Resend two patches as a series.

Hongchen Zhang & Huacai Chen (2):
  PCI: Use local_pci_probe() when best selected cpu is offline
  PCI: Prevent LS7A Bus Master clearing on kexec

Signed-off-by: Hongchen Zhang <zhanghongchen@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 drivers/pci/pci-driver.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
---
2.27.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH V3 1/2] PCI: Use local_pci_probe() when best selected cpu is offline
  2025-05-11  8:34 [PATCH V3 0/2] PCI: Fix problems about boot & kexec Huacai Chen
@ 2025-05-11  8:34 ` Huacai Chen
  2025-06-13  7:11   ` Manivannan Sadhasivam
  2025-05-11  8:34 ` [PATCH V3 2/2] PCI: Prevent LS7A Bus Master clearing on kexec Huacai Chen
  1 sibling, 1 reply; 7+ messages in thread
From: Huacai Chen @ 2025-05-11  8:34 UTC (permalink / raw)
  To: Bjorn Helgaas, Lorenzo Pieralisi, Krzysztof Wilczyński,
	Rob Herring
  Cc: linux-pci, Jianmin Lv, Xuefeng Li, Huacai Chen, Jiaxun Yang,
	Hongchen Zhang, stable, Huacai Chen

From: Hongchen Zhang <zhanghongchen@loongson.cn>

When the best selected CPU is offline, work_on_cpu() will stuck forever.
This can be happen if a node is online while all its CPUs are offline
(we can use "maxcpus=1" without "nr_cpus=1" to reproduce it), Therefore,
in this case, we should call local_pci_probe() instead of work_on_cpu().

Cc: <stable@vger.kernel.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Hongchen Zhang <zhanghongchen@loongson.cn>
---
 drivers/pci/pci-driver.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index c8bd71a739f7..602838416e6a 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -386,7 +386,7 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
 		free_cpumask_var(wq_domain_mask);
 	}
 
-	if (cpu < nr_cpu_ids)
+	if ((cpu < nr_cpu_ids) && cpu_online(cpu))
 		error = work_on_cpu(cpu, local_pci_probe, &ddi);
 	else
 		error = local_pci_probe(&ddi);
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH V3 2/2] PCI: Prevent LS7A Bus Master clearing on kexec
  2025-05-11  8:34 [PATCH V3 0/2] PCI: Fix problems about boot & kexec Huacai Chen
  2025-05-11  8:34 ` [PATCH V3 1/2] PCI: Use local_pci_probe() when best selected cpu is offline Huacai Chen
@ 2025-05-11  8:34 ` Huacai Chen
  2025-06-13  7:15   ` Manivannan Sadhasivam
  1 sibling, 1 reply; 7+ messages in thread
From: Huacai Chen @ 2025-05-11  8:34 UTC (permalink / raw)
  To: Bjorn Helgaas, Lorenzo Pieralisi, Krzysztof Wilczyński,
	Rob Herring
  Cc: linux-pci, Jianmin Lv, Xuefeng Li, Huacai Chen, Jiaxun Yang,
	Huacai Chen, stable, Ming Wang

This is similar to commit 62b6dee1b44a ("PCI/portdrv: Prevent LS7A Bus
Master clearing on shutdown"), which prevents LS7A Bus Master clearing
on kexec.

The key point of this is to work around the LS7A defect that clearing
PCI_COMMAND_MASTER prevents MMIO requests from going downstream, and
we may need to do that even after .shutdown(), e.g., to print console
messages. And in this case we rely on .shutdown() for the downstream
devices to disable interrupts and DMA.

Only skip Bus Master clearing on bridges because endpoint devices still
need it.

Cc: <stable@vger.kernel.org>
Signed-off-by: Ming Wang <wangming01@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 drivers/pci/pci-driver.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 602838416e6a..8a1e32367a06 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -517,7 +517,7 @@ static void pci_device_shutdown(struct device *dev)
 	 * If it is not a kexec reboot, firmware will hit the PCI
 	 * devices with big hammer and stop their DMA any way.
 	 */
-	if (kexec_in_progress && (pci_dev->current_state <= PCI_D3hot))
+	if (kexec_in_progress && !pci_is_bridge(pci_dev) && (pci_dev->current_state <= PCI_D3hot))
 		pci_clear_master(pci_dev);
 }
 
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH V3 1/2] PCI: Use local_pci_probe() when best selected cpu is offline
  2025-05-11  8:34 ` [PATCH V3 1/2] PCI: Use local_pci_probe() when best selected cpu is offline Huacai Chen
@ 2025-06-13  7:11   ` Manivannan Sadhasivam
  2025-06-19 12:25     ` Huacai Chen
  0 siblings, 1 reply; 7+ messages in thread
From: Manivannan Sadhasivam @ 2025-06-13  7:11 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Bjorn Helgaas, Lorenzo Pieralisi, Krzysztof Wilczyński,
	Rob Herring, linux-pci, Jianmin Lv, Xuefeng Li, Huacai Chen,
	Jiaxun Yang, Hongchen Zhang, stable

On Sun, May 11, 2025 at 04:34:12PM +0800, Huacai Chen wrote:
> From: Hongchen Zhang <zhanghongchen@loongson.cn>
> 
> When the best selected CPU is offline, work_on_cpu() will stuck forever.
> This can be happen if a node is online while all its CPUs are offline
> (we can use "maxcpus=1" without "nr_cpus=1" to reproduce it), Therefore,
> in this case, we should call local_pci_probe() instead of work_on_cpu().
> 

Just curious, did you encounter this problem in a real world usecase or just
found the issue while playing with maxcpus/nr_cpus parameters?

> Cc: <stable@vger.kernel.org>

I believe the fixes tag for this patch is 873392ca514f8.

- Mani

> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> Signed-off-by: Hongchen Zhang <zhanghongchen@loongson.cn>
> ---
>  drivers/pci/pci-driver.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index c8bd71a739f7..602838416e6a 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -386,7 +386,7 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
>  		free_cpumask_var(wq_domain_mask);
>  	}
>  
> -	if (cpu < nr_cpu_ids)
> +	if ((cpu < nr_cpu_ids) && cpu_online(cpu))
>  		error = work_on_cpu(cpu, local_pci_probe, &ddi);
>  	else
>  		error = local_pci_probe(&ddi);
> -- 
> 2.47.1
> 

-- 
மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH V3 2/2] PCI: Prevent LS7A Bus Master clearing on kexec
  2025-05-11  8:34 ` [PATCH V3 2/2] PCI: Prevent LS7A Bus Master clearing on kexec Huacai Chen
@ 2025-06-13  7:15   ` Manivannan Sadhasivam
  2025-06-19 12:27     ` Huacai Chen
  0 siblings, 1 reply; 7+ messages in thread
From: Manivannan Sadhasivam @ 2025-06-13  7:15 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Bjorn Helgaas, Lorenzo Pieralisi, Krzysztof Wilczyński,
	Rob Herring, linux-pci, Jianmin Lv, Xuefeng Li, Huacai Chen,
	Jiaxun Yang, stable, Ming Wang

On Sun, May 11, 2025 at 04:34:13PM +0800, Huacai Chen wrote:
> This is similar to commit 62b6dee1b44a ("PCI/portdrv: Prevent LS7A Bus
> Master clearing on shutdown"), which prevents LS7A Bus Master clearing
> on kexec.
> 

So 62b6dee1b44a never worked as intented because the PCI core still cleared bus
master bit?

> The key point of this is to work around the LS7A defect that clearing
> PCI_COMMAND_MASTER prevents MMIO requests from going downstream, and
> we may need to do that even after .shutdown(), e.g., to print console
> messages. And in this case we rely on .shutdown() for the downstream
> devices to disable interrupts and DMA.
> 
> Only skip Bus Master clearing on bridges because endpoint devices still
> need it.
> 
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Ming Wang <wangming01@loongson.cn>
> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> ---
>  drivers/pci/pci-driver.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 602838416e6a..8a1e32367a06 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -517,7 +517,7 @@ static void pci_device_shutdown(struct device *dev)
>  	 * If it is not a kexec reboot, firmware will hit the PCI
>  	 * devices with big hammer and stop their DMA any way.
>  	 */
> -	if (kexec_in_progress && (pci_dev->current_state <= PCI_D3hot))
> +	if (kexec_in_progress && !pci_is_bridge(pci_dev) && (pci_dev->current_state <= PCI_D3hot))

I'm not a Kexec expert, but wouldn't not clearing the bus mastering for all PCI
bridges safe? You are making a generic change for a defect in your hardware, so
it might not apply to all other hardwares.

- Mani

-- 
மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH V3 1/2] PCI: Use local_pci_probe() when best selected cpu is offline
  2025-06-13  7:11   ` Manivannan Sadhasivam
@ 2025-06-19 12:25     ` Huacai Chen
  0 siblings, 0 replies; 7+ messages in thread
From: Huacai Chen @ 2025-06-19 12:25 UTC (permalink / raw)
  To: Manivannan Sadhasivam
  Cc: Huacai Chen, Bjorn Helgaas, Lorenzo Pieralisi,
	Krzysztof Wilczyński, Rob Herring, linux-pci, Jianmin Lv,
	Xuefeng Li, Jiaxun Yang, Hongchen Zhang, stable

Hi, Manivannan,

Sorry for the late reply.

On Fri, Jun 13, 2025 at 3:11 PM Manivannan Sadhasivam <mani@kernel.org> wrote:
>
> On Sun, May 11, 2025 at 04:34:12PM +0800, Huacai Chen wrote:
> > From: Hongchen Zhang <zhanghongchen@loongson.cn>
> >
> > When the best selected CPU is offline, work_on_cpu() will stuck forever.
> > This can be happen if a node is online while all its CPUs are offline
> > (we can use "maxcpus=1" without "nr_cpus=1" to reproduce it), Therefore,
> > in this case, we should call local_pci_probe() instead of work_on_cpu().
> >
>
> Just curious, did you encounter this problem in a real world usecase or just
> found the issue while playing with maxcpus/nr_cpus parameters?
When we debug kdump we tried to use different maxcpus/nr_cpus
combinations, and we found this problem.

>
> > Cc: <stable@vger.kernel.org>
>
> I believe the fixes tag for this patch is 873392ca514f8.
Yes, but the code has changed many times, this patch cannot be applied
as early as 873392ca514f8.


Huacai

>
> - Mani
>
> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > Signed-off-by: Hongchen Zhang <zhanghongchen@loongson.cn>
> > ---
> >  drivers/pci/pci-driver.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> > index c8bd71a739f7..602838416e6a 100644
> > --- a/drivers/pci/pci-driver.c
> > +++ b/drivers/pci/pci-driver.c
> > @@ -386,7 +386,7 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
> >               free_cpumask_var(wq_domain_mask);
> >       }
> >
> > -     if (cpu < nr_cpu_ids)
> > +     if ((cpu < nr_cpu_ids) && cpu_online(cpu))
> >               error = work_on_cpu(cpu, local_pci_probe, &ddi);
> >       else
> >               error = local_pci_probe(&ddi);
> > --
> > 2.47.1
> >
>
> --
> மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH V3 2/2] PCI: Prevent LS7A Bus Master clearing on kexec
  2025-06-13  7:15   ` Manivannan Sadhasivam
@ 2025-06-19 12:27     ` Huacai Chen
  0 siblings, 0 replies; 7+ messages in thread
From: Huacai Chen @ 2025-06-19 12:27 UTC (permalink / raw)
  To: Manivannan Sadhasivam
  Cc: Huacai Chen, Bjorn Helgaas, Lorenzo Pieralisi,
	Krzysztof Wilczyński, Rob Herring, linux-pci, Jianmin Lv,
	Xuefeng Li, Jiaxun Yang, stable, Ming Wang

Hi, Manivannan,

Sorry for the late reply.

On Fri, Jun 13, 2025 at 3:15 PM Manivannan Sadhasivam <mani@kernel.org> wrote:
>
> On Sun, May 11, 2025 at 04:34:13PM +0800, Huacai Chen wrote:
> > This is similar to commit 62b6dee1b44a ("PCI/portdrv: Prevent LS7A Bus
> > Master clearing on shutdown"), which prevents LS7A Bus Master clearing
> > on kexec.
> >
>
> So 62b6dee1b44a never worked as intented because the PCI core still cleared bus
> master bit?
Commit 62b6dee1b44a only solved the poweroff/reboot problem, because
in those cases kexec_in_progress is false and pci_clear_master() is
skipped.


>
> > The key point of this is to work around the LS7A defect that clearing
> > PCI_COMMAND_MASTER prevents MMIO requests from going downstream, and
> > we may need to do that even after .shutdown(), e.g., to print console
> > messages. And in this case we rely on .shutdown() for the downstream
> > devices to disable interrupts and DMA.
> >
> > Only skip Bus Master clearing on bridges because endpoint devices still
> > need it.
> >
> > Cc: <stable@vger.kernel.org>
> > Signed-off-by: Ming Wang <wangming01@loongson.cn>
> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > ---
> >  drivers/pci/pci-driver.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> > index 602838416e6a..8a1e32367a06 100644
> > --- a/drivers/pci/pci-driver.c
> > +++ b/drivers/pci/pci-driver.c
> > @@ -517,7 +517,7 @@ static void pci_device_shutdown(struct device *dev)
> >        * If it is not a kexec reboot, firmware will hit the PCI
> >        * devices with big hammer and stop their DMA any way.
> >        */
> > -     if (kexec_in_progress && (pci_dev->current_state <= PCI_D3hot))
> > +     if (kexec_in_progress && !pci_is_bridge(pci_dev) && (pci_dev->current_state <= PCI_D3hot))
>
> I'm not a Kexec expert, but wouldn't not clearing the bus mastering for all PCI
> bridges safe? You are making a generic change for a defect in your hardware, so
> it might not apply to all other hardwares.
I think most DMA comes from endpoint devices rather than bridges so
kexec is probably safe. When I solve the problem in commit
62b6dee1b44a I want to make a special case for Loongson but Bjorn
suggests doing a generic change, so I also do a generic change for
kexec.


Huacai

>
> - Mani
>
> --
> மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-06-19 12:27 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-11  8:34 [PATCH V3 0/2] PCI: Fix problems about boot & kexec Huacai Chen
2025-05-11  8:34 ` [PATCH V3 1/2] PCI: Use local_pci_probe() when best selected cpu is offline Huacai Chen
2025-06-13  7:11   ` Manivannan Sadhasivam
2025-06-19 12:25     ` Huacai Chen
2025-05-11  8:34 ` [PATCH V3 2/2] PCI: Prevent LS7A Bus Master clearing on kexec Huacai Chen
2025-06-13  7:15   ` Manivannan Sadhasivam
2025-06-19 12:27     ` Huacai Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).