All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] PCI: take the rescan lock when adding devices during host probe
@ 2024-10-03  8:43 Bartosz Golaszewski
  2024-10-03  8:50 ` Manivannan Sadhasivam
  2024-10-10  9:17 ` Bartosz Golaszewski
  0 siblings, 2 replies; 5+ messages in thread
From: Bartosz Golaszewski @ 2024-10-03  8:43 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci, linux-kernel, Bartosz Golaszewski, Konrad Dybcio

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

Since adding the PCI power control code, we may end up with a race
between the pwrctl platform device rescanning the bus and the host
controller probe function. The latter needs to take the rescan lock when
adding devices or we may end up in an undefined state having two
incompletely added devices and hit the following crash when trying to
remove the device over sysfs:

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
Internal error: Oops: 0000000096000004 [#1] SMP
Call trace:
  __pi_strlen+0x14/0x150
  kernfs_find_ns+0x80/0x13c
  kernfs_remove_by_name_ns+0x54/0xf0
  sysfs_remove_bin_file+0x24/0x34
  pci_remove_resource_files+0x3c/0x84
  pci_remove_sysfs_dev_files+0x28/0x38
  pci_stop_bus_device+0x8c/0xd8
  pci_stop_bus_device+0x40/0xd8
  pci_stop_and_remove_bus_device_locked+0x28/0x48
  remove_store+0x70/0xb0
  dev_attr_store+0x20/0x38
  sysfs_kf_write+0x58/0x78
  kernfs_fop_write_iter+0xe8/0x184
  vfs_write+0x2dc/0x308
  ksys_write+0x7c/0xec

Reported-by: Konrad Dybcio <konradybcio@kernel.org>
Tested-by: Konrad Dybcio <konradybcio@kernel.org>
Fixes: 4565d2652a37 ("PCI/pwrctl: Add PCI power control core code")
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
---
v1 -> v2:
- improve the commit message, add example stack trace

 drivers/pci/probe.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 4f68414c3086..f1615805f5b0 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -3105,7 +3105,9 @@ int pci_host_probe(struct pci_host_bridge *bridge)
 	list_for_each_entry(child, &bus->children, node)
 		pcie_bus_configure_settings(child);
 
+	pci_lock_rescan_remove();
 	pci_bus_add_devices(bus);
+	pci_unlock_rescan_remove();
 	return 0;
 }
 EXPORT_SYMBOL_GPL(pci_host_probe);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] PCI: take the rescan lock when adding devices during host probe
  2024-10-03  8:43 [PATCH v2] PCI: take the rescan lock when adding devices during host probe Bartosz Golaszewski
@ 2024-10-03  8:50 ` Manivannan Sadhasivam
  2024-10-10  9:17 ` Bartosz Golaszewski
  1 sibling, 0 replies; 5+ messages in thread
From: Manivannan Sadhasivam @ 2024-10-03  8:50 UTC (permalink / raw)
  To: Bartosz Golaszewski
  Cc: Bjorn Helgaas, linux-pci, linux-kernel, Bartosz Golaszewski,
	Konrad Dybcio

On Thu, Oct 03, 2024 at 10:43:41AM +0200, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> 
> Since adding the PCI power control code, we may end up with a race
> between the pwrctl platform device rescanning the bus and the host
> controller probe function. The latter needs to take the rescan lock when
> adding devices or we may end up in an undefined state having two
> incompletely added devices and hit the following crash when trying to
> remove the device over sysfs:
> 
> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> Internal error: Oops: 0000000096000004 [#1] SMP
> Call trace:
>   __pi_strlen+0x14/0x150
>   kernfs_find_ns+0x80/0x13c
>   kernfs_remove_by_name_ns+0x54/0xf0
>   sysfs_remove_bin_file+0x24/0x34
>   pci_remove_resource_files+0x3c/0x84
>   pci_remove_sysfs_dev_files+0x28/0x38
>   pci_stop_bus_device+0x8c/0xd8
>   pci_stop_bus_device+0x40/0xd8
>   pci_stop_and_remove_bus_device_locked+0x28/0x48
>   remove_store+0x70/0xb0
>   dev_attr_store+0x20/0x38
>   sysfs_kf_write+0x58/0x78
>   kernfs_fop_write_iter+0xe8/0x184
>   vfs_write+0x2dc/0x308
>   ksys_write+0x7c/0xec
> 

Thanks for adding the crash log. It always helps to have the log in patch
description to find *this* patch.

> Reported-by: Konrad Dybcio <konradybcio@kernel.org>
> Tested-by: Konrad Dybcio <konradybcio@kernel.org>
> Fixes: 4565d2652a37 ("PCI/pwrctl: Add PCI power control core code")
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>

- Mani

> ---
> v1 -> v2:
> - improve the commit message, add example stack trace
> 
>  drivers/pci/probe.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 4f68414c3086..f1615805f5b0 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -3105,7 +3105,9 @@ int pci_host_probe(struct pci_host_bridge *bridge)
>  	list_for_each_entry(child, &bus->children, node)
>  		pcie_bus_configure_settings(child);
>  
> +	pci_lock_rescan_remove();
>  	pci_bus_add_devices(bus);
> +	pci_unlock_rescan_remove();
>  	return 0;
>  }
>  EXPORT_SYMBOL_GPL(pci_host_probe);
> -- 
> 2.30.2
> 
> 

-- 
மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] PCI: take the rescan lock when adding devices during host probe
  2024-10-03  8:43 [PATCH v2] PCI: take the rescan lock when adding devices during host probe Bartosz Golaszewski
  2024-10-03  8:50 ` Manivannan Sadhasivam
@ 2024-10-10  9:17 ` Bartosz Golaszewski
  2024-10-12 14:31   ` Bjorn Helgaas
  1 sibling, 1 reply; 5+ messages in thread
From: Bartosz Golaszewski @ 2024-10-10  9:17 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci, linux-kernel, Bartosz Golaszewski, Konrad Dybcio

On Thu, Oct 3, 2024 at 10:43 AM Bartosz Golaszewski <brgl@bgdev.pl> wrote:
>
> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>
> Since adding the PCI power control code, we may end up with a race
> between the pwrctl platform device rescanning the bus and the host
> controller probe function. The latter needs to take the rescan lock when
> adding devices or we may end up in an undefined state having two
> incompletely added devices and hit the following crash when trying to
> remove the device over sysfs:
>
> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> Internal error: Oops: 0000000096000004 [#1] SMP
> Call trace:
>   __pi_strlen+0x14/0x150
>   kernfs_find_ns+0x80/0x13c
>   kernfs_remove_by_name_ns+0x54/0xf0
>   sysfs_remove_bin_file+0x24/0x34
>   pci_remove_resource_files+0x3c/0x84
>   pci_remove_sysfs_dev_files+0x28/0x38
>   pci_stop_bus_device+0x8c/0xd8
>   pci_stop_bus_device+0x40/0xd8
>   pci_stop_and_remove_bus_device_locked+0x28/0x48
>   remove_store+0x70/0xb0
>   dev_attr_store+0x20/0x38
>   sysfs_kf_write+0x58/0x78
>   kernfs_fop_write_iter+0xe8/0x184
>   vfs_write+0x2dc/0x308
>   ksys_write+0x7c/0xec
>
> Reported-by: Konrad Dybcio <konradybcio@kernel.org>
> Tested-by: Konrad Dybcio <konradybcio@kernel.org>
> Fixes: 4565d2652a37 ("PCI/pwrctl: Add PCI power control core code")
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> ---

It's been a week, so gentle ping - can this be picked up into v6.12?

Thanks,
Bartosz

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] PCI: take the rescan lock when adding devices during host probe
  2024-10-10  9:17 ` Bartosz Golaszewski
@ 2024-10-12 14:31   ` Bjorn Helgaas
  2024-10-14 12:21     ` Bartosz Golaszewski
  0 siblings, 1 reply; 5+ messages in thread
From: Bjorn Helgaas @ 2024-10-12 14:31 UTC (permalink / raw)
  To: Bartosz Golaszewski
  Cc: Bjorn Helgaas, linux-pci, linux-kernel, Bartosz Golaszewski,
	Konrad Dybcio

On Thu, Oct 10, 2024 at 11:17:47AM +0200, Bartosz Golaszewski wrote:
> On Thu, Oct 3, 2024 at 10:43 AM Bartosz Golaszewski <brgl@bgdev.pl> wrote:
> >
> > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> >
> > Since adding the PCI power control code, we may end up with a race
> > between the pwrctl platform device rescanning the bus and the host
> > controller probe function. The latter needs to take the rescan lock when
> > adding devices or we may end up in an undefined state having two
> > incompletely added devices and hit the following crash when trying to
> > remove the device over sysfs:
> >
> > Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> > Internal error: Oops: 0000000096000004 [#1] SMP
> > Call trace:
> >   __pi_strlen+0x14/0x150
> >   kernfs_find_ns+0x80/0x13c
> >   kernfs_remove_by_name_ns+0x54/0xf0
> >   sysfs_remove_bin_file+0x24/0x34
> >   pci_remove_resource_files+0x3c/0x84
> >   pci_remove_sysfs_dev_files+0x28/0x38
> >   pci_stop_bus_device+0x8c/0xd8
> >   pci_stop_bus_device+0x40/0xd8
> >   pci_stop_and_remove_bus_device_locked+0x28/0x48
> >   remove_store+0x70/0xb0
> >   dev_attr_store+0x20/0x38
> >   sysfs_kf_write+0x58/0x78
> >   kernfs_fop_write_iter+0xe8/0x184
> >   vfs_write+0x2dc/0x308
> >   ksys_write+0x7c/0xec
> >
> > Reported-by: Konrad Dybcio <konradybcio@kernel.org>
> > Tested-by: Konrad Dybcio <konradybcio@kernel.org>
> > Fixes: 4565d2652a37 ("PCI/pwrctl: Add PCI power control core code")
> > Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> > ---
> 
> It's been a week, so gentle ping - can this be picked up into v6.12?

I hoped we could fix the similar latent issues in other drivers, but
yes, we can get this in v6.12.  Thanks for the hint that it should go
there.  I'll pick it up when I return from vacation on Wednesday.

Bjorn

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] PCI: take the rescan lock when adding devices during host probe
  2024-10-12 14:31   ` Bjorn Helgaas
@ 2024-10-14 12:21     ` Bartosz Golaszewski
  0 siblings, 0 replies; 5+ messages in thread
From: Bartosz Golaszewski @ 2024-10-14 12:21 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, linux-pci, linux-kernel, Bartosz Golaszewski,
	Konrad Dybcio

On Sat, Oct 12, 2024 at 4:31 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Thu, Oct 10, 2024 at 11:17:47AM +0200, Bartosz Golaszewski wrote:
> > On Thu, Oct 3, 2024 at 10:43 AM Bartosz Golaszewski <brgl@bgdev.pl> wrote:
> > >
> > > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> > >
> > > Since adding the PCI power control code, we may end up with a race
> > > between the pwrctl platform device rescanning the bus and the host
> > > controller probe function. The latter needs to take the rescan lock when
> > > adding devices or we may end up in an undefined state having two
> > > incompletely added devices and hit the following crash when trying to
> > > remove the device over sysfs:
> > >
> > > Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> > > Internal error: Oops: 0000000096000004 [#1] SMP
> > > Call trace:
> > >   __pi_strlen+0x14/0x150
> > >   kernfs_find_ns+0x80/0x13c
> > >   kernfs_remove_by_name_ns+0x54/0xf0
> > >   sysfs_remove_bin_file+0x24/0x34
> > >   pci_remove_resource_files+0x3c/0x84
> > >   pci_remove_sysfs_dev_files+0x28/0x38
> > >   pci_stop_bus_device+0x8c/0xd8
> > >   pci_stop_bus_device+0x40/0xd8
> > >   pci_stop_and_remove_bus_device_locked+0x28/0x48
> > >   remove_store+0x70/0xb0
> > >   dev_attr_store+0x20/0x38
> > >   sysfs_kf_write+0x58/0x78
> > >   kernfs_fop_write_iter+0xe8/0x184
> > >   vfs_write+0x2dc/0x308
> > >   ksys_write+0x7c/0xec
> > >
> > > Reported-by: Konrad Dybcio <konradybcio@kernel.org>
> > > Tested-by: Konrad Dybcio <konradybcio@kernel.org>
> > > Fixes: 4565d2652a37 ("PCI/pwrctl: Add PCI power control core code")
> > > Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> > > ---
> >
> > It's been a week, so gentle ping - can this be picked up into v6.12?
>
> I hoped we could fix the similar latent issues in other drivers, but
> yes, we can get this in v6.12.  Thanks for the hint that it should go
> there.  I'll pick it up when I return from vacation on Wednesday.
>

Sure this can still be done but this patch fixes an urgent issue and I
think it warrants fast tracking it to mainline.

Bart

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-10-14 12:21 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-03  8:43 [PATCH v2] PCI: take the rescan lock when adding devices during host probe Bartosz Golaszewski
2024-10-03  8:50 ` Manivannan Sadhasivam
2024-10-10  9:17 ` Bartosz Golaszewski
2024-10-12 14:31   ` Bjorn Helgaas
2024-10-14 12:21     ` Bartosz Golaszewski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.