* [PATCH] PCI: take the rescan lock when adding devices during host probe
@ 2024-09-26 13:09 Bartosz Golaszewski
2024-09-30 16:09 ` Konrad Dybcio
2024-10-01 21:11 ` Bjorn Helgaas
0 siblings, 2 replies; 6+ messages in thread
From: Bartosz Golaszewski @ 2024-09-26 13:09 UTC (permalink / raw)
To: Bjorn Helgaas; +Cc: linux-pci, linux-kernel, Bartosz Golaszewski, Konrad Dybcio
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Since adding the PCI power control code, we may end up with a race
between the pwrctl platform device rescanning the bus and the host
controller probe function. The latter needs to take the rescan lock when
adding devices or may crash.
Reported-by: Konrad Dybcio <konradybcio@kernel.org>
Fixes: 4565d2652a37 ("PCI/pwrctl: Add PCI power control core code")
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
---
drivers/pci/probe.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 4f68414c3086..f1615805f5b0 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -3105,7 +3105,9 @@ int pci_host_probe(struct pci_host_bridge *bridge)
list_for_each_entry(child, &bus->children, node)
pcie_bus_configure_settings(child);
+ pci_lock_rescan_remove();
pci_bus_add_devices(bus);
+ pci_unlock_rescan_remove();
return 0;
}
EXPORT_SYMBOL_GPL(pci_host_probe);
--
2.30.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] PCI: take the rescan lock when adding devices during host probe
2024-09-26 13:09 [PATCH] PCI: take the rescan lock when adding devices during host probe Bartosz Golaszewski
@ 2024-09-30 16:09 ` Konrad Dybcio
2024-10-01 21:11 ` Bjorn Helgaas
1 sibling, 0 replies; 6+ messages in thread
From: Konrad Dybcio @ 2024-09-30 16:09 UTC (permalink / raw)
To: Bartosz Golaszewski, Bjorn Helgaas
Cc: linux-pci, linux-kernel, Bartosz Golaszewski, Konrad Dybcio
On 26.09.2024 3:09 PM, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>
> Since adding the PCI power control code, we may end up with a race
> between the pwrctl platform device rescanning the bus and the host
> controller probe function. The latter needs to take the rescan lock when
> adding devices or may crash.
>
> Reported-by: Konrad Dybcio <konradybcio@kernel.org>
> Fixes: 4565d2652a37 ("PCI/pwrctl: Add PCI power control core code")
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> ---
Tested-by: Konrad Dybcio <konradybcio@kernel.org>
Konrad
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] PCI: take the rescan lock when adding devices during host probe
2024-09-26 13:09 [PATCH] PCI: take the rescan lock when adding devices during host probe Bartosz Golaszewski
2024-09-30 16:09 ` Konrad Dybcio
@ 2024-10-01 21:11 ` Bjorn Helgaas
2024-10-02 8:36 ` Bartosz Golaszewski
1 sibling, 1 reply; 6+ messages in thread
From: Bjorn Helgaas @ 2024-10-01 21:11 UTC (permalink / raw)
To: Bartosz Golaszewski
Cc: Bjorn Helgaas, linux-pci, linux-kernel, Bartosz Golaszewski,
Konrad Dybcio
On Thu, Sep 26, 2024 at 03:09:23PM +0200, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>
> Since adding the PCI power control code, we may end up with a race
> between the pwrctl platform device rescanning the bus and the host
> controller probe function. The latter needs to take the rescan lock when
> adding devices or may crash.
>
> Reported-by: Konrad Dybcio <konradybcio@kernel.org>
> Fixes: 4565d2652a37 ("PCI/pwrctl: Add PCI power control core code")
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> ---
> drivers/pci/probe.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 4f68414c3086..f1615805f5b0 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -3105,7 +3105,9 @@ int pci_host_probe(struct pci_host_bridge *bridge)
> list_for_each_entry(child, &bus->children, node)
> pcie_bus_configure_settings(child);
>
> + pci_lock_rescan_remove();
> pci_bus_add_devices(bus);
> + pci_unlock_rescan_remove();
Seems like we do need locking here, but don't we need a more
comprehensive change? There are many other callers of
pci_bus_add_devices(), and most of them look similarly unprotected.
> return 0;
> }
> EXPORT_SYMBOL_GPL(pci_host_probe);
> --
> 2.30.2
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] PCI: take the rescan lock when adding devices during host probe
2024-10-01 21:11 ` Bjorn Helgaas
@ 2024-10-02 8:36 ` Bartosz Golaszewski
2024-10-02 20:31 ` Konrad Dybcio
0 siblings, 1 reply; 6+ messages in thread
From: Bartosz Golaszewski @ 2024-10-02 8:36 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Bjorn Helgaas, linux-pci, linux-kernel, Bartosz Golaszewski,
Konrad Dybcio
On Tue, Oct 1, 2024 at 11:11 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Thu, Sep 26, 2024 at 03:09:23PM +0200, Bartosz Golaszewski wrote:
> > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> >
> > Since adding the PCI power control code, we may end up with a race
> > between the pwrctl platform device rescanning the bus and the host
> > controller probe function. The latter needs to take the rescan lock when
> > adding devices or may crash.
> >
> > Reported-by: Konrad Dybcio <konradybcio@kernel.org>
> > Fixes: 4565d2652a37 ("PCI/pwrctl: Add PCI power control core code")
> > Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> > ---
> > drivers/pci/probe.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> > index 4f68414c3086..f1615805f5b0 100644
> > --- a/drivers/pci/probe.c
> > +++ b/drivers/pci/probe.c
> > @@ -3105,7 +3105,9 @@ int pci_host_probe(struct pci_host_bridge *bridge)
> > list_for_each_entry(child, &bus->children, node)
> > pcie_bus_configure_settings(child);
> >
> > + pci_lock_rescan_remove();
> > pci_bus_add_devices(bus);
> > + pci_unlock_rescan_remove();
>
> Seems like we do need locking here, but don't we need a more
> comprehensive change? There are many other callers of
> pci_bus_add_devices(), and most of them look similarly unprotected.
>
From a quick glance it looks like the majority of users are specific
drivers (controller, hotplug, etc.). The calls inside pci_rescan_bus()
and pci_rescan_bus_bridge_resize() are already protected from what I
can tell. I'm not saying that the driver calls shouldn't be fixed but
there's no immediate danger. This however fixes an issue we hit with
PCI core so I'd send it upstream now and then we can think about the
other use-cases.
Bart
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] PCI: take the rescan lock when adding devices during host probe
2024-10-02 8:36 ` Bartosz Golaszewski
@ 2024-10-02 20:31 ` Konrad Dybcio
2024-10-02 20:53 ` Bjorn Helgaas
0 siblings, 1 reply; 6+ messages in thread
From: Konrad Dybcio @ 2024-10-02 20:31 UTC (permalink / raw)
To: Bartosz Golaszewski, Bjorn Helgaas
Cc: Bjorn Helgaas, linux-pci, linux-kernel, Bartosz Golaszewski,
Konrad Dybcio
On 2.10.2024 10:36 AM, Bartosz Golaszewski wrote:
> On Tue, Oct 1, 2024 at 11:11 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>>
>> On Thu, Sep 26, 2024 at 03:09:23PM +0200, Bartosz Golaszewski wrote:
>>> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>>>
>>> Since adding the PCI power control code, we may end up with a race
>>> between the pwrctl platform device rescanning the bus and the host
>>> controller probe function. The latter needs to take the rescan lock when
>>> adding devices or may crash.
>>>
>>> Reported-by: Konrad Dybcio <konradybcio@kernel.org>
>>> Fixes: 4565d2652a37 ("PCI/pwrctl: Add PCI power control core code")
>>> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>>> ---
>>> drivers/pci/probe.c | 2 ++
>>> 1 file changed, 2 insertions(+)
>>>
>>> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
>>> index 4f68414c3086..f1615805f5b0 100644
>>> --- a/drivers/pci/probe.c
>>> +++ b/drivers/pci/probe.c
>>> @@ -3105,7 +3105,9 @@ int pci_host_probe(struct pci_host_bridge *bridge)
>>> list_for_each_entry(child, &bus->children, node)
>>> pcie_bus_configure_settings(child);
>>>
>>> + pci_lock_rescan_remove();
>>> pci_bus_add_devices(bus);
>>> + pci_unlock_rescan_remove();
>>
>> Seems like we do need locking here, but don't we need a more
>> comprehensive change? There are many other callers of
>> pci_bus_add_devices(), and most of them look similarly unprotected.
>>
>
> From a quick glance it looks like the majority of users are specific
> drivers (controller, hotplug, etc.). The calls inside pci_rescan_bus()
> and pci_rescan_bus_bridge_resize() are already protected from what I
> can tell. I'm not saying that the driver calls shouldn't be fixed but
> there's no immediate danger. This however fixes an issue we hit with
> PCI core so I'd send it upstream now and then we can think about the
> other use-cases.
Probably worth showing an example of how this can manifest:
removed a device through sysfs and called bus rescan:
[ 63.851901] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[ 63.861048] Mem abort info:
[ 63.863940] ESR = 0x0000000096000004
[ 63.867822] EC = 0x25: DABT (current EL), IL = 32 bits
[ 63.873291] SET = 0, FnV = 0
[ 63.876440] EA = 0, S1PTW = 0
[ 63.879688] FSC = 0x04: level 0 translation fault
[ 63.884711] Data abort info:
[ 63.887697] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[ 63.893344] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 63.898549] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 63.904009] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000116c36000
[ 63.910644] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
[ 63.917683] Internal error: Oops: 0000000096000004 [#1] SMP
[ 63.923413] Modules linked in:
[ 63.926561] CPU: 1 UID: 0 PID: 530 Comm: bash Tainted: G W <snip> #10176
[ 63.938971] Tainted: [W]=WARN
[ 63.942037] Hardware name: <snip>
[ 63.951239] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 63.958398] pc : __pi_strlen+0x14/0x150
[ 63.962350] lr : kernfs_name_hash+0x24/0x88
[ 63.966668] sp : ffff800083c43af0
[ 63.970089] x29: ffff800083c43af0 x28: ffff519888b83500 x27: 0000000000000000
[ 63.977420] x26: 0000000000000000 x25: 0000000000000000 x24: ffff519888b83500
[ 63.984751] x23: 0000000000000011 x22: ffff519886bd6040 x21: ffff519886bd5b00
[ 63.992081] x20: 0000000000000000 x19: 0000000000000000 x18: ffff80008a0bd0a8
[ 63.999410] x17: 0000000000000001 x16: ffff519888b83de8 x15: ffffa08dea3f5bf0
[ 64.006741] x14: ffff519888b83de8 x13: ffffffffffffffff x12: 0000000000000028
[ 64.014076] x11: ffffa08dea3f5bf0 x10: 0000000000000012 x9 : 0000000000000001
[ 64.021403] x8 : 0101010101010101 x7 : ffffa08de7a5e844 x6 : 0000000000000000
[ 64.028730] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
[ 64.036062] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000
[ 64.043390] Call trace:
[ 64.045918] __pi_strlen+0x14/0x150
[ 64.049508] kernfs_find_ns+0x80/0x13c
[ 64.053375] kernfs_remove_by_name_ns+0x54/0xf0
[ 64.058036] sysfs_remove_bin_file+0x24/0x34
[ 64.062436] pci_remove_resource_files+0x3c/0x84
[ 64.067190] pci_remove_sysfs_dev_files+0x28/0x38
[ 64.072025] pci_stop_bus_device+0x8c/0xd8
[ 64.076241] pci_stop_bus_device+0x40/0xd8
[ 64.080463] pci_stop_and_remove_bus_device_locked+0x28/0x48
[ 64.086277] remove_store+0x70/0xb0
[ 64.089878] dev_attr_store+0x20/0x38
[ 64.093649] sysfs_kf_write+0x58/0x78
[ 64.097426] kernfs_fop_write_iter+0xe8/0x184
[ 64.101905] vfs_write+0x2dc/0x308
[ 64.105413] ksys_write+0x7c/0xec
[ 64.108834] __arm64_sys_write+0x24/0x34
[ 64.112880] invoke_syscall+0x48/0x100
[ 64.116744] el0_svc_common+0xb4/0xe8
[ 64.120522] do_el0_svc+0x24/0x34
[ 64.123938] el0_svc+0x58/0xb4
[ 64.127085] el0t_64_sync_handler+0x50/0x12c
[ 64.131474] el0t_64_sync+0x198/0x19c
[ 64.135244] Code: 92402c04 b200c3e8 f13fc09f 5400088c (a9400c02)
[ 64.141508] ---[ end trace 0000000000000000 ]---
Konrad
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] PCI: take the rescan lock when adding devices during host probe
2024-10-02 20:31 ` Konrad Dybcio
@ 2024-10-02 20:53 ` Bjorn Helgaas
0 siblings, 0 replies; 6+ messages in thread
From: Bjorn Helgaas @ 2024-10-02 20:53 UTC (permalink / raw)
To: Konrad Dybcio
Cc: Bartosz Golaszewski, Bjorn Helgaas, linux-pci, linux-kernel,
Bartosz Golaszewski
On Wed, Oct 02, 2024 at 10:31:46PM +0200, Konrad Dybcio wrote:
> On 2.10.2024 10:36 AM, Bartosz Golaszewski wrote:
> > On Tue, Oct 1, 2024 at 11:11 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> >>
> >> On Thu, Sep 26, 2024 at 03:09:23PM +0200, Bartosz Golaszewski wrote:
> >>> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> >>>
> >>> Since adding the PCI power control code, we may end up with a race
> >>> between the pwrctl platform device rescanning the bus and the host
> >>> controller probe function. The latter needs to take the rescan lock when
> >>> adding devices or may crash.
> >>>
> >>> Reported-by: Konrad Dybcio <konradybcio@kernel.org>
> >>> Fixes: 4565d2652a37 ("PCI/pwrctl: Add PCI power control core code")
> >>> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> >>> ---
> >>> drivers/pci/probe.c | 2 ++
> >>> 1 file changed, 2 insertions(+)
> >>>
> >>> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> >>> index 4f68414c3086..f1615805f5b0 100644
> >>> --- a/drivers/pci/probe.c
> >>> +++ b/drivers/pci/probe.c
> >>> @@ -3105,7 +3105,9 @@ int pci_host_probe(struct pci_host_bridge *bridge)
> >>> list_for_each_entry(child, &bus->children, node)
> >>> pcie_bus_configure_settings(child);
> >>>
> >>> + pci_lock_rescan_remove();
> >>> pci_bus_add_devices(bus);
> >>> + pci_unlock_rescan_remove();
> >>
> >> Seems like we do need locking here, but don't we need a more
> >> comprehensive change? There are many other callers of
> >> pci_bus_add_devices(), and most of them look similarly unprotected.
> >
> > From a quick glance it looks like the majority of users are specific
> > drivers (controller, hotplug, etc.). The calls inside pci_rescan_bus()
> > and pci_rescan_bus_bridge_resize() are already protected from what I
> > can tell. I'm not saying that the driver calls shouldn't be fixed but
> > there's no immediate danger. This however fixes an issue we hit with
> > PCI core so I'd send it upstream now and then we can think about the
> > other use-cases.
Agreed that all current callers of pci_rescan_bus() and
pci_rescan_bus_bridge_resize() already do their own locking. Most of
the hotplug drivers that use pci_bus_add_devices() do their own
locking as well.
pci_host_probe() is used by many controller drivers, but my guess is
that a dozen or so controller drivers that use pci_bus_add_devices()
directly without locking are still at risk. It's sort of an
unfinished project to convert drivers like this over to using
pci_host_probe().
In the meantime, I wish we had a safer interface that could enforce
the locking internally.
> Probably worth showing an example of how this can manifest:
>
> removed a device through sysfs and called bus rescan:
Thanks for this; I was about to ask for it! I don't think we need
*all* the details, but something like the following might help people
recognize if we trip over another instance:
> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> Internal error: Oops: 0000000096000004 [#1] SMP
> Call trace:
> __pi_strlen+0x14/0x150
> kernfs_find_ns+0x80/0x13c
> kernfs_remove_by_name_ns+0x54/0xf0
> sysfs_remove_bin_file+0x24/0x34
> pci_remove_resource_files+0x3c/0x84
> pci_remove_sysfs_dev_files+0x28/0x38
> pci_stop_bus_device+0x8c/0xd8
> pci_stop_bus_device+0x40/0xd8
> pci_stop_and_remove_bus_device_locked+0x28/0x48
> remove_store+0x70/0xb0
> dev_attr_store+0x20/0x38
> sysfs_kf_write+0x58/0x78
> kernfs_fop_write_iter+0xe8/0x184
> vfs_write+0x2dc/0x308
Bjorn
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-10-02 20:53 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-26 13:09 [PATCH] PCI: take the rescan lock when adding devices during host probe Bartosz Golaszewski
2024-09-30 16:09 ` Konrad Dybcio
2024-10-01 21:11 ` Bjorn Helgaas
2024-10-02 8:36 ` Bartosz Golaszewski
2024-10-02 20:31 ` Konrad Dybcio
2024-10-02 20:53 ` Bjorn Helgaas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).