* [PATCHv2 1/5] pci: make pci_stop_dev concurrent safe
2024-08-27 19:28 [PATCHv2 0/5] pci cleanup/prep patches Keith Busch
@ 2024-08-27 19:28 ` Keith Busch
2024-10-02 23:39 ` Davidlohr Bueso
2024-08-27 19:28 ` [PATCHv2 2/5] pci: make pci_destroy_dev " Keith Busch
` (3 subsequent siblings)
4 siblings, 1 reply; 19+ messages in thread
From: Keith Busch @ 2024-08-27 19:28 UTC (permalink / raw)
To: linux-pci, bhelgaas; +Cc: Keith Busch, Jonathan Cameron
From: Keith Busch <kbusch@kernel.org>
Use the atomic ADDED bit flag to safely ensure concurrent callers can't
attempt to stop the device multiple times. Callers should currently be
holding the pci_rescan_remove_lock, so there shouldn't be fixing any
existing race. But that global lock can cause lock dependency issues, so
this is preparing to reduce reliance on that lock by using the existing
atomic bit ops.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
drivers/pci/bus.c | 2 +-
drivers/pci/pci.h | 9 +++++++--
drivers/pci/remove.c | 18 ++++++++----------
3 files changed, 16 insertions(+), 13 deletions(-)
diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 55c8536860518..e41dfece0d969 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -348,7 +348,7 @@ void pci_bus_add_device(struct pci_dev *dev)
if (retval < 0 && retval != -EPROBE_DEFER)
pci_warn(dev, "device attach failed (%d)\n", retval);
- pci_dev_assign_added(dev, true);
+ pci_dev_assign_added(dev);
if (dev_of_node(&dev->dev) && pci_is_bridge(dev)) {
retval = of_platform_populate(dev_of_node(&dev->dev), NULL, NULL,
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 0e9b1c7b94a5a..802f7eac53115 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -444,9 +444,14 @@ static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused)
#define PCI_DPC_RECOVERED 1
#define PCI_DPC_RECOVERING 2
-static inline void pci_dev_assign_added(struct pci_dev *dev, bool added)
+static inline void pci_dev_assign_added(struct pci_dev *dev)
{
- assign_bit(PCI_DEV_ADDED, &dev->priv_flags, added);
+ set_bit(PCI_DEV_ADDED, &dev->priv_flags);
+}
+
+static inline bool pci_dev_test_and_clear_added(struct pci_dev *dev)
+{
+ return test_and_clear_bit(PCI_DEV_ADDED, &dev->priv_flags);
}
static inline bool pci_dev_is_added(const struct pci_dev *dev)
diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
index 910387e5bdbf9..ec3064a115bf8 100644
--- a/drivers/pci/remove.c
+++ b/drivers/pci/remove.c
@@ -16,17 +16,15 @@ static void pci_free_resources(struct pci_dev *dev)
static void pci_stop_dev(struct pci_dev *dev)
{
- pci_pme_active(dev, false);
-
- if (pci_dev_is_added(dev)) {
- of_platform_depopulate(&dev->dev);
- device_release_driver(&dev->dev);
- pci_proc_detach_device(dev);
- pci_remove_sysfs_dev_files(dev);
- of_pci_remove_node(dev);
+ if (!pci_dev_test_and_clear_added(dev))
+ return;
- pci_dev_assign_added(dev, false);
- }
+ pci_pme_active(dev, false);
+ of_platform_depopulate(&dev->dev);
+ device_release_driver(&dev->dev);
+ pci_proc_detach_device(dev);
+ pci_remove_sysfs_dev_files(dev);
+ of_pci_remove_node(dev);
}
static void pci_destroy_dev(struct pci_dev *dev)
--
2.43.5
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [PATCHv2 1/5] pci: make pci_stop_dev concurrent safe
2024-08-27 19:28 ` [PATCHv2 1/5] pci: make pci_stop_dev concurrent safe Keith Busch
@ 2024-10-02 23:39 ` Davidlohr Bueso
2024-10-03 0:04 ` Keith Busch
0 siblings, 1 reply; 19+ messages in thread
From: Davidlohr Bueso @ 2024-10-02 23:39 UTC (permalink / raw)
To: Keith Busch; +Cc: linux-pci, bhelgaas, Keith Busch, Jonathan Cameron
On Tue, 27 Aug 2024, Keith Busch wrote:
>diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
>index 55c8536860518..e41dfece0d969 100644
>--- a/drivers/pci/bus.c
>+++ b/drivers/pci/bus.c
>@@ -348,7 +348,7 @@ void pci_bus_add_device(struct pci_dev *dev)
> if (retval < 0 && retval != -EPROBE_DEFER)
> pci_warn(dev, "device attach failed (%d)\n", retval);
>
>- pci_dev_assign_added(dev, true);
>+ pci_dev_assign_added(dev);
>
> if (dev_of_node(&dev->dev) && pci_is_bridge(dev)) {
> retval = of_platform_populate(dev_of_node(&dev->dev), NULL, NULL,
>diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
>index 0e9b1c7b94a5a..802f7eac53115 100644
>--- a/drivers/pci/pci.h
>+++ b/drivers/pci/pci.h
>@@ -444,9 +444,14 @@ static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused)
> #define PCI_DPC_RECOVERED 1
> #define PCI_DPC_RECOVERING 2
>
>-static inline void pci_dev_assign_added(struct pci_dev *dev, bool added)
>+static inline void pci_dev_assign_added(struct pci_dev *dev)
> {
>- assign_bit(PCI_DEV_ADDED, &dev->priv_flags, added);
>+ set_bit(PCI_DEV_ADDED, &dev->priv_flags);
>+}
So set_bit does not imply any barriers. Does this matter in the future
when breaking up pci_rescan_remove_lock? For example, what prevents
things like:
pci_bus_add_device() pci_stop_dev()
pci_dev_assign_added()
dev->priv_flags [S]
pci_dev_test_and_clear_added() // true
dev->priv_flags [L]
device_attach(&dev->dev)
device_release_driver(&dev->dev)
... I guess that implied barrier from that device_lock() in device_attach().
I am not familiar with this code, but overall I think any locking rework should
explain more about the ordering implications in the changes if the end result
is finer grained locking.
Thanks,
Davidlohr
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [PATCHv2 1/5] pci: make pci_stop_dev concurrent safe
2024-10-02 23:39 ` Davidlohr Bueso
@ 2024-10-03 0:04 ` Keith Busch
0 siblings, 0 replies; 19+ messages in thread
From: Keith Busch @ 2024-10-03 0:04 UTC (permalink / raw)
To: Davidlohr Bueso; +Cc: Keith Busch, linux-pci, bhelgaas, Jonathan Cameron
On Wed, Oct 02, 2024 at 04:39:37PM -0700, Davidlohr Bueso wrote:
> > + set_bit(PCI_DEV_ADDED, &dev->priv_flags);
> > +}
>
> So set_bit does not imply any barriers.
Huh. Hannes told me the same thing just last weak, and I was thinking
"nah, it's an atomic operation." But I'm mistaken thinking that provides
a memory barrier.
> Does this matter in the future when breaking up
> pci_rescan_remove_lock? For example, what prevents things like:
We're still far from being able to remove the big pci rescan/remove
lock, but yes, that's the idea. This should be safe as-is since it is
still using that lock, I can add smp barriers to make the memroy
dependencies explicit.
> pci_bus_add_device() pci_stop_dev()
> pci_dev_assign_added()
> dev->priv_flags [S]
> pci_dev_test_and_clear_added() // true
> dev->priv_flags [L]
> device_attach(&dev->dev)
> device_release_driver(&dev->dev)
>
> ... I guess that implied barrier from that device_lock() in device_attach().
> I am not familiar with this code, but overall I think any locking rework should
> explain more about the ordering implications in the changes if the end result
Oh, goot point. This sequence shouldn't be possible with either the
existing or proposed bus locking, and I can certainly add more detailed
explanations.
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCHv2 2/5] pci: make pci_destroy_dev concurrent safe
2024-08-27 19:28 [PATCHv2 0/5] pci cleanup/prep patches Keith Busch
2024-08-27 19:28 ` [PATCHv2 1/5] pci: make pci_stop_dev concurrent safe Keith Busch
@ 2024-08-27 19:28 ` Keith Busch
2024-10-03 2:34 ` Davidlohr Bueso
2024-08-27 19:28 ` [PATCHv2 3/5] pci: move the walk bus lock to where its needed Keith Busch
` (2 subsequent siblings)
4 siblings, 1 reply; 19+ messages in thread
From: Keith Busch @ 2024-08-27 19:28 UTC (permalink / raw)
To: linux-pci, bhelgaas; +Cc: Keith Busch, Jonathan Cameron
From: Keith Busch <kbusch@kernel.org>
Use an atomic flag instead of the racey check against the device's kobj
parent. We shouldn't be poking into device implementation details at
this level anyway.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
drivers/pci/pci.h | 6 ++++++
drivers/pci/remove.c | 2 +-
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 802f7eac53115..a11a53f67a0ce 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -443,6 +443,7 @@ static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused)
#define PCI_DEV_ADDED 0
#define PCI_DPC_RECOVERED 1
#define PCI_DPC_RECOVERING 2
+#define PCI_DEV_REMOVED 3
static inline void pci_dev_assign_added(struct pci_dev *dev)
{
@@ -459,6 +460,11 @@ static inline bool pci_dev_is_added(const struct pci_dev *dev)
return test_bit(PCI_DEV_ADDED, &dev->priv_flags);
}
+static inline bool pci_dev_test_and_set_removed(struct pci_dev *dev)
+{
+ return test_and_set_bit(PCI_DEV_REMOVED, &dev->priv_flags);
+}
+
#ifdef CONFIG_PCIEAER
#include <linux/aer.h>
diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
index ec3064a115bf8..8284ab20949c9 100644
--- a/drivers/pci/remove.c
+++ b/drivers/pci/remove.c
@@ -29,7 +29,7 @@ static void pci_stop_dev(struct pci_dev *dev)
static void pci_destroy_dev(struct pci_dev *dev)
{
- if (!dev->dev.kobj.parent)
+ if (pci_dev_test_and_set_removed(dev))
return;
device_del(&dev->dev);
--
2.43.5
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [PATCHv2 2/5] pci: make pci_destroy_dev concurrent safe
2024-08-27 19:28 ` [PATCHv2 2/5] pci: make pci_destroy_dev " Keith Busch
@ 2024-10-03 2:34 ` Davidlohr Bueso
2024-10-03 14:54 ` Keith Busch
2024-10-22 20:29 ` Keith Busch
0 siblings, 2 replies; 19+ messages in thread
From: Davidlohr Bueso @ 2024-10-03 2:34 UTC (permalink / raw)
To: Keith Busch; +Cc: linux-pci, bhelgaas, Keith Busch, Jonathan Cameron
On Tue, 27 Aug 2024, Keith Busch wrote:
>+static inline bool pci_dev_test_and_set_removed(struct pci_dev *dev)
>+{
>+ return test_and_set_bit(PCI_DEV_REMOVED, &dev->priv_flags);
>+}
Same ordering/dependency description observations as mentioned in
patch 1 (both these cases are fully ordered).
>+
> #ifdef CONFIG_PCIEAER
> #include <linux/aer.h>
>
>diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
>index ec3064a115bf8..8284ab20949c9 100644
>--- a/drivers/pci/remove.c
>+++ b/drivers/pci/remove.c
>@@ -29,7 +29,7 @@ static void pci_stop_dev(struct pci_dev *dev)
>
> static void pci_destroy_dev(struct pci_dev *dev)
> {
>- if (!dev->dev.kobj.parent)
>+ if (pci_dev_test_and_set_removed(dev))
Doesn't this want to be if (!pci_dev_test_and_set_removed()) ?
This also fixes a splat when triggering a removal when you add
subordinate refcounting is added:
https://git.kernel.org/pub/scm/linux/kernel/git/kbusch/linux.git/commit/?h=pci-bus-locking-2024-09-09&id=3883c485d5e45b5e17f685f77ff4020bec162336
fyi:
[ 22.739614] BUG: kernel NULL pointer dereference, address: 0000000000000028
[ 22.739910] #PF: supervisor read access in kernel mode
[ 22.740132] #PF: error_code(0x0000) - not-present page
[ 22.740351] PGD 0 P4D 0
[ 22.740468] Oops: Oops: 0000 [#1] PREEMPT SMP KASAN PTI
[ 22.740695] CPU: 0 UID: 0 PID: 266 Comm: bash Tainted: G B 6.11.0-rc1-g3883c485d5e4-dirty #13
[ 22.741111] Tainted: [B]=BAD_PAGE
[ 22.741258] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 22.741727] RIP: 0010:pcie_aspm_check_latency.isra.0+0x192/0x4d0
[ 22.741990] Code: 18 e8 e2 6f 6f ff 48 89 df e8 aa ad 92 ff 4c 8b 2b 49 8d 7d 18 e8 9e ad 92 ff 49 8b 6d 18 4c 8d 65 28 4c f
[ 22.743438] RSP: 0018:ffff88800554f970 EFLAGS: 00010282
[ 22.743673] RAX: 0000000000000001 RBX: ffff888001cc0c80 RCX: 0000000000000001
[ 22.743976] RDX: ffff88804c63ce00 RSI: 0000000000000000 RDI: 0000000000000007
[ 22.744285] RBP: 0000000000000000 R08: 0000000000000001 R09: fffffbfff7b7275c
[ 22.744596] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000028
[ 22.744906] R13: ffff888001f62000 R14: 0000000000000040 R15: 00000000000003e8
[ 22.745216] FS: 00007f14a687f740(0000) GS:ffff88806d200000(0000) knlGS:0000000000000000
[ 22.745565] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 22.745823] CR2: 0000000000000028 CR3: 000000004dc06000 CR4: 00000000000006f0
[ 22.746132] Call Trace:
[ 22.746246] <TASK>
[ 22.746346] ? show_regs+0x8c/0xa0
[ 22.746507] ? __die+0x2c/0x80
[ 22.746652] ? page_fault_oops+0x31a/0x830
[ 22.746839] ? __pfx_page_fault_oops+0x10/0x10
[ 22.747042] ? is_prefetch.constprop.0+0x9b/0x450
[ 22.747253] ? pcie_aspm_check_latency.isra.0+0x192/0x4d0
[ 22.747495] ? __pfx_is_prefetch.constprop.0+0x10/0x10
[ 22.747725] ? pcie_aspm_check_latency.isra.0+0x192/0x4d0
[ 22.747966] ? search_module_extables+0x93/0xc0
[ 22.748173] ? fixup_exception+0xd7/0x560
[ 22.748358] ? kernelmode_fixup_or_oops.constprop.0+0x9c/0xc0
[ 22.748613] ? __bad_area_nosemaphore+0x2f8/0x420
[ 22.748826] ? lock_mm_and_find_vma+0x90/0x4f0
[ 22.749028] ? do_user_addr_fault+0x58a/0xc80
[ 22.749226] ? rcu_is_watching+0x20/0x50
[ 22.749407] ? exc_page_fault+0x5c/0xd0
[ 22.749586] ? asm_exc_page_fault+0x26/0x30
[ 22.749776] ? pcie_aspm_check_latency.isra.0+0x192/0x4d0
[ 22.750020] ? __pfx_pcie_aspm_check_latency.isra.0+0x10/0x10
[ 22.750278] ? mark_held_locks+0x65/0x90
[ 22.750457] ? kobject_get+0x95/0x110
[ 22.750629] pcie_update_aspm_capable+0x128/0x1c0
[ 22.750843] pcie_aspm_exit_link_state+0x137/0x1e0
[ 22.751059] pci_remove_bus_device+0x15b/0x200
[ 22.751260] pci_remove_bus+0x4a/0x130
[ 22.751432] pci_remove_bus_device+0x88/0x200
[ 22.751631] pci_remove_bus+0x4a/0x130
[ 22.751802] pci_remove_bus_device+0x88/0x200
[ 22.752000] pci_remove_bus+0x4a/0x130
[ 22.752172] pci_remove_bus_device+0x88/0x200
[ 22.752370] pci_stop_and_remove_bus_device_locked+0x22/0x30
[ 22.752622] remove_store+0x125/0x140
[ 22.752791] ? __pfx_remove_store+0x10/0x10
[ 22.752981] ? __pfx___mutex_lock+0x10/0x10
[ 22.753170] ? __pfx__copy_from_iter+0x10/0x10
[ 22.753372] ? __pfx_remove_store+0x10/0x10
[ 22.753562] dev_attr_store+0x46/0x70
[ 22.753752] ? __pfx_dev_attr_store+0x10/0x10
[ 22.753965] sysfs_kf_write+0xa0/0xc0
[ 22.754142] kernfs_fop_write_iter+0x23d/0x300
[ 22.754346] ? __pfx_sysfs_kf_write+0x10/0x10
[ 22.754549] vfs_write+0x508/0xa90
[ 22.754709] ? __pfx_kernfs_fop_write_iter+0x10/0x10
[ 22.754935] ? __pfx_vfs_write+0x10/0x10
[ 22.755118] ? __fget_light+0xcd/0x120
[ 22.755295] ksys_write+0x108/0x200
[ 22.755458] ? __pfx_ksys_write+0x10/0x10
[ 22.755644] ? mark_held_locks+0x24/0x90
[ 22.755826] do_syscall_64+0xc1/0x1d0
[ 22.755998] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 22.756229] RIP: 0033:0x7f14a697a240
[ 22.756394] Code: 40 00 48 8b 15 c1 9b 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 80 3d a1 23 0e 00 00 74 17 9
[ 22.757184] RSP: 002b:00007ffdc41b64e8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[ 22.757508] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f14a697a240
[ 22.757813] RDX: 0000000000000002 RSI: 0000558b25aeda50 RDI: 0000000000000001
[ 22.758117] RBP: 0000558b25aeda50 R08: 00007f14a6a54d90 R09: 00007f14a6a54d90
[ 22.758422] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000002
[ 22.758726] R13: 00007f14a6a55760 R14: 0000000000000002 R15: 00007f14a6a509e0
[ 22.759039] </TASK>
> return;
>
> device_del(&dev->dev);
>--
>2.43.5
>
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [PATCHv2 2/5] pci: make pci_destroy_dev concurrent safe
2024-10-03 2:34 ` Davidlohr Bueso
@ 2024-10-03 14:54 ` Keith Busch
2024-10-03 17:04 ` Davidlohr Bueso
2024-10-22 20:29 ` Keith Busch
1 sibling, 1 reply; 19+ messages in thread
From: Keith Busch @ 2024-10-03 14:54 UTC (permalink / raw)
To: Davidlohr Bueso; +Cc: Keith Busch, linux-pci, bhelgaas, Jonathan Cameron
On Wed, Oct 02, 2024 at 07:34:13PM -0700, Davidlohr Bueso wrote:
> On Tue, 27 Aug 2024, Keith Busch wrote:
> > static void pci_destroy_dev(struct pci_dev *dev)
> > {
> > - if (!dev->dev.kobj.parent)
> > + if (pci_dev_test_and_set_removed(dev))
>
> Doesn't this want to be if (!pci_dev_test_and_set_removed()) ?
No, this function returns the previous value of the REMOVED flag. If
it's already set, then another thread already removed this device.
> This also fixes a splat when triggering a removal when you add
> subordinate refcounting is added:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/kbusch/linux.git/commit/?h=pci-bus-locking-2024-09-09&id=3883c485d5e45b5e17f685f77ff4020bec162336
Oh, that's pretty neat. Thanks for testing that! I think that reference
counting patch is pretty safe too. I can rebase the series with that one
included next time.
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [PATCHv2 2/5] pci: make pci_destroy_dev concurrent safe
2024-10-03 14:54 ` Keith Busch
@ 2024-10-03 17:04 ` Davidlohr Bueso
2024-10-03 17:59 ` Keith Busch
0 siblings, 1 reply; 19+ messages in thread
From: Davidlohr Bueso @ 2024-10-03 17:04 UTC (permalink / raw)
To: Keith Busch; +Cc: Keith Busch, linux-pci, bhelgaas, Jonathan Cameron
On Thu, 03 Oct 2024, Keith Busch wrote:
>Oh, that's pretty neat. Thanks for testing that! I think that reference
>counting patch is pretty safe too. I can rebase the series with that one
>included next time.
Well the thing is it is crashing on me, as reported. I was able to reproduce
the deadlock with rescan_remove_lock on a switched CXL topology (via sysfs
remove/rescan parent/child). This is why I tested your full pci-bus-locking-2024-09-09
branch hoping the last patch would fix it, but still cannot confirm because
of that nil ptr deref.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCHv2 2/5] pci: make pci_destroy_dev concurrent safe
2024-10-03 17:04 ` Davidlohr Bueso
@ 2024-10-03 17:59 ` Keith Busch
2024-10-08 2:15 ` Davidlohr Bueso
0 siblings, 1 reply; 19+ messages in thread
From: Keith Busch @ 2024-10-03 17:59 UTC (permalink / raw)
To: Davidlohr Bueso; +Cc: Keith Busch, linux-pci, bhelgaas, Jonathan Cameron
On Thu, Oct 03, 2024 at 10:04:16AM -0700, Davidlohr Bueso wrote:
> Well the thing is it is crashing on me, as reported. I was able to reproduce
> the deadlock with rescan_remove_lock on a switched CXL topology (via sysfs
> remove/rescan parent/child). This is why I tested your full pci-bus-locking-2024-09-09
> branch hoping the last patch would fix it, but still cannot confirm because
> of that nil ptr deref.
You might need something different than anything in that series if you
are doing concurrent sysfs access during removal. This is what I wrote
up for that issue:
https://lore.kernel.org/linux-pci/20240719185513.3376321-1-kbusch@meta.com/T/#u
The last reply from GregKH was against this proposal, but I briefly
spoke with him a couple weeks ago and I think he may accept this if I
resubmit.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCHv2 2/5] pci: make pci_destroy_dev concurrent safe
2024-10-03 17:59 ` Keith Busch
@ 2024-10-08 2:15 ` Davidlohr Bueso
0 siblings, 0 replies; 19+ messages in thread
From: Davidlohr Bueso @ 2024-10-08 2:15 UTC (permalink / raw)
To: Keith Busch; +Cc: Keith Busch, linux-pci, bhelgaas, Jonathan Cameron
On Thu, 03 Oct 2024, Keith Busch wrote:
>On Thu, Oct 03, 2024 at 10:04:16AM -0700, Davidlohr Bueso wrote:
>> Well the thing is it is crashing on me, as reported. I was able to reproduce
>> the deadlock with rescan_remove_lock on a switched CXL topology (via sysfs
>> remove/rescan parent/child). This is why I tested your full pci-bus-locking-2024-09-09
>> branch hoping the last patch would fix it, but still cannot confirm because
>> of that nil ptr deref.
>
>You might need something different than anything in that series if you
>are doing concurrent sysfs access during removal. This is what I wrote
>up for that issue:
The crash is *only* happening in your branch. The test runs into a deadlock
on a vanilla kernel.
>
> https://lore.kernel.org/linux-pci/20240719185513.3376321-1-kbusch@meta.com/T/#u
This does not address either, fyi.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCHv2 2/5] pci: make pci_destroy_dev concurrent safe
2024-10-03 2:34 ` Davidlohr Bueso
2024-10-03 14:54 ` Keith Busch
@ 2024-10-22 20:29 ` Keith Busch
1 sibling, 0 replies; 19+ messages in thread
From: Keith Busch @ 2024-10-22 20:29 UTC (permalink / raw)
To: Davidlohr Bueso; +Cc: Keith Busch, linux-pci, bhelgaas, Jonathan Cameron
On Wed, Oct 02, 2024 at 07:34:13PM -0700, Davidlohr Bueso wrote:
> On Tue, 27 Aug 2024, Keith Busch wrote:
>
> > +static inline bool pci_dev_test_and_set_removed(struct pci_dev *dev)
> > +{
> > + return test_and_set_bit(PCI_DEV_REMOVED, &dev->priv_flags);
> > +}
>
> Same ordering/dependency description observations as mentioned in
> patch 1 (both these cases are fully ordered).
Just rebasing everything so late reply here.
test_and_set_bit already has a memory barrier. It's the "set_bit" that
doesn't, but set_bit is not used for this new flag. This new flag only
indicates the device is being removed, so it's only set once before the
device is deleted. It's never accessed outside this path, and it's
safe compared to looking at the kobj parent.
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCHv2 3/5] pci: move the walk bus lock to where its needed
2024-08-27 19:28 [PATCHv2 0/5] pci cleanup/prep patches Keith Busch
2024-08-27 19:28 ` [PATCHv2 1/5] pci: make pci_stop_dev concurrent safe Keith Busch
2024-08-27 19:28 ` [PATCHv2 2/5] pci: make pci_destroy_dev " Keith Busch
@ 2024-08-27 19:28 ` Keith Busch
2024-10-03 0:32 ` Davidlohr Bueso
2024-10-09 11:09 ` Ilpo Järvinen
2024-08-27 19:28 ` [PATCHv2 4/5] pci: walk bus recursively Keith Busch
2024-08-27 19:28 ` [PATCHv2 5/5] pci: unexport pci_walk_bus_locked Keith Busch
4 siblings, 2 replies; 19+ messages in thread
From: Keith Busch @ 2024-08-27 19:28 UTC (permalink / raw)
To: linux-pci, bhelgaas; +Cc: Keith Busch, Jonathan Cameron
From: Keith Busch <kbusch@kernel.org>
Simplify the common function by removing an unnecessary parameter that
can be more easily handled in the only caller that wants it.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
drivers/pci/bus.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index e41dfece0d969..7c07a141e8772 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -390,7 +390,7 @@ void pci_bus_add_devices(const struct pci_bus *bus)
EXPORT_SYMBOL(pci_bus_add_devices);
static void __pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),
- void *userdata, bool locked)
+ void *userdata)
{
struct pci_dev *dev;
struct pci_bus *bus;
@@ -398,8 +398,6 @@ static void __pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void
int retval;
bus = top;
- if (!locked)
- down_read(&pci_bus_sem);
next = top->devices.next;
for (;;) {
if (next == &bus->devices) {
@@ -422,8 +420,6 @@ static void __pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void
if (retval)
break;
}
- if (!locked)
- up_read(&pci_bus_sem);
}
/**
@@ -441,7 +437,9 @@ static void __pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void
*/
void pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *), void *userdata)
{
- __pci_walk_bus(top, cb, userdata, false);
+ down_read(&pci_bus_sem);
+ __pci_walk_bus(top, cb, userdata);
+ up_read(&pci_bus_sem);
}
EXPORT_SYMBOL_GPL(pci_walk_bus);
@@ -449,7 +447,7 @@ void pci_walk_bus_locked(struct pci_bus *top, int (*cb)(struct pci_dev *, void *
{
lockdep_assert_held(&pci_bus_sem);
- __pci_walk_bus(top, cb, userdata, true);
+ __pci_walk_bus(top, cb, userdata);
}
EXPORT_SYMBOL_GPL(pci_walk_bus_locked);
--
2.43.5
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [PATCHv2 3/5] pci: move the walk bus lock to where its needed
2024-08-27 19:28 ` [PATCHv2 3/5] pci: move the walk bus lock to where its needed Keith Busch
@ 2024-10-03 0:32 ` Davidlohr Bueso
2024-10-09 11:09 ` Ilpo Järvinen
1 sibling, 0 replies; 19+ messages in thread
From: Davidlohr Bueso @ 2024-10-03 0:32 UTC (permalink / raw)
To: Keith Busch; +Cc: linux-pci, bhelgaas, Keith Busch, Jonathan Cameron
On Tue, 27 Aug 2024, Keith Busch wrote:
>From: Keith Busch <kbusch@kernel.org>
>
>Simplify the common function by removing an unnecessary parameter that
>can be more easily handled in the only caller that wants it.
>
>Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>Signed-off-by: Keith Busch <kbusch@kernel.org>
Nice cleanup.
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCHv2 3/5] pci: move the walk bus lock to where its needed
2024-08-27 19:28 ` [PATCHv2 3/5] pci: move the walk bus lock to where its needed Keith Busch
2024-10-03 0:32 ` Davidlohr Bueso
@ 2024-10-09 11:09 ` Ilpo Järvinen
1 sibling, 0 replies; 19+ messages in thread
From: Ilpo Järvinen @ 2024-10-09 11:09 UTC (permalink / raw)
To: Keith Busch; +Cc: linux-pci, bhelgaas, Keith Busch, Jonathan Cameron
[-- Attachment #1: Type: text/plain, Size: 2196 bytes --]
On Tue, 27 Aug 2024, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> Simplify the common function by removing an unnecessary parameter that
> can be more easily handled in the only caller that wants it.
>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
--
i.
> ---
> drivers/pci/bus.c | 12 +++++-------
> 1 file changed, 5 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
> index e41dfece0d969..7c07a141e8772 100644
> --- a/drivers/pci/bus.c
> +++ b/drivers/pci/bus.c
> @@ -390,7 +390,7 @@ void pci_bus_add_devices(const struct pci_bus *bus)
> EXPORT_SYMBOL(pci_bus_add_devices);
>
> static void __pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),
> - void *userdata, bool locked)
> + void *userdata)
> {
> struct pci_dev *dev;
> struct pci_bus *bus;
> @@ -398,8 +398,6 @@ static void __pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void
> int retval;
>
> bus = top;
> - if (!locked)
> - down_read(&pci_bus_sem);
> next = top->devices.next;
> for (;;) {
> if (next == &bus->devices) {
> @@ -422,8 +420,6 @@ static void __pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void
> if (retval)
> break;
> }
> - if (!locked)
> - up_read(&pci_bus_sem);
> }
>
> /**
> @@ -441,7 +437,9 @@ static void __pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void
> */
> void pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *), void *userdata)
> {
> - __pci_walk_bus(top, cb, userdata, false);
> + down_read(&pci_bus_sem);
> + __pci_walk_bus(top, cb, userdata);
> + up_read(&pci_bus_sem);
> }
> EXPORT_SYMBOL_GPL(pci_walk_bus);
>
> @@ -449,7 +447,7 @@ void pci_walk_bus_locked(struct pci_bus *top, int (*cb)(struct pci_dev *, void *
> {
> lockdep_assert_held(&pci_bus_sem);
>
> - __pci_walk_bus(top, cb, userdata, true);
> + __pci_walk_bus(top, cb, userdata);
> }
> EXPORT_SYMBOL_GPL(pci_walk_bus_locked);
>
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCHv2 4/5] pci: walk bus recursively
2024-08-27 19:28 [PATCHv2 0/5] pci cleanup/prep patches Keith Busch
` (2 preceding siblings ...)
2024-08-27 19:28 ` [PATCHv2 3/5] pci: move the walk bus lock to where its needed Keith Busch
@ 2024-08-27 19:28 ` Keith Busch
2024-10-09 12:08 ` Ilpo Järvinen
2024-08-27 19:28 ` [PATCHv2 5/5] pci: unexport pci_walk_bus_locked Keith Busch
4 siblings, 1 reply; 19+ messages in thread
From: Keith Busch @ 2024-08-27 19:28 UTC (permalink / raw)
To: linux-pci, bhelgaas; +Cc: Keith Busch, Jonathan Cameron
From: Keith Busch <kbusch@kernel.org>
The original implementation purposefully chose a non-recursive walk,
presumably as a precaution on stack use. We do recursive bus walking in
other places though. For example:
pci_bus_resettable
pci_stop_bus_device
pci_remove_bus_device
pci_bus_allocate_dev_resources
So, recursive pci bus walking is well tested and safe, and the
implementation is easier to follow. The motivation for changing it now
is to make it easier to introduce finer grain locking in the future.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
drivers/pci/bus.c | 36 +++++++++++-------------------------
1 file changed, 11 insertions(+), 25 deletions(-)
diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 7c07a141e8772..8491e9c7f0586 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -389,37 +389,23 @@ void pci_bus_add_devices(const struct pci_bus *bus)
}
EXPORT_SYMBOL(pci_bus_add_devices);
-static void __pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),
- void *userdata)
+static int __pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),
+ void *userdata)
{
struct pci_dev *dev;
- struct pci_bus *bus;
- struct list_head *next;
- int retval;
+ int ret = 0;
- bus = top;
- next = top->devices.next;
- for (;;) {
- if (next == &bus->devices) {
- /* end of this bus, go up or finish */
- if (bus == top)
+ list_for_each_entry(dev, &top->devices, bus_list) {
+ ret = cb(dev, userdata);
+ if (ret)
+ break;
+ if (dev->subordinate) {
+ ret = __pci_walk_bus(dev->subordinate, cb, userdata);
+ if (ret)
break;
- next = bus->self->bus_list.next;
- bus = bus->self->bus;
- continue;
}
- dev = list_entry(next, struct pci_dev, bus_list);
- if (dev->subordinate) {
- /* this is a pci-pci bridge, do its devices next */
- next = dev->subordinate->devices.next;
- bus = dev->subordinate;
- } else
- next = dev->bus_list.next;
-
- retval = cb(dev, userdata);
- if (retval)
- break;
}
+ return ret;
}
/**
--
2.43.5
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [PATCHv2 4/5] pci: walk bus recursively
2024-08-27 19:28 ` [PATCHv2 4/5] pci: walk bus recursively Keith Busch
@ 2024-10-09 12:08 ` Ilpo Järvinen
0 siblings, 0 replies; 19+ messages in thread
From: Ilpo Järvinen @ 2024-10-09 12:08 UTC (permalink / raw)
To: Keith Busch; +Cc: linux-pci, bhelgaas, Keith Busch, Jonathan Cameron
[-- Attachment #1: Type: text/plain, Size: 2597 bytes --]
On Tue, 27 Aug 2024, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> The original implementation purposefully chose a non-recursive walk,
> presumably as a precaution on stack use. We do recursive bus walking in
> other places though. For example:
>
> pci_bus_resettable
> pci_stop_bus_device
> pci_remove_bus_device
> pci_bus_allocate_dev_resources
>
> So, recursive pci bus walking is well tested and safe, and the
> implementation is easier to follow. The motivation for changing it now
> is to make it easier to introduce finer grain locking in the future.
>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
> drivers/pci/bus.c | 36 +++++++++++-------------------------
> 1 file changed, 11 insertions(+), 25 deletions(-)
>
> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
> index 7c07a141e8772..8491e9c7f0586 100644
> --- a/drivers/pci/bus.c
> +++ b/drivers/pci/bus.c
> @@ -389,37 +389,23 @@ void pci_bus_add_devices(const struct pci_bus *bus)
> }
> EXPORT_SYMBOL(pci_bus_add_devices);
>
> -static void __pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),
> - void *userdata)
> +static int __pci_walk_bus(struct pci_bus *top, int (*cb)(struct pci_dev *, void *),
I guess the reason why this parameter was called "top" is now gone so you
could just name it "bus"?
Other than that, the change looks okay,
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
--
i.
> + void *userdata)
> {
> struct pci_dev *dev;
> - struct pci_bus *bus;
> - struct list_head *next;
> - int retval;
> + int ret = 0;
>
> - bus = top;
> - next = top->devices.next;
> - for (;;) {
> - if (next == &bus->devices) {
> - /* end of this bus, go up or finish */
> - if (bus == top)
> + list_for_each_entry(dev, &top->devices, bus_list) {
> + ret = cb(dev, userdata);
> + if (ret)
> + break;
> + if (dev->subordinate) {
> + ret = __pci_walk_bus(dev->subordinate, cb, userdata);
> + if (ret)
> break;
> - next = bus->self->bus_list.next;
> - bus = bus->self->bus;
> - continue;
> }
> - dev = list_entry(next, struct pci_dev, bus_list);
> - if (dev->subordinate) {
> - /* this is a pci-pci bridge, do its devices next */
> - next = dev->subordinate->devices.next;
> - bus = dev->subordinate;
> - } else
> - next = dev->bus_list.next;
> -
> - retval = cb(dev, userdata);
> - if (retval)
> - break;
> }
> + return ret;
> }
>
> /**
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCHv2 5/5] pci: unexport pci_walk_bus_locked
2024-08-27 19:28 [PATCHv2 0/5] pci cleanup/prep patches Keith Busch
` (3 preceding siblings ...)
2024-08-27 19:28 ` [PATCHv2 4/5] pci: walk bus recursively Keith Busch
@ 2024-08-27 19:28 ` Keith Busch
2024-10-03 0:35 ` Davidlohr Bueso
2024-10-09 12:20 ` Ilpo Järvinen
4 siblings, 2 replies; 19+ messages in thread
From: Keith Busch @ 2024-08-27 19:28 UTC (permalink / raw)
To: linux-pci, bhelgaas; +Cc: Keith Busch, Jonathan Cameron
From: Keith Busch <kbusch@kernel.org>
There's only one user of this, and it's internal, so no need to export
it.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
drivers/pci/bus.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 8491e9c7f0586..30620f3bb0e2d 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -435,7 +435,6 @@ void pci_walk_bus_locked(struct pci_bus *top, int (*cb)(struct pci_dev *, void *
__pci_walk_bus(top, cb, userdata);
}
-EXPORT_SYMBOL_GPL(pci_walk_bus_locked);
struct pci_bus *pci_bus_get(struct pci_bus *bus)
{
--
2.43.5
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [PATCHv2 5/5] pci: unexport pci_walk_bus_locked
2024-08-27 19:28 ` [PATCHv2 5/5] pci: unexport pci_walk_bus_locked Keith Busch
@ 2024-10-03 0:35 ` Davidlohr Bueso
2024-10-09 12:20 ` Ilpo Järvinen
1 sibling, 0 replies; 19+ messages in thread
From: Davidlohr Bueso @ 2024-10-03 0:35 UTC (permalink / raw)
To: Keith Busch; +Cc: linux-pci, bhelgaas, Keith Busch, Jonathan Cameron
On Tue, 27 Aug 2024, Keith Busch wrote:
>From: Keith Busch <kbusch@kernel.org>
>
>There's only one user of this, and it's internal, so no need to export
>it.
>
>Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCHv2 5/5] pci: unexport pci_walk_bus_locked
2024-08-27 19:28 ` [PATCHv2 5/5] pci: unexport pci_walk_bus_locked Keith Busch
2024-10-03 0:35 ` Davidlohr Bueso
@ 2024-10-09 12:20 ` Ilpo Järvinen
1 sibling, 0 replies; 19+ messages in thread
From: Ilpo Järvinen @ 2024-10-09 12:20 UTC (permalink / raw)
To: Keith Busch; +Cc: linux-pci, bhelgaas, Keith Busch, Jonathan Cameron
[-- Attachment #1: Type: text/plain, Size: 785 bytes --]
On Tue, 27 Aug 2024, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> There's only one user of this, and it's internal, so no need to export
> it.
>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
> drivers/pci/bus.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
> index 8491e9c7f0586..30620f3bb0e2d 100644
> --- a/drivers/pci/bus.c
> +++ b/drivers/pci/bus.c
> @@ -435,7 +435,6 @@ void pci_walk_bus_locked(struct pci_bus *top, int (*cb)(struct pci_dev *, void *
>
> __pci_walk_bus(top, cb, userdata);
> }
> -EXPORT_SYMBOL_GPL(pci_walk_bus_locked);
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
--
i.
^ permalink raw reply [flat|nested] 19+ messages in thread