* [PATCH] pci/aer_inject: switching inject_lock to raw_spinlock_t
@ 2025-10-07 6:02 Guangbo Cui
2025-10-07 15:13 ` Waiman Long
0 siblings, 1 reply; 5+ messages in thread
From: Guangbo Cui @ 2025-10-07 6:02 UTC (permalink / raw)
To: Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt,
Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng,
Thomas Gleixner, Bjorn Helgaas
Cc: linux-rt-devel, Waiman Long, linux-kernel, linux-pci, Guangbo Cui
When injecting AER errors under PREEMPT_RT, the kernel may trigger a
lockdep warning about an invalid wait context:
```
[ 1850.950780] [ BUG: Invalid wait context ]
[ 1850.951152] 6.17.0-11316-g7a405dbb0f03-dirty #7 Not tainted
[ 1850.951457] -----------------------------
[ 1850.951680] irq/16-PCIe PME/56 is trying to lock:
[ 1850.952004] ffff800082865238 (inject_lock){+.+.}-{3:3}, at: aer_inj_read_config+0x38/0x1dc
[ 1850.952731] other info that might help us debug this:
[ 1850.952997] context-{5:5}
[ 1850.953192] 5 locks held by irq/16-PCIe PME/56:
[ 1850.953415] #0: ffff800082647390 (local_bh){.+.+}-{1:3}, at: __local_bh_disable_ip+0x30/0x268
[ 1850.953931] #1: ffff8000826c6b38 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire+0x4/0x48
[ 1850.954453] #2: ffff000004bb6c58 (&data->lock){+...}-{3:3}, at: pcie_pme_irq+0x34/0xc4
[ 1850.954949] #3: ffff8000826c6b38 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire+0x4/0x48
[ 1850.955420] #4: ffff800082863d10 (pci_lock){....}-{2:2}, at: pci_bus_read_config_dword+0x5c/0xd8
```
This happens because the AER injection path (`aer_inj_read_config()`)
is called in the context of the PCIe PME interrupt thread, which runs
through `irq_forced_thread_fn()` under PREEMPT_RT. In this context,
`pci_lock` (a raw_spinlock_t) is held with interrupts disabled
(`spin_lock_irqsave()`), and then `aer_inj_read_config()` tries to
acquire `inject_lock`, which is a `rt_spin_lock`. (Thanks Waiman Long)
`rt_spin_lock` may sleep, so acquiring it while holding a raw spinlock
with IRQs disabled violates the lock ordering rules. This leads to
the “Invalid wait context” lockdep warning.
In other words, the lock order looks like this:
```
raw_spin_lock_irqsave(&pci_lock);
↓
rt_spin_lock(&inject_lock); <-- not allowed
```
To fix this, convert `inject_lock` from an `rt_spin_lock` to a
`raw_spinlock_t`, a raw spinlock is safe and consistent with the
surrounding locking scheme.
This resolves the lockdep “Invalid wait context” warning observed when
injecting correctable AER errors through `/dev/aer_inject` on PREEMPT_RT.
This was discovered while testing PCIe AER error injection on an arm64
QEMU virtual machine:
```
qemu-system-aarch64 \
-nographic \
-machine virt,highmem=off,gic-version=3 \
-cpu cortex-a72 \
-kernel arch/arm64/boot/Image \
-initrd initramfs.cpio.gz \
-append "console=ttyAMA0 root=/dev/ram rdinit=/linuxrc nokaslr" \
-m 2G \
-smp 1 \
-netdev user,id=net0,hostfwd=tcp::2223-:22 \
-device virtio-net-pci,netdev=net0 \
-device pcie-root-port,id=rp0,chassis=1,slot=0x0 \
-device pci-testdev -s -S
```
Injecting a correctable PCIe error via /dev/aer_inject caused a BUG
report with "Invalid wait context" in the irq/PCIe thread.
```
~ # export HEX="00020000000000000100000000000000000000000000000000000000"
~ # echo -n "$HEX" | xxd -r -p | tee /dev/aer_inject >/dev/null
[ 1850.947170] pcieport 0000:00:02.0: aer_inject: Injecting errors 00000001/00000000 into device 0000:00:02.0
[ 1850.949951]
[ 1850.950479] =============================
[ 1850.950780] [ BUG: Invalid wait context ]
[ 1850.951152] 6.17.0-11316-g7a405dbb0f03-dirty #7 Not tainted
[ 1850.951457] -----------------------------
[ 1850.951680] irq/16-PCIe PME/56 is trying to lock:
[ 1850.952004] ffff800082865238 (inject_lock){+.+.}-{3:3}, at: aer_inj_read_config+0x38/0x1dc
[ 1850.952731] other info that might help us debug this:
[ 1850.952997] context-{5:5}
[ 1850.953192] 5 locks held by irq/16-PCIe PME/56:
[ 1850.953415] #0: ffff800082647390 (local_bh){.+.+}-{1:3}, at: __local_bh_disable_ip+0x30/0x268
[ 1850.953931] #1: ffff8000826c6b38 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire+0x4/0x48
[ 1850.954453] #2: ffff000004bb6c58 (&data->lock){+...}-{3:3}, at: pcie_pme_irq+0x34/0xc4
[ 1850.954949] #3: ffff8000826c6b38 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire+0x4/0x48
[ 1850.955420] #4: ffff800082863d10 (pci_lock){....}-{2:2}, at: pci_bus_read_config_dword+0x5c/0xd8
[ 1850.955932] stack backtrace:
[ 1850.956412] CPU: 0 UID: 0 PID: 56 Comm: irq/16-PCIe PME Not tainted 6.17.0-11316-g7a405dbb0f03-dirty #7 PREEMPT_{RT,(full)}
[ 1850.957039] Hardware name: linux,dummy-virt (DT)
[ 1850.957409] Call trace:
[ 1850.957727] show_stack+0x18/0x24 (C)
[ 1850.958089] dump_stack_lvl+0x40/0xbc
[ 1850.958339] dump_stack+0x18/0x24
[ 1850.958586] __lock_acquire+0xa84/0x3008
[ 1850.958907] lock_acquire+0x128/0x2a8
[ 1850.959171] rt_spin_lock+0x50/0x1b8
[ 1850.959476] aer_inj_read_config+0x38/0x1dc
[ 1850.959821] pci_bus_read_config_dword+0x80/0xd8
[ 1850.960079] pcie_capability_read_dword+0xac/0xd8
[ 1850.960454] pcie_pme_irq+0x44/0xc4
[ 1850.960728] irq_forced_thread_fn+0x30/0x94
[ 1850.960984] irq_thread+0x1ac/0x3a4
[ 1850.961308] kthread+0x1b4/0x208
[ 1850.961557] ret_from_fork+0x10/0x20
[ 1850.963088] pcieport 0000:00:02.0: AER: Correctable error message received from 0000:00:02.0
[ 1850.963330] pcieport 0000:00:02.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
[ 1850.963351] pcieport 0000:00:02.0: device [1b36:000c] error status/mask=00000001/0000e000
[ 1850.963385] pcieport 0000:00:02.0: [ 0] RxErr (First)
```
Signed-off-by: Guangbo Cui <jckeep.cuiguangbo@gmail.com>
---
drivers/pci/pcie/aer_inject.c | 32 +++++++++++---------------------
1 file changed, 11 insertions(+), 21 deletions(-)
diff --git a/drivers/pci/pcie/aer_inject.c b/drivers/pci/pcie/aer_inject.c
index 91acc7b17f68..e4c9d08c1657 100644
--- a/drivers/pci/pcie/aer_inject.c
+++ b/drivers/pci/pcie/aer_inject.c
@@ -72,7 +72,7 @@ static LIST_HEAD(einjected);
static LIST_HEAD(pci_bus_ops_list);
/* Protect einjected and pci_bus_ops_list */
-static DEFINE_SPINLOCK(inject_lock);
+static DEFINE_RAW_SPINLOCK(inject_lock);
static void aer_error_init(struct aer_error *err, u32 domain,
unsigned int bus, unsigned int devfn,
@@ -123,15 +123,13 @@ static struct pci_ops *__find_pci_bus_ops(struct pci_bus *bus)
static struct pci_bus_ops *pci_bus_ops_pop(void)
{
- unsigned long flags;
struct pci_bus_ops *bus_ops;
- spin_lock_irqsave(&inject_lock, flags);
+ guard(raw_spinlock_irqsave)(&inject_lock);
bus_ops = list_first_entry_or_null(&pci_bus_ops_list,
struct pci_bus_ops, list);
if (bus_ops)
list_del(&bus_ops->list);
- spin_unlock_irqrestore(&inject_lock, flags);
return bus_ops;
}
@@ -219,11 +217,10 @@ static int aer_inj_read_config(struct pci_bus *bus, unsigned int devfn,
{
u32 *sim;
struct aer_error *err;
- unsigned long flags;
int domain;
int rv;
- spin_lock_irqsave(&inject_lock, flags);
+ guard(raw_spinlock_irqsave)(&inject_lock);
if (size != sizeof(u32))
goto out;
domain = pci_domain_nr(bus);
@@ -236,12 +233,10 @@ static int aer_inj_read_config(struct pci_bus *bus, unsigned int devfn,
sim = find_pci_config_dword(err, where, NULL);
if (sim) {
*val = *sim;
- spin_unlock_irqrestore(&inject_lock, flags);
return 0;
}
out:
rv = aer_inj_read(bus, devfn, where, size, val);
- spin_unlock_irqrestore(&inject_lock, flags);
return rv;
}
@@ -250,12 +245,11 @@ static int aer_inj_write_config(struct pci_bus *bus, unsigned int devfn,
{
u32 *sim;
struct aer_error *err;
- unsigned long flags;
int rw1cs;
int domain;
int rv;
- spin_lock_irqsave(&inject_lock, flags);
+ guard(raw_spinlock_irqsave)(&inject_lock);
if (size != sizeof(u32))
goto out;
domain = pci_domain_nr(bus);
@@ -271,12 +265,10 @@ static int aer_inj_write_config(struct pci_bus *bus, unsigned int devfn,
*sim ^= val;
else
*sim = val;
- spin_unlock_irqrestore(&inject_lock, flags);
return 0;
}
out:
rv = aer_inj_write(bus, devfn, where, size, val);
- spin_unlock_irqrestore(&inject_lock, flags);
return rv;
}
@@ -304,14 +296,14 @@ static int pci_bus_set_aer_ops(struct pci_bus *bus)
if (!bus_ops)
return -ENOMEM;
ops = pci_bus_set_ops(bus, &aer_inj_pci_ops);
- spin_lock_irqsave(&inject_lock, flags);
+ raw_spin_lock_irqsave(&inject_lock, flags);
if (ops == &aer_inj_pci_ops)
goto out;
pci_bus_ops_init(bus_ops, bus, ops);
list_add(&bus_ops->list, &pci_bus_ops_list);
bus_ops = NULL;
out:
- spin_unlock_irqrestore(&inject_lock, flags);
+ raw_spin_unlock_irqrestore(&inject_lock, flags);
kfree(bus_ops);
return 0;
}
@@ -383,7 +375,7 @@ static int aer_inject(struct aer_error_inj *einj)
uncor_mask);
}
- spin_lock_irqsave(&inject_lock, flags);
+ raw_spin_lock_irqsave(&inject_lock, flags);
err = __find_aer_error_by_dev(dev);
if (!err) {
@@ -404,14 +396,14 @@ static int aer_inject(struct aer_error_inj *einj)
!(einj->cor_status & ~cor_mask)) {
ret = -EINVAL;
pci_warn(dev, "The correctable error(s) is masked by device\n");
- spin_unlock_irqrestore(&inject_lock, flags);
+ raw_spin_unlock_irqrestore(&inject_lock, flags);
goto out_put;
}
if (!aer_mask_override && einj->uncor_status &&
!(einj->uncor_status & ~uncor_mask)) {
ret = -EINVAL;
pci_warn(dev, "The uncorrectable error(s) is masked by device\n");
- spin_unlock_irqrestore(&inject_lock, flags);
+ raw_spin_unlock_irqrestore(&inject_lock, flags);
goto out_put;
}
@@ -445,7 +437,7 @@ static int aer_inject(struct aer_error_inj *einj)
rperr->source_id &= 0x0000ffff;
rperr->source_id |= PCI_DEVID(einj->bus, devfn) << 16;
}
- spin_unlock_irqrestore(&inject_lock, flags);
+ raw_spin_unlock_irqrestore(&inject_lock, flags);
if (aer_mask_override) {
pci_write_config_dword(dev, pos_cap_err + PCI_ERR_COR_MASK,
@@ -523,7 +515,6 @@ static int __init aer_inject_init(void)
static void __exit aer_inject_exit(void)
{
struct aer_error *err, *err_next;
- unsigned long flags;
struct pci_bus_ops *bus_ops;
misc_deregister(&aer_inject_device);
@@ -533,12 +524,11 @@ static void __exit aer_inject_exit(void)
kfree(bus_ops);
}
- spin_lock_irqsave(&inject_lock, flags);
+ guard(raw_spinlock_irqsave)(&inject_lock);
list_for_each_entry_safe(err, err_next, &einjected, list) {
list_del(&err->list);
kfree(err);
}
- spin_unlock_irqrestore(&inject_lock, flags);
}
module_init(aer_inject_init);
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH] pci/aer_inject: switching inject_lock to raw_spinlock_t
2025-10-07 6:02 [PATCH] pci/aer_inject: switching inject_lock to raw_spinlock_t Guangbo Cui
@ 2025-10-07 15:13 ` Waiman Long
2025-10-07 16:10 ` guangbo cui
0 siblings, 1 reply; 5+ messages in thread
From: Waiman Long @ 2025-10-07 15:13 UTC (permalink / raw)
To: Guangbo Cui, Sebastian Andrzej Siewior, Clark Williams,
Steven Rostedt, Peter Zijlstra, Ingo Molnar, Will Deacon,
Boqun Feng, Thomas Gleixner, Bjorn Helgaas
Cc: linux-rt-devel, linux-kernel, linux-pci
On 10/7/25 2:02 AM, Guangbo Cui wrote:
> When injecting AER errors under PREEMPT_RT, the kernel may trigger a
> lockdep warning about an invalid wait context:
>
> ```
> [ 1850.950780] [ BUG: Invalid wait context ]
> [ 1850.951152] 6.17.0-11316-g7a405dbb0f03-dirty #7 Not tainted
> [ 1850.951457] -----------------------------
> [ 1850.951680] irq/16-PCIe PME/56 is trying to lock:
> [ 1850.952004] ffff800082865238 (inject_lock){+.+.}-{3:3}, at: aer_inj_read_config+0x38/0x1dc
> [ 1850.952731] other info that might help us debug this:
> [ 1850.952997] context-{5:5}
> [ 1850.953192] 5 locks held by irq/16-PCIe PME/56:
> [ 1850.953415] #0: ffff800082647390 (local_bh){.+.+}-{1:3}, at: __local_bh_disable_ip+0x30/0x268
> [ 1850.953931] #1: ffff8000826c6b38 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire+0x4/0x48
> [ 1850.954453] #2: ffff000004bb6c58 (&data->lock){+...}-{3:3}, at: pcie_pme_irq+0x34/0xc4
> [ 1850.954949] #3: ffff8000826c6b38 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire+0x4/0x48
> [ 1850.955420] #4: ffff800082863d10 (pci_lock){....}-{2:2}, at: pci_bus_read_config_dword+0x5c/0xd8
> ```
>
> This happens because the AER injection path (`aer_inj_read_config()`)
> is called in the context of the PCIe PME interrupt thread, which runs
> through `irq_forced_thread_fn()` under PREEMPT_RT. In this context,
> `pci_lock` (a raw_spinlock_t) is held with interrupts disabled
> (`spin_lock_irqsave()`), and then `aer_inj_read_config()` tries to
> acquire `inject_lock`, which is a `rt_spin_lock`. (Thanks Waiman Long)
>
> `rt_spin_lock` may sleep, so acquiring it while holding a raw spinlock
> with IRQs disabled violates the lock ordering rules. This leads to
> the “Invalid wait context” lockdep warning.
>
> In other words, the lock order looks like this:
>
> ```
> raw_spin_lock_irqsave(&pci_lock);
> ↓
> rt_spin_lock(&inject_lock); <-- not allowed
> ```
>
> To fix this, convert `inject_lock` from an `rt_spin_lock` to a
> `raw_spinlock_t`, a raw spinlock is safe and consistent with the
> surrounding locking scheme.
>
> This resolves the lockdep “Invalid wait context” warning observed when
> injecting correctable AER errors through `/dev/aer_inject` on PREEMPT_RT.
>
> This was discovered while testing PCIe AER error injection on an arm64
> QEMU virtual machine:
>
> ```
> qemu-system-aarch64 \
> -nographic \
> -machine virt,highmem=off,gic-version=3 \
> -cpu cortex-a72 \
> -kernel arch/arm64/boot/Image \
> -initrd initramfs.cpio.gz \
> -append "console=ttyAMA0 root=/dev/ram rdinit=/linuxrc nokaslr" \
> -m 2G \
> -smp 1 \
> -netdev user,id=net0,hostfwd=tcp::2223-:22 \
> -device virtio-net-pci,netdev=net0 \
> -device pcie-root-port,id=rp0,chassis=1,slot=0x0 \
> -device pci-testdev -s -S
> ```
>
> Injecting a correctable PCIe error via /dev/aer_inject caused a BUG
> report with "Invalid wait context" in the irq/PCIe thread.
>
> ```
> ~ # export HEX="00020000000000000100000000000000000000000000000000000000"
> ~ # echo -n "$HEX" | xxd -r -p | tee /dev/aer_inject >/dev/null
> [ 1850.947170] pcieport 0000:00:02.0: aer_inject: Injecting errors 00000001/00000000 into device 0000:00:02.0
> [ 1850.949951]
> [ 1850.950479] =============================
> [ 1850.950780] [ BUG: Invalid wait context ]
> [ 1850.951152] 6.17.0-11316-g7a405dbb0f03-dirty #7 Not tainted
> [ 1850.951457] -----------------------------
> [ 1850.951680] irq/16-PCIe PME/56 is trying to lock:
> [ 1850.952004] ffff800082865238 (inject_lock){+.+.}-{3:3}, at: aer_inj_read_config+0x38/0x1dc
> [ 1850.952731] other info that might help us debug this:
> [ 1850.952997] context-{5:5}
> [ 1850.953192] 5 locks held by irq/16-PCIe PME/56:
> [ 1850.953415] #0: ffff800082647390 (local_bh){.+.+}-{1:3}, at: __local_bh_disable_ip+0x30/0x268
> [ 1850.953931] #1: ffff8000826c6b38 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire+0x4/0x48
> [ 1850.954453] #2: ffff000004bb6c58 (&data->lock){+...}-{3:3}, at: pcie_pme_irq+0x34/0xc4
> [ 1850.954949] #3: ffff8000826c6b38 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire+0x4/0x48
> [ 1850.955420] #4: ffff800082863d10 (pci_lock){....}-{2:2}, at: pci_bus_read_config_dword+0x5c/0xd8
> [ 1850.955932] stack backtrace:
> [ 1850.956412] CPU: 0 UID: 0 PID: 56 Comm: irq/16-PCIe PME Not tainted 6.17.0-11316-g7a405dbb0f03-dirty #7 PREEMPT_{RT,(full)}
> [ 1850.957039] Hardware name: linux,dummy-virt (DT)
> [ 1850.957409] Call trace:
> [ 1850.957727] show_stack+0x18/0x24 (C)
> [ 1850.958089] dump_stack_lvl+0x40/0xbc
> [ 1850.958339] dump_stack+0x18/0x24
> [ 1850.958586] __lock_acquire+0xa84/0x3008
> [ 1850.958907] lock_acquire+0x128/0x2a8
> [ 1850.959171] rt_spin_lock+0x50/0x1b8
> [ 1850.959476] aer_inj_read_config+0x38/0x1dc
> [ 1850.959821] pci_bus_read_config_dword+0x80/0xd8
> [ 1850.960079] pcie_capability_read_dword+0xac/0xd8
> [ 1850.960454] pcie_pme_irq+0x44/0xc4
> [ 1850.960728] irq_forced_thread_fn+0x30/0x94
> [ 1850.960984] irq_thread+0x1ac/0x3a4
> [ 1850.961308] kthread+0x1b4/0x208
> [ 1850.961557] ret_from_fork+0x10/0x20
> [ 1850.963088] pcieport 0000:00:02.0: AER: Correctable error message received from 0000:00:02.0
> [ 1850.963330] pcieport 0000:00:02.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
> [ 1850.963351] pcieport 0000:00:02.0: device [1b36:000c] error status/mask=00000001/0000e000
> [ 1850.963385] pcieport 0000:00:02.0: [ 0] RxErr (First)
> ```
>
> Signed-off-by: Guangbo Cui <jckeep.cuiguangbo@gmail.com>
Changing inject_lock into a raw_spinlock is the most obvious solution as
long as it meets the criteria that the lock hold time is deterministic
and relatively short and no other sleeping locks are being acquired down
the locking chain.
I am afraid that the these criteria are not met. First of all in
aer_inject_exit(), inject_lock is acquired while iterating the a linked
list which can last for while depending on how many items are in the
list. This may be OK as long as it is guaranteed the list will not be
long. Another problem is that it call kfree() while holding the lock.
kfree() will likely acquire another rt_spin_lock which is a sleeping
lock. You will have to consider pulling kfree() out from the lock
critical section.
Another function __find_aer_error() which does list iteration is called
while holding inject_lock. Again this may be a problem. If the linked
list can be long, you may have to consider breaking inject_lock into 2
or more separate locks to guard different data.
Cheers,
Longman
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] pci/aer_inject: switching inject_lock to raw_spinlock_t
2025-10-07 15:13 ` Waiman Long
@ 2025-10-07 16:10 ` guangbo cui
2025-10-07 16:12 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 5+ messages in thread
From: guangbo cui @ 2025-10-07 16:10 UTC (permalink / raw)
To: Waiman Long
Cc: Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt,
Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng,
Thomas Gleixner, Bjorn Helgaas, linux-rt-devel, linux-kernel,
linux-pci
On Tue, Oct 07, 2025 at 11:13:33AM -0400, Waiman Long wrote:
> On 10/7/25 2:02 AM, Guangbo Cui wrote:
> > [ 1850.947170] pcieport 0000:00:02.0: aer_inject: Injecting errors 00000001/00000000 into device 0000:00:02.0
> > [ 1850.949951]
> > [ 1850.950479] =============================
> > [ 1850.950780] [ BUG: Invalid wait context ]
> > [ 1850.951152] 6.17.0-11316-g7a405dbb0f03-dirty #7 Not tainted
> > [ 1850.951457] -----------------------------
> > [ 1850.951680] irq/16-PCIe PME/56 is trying to lock:
> > [ 1850.952004] ffff800082865238 (inject_lock){+.+.}-{3:3}, at: aer_inj_read_config+0x38/0x1dc
> > [ 1850.952731] other info that might help us debug this:
> > [ 1850.952997] context-{5:5}
> > [ 1850.953192] 5 locks held by irq/16-PCIe PME/56:
> > [ 1850.953415] #0: ffff800082647390 (local_bh){.+.+}-{1:3}, at: __local_bh_disable_ip+0x30/0x268
> > [ 1850.953931] #1: ffff8000826c6b38 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire+0x4/0x48
> > [ 1850.954453] #2: ffff000004bb6c58 (&data->lock){+...}-{3:3}, at: pcie_pme_irq+0x34/0xc4
> > [ 1850.954949] #3: ffff8000826c6b38 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire+0x4/0x48
> > [ 1850.955420] #4: ffff800082863d10 (pci_lock){....}-{2:2}, at: pci_bus_read_config_dword+0x5c/0xd8
> > [ 1850.955932] stack backtrace:
> > [ 1850.956412] CPU: 0 UID: 0 PID: 56 Comm: irq/16-PCIe PME Not tainted 6.17.0-11316-g7a405dbb0f03-dirty #7 PREEMPT_{RT,(full)}
> > [ 1850.957039] Hardware name: linux,dummy-virt (DT)
> > [ 1850.957409] Call trace:
> > [ 1850.957727] show_stack+0x18/0x24 (C)
> > [ 1850.958089] dump_stack_lvl+0x40/0xbc
> > [ 1850.958339] dump_stack+0x18/0x24
> > [ 1850.958586] __lock_acquire+0xa84/0x3008
> > [ 1850.958907] lock_acquire+0x128/0x2a8
> > [ 1850.959171] rt_spin_lock+0x50/0x1b8
> > [ 1850.959476] aer_inj_read_config+0x38/0x1dc
> > [ 1850.959821] pci_bus_read_config_dword+0x80/0xd8
> > [ 1850.960079] pcie_capability_read_dword+0xac/0xd8
> > [ 1850.960454] pcie_pme_irq+0x44/0xc4
> > [ 1850.960728] irq_forced_thread_fn+0x30/0x94
> > [ 1850.960984] irq_thread+0x1ac/0x3a4
> > [ 1850.961308] kthread+0x1b4/0x208
> > [ 1850.961557] ret_from_fork+0x10/0x20
> > [ 1850.963088] pcieport 0000:00:02.0: AER: Correctable error message received from 0000:00:02.0
> > [ 1850.963330] pcieport 0000:00:02.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
> > [ 1850.963351] pcieport 0000:00:02.0: device [1b36:000c] error status/mask=00000001/0000e000
> > [ 1850.963385] pcieport 0000:00:02.0: [ 0] RxErr (First)
>
> Changing inject_lock into a raw_spinlock is the most obvious solution as
> long as it meets the criteria that the lock hold time is deterministic and
> relatively short and no other sleeping locks are being acquired down the
> locking chain.
>
> I am afraid that the these criteria are not met. First of all in
> aer_inject_exit(), inject_lock is acquired while iterating the a linked list
> which can last for while depending on how many items are in the list. This
> may be OK as long as it is guaranteed the list will not be long. Another
> problem is that it call kfree() while holding the lock. kfree() will likely
> acquire another rt_spin_lock which is a sleeping lock. You will have to
> consider pulling kfree() out from the lock critical section.
The list length depends on how many error injections the user performs,
which is typically small since this feature is mainly used for development
and debugging purposes. So the list traversal time should be acceptable
in practice.
Yeah, pulling kfree() out from the lock critical section is right, I
will fix it in the next version.
> Another function __find_aer_error() which does list iteration is called
> while holding inject_lock. Again this may be a problem. If the linked list
> can be long, you may have to consider breaking inject_lock into 2 or more
> separate locks to guard different data.
As mentioned above, the list is usually short in typical use cases,
since error injection is mainly used for debugging or development
purposes. Perhaps we can also get some advice from the PCI folks.
Best regards,
Guangbo
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] pci/aer_inject: switching inject_lock to raw_spinlock_t
2025-10-07 16:10 ` guangbo cui
@ 2025-10-07 16:12 ` Sebastian Andrzej Siewior
2025-10-08 3:44 ` Guangbo Cui
0 siblings, 1 reply; 5+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-10-07 16:12 UTC (permalink / raw)
To: guangbo cui
Cc: Waiman Long, Clark Williams, Steven Rostedt, Peter Zijlstra,
Ingo Molnar, Will Deacon, Boqun Feng, Thomas Gleixner,
Bjorn Helgaas, linux-rt-devel, linux-kernel, linux-pci
On 2025-10-08 00:10:45 [+0800], guangbo cui wrote:
> As mentioned above, the list is usually short in typical use cases,
> since error injection is mainly used for debugging or development
> purposes. Perhaps we can also get some advice from the PCI folks.
Is this something that is used in "production" or more debugging/
development?
> Best regards,
> Guangbo
Sebastian
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] pci/aer_inject: switching inject_lock to raw_spinlock_t
2025-10-07 16:12 ` Sebastian Andrzej Siewior
@ 2025-10-08 3:44 ` Guangbo Cui
0 siblings, 0 replies; 5+ messages in thread
From: Guangbo Cui @ 2025-10-08 3:44 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Waiman Long, Clark Williams, Steven Rostedt, Peter Zijlstra,
Ingo Molnar, Will Deacon, Boqun Feng, Thomas Gleixner,
Bjorn Helgaas, linux-rt-devel, linux-kernel, linux-pci
On Tue, Oct 07, 2025 at 06:12:29PM +0200, Sebastian Andrzej Siewior wrote:
> On 2025-10-08 00:10:45 [+0800], guangbo cui wrote:
> > As mentioned above, the list is usually short in typical use cases,
> > since error injection is mainly used for debugging or development
> > purposes. Perhaps we can also get some advice from the PCI folks.
>
> Is this something that is used in "production" or more debugging/
> development?
It's mainly for debugging/development. Debugging AER code is difficult
cause it's hard to trigger real hardware errors.
Best regards,
Guangbo
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-10-08 3:44 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-07 6:02 [PATCH] pci/aer_inject: switching inject_lock to raw_spinlock_t Guangbo Cui
2025-10-07 15:13 ` Waiman Long
2025-10-07 16:10 ` guangbo cui
2025-10-07 16:12 ` Sebastian Andrzej Siewior
2025-10-08 3:44 ` Guangbo Cui
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).