* [BUG] soc: fsl: qbman: lockdep invalid wait context with qman_update_cgr_smp_call
@ 2023-12-28 10:19 Steffen Trumtrar
2023-12-28 16:17 ` Sean Anderson
2023-12-28 16:24 ` Sean Anderson
0 siblings, 2 replies; 4+ messages in thread
From: Steffen Trumtrar @ 2023-12-28 10:19 UTC (permalink / raw)
To: Sean Anderson; +Cc: Camelia Groza, Li Yang, David S. Miller, linux-arm-kernel
Hi,
I noticed that lockdep reports a BUG on the qman driver since
914f8b228ede709274b8c80514b352248ec9da00
Author: Sean Anderson <sean.anderson@seco.com>
AuthorDate: Fri Sep 2 17:57:35 2022 -0400
Commit: David S. Miller <davem@davemloft.net>
CommitDate: Mon Sep 5 14:27:39 2022 +0100
soc: fsl: qbman: Add CGR update function
This adds a function to update a CGR with new parameters. qman_create_cgr
can almost be used for this (with flags=0), but it's not suitable because
it also registers the callback function. The _safe variant was modeled off
of qman_cgr_delete_safe. However, we handle multiple arguments and a return
value.
The stack trace looks something like:
[ 20.192060] =============================
[ 20.196067] [ BUG: Invalid wait context ]
[ 20.200073] 6.7.0-rc6 #73 Not tainted
[ 20.203733] -----------------------------
[ 20.207738] systemd-journal/114 is trying to lock:
[ 20.212528] ffff000973403860 (&portal->cgr_lock){+.+.}-{3:3}, at: qman_update_cgr_smp_call+0x40/0xb0
[ 20.221688] other info that might help us debug this:
[ 20.226736] context-{2:2}
[ 20.229350] 1 lock held by systemd-journal/114:
[ 20.233878] #0: ffff0008001a0208 (&root->kernfs_iattr_rwsem){++++}-{4:4}, at: kernfs_iop_permission+0x48/0xa0
[ 20.243902] stack backtrace:
[ 20.246779] CPU: 2 PID: 114 Comm: systemd-journal Not tainted 6.7.0-rc6 #73
[ 20.253743] Hardware name: TQ TQMLS1046A SoM on Arkona AT1130 (AT300) board (DT)
[ 20.261144] Call trace:
[ 20.261147] dump_backtrace+0xa0/0x128
[ 20.261154] show_stack+0x20/0x38
[ 20.261158] dump_stack_lvl+0x74/0xd8
[ 20.274303] dump_stack+0x18/0x28
[ 20.279004] __lock_acquire+0x920/0x1b58
[ 20.284309] lock_acquire+0x1fc/0x348
[ 20.289354] _raw_spin_lock_irqsave+0x6c/0xd0
[ 20.294748] qman_update_cgr_smp_call+0x40/0xb0
[ 20.299278] __flush_smp_call_function_queue+0x1d0/0x3e0
[ 20.304593] generic_smp_call_function_single_interrupt+0x1c/0x30
[ 20.310689] ipi_handler+0x250/0x290
[ 20.314263] handle_percpu_devid_irq+0xb0/0x170
[ 20.318793] generic_handle_domain_irq+0x34/0x58
[ 20.323411] gic_handle_irq+0x4c/0xd8
[ 20.327070] call_on_irq_stack+0x24/0x58
[ 20.330991] do_interrupt_handler+0xdc/0xe8
[ 20.335173] el1_interrupt+0x34/0x68
[ 20.338747] el1h_64_irq_handler+0x18/0x28
[ 20.342843] el1h_64_irq+0x64/0x68
[ 20.346240] lock_acquired+0x198/0x448
[ 20.349988] down_read+0x98/0x1c0
[ 20.353300] kernfs_iop_permission+0x48/0xa0
[ 20.357569] inode_permission+0x118/0x190
[ 20.361578] link_path_walk.part.0.constprop.0+0x2b0/0x398
[ 20.367065] path_lookupat+0x44/0x1b8
[ 20.370726] filename_lookup+0x9c/0x170
[ 20.374561] user_path_at_empty+0x54/0x88
[ 20.378571] do_faccessat+0x88/0x308
[ 20.382144] __arm64_sys_access+0x2c/0x40
[ 20.386152] invoke_syscall+0x50/0x120
[ 20.389901] el0_svc_common.constprop.0+0xc8/0xf0
[ 20.394606] do_el0_svc_compat+0x24/0x40
[ 20.398528] el0_svc_compat+0x4c/0x148
[ 20.402275] el0t_32_sync_handler+0xb0/0x138
[ 20.406545] el0t_32_sync+0x194/0x198
The
[ 20.207738] systemd-journal/114 is trying to lock:
can be any other process and must not be systemd-journal. For example when barebox-state triggers the stacktrace the function calls look like:
# _-----=> irqs-off/BH-disabled
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / _-=> migrate-disable
# |||| / delay
# TASK-PID CPU# ||||| TIMESTAMP FUNCTION
# | | | ||||| | |
systemd-1 [002] ...2. 6.871198: qm_modify_cgr <-qman_init_cgr_all
(...)
kworker/2:1-38 [002] ...1. 19.070335: qman_update_cgr_safe <-dpaa_eth_cgr_set_speed
barebox-state-211 [001] d.h1. 19.070344: qman_update_cgr_smp_call <-__flush_smp_call_function_queue
barebox-state-211 [001] d.h3. 19.260311: qm_modify_cgr <-qman_update_cgr_smp_call
kworker/2:1-38 [002] ...1. 19.305517: qman_update_cgr_safe <-dpaa_eth_cgr_set_speed
<idle>-0 [001] d.h2. 19.305524: qman_update_cgr_smp_call <-__flush_smp_call_function_queue
<idle>-0 [001] d.h4. 19.305526: qm_modify_cgr <-qman_update_cgr_smp_call
kworker/3:1-40 [003] ...1. 19.354259: qman_update_cgr_safe <-dpaa_eth_cgr_set_speed
<idle>-0 [001] d.h2. 19.354265: qman_update_cgr_smp_call <-__flush_smp_call_function_queue
<idle>-0 [001] d.h4. 19.354267: qm_modify_cgr <-qman_update_cgr_smp_call
I'm not sure why the CPU# detection in the patch is necessary, but maybe you have an idea what is happening here.
Best regards,
Steffen
--
Pengutronix e.K. | Dipl.-Inform. Steffen Trumtrar |
Steuerwalder Str. 21 | https://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686| Fax: +49-5121-206917-5555 |
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [BUG] soc: fsl: qbman: lockdep invalid wait context with qman_update_cgr_smp_call
2023-12-28 10:19 [BUG] soc: fsl: qbman: lockdep invalid wait context with qman_update_cgr_smp_call Steffen Trumtrar
@ 2023-12-28 16:17 ` Sean Anderson
2024-01-08 7:54 ` Steffen Trumtrar
2023-12-28 16:24 ` Sean Anderson
1 sibling, 1 reply; 4+ messages in thread
From: Sean Anderson @ 2023-12-28 16:17 UTC (permalink / raw)
To: Steffen Trumtrar
Cc: Camelia Groza, Li Yang, David S. Miller, linux-arm-kernel
On 12/28/23 05:19, Steffen Trumtrar wrote:
> [You don't often get email from s.trumtrar@pengutronix.de. Learn why this is important at https://cas5-0-urlprotect.trendmicro.com:443/wis/clicktime/v1/query?url=https%3a%2f%2faka.ms%2fLearnAboutSenderIdentification&umid=72a5c76d-1830-45a7-9b06-a3ea261ec2fb&auth=d807158c60b7d2502abde8a2fc01f40662980862-c66d6dcafe8b63c2b68864762f9c6ab8a12651d4 ]
>
> Hi,
>
> I noticed that lockdep reports a BUG on the qman driver since
>
> 914f8b228ede709274b8c80514b352248ec9da00
> Author: Sean Anderson <sean.anderson@seco.com>
> AuthorDate: Fri Sep 2 17:57:35 2022 -0400
> Commit: David S. Miller <davem@davemloft.net>
> CommitDate: Mon Sep 5 14:27:39 2022 +0100
>
> soc: fsl: qbman: Add CGR update function
>
> This adds a function to update a CGR with new parameters. qman_create_cgr
> can almost be used for this (with flags=0), but it's not suitable because
> it also registers the callback function. The _safe variant was modeled off
> of qman_cgr_delete_safe. However, we handle multiple arguments and a return
> value.
>
> The stack trace looks something like:
>
> [ 20.192060] =============================
> [ 20.196067] [ BUG: Invalid wait context ]
> [ 20.200073] 6.7.0-rc6 #73 Not tainted
> [ 20.203733] -----------------------------
> [ 20.207738] systemd-journal/114 is trying to lock:
> [ 20.212528] ffff000973403860 (&portal->cgr_lock){+.+.}-{3:3}, at: qman_update_cgr_smp_call+0x40/0xb0
> [ 20.221688] other info that might help us debug this:
> [ 20.226736] context-{2:2}
> [ 20.229350] 1 lock held by systemd-journal/114:
> [ 20.233878] #0: ffff0008001a0208 (&root->kernfs_iattr_rwsem){++++}-{4:4}, at: kernfs_iop_permission+0x48/0xa0
> [ 20.243902] stack backtrace:
> [ 20.246779] CPU: 2 PID: 114 Comm: systemd-journal Not tainted 6.7.0-rc6 #73
> [ 20.253743] Hardware name: TQ TQMLS1046A SoM on Arkona AT1130 (AT300) board (DT)
> [ 20.261144] Call trace:
> [ 20.261147] dump_backtrace+0xa0/0x128
> [ 20.261154] show_stack+0x20/0x38
> [ 20.261158] dump_stack_lvl+0x74/0xd8
> [ 20.274303] dump_stack+0x18/0x28
> [ 20.279004] __lock_acquire+0x920/0x1b58
> [ 20.284309] lock_acquire+0x1fc/0x348
> [ 20.289354] _raw_spin_lock_irqsave+0x6c/0xd0
> [ 20.294748] qman_update_cgr_smp_call+0x40/0xb0
> [ 20.299278] __flush_smp_call_function_queue+0x1d0/0x3e0
> [ 20.304593] generic_smp_call_function_single_interrupt+0x1c/0x30
> [ 20.310689] ipi_handler+0x250/0x290
> [ 20.314263] handle_percpu_devid_irq+0xb0/0x170
> [ 20.318793] generic_handle_domain_irq+0x34/0x58
> [ 20.323411] gic_handle_irq+0x4c/0xd8
> [ 20.327070] call_on_irq_stack+0x24/0x58
> [ 20.330991] do_interrupt_handler+0xdc/0xe8
> [ 20.335173] el1_interrupt+0x34/0x68
> [ 20.338747] el1h_64_irq_handler+0x18/0x28
> [ 20.342843] el1h_64_irq+0x64/0x68
> [ 20.346240] lock_acquired+0x198/0x448
> [ 20.349988] down_read+0x98/0x1c0
> [ 20.353300] kernfs_iop_permission+0x48/0xa0
> [ 20.357569] inode_permission+0x118/0x190
> [ 20.361578] link_path_walk.part.0.constprop.0+0x2b0/0x398
> [ 20.367065] path_lookupat+0x44/0x1b8
> [ 20.370726] filename_lookup+0x9c/0x170
> [ 20.374561] user_path_at_empty+0x54/0x88
> [ 20.378571] do_faccessat+0x88/0x308
> [ 20.382144] __arm64_sys_access+0x2c/0x40
> [ 20.386152] invoke_syscall+0x50/0x120
> [ 20.389901] el0_svc_common.constprop.0+0xc8/0xf0
> [ 20.394606] do_el0_svc_compat+0x24/0x40
> [ 20.398528] el0_svc_compat+0x4c/0x148
> [ 20.402275] el0t_32_sync_handler+0xb0/0x138
> [ 20.406545] el0t_32_sync+0x194/0x198
>
> The
> [ 20.207738] systemd-journal/114 is trying to lock:
> can be any other process and must not be systemd-journal. For example when barebox-state triggers the stacktrace the function calls look like:
>
> # _-----=> irqs-off/BH-disabled
> # / _----=> need-resched
> # | / _---=> hardirq/softirq
> # || / _--=> preempt-depth
> # ||| / _-=> migrate-disable
> # |||| / delay
> # TASK-PID CPU# ||||| TIMESTAMP FUNCTION
> # | | | ||||| | |
> systemd-1 [002] ...2. 6.871198: qm_modify_cgr <-qman_init_cgr_all
> (...)
> kworker/2:1-38 [002] ...1. 19.070335: qman_update_cgr_safe <-dpaa_eth_cgr_set_speed
> barebox-state-211 [001] d.h1. 19.070344: qman_update_cgr_smp_call <-__flush_smp_call_function_queue
> barebox-state-211 [001] d.h3. 19.260311: qm_modify_cgr <-qman_update_cgr_smp_call
> kworker/2:1-38 [002] ...1. 19.305517: qman_update_cgr_safe <-dpaa_eth_cgr_set_speed
> <idle>-0 [001] d.h2. 19.305524: qman_update_cgr_smp_call <-__flush_smp_call_function_queue
> <idle>-0 [001] d.h4. 19.305526: qm_modify_cgr <-qman_update_cgr_smp_call
> kworker/3:1-40 [003] ...1. 19.354259: qman_update_cgr_safe <-dpaa_eth_cgr_set_speed
> <idle>-0 [001] d.h2. 19.354265: qman_update_cgr_smp_call <-__flush_smp_call_function_queue
> <idle>-0 [001] d.h4. 19.354267: qm_modify_cgr <-qman_update_cgr_smp_call
>
> I'm not sure why the CPU# detection in the patch is necessary, but maybe you have an idea what is happening here.
Can you try [1]?
If that works for you I'll resend.
--Sean
[1] https://lore.kernel.org/linux-arm-kernel/20230404145557.2356894-1-sean.anderson@seco.com/
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [BUG] soc: fsl: qbman: lockdep invalid wait context with qman_update_cgr_smp_call
2023-12-28 10:19 [BUG] soc: fsl: qbman: lockdep invalid wait context with qman_update_cgr_smp_call Steffen Trumtrar
2023-12-28 16:17 ` Sean Anderson
@ 2023-12-28 16:24 ` Sean Anderson
1 sibling, 0 replies; 4+ messages in thread
From: Sean Anderson @ 2023-12-28 16:24 UTC (permalink / raw)
To: Steffen Trumtrar
Cc: Camelia Groza, Li Yang, David S. Miller, linux-arm-kernel
On 12/28/23 05:19, Steffen Trumtrar wrote:
> [You don't often get email from s.trumtrar@pengutronix.de. Learn why this is important at https://cas5-0-urlprotect.trendmicro.com:443/wis/clicktime/v1/query?url=https%3a%2f%2faka.ms%2fLearnAboutSenderIdentification&umid=72a5c76d-1830-45a7-9b06-a3ea261ec2fb&auth=d807158c60b7d2502abde8a2fc01f40662980862-c66d6dcafe8b63c2b68864762f9c6ab8a12651d4 ]
>
> Hi,
>
> I noticed that lockdep reports a BUG on the qman driver since
>
> 914f8b228ede709274b8c80514b352248ec9da00
> Author: Sean Anderson <sean.anderson@seco.com>
> AuthorDate: Fri Sep 2 17:57:35 2022 -0400
> Commit: David S. Miller <davem@davemloft.net>
> CommitDate: Mon Sep 5 14:27:39 2022 +0100
>
> soc: fsl: qbman: Add CGR update function
>
> This adds a function to update a CGR with new parameters. qman_create_cgr
> can almost be used for this (with flags=0), but it's not suitable because
> it also registers the callback function. The _safe variant was modeled off
> of qman_cgr_delete_safe. However, we handle multiple arguments and a return
> value.
>
> The stack trace looks something like:
>
> [ 20.192060] =============================
> [ 20.196067] [ BUG: Invalid wait context ]
> [ 20.200073] 6.7.0-rc6 #73 Not tainted
> [ 20.203733] -----------------------------
> [ 20.207738] systemd-journal/114 is trying to lock:
> [ 20.212528] ffff000973403860 (&portal->cgr_lock){+.+.}-{3:3}, at: qman_update_cgr_smp_call+0x40/0xb0
> [ 20.221688] other info that might help us debug this:
> [ 20.226736] context-{2:2}
> [ 20.229350] 1 lock held by systemd-journal/114:
> [ 20.233878] #0: ffff0008001a0208 (&root->kernfs_iattr_rwsem){++++}-{4:4}, at: kernfs_iop_permission+0x48/0xa0
> [ 20.243902] stack backtrace:
> [ 20.246779] CPU: 2 PID: 114 Comm: systemd-journal Not tainted 6.7.0-rc6 #73
> [ 20.253743] Hardware name: TQ TQMLS1046A SoM on Arkona AT1130 (AT300) board (DT)
> [ 20.261144] Call trace:
> [ 20.261147] dump_backtrace+0xa0/0x128
> [ 20.261154] show_stack+0x20/0x38
> [ 20.261158] dump_stack_lvl+0x74/0xd8
> [ 20.274303] dump_stack+0x18/0x28
> [ 20.279004] __lock_acquire+0x920/0x1b58
> [ 20.284309] lock_acquire+0x1fc/0x348
> [ 20.289354] _raw_spin_lock_irqsave+0x6c/0xd0
> [ 20.294748] qman_update_cgr_smp_call+0x40/0xb0
> [ 20.299278] __flush_smp_call_function_queue+0x1d0/0x3e0
> [ 20.304593] generic_smp_call_function_single_interrupt+0x1c/0x30
> [ 20.310689] ipi_handler+0x250/0x290
> [ 20.314263] handle_percpu_devid_irq+0xb0/0x170
> [ 20.318793] generic_handle_domain_irq+0x34/0x58
> [ 20.323411] gic_handle_irq+0x4c/0xd8
> [ 20.327070] call_on_irq_stack+0x24/0x58
> [ 20.330991] do_interrupt_handler+0xdc/0xe8
> [ 20.335173] el1_interrupt+0x34/0x68
> [ 20.338747] el1h_64_irq_handler+0x18/0x28
> [ 20.342843] el1h_64_irq+0x64/0x68
> [ 20.346240] lock_acquired+0x198/0x448
> [ 20.349988] down_read+0x98/0x1c0
> [ 20.353300] kernfs_iop_permission+0x48/0xa0
> [ 20.357569] inode_permission+0x118/0x190
> [ 20.361578] link_path_walk.part.0.constprop.0+0x2b0/0x398
> [ 20.367065] path_lookupat+0x44/0x1b8
> [ 20.370726] filename_lookup+0x9c/0x170
> [ 20.374561] user_path_at_empty+0x54/0x88
> [ 20.378571] do_faccessat+0x88/0x308
> [ 20.382144] __arm64_sys_access+0x2c/0x40
> [ 20.386152] invoke_syscall+0x50/0x120
> [ 20.389901] el0_svc_common.constprop.0+0xc8/0xf0
> [ 20.394606] do_el0_svc_compat+0x24/0x40
> [ 20.398528] el0_svc_compat+0x4c/0x148
> [ 20.402275] el0t_32_sync_handler+0xb0/0x138
> [ 20.406545] el0t_32_sync+0x194/0x198
>
> The
> [ 20.207738] systemd-journal/114 is trying to lock:
> can be any other process and must not be systemd-journal. For example when barebox-state triggers the stacktrace the function calls look like:
>
> # _-----=> irqs-off/BH-disabled
> # / _----=> need-resched
> # | / _---=> hardirq/softirq
> # || / _--=> preempt-depth
> # ||| / _-=> migrate-disable
> # |||| / delay
> # TASK-PID CPU# ||||| TIMESTAMP FUNCTION
> # | | | ||||| | |
> systemd-1 [002] ...2. 6.871198: qm_modify_cgr <-qman_init_cgr_all
> (...)
> kworker/2:1-38 [002] ...1. 19.070335: qman_update_cgr_safe <-dpaa_eth_cgr_set_speed
> barebox-state-211 [001] d.h1. 19.070344: qman_update_cgr_smp_call <-__flush_smp_call_function_queue
> barebox-state-211 [001] d.h3. 19.260311: qm_modify_cgr <-qman_update_cgr_smp_call
> kworker/2:1-38 [002] ...1. 19.305517: qman_update_cgr_safe <-dpaa_eth_cgr_set_speed
> <idle>-0 [001] d.h2. 19.305524: qman_update_cgr_smp_call <-__flush_smp_call_function_queue
> <idle>-0 [001] d.h4. 19.305526: qm_modify_cgr <-qman_update_cgr_smp_call
> kworker/3:1-40 [003] ...1. 19.354259: qman_update_cgr_safe <-dpaa_eth_cgr_set_speed
> <idle>-0 [001] d.h2. 19.354265: qman_update_cgr_smp_call <-__flush_smp_call_function_queue
> <idle>-0 [001] d.h4. 19.354267: qm_modify_cgr <-qman_update_cgr_smp_call
>
> I'm not sure why the CPU# detection in the patch is necessary, but maybe you have an idea what is happening here.
And to keep this from being lost again:
#regzbot introduced 914f8b228ede709274b8c80514b352248ec9da00 https://lore.kernel.org/netdev/20230323153935.nofnjucqjqnz34ej@skbuf/
#regzbot monitor https://lore.kernel.org/linux-arm-kernel/20230404145557.2356894-1-sean.anderson@seco.com/
#regzbot dup https://lore.kernel.org/netdev/20230323153935.nofnjucqjqnz34ej@skbuf/
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [BUG] soc: fsl: qbman: lockdep invalid wait context with qman_update_cgr_smp_call
2023-12-28 16:17 ` Sean Anderson
@ 2024-01-08 7:54 ` Steffen Trumtrar
0 siblings, 0 replies; 4+ messages in thread
From: Steffen Trumtrar @ 2024-01-08 7:54 UTC (permalink / raw)
To: Sean Anderson; +Cc: Camelia Groza, Li Yang, David S. Miller, linux-arm-kernel
On 2023-12-28 at 11:17 -05, Sean Anderson <sean.anderson@seco.com> wrote:
> On 12/28/23 05:19, Steffen Trumtrar wrote:
>> [You don't often get email from s.trumtrar@pengutronix.de. Learn why this is
>> important at
>> https://cas5-0-urlprotect.trendmicro.com:443/wis/clicktime/v1/query?url=https%3a%2f%2faka.ms%2fLearnAboutSenderIdentification&umid=72a5c76d-1830-45a7-9b06-a3ea261ec2fb&auth=d807158c60b7d2502abde8a2fc01f40662980862-c66d6dcafe8b63c2b68864762f9c6ab8a12651d4
>> ]
>>
>> Hi,
>>
>> I noticed that lockdep reports a BUG on the qman driver since
>>
>> 914f8b228ede709274b8c80514b352248ec9da00
>> Author: Sean Anderson <sean.anderson@seco.com>
>> AuthorDate: Fri Sep 2 17:57:35 2022 -0400
>> Commit: David S. Miller <davem@davemloft.net>
>> CommitDate: Mon Sep 5 14:27:39 2022 +0100
>>
>> soc: fsl: qbman: Add CGR update function
>>
>> This adds a function to update a CGR with new parameters. qman_create_cgr
>> can almost be used for this (with flags=0), but it's not suitable because
>> it also registers the callback function. The _safe variant was modeled off
>> of qman_cgr_delete_safe. However, we handle multiple arguments and a return
>> value.
>>
>> The stack trace looks something like:
>>
>> [ 20.192060] =============================
>> [ 20.196067] [ BUG: Invalid wait context ]
>> [ 20.200073] 6.7.0-rc6 #73 Not tainted
>> [ 20.203733] -----------------------------
>> [ 20.207738] systemd-journal/114 is trying to lock:
>> [ 20.212528] ffff000973403860 (&portal->cgr_lock){+.+.}-{3:3}, at: qman_update_cgr_smp_call+0x40/0xb0
>> [ 20.221688] other info that might help us debug this:
>> [ 20.226736] context-{2:2}
>> [ 20.229350] 1 lock held by systemd-journal/114:
>> [ 20.233878] #0: ffff0008001a0208 (&root->kernfs_iattr_rwsem){++++}-{4:4}, at: kernfs_iop_permission+0x48/0xa0
>> [ 20.243902] stack backtrace:
>> [ 20.246779] CPU: 2 PID: 114 Comm: systemd-journal Not tainted 6.7.0-rc6 #73
>> [ 20.253743] Hardware name: TQ TQMLS1046A SoM on Arkona AT1130 (AT300) board (DT)
>> [ 20.261144] Call trace:
>> [ 20.261147] dump_backtrace+0xa0/0x128
>> [ 20.261154] show_stack+0x20/0x38
>> [ 20.261158] dump_stack_lvl+0x74/0xd8
>> [ 20.274303] dump_stack+0x18/0x28
>> [ 20.279004] __lock_acquire+0x920/0x1b58
>> [ 20.284309] lock_acquire+0x1fc/0x348
>> [ 20.289354] _raw_spin_lock_irqsave+0x6c/0xd0
>> [ 20.294748] qman_update_cgr_smp_call+0x40/0xb0
>> [ 20.299278] __flush_smp_call_function_queue+0x1d0/0x3e0
>> [ 20.304593] generic_smp_call_function_single_interrupt+0x1c/0x30
>> [ 20.310689] ipi_handler+0x250/0x290
>> [ 20.314263] handle_percpu_devid_irq+0xb0/0x170
>> [ 20.318793] generic_handle_domain_irq+0x34/0x58
>> [ 20.323411] gic_handle_irq+0x4c/0xd8
>> [ 20.327070] call_on_irq_stack+0x24/0x58
>> [ 20.330991] do_interrupt_handler+0xdc/0xe8
>> [ 20.335173] el1_interrupt+0x34/0x68
>> [ 20.338747] el1h_64_irq_handler+0x18/0x28
>> [ 20.342843] el1h_64_irq+0x64/0x68
>> [ 20.346240] lock_acquired+0x198/0x448
>> [ 20.349988] down_read+0x98/0x1c0
>> [ 20.353300] kernfs_iop_permission+0x48/0xa0
>> [ 20.357569] inode_permission+0x118/0x190
>> [ 20.361578] link_path_walk.part.0.constprop.0+0x2b0/0x398
>> [ 20.367065] path_lookupat+0x44/0x1b8
>> [ 20.370726] filename_lookup+0x9c/0x170
>> [ 20.374561] user_path_at_empty+0x54/0x88
>> [ 20.378571] do_faccessat+0x88/0x308
>> [ 20.382144] __arm64_sys_access+0x2c/0x40
>> [ 20.386152] invoke_syscall+0x50/0x120
>> [ 20.389901] el0_svc_common.constprop.0+0xc8/0xf0
>> [ 20.394606] do_el0_svc_compat+0x24/0x40
>> [ 20.398528] el0_svc_compat+0x4c/0x148
>> [ 20.402275] el0t_32_sync_handler+0xb0/0x138
>> [ 20.406545] el0t_32_sync+0x194/0x198
>>
>> The
>> [ 20.207738] systemd-journal/114 is trying to lock:
>> can be any other process and must not be systemd-journal. For example when
>> barebox-state triggers the stacktrace the function calls look like:
>>
>> # _-----=> irqs-off/BH-disabled
>> # / _----=> need-resched
>> # | / _---=> hardirq/softirq
>> # || / _--=> preempt-depth
>> # ||| / _-=> migrate-disable
>> # |||| / delay
>> # TASK-PID CPU# ||||| TIMESTAMP FUNCTION
>> # | | | ||||| | |
>> systemd-1 [002] ...2. 6.871198: qm_modify_cgr <-qman_init_cgr_all
>> (...)
>> kworker/2:1-38 [002] ...1. 19.070335: qman_update_cgr_safe <-dpaa_eth_cgr_set_speed
>> barebox-state-211 [001] d.h1. 19.070344: qman_update_cgr_smp_call <-__flush_smp_call_function_queue
>> barebox-state-211 [001] d.h3. 19.260311: qm_modify_cgr <-qman_update_cgr_smp_call
>> kworker/2:1-38 [002] ...1. 19.305517: qman_update_cgr_safe <-dpaa_eth_cgr_set_speed
>> <idle>-0 [001] d.h2. 19.305524: qman_update_cgr_smp_call <-__flush_smp_call_function_queue
>> <idle>-0 [001] d.h4. 19.305526: qm_modify_cgr <-qman_update_cgr_smp_call
>> kworker/3:1-40 [003] ...1. 19.354259: qman_update_cgr_safe <-dpaa_eth_cgr_set_speed
>> <idle>-0 [001] d.h2. 19.354265: qman_update_cgr_smp_call <-__flush_smp_call_function_queue
>> <idle>-0 [001] d.h4. 19.354267: qm_modify_cgr <-qman_update_cgr_smp_call
>>
>> I'm not sure why the CPU# detection in the patch is necessary, but maybe you have an idea what is happening here.
>
> Can you try [1]?
>
> If that works for you I'll resend.
Works fine for me \o/
Thanks,
Steffen
--
Pengutronix e.K. | Dipl.-Inform. Steffen Trumtrar |
Steuerwalder Str. 21 | https://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686| Fax: +49-5121-206917-5555 |
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-01-08 7:55 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-28 10:19 [BUG] soc: fsl: qbman: lockdep invalid wait context with qman_update_cgr_smp_call Steffen Trumtrar
2023-12-28 16:17 ` Sean Anderson
2024-01-08 7:54 ` Steffen Trumtrar
2023-12-28 16:24 ` Sean Anderson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).