All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steffen Trumtrar <s.trumtrar@pengutronix.de>
To: Sean Anderson <sean.anderson@seco.com>
Cc: Camelia Groza <camelia.groza@nxp.com>,
	Li Yang <leoyang.li@nxp.com>,
	"David S. Miller" <davem@davemloft.net>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [BUG] soc: fsl: qbman: lockdep invalid wait context with qman_update_cgr_smp_call
Date: Mon, 08 Jan 2024 08:54:04 +0100	[thread overview]
Message-ID: <87bk9w2rb0.fsf@pengutronix.de> (raw)
In-Reply-To: <33088f1b-4a44-425b-8694-3f602afb4537@seco.com>


On 2023-12-28 at 11:17 -05, Sean Anderson <sean.anderson@seco.com> wrote:

> On 12/28/23 05:19, Steffen Trumtrar wrote:
>> [You don't often get email from s.trumtrar@pengutronix.de. Learn why this is
>> important at
>> https://cas5-0-urlprotect.trendmicro.com:443/wis/clicktime/v1/query?url=https%3a%2f%2faka.ms%2fLearnAboutSenderIdentification&umid=72a5c76d-1830-45a7-9b06-a3ea261ec2fb&auth=d807158c60b7d2502abde8a2fc01f40662980862-c66d6dcafe8b63c2b68864762f9c6ab8a12651d4
>> ]
>>
>> Hi,
>>
>> I noticed that lockdep reports a BUG on the qman driver since
>>
>>    914f8b228ede709274b8c80514b352248ec9da00
>>    Author:     Sean Anderson <sean.anderson@seco.com>
>>    AuthorDate: Fri Sep 2 17:57:35 2022 -0400
>>    Commit:     David S. Miller <davem@davemloft.net>
>>    CommitDate: Mon Sep 5 14:27:39 2022 +0100
>>
>>    soc: fsl: qbman: Add CGR update function
>>
>>    This adds a function to update a CGR with new parameters. qman_create_cgr
>>    can almost be used for this (with flags=0), but it's not suitable because
>>    it also registers the callback function. The _safe variant was modeled off
>>    of qman_cgr_delete_safe. However, we handle multiple arguments and a return
>>    value.
>>
>> The stack trace looks something like:
>>
>>    [   20.192060] =============================
>>    [   20.196067] [ BUG: Invalid wait context ]
>>    [   20.200073] 6.7.0-rc6 #73 Not tainted
>>    [   20.203733] -----------------------------
>>    [   20.207738] systemd-journal/114 is trying to lock:
>>    [   20.212528] ffff000973403860 (&portal->cgr_lock){+.+.}-{3:3}, at: qman_update_cgr_smp_call+0x40/0xb0
>>    [   20.221688] other info that might help us debug this:
>>    [   20.226736] context-{2:2}
>>    [   20.229350] 1 lock held by systemd-journal/114:
>>    [   20.233878]  #0: ffff0008001a0208 (&root->kernfs_iattr_rwsem){++++}-{4:4}, at: kernfs_iop_permission+0x48/0xa0
>>    [   20.243902] stack backtrace:
>>    [   20.246779] CPU: 2 PID: 114 Comm: systemd-journal Not tainted 6.7.0-rc6 #73
>>    [   20.253743] Hardware name: TQ TQMLS1046A SoM on Arkona AT1130 (AT300) board (DT)
>>    [   20.261144] Call trace:
>>    [   20.261147]  dump_backtrace+0xa0/0x128
>>    [   20.261154]  show_stack+0x20/0x38
>>    [   20.261158]  dump_stack_lvl+0x74/0xd8
>>    [   20.274303]  dump_stack+0x18/0x28
>>    [   20.279004]  __lock_acquire+0x920/0x1b58
>>    [   20.284309]  lock_acquire+0x1fc/0x348
>>    [   20.289354]  _raw_spin_lock_irqsave+0x6c/0xd0
>>    [   20.294748]  qman_update_cgr_smp_call+0x40/0xb0
>>    [   20.299278]  __flush_smp_call_function_queue+0x1d0/0x3e0
>>    [   20.304593]  generic_smp_call_function_single_interrupt+0x1c/0x30
>>    [   20.310689]  ipi_handler+0x250/0x290
>>    [   20.314263]  handle_percpu_devid_irq+0xb0/0x170
>>    [   20.318793]  generic_handle_domain_irq+0x34/0x58
>>    [   20.323411]  gic_handle_irq+0x4c/0xd8
>>    [   20.327070]  call_on_irq_stack+0x24/0x58
>>    [   20.330991]  do_interrupt_handler+0xdc/0xe8
>>    [   20.335173]  el1_interrupt+0x34/0x68
>>    [   20.338747]  el1h_64_irq_handler+0x18/0x28
>>    [   20.342843]  el1h_64_irq+0x64/0x68
>>    [   20.346240]  lock_acquired+0x198/0x448
>>    [   20.349988]  down_read+0x98/0x1c0
>>    [   20.353300]  kernfs_iop_permission+0x48/0xa0
>>    [   20.357569]  inode_permission+0x118/0x190
>>    [   20.361578]  link_path_walk.part.0.constprop.0+0x2b0/0x398
>>    [   20.367065]  path_lookupat+0x44/0x1b8
>>    [   20.370726]  filename_lookup+0x9c/0x170
>>    [   20.374561]  user_path_at_empty+0x54/0x88
>>    [   20.378571]  do_faccessat+0x88/0x308
>>    [   20.382144]  __arm64_sys_access+0x2c/0x40
>>    [   20.386152]  invoke_syscall+0x50/0x120
>>    [   20.389901]  el0_svc_common.constprop.0+0xc8/0xf0
>>    [   20.394606]  do_el0_svc_compat+0x24/0x40
>>    [   20.398528]  el0_svc_compat+0x4c/0x148
>>    [   20.402275]  el0t_32_sync_handler+0xb0/0x138
>>    [   20.406545]  el0t_32_sync+0x194/0x198
>>
>> The
>>    [   20.207738] systemd-journal/114 is trying to lock:
>> can be any other process and must not be systemd-journal. For example when
>> barebox-state triggers the stacktrace the function calls look like:
>>
>> #                                _-----=> irqs-off/BH-disabled
>> #                               / _----=> need-resched
>> #                              | / _---=> hardirq/softirq
>> #                              || / _--=> preempt-depth
>> #                              ||| / _-=> migrate-disable
>> #                              |||| /     delay
>> #           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
>> #              | |         |   |||||     |         |
>>         systemd-1       [002] ...2.     6.871198: qm_modify_cgr <-qman_init_cgr_all
>>         (...)
>>     kworker/2:1-38      [002] ...1.    19.070335: qman_update_cgr_safe <-dpaa_eth_cgr_set_speed
>>   barebox-state-211     [001] d.h1.    19.070344: qman_update_cgr_smp_call <-__flush_smp_call_function_queue
>>   barebox-state-211     [001] d.h3.    19.260311: qm_modify_cgr <-qman_update_cgr_smp_call
>>     kworker/2:1-38      [002] ...1.    19.305517: qman_update_cgr_safe <-dpaa_eth_cgr_set_speed
>>          <idle>-0       [001] d.h2.    19.305524: qman_update_cgr_smp_call <-__flush_smp_call_function_queue
>>          <idle>-0       [001] d.h4.    19.305526: qm_modify_cgr <-qman_update_cgr_smp_call
>>     kworker/3:1-40      [003] ...1.    19.354259: qman_update_cgr_safe <-dpaa_eth_cgr_set_speed
>>          <idle>-0       [001] d.h2.    19.354265: qman_update_cgr_smp_call <-__flush_smp_call_function_queue
>>          <idle>-0       [001] d.h4.    19.354267: qm_modify_cgr <-qman_update_cgr_smp_call
>>
>> I'm not sure why the CPU# detection in the patch is necessary, but maybe you have an idea what is happening here.
>
> Can you try [1]?
>
> If that works for you I'll resend.

Works fine for me \o/

Thanks,
Steffen

--
Pengutronix e.K.                | Dipl.-Inform. Steffen Trumtrar |
Steuerwalder Str. 21            | https://www.pengutronix.de/    |
31137 Hildesheim, Germany       | Phone: +49-5121-206917-0       |
Amtsgericht Hildesheim, HRA 2686| Fax:   +49-5121-206917-5555    |

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2024-01-08  7:55 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-28 10:19 [BUG] soc: fsl: qbman: lockdep invalid wait context with qman_update_cgr_smp_call Steffen Trumtrar
2023-12-28 16:17 ` Sean Anderson
2024-01-08  7:54   ` Steffen Trumtrar [this message]
2023-12-28 16:24 ` Sean Anderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bk9w2rb0.fsf@pengutronix.de \
    --to=s.trumtrar@pengutronix.de \
    --cc=camelia.groza@nxp.com \
    --cc=davem@davemloft.net \
    --cc=leoyang.li@nxp.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=sean.anderson@seco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.