All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steffen Trumtrar <s.trumtrar@pengutronix.de>
To: Sean Anderson <sean.anderson@seco.com>
Cc: Camelia Groza <camelia.groza@nxp.com>,
	Li Yang <leoyang.li@nxp.com>,
	David S. Miller <davem@davemloft.net>,
	linux-arm-kernel@lists.infradead.org
Subject: [BUG] soc: fsl: qbman: lockdep invalid wait context with qman_update_cgr_smp_call
Date: Thu, 28 Dec 2023 11:19:06 +0100	[thread overview]
Message-ID: <87wmsyvclu.fsf@pengutronix.de> (raw)


Hi,

I noticed that lockdep reports a BUG on the qman driver since

    914f8b228ede709274b8c80514b352248ec9da00
    Author:     Sean Anderson <sean.anderson@seco.com>
    AuthorDate: Fri Sep 2 17:57:35 2022 -0400
    Commit:     David S. Miller <davem@davemloft.net>
    CommitDate: Mon Sep 5 14:27:39 2022 +0100

    soc: fsl: qbman: Add CGR update function

    This adds a function to update a CGR with new parameters. qman_create_cgr
    can almost be used for this (with flags=0), but it's not suitable because
    it also registers the callback function. The _safe variant was modeled off
    of qman_cgr_delete_safe. However, we handle multiple arguments and a return
    value.

The stack trace looks something like:

    [   20.192060] =============================
    [   20.196067] [ BUG: Invalid wait context ]
    [   20.200073] 6.7.0-rc6 #73 Not tainted
    [   20.203733] -----------------------------
    [   20.207738] systemd-journal/114 is trying to lock:
    [   20.212528] ffff000973403860 (&portal->cgr_lock){+.+.}-{3:3}, at: qman_update_cgr_smp_call+0x40/0xb0
    [   20.221688] other info that might help us debug this:
    [   20.226736] context-{2:2}
    [   20.229350] 1 lock held by systemd-journal/114:
    [   20.233878]  #0: ffff0008001a0208 (&root->kernfs_iattr_rwsem){++++}-{4:4}, at: kernfs_iop_permission+0x48/0xa0
    [   20.243902] stack backtrace:
    [   20.246779] CPU: 2 PID: 114 Comm: systemd-journal Not tainted 6.7.0-rc6 #73
    [   20.253743] Hardware name: TQ TQMLS1046A SoM on Arkona AT1130 (AT300) board (DT)
    [   20.261144] Call trace:
    [   20.261147]  dump_backtrace+0xa0/0x128
    [   20.261154]  show_stack+0x20/0x38
    [   20.261158]  dump_stack_lvl+0x74/0xd8
    [   20.274303]  dump_stack+0x18/0x28
    [   20.279004]  __lock_acquire+0x920/0x1b58
    [   20.284309]  lock_acquire+0x1fc/0x348
    [   20.289354]  _raw_spin_lock_irqsave+0x6c/0xd0
    [   20.294748]  qman_update_cgr_smp_call+0x40/0xb0
    [   20.299278]  __flush_smp_call_function_queue+0x1d0/0x3e0
    [   20.304593]  generic_smp_call_function_single_interrupt+0x1c/0x30
    [   20.310689]  ipi_handler+0x250/0x290
    [   20.314263]  handle_percpu_devid_irq+0xb0/0x170
    [   20.318793]  generic_handle_domain_irq+0x34/0x58
    [   20.323411]  gic_handle_irq+0x4c/0xd8
    [   20.327070]  call_on_irq_stack+0x24/0x58
    [   20.330991]  do_interrupt_handler+0xdc/0xe8
    [   20.335173]  el1_interrupt+0x34/0x68
    [   20.338747]  el1h_64_irq_handler+0x18/0x28
    [   20.342843]  el1h_64_irq+0x64/0x68
    [   20.346240]  lock_acquired+0x198/0x448
    [   20.349988]  down_read+0x98/0x1c0
    [   20.353300]  kernfs_iop_permission+0x48/0xa0
    [   20.357569]  inode_permission+0x118/0x190
    [   20.361578]  link_path_walk.part.0.constprop.0+0x2b0/0x398
    [   20.367065]  path_lookupat+0x44/0x1b8
    [   20.370726]  filename_lookup+0x9c/0x170
    [   20.374561]  user_path_at_empty+0x54/0x88
    [   20.378571]  do_faccessat+0x88/0x308
    [   20.382144]  __arm64_sys_access+0x2c/0x40
    [   20.386152]  invoke_syscall+0x50/0x120
    [   20.389901]  el0_svc_common.constprop.0+0xc8/0xf0
    [   20.394606]  do_el0_svc_compat+0x24/0x40
    [   20.398528]  el0_svc_compat+0x4c/0x148
    [   20.402275]  el0t_32_sync_handler+0xb0/0x138
    [   20.406545]  el0t_32_sync+0x194/0x198

The
    [   20.207738] systemd-journal/114 is trying to lock:
can be any other process and must not be systemd-journal. For example when barebox-state triggers the stacktrace the function calls look like:

#                                _-----=> irqs-off/BH-disabled
#                               / _----=> need-resched
#                              | / _---=> hardirq/softirq
#                              || / _--=> preempt-depth
#                              ||| / _-=> migrate-disable
#                              |||| /     delay
#           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
#              | |         |   |||||     |         |
         systemd-1       [002] ...2.     6.871198: qm_modify_cgr <-qman_init_cgr_all
         (...)
     kworker/2:1-38      [002] ...1.    19.070335: qman_update_cgr_safe <-dpaa_eth_cgr_set_speed
   barebox-state-211     [001] d.h1.    19.070344: qman_update_cgr_smp_call <-__flush_smp_call_function_queue
   barebox-state-211     [001] d.h3.    19.260311: qm_modify_cgr <-qman_update_cgr_smp_call
     kworker/2:1-38      [002] ...1.    19.305517: qman_update_cgr_safe <-dpaa_eth_cgr_set_speed
          <idle>-0       [001] d.h2.    19.305524: qman_update_cgr_smp_call <-__flush_smp_call_function_queue
          <idle>-0       [001] d.h4.    19.305526: qm_modify_cgr <-qman_update_cgr_smp_call
     kworker/3:1-40      [003] ...1.    19.354259: qman_update_cgr_safe <-dpaa_eth_cgr_set_speed
          <idle>-0       [001] d.h2.    19.354265: qman_update_cgr_smp_call <-__flush_smp_call_function_queue
          <idle>-0       [001] d.h4.    19.354267: qm_modify_cgr <-qman_update_cgr_smp_call

I'm not sure why the CPU# detection in the patch is necessary, but maybe you have an idea what is happening here.


Best regards,
Steffen

--
Pengutronix e.K.                | Dipl.-Inform. Steffen Trumtrar |
Steuerwalder Str. 21            | https://www.pengutronix.de/    |
31137 Hildesheim, Germany       | Phone: +49-5121-206917-0       |
Amtsgericht Hildesheim, HRA 2686| Fax:   +49-5121-206917-5555    |

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

             reply	other threads:[~2023-12-28 10:36 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-28 10:19 Steffen Trumtrar [this message]
2023-12-28 16:17 ` [BUG] soc: fsl: qbman: lockdep invalid wait context with qman_update_cgr_smp_call Sean Anderson
2024-01-08  7:54   ` Steffen Trumtrar
2023-12-28 16:24 ` Sean Anderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wmsyvclu.fsf@pengutronix.de \
    --to=s.trumtrar@pengutronix.de \
    --cc=camelia.groza@nxp.com \
    --cc=davem@davemloft.net \
    --cc=leoyang.li@nxp.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=sean.anderson@seco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.