public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
From: Ben Greear <greearb@candelatech.com>
To: Tejun Heo <tj@kernel.org>
Cc: Johannes Berg <johannes@sipsolutions.net>,
	linux-wireless <linux-wireless@vger.kernel.org>,
	Miriam Rachel <miriam.rachel.korenblit@intel.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: 6.18.13 iwlwifi deadlock allocating cma while work-item is active.
Date: Tue, 10 Mar 2026 09:10:56 -0700	[thread overview]
Message-ID: <bba74cab-7305-a052-7e1c-7a7736ba4531@candelatech.com> (raw)
In-Reply-To: <68c1ca1381d1871fff72b211890a64eb@kernel.org>

On 3/4/26 09:14, Tejun Heo wrote:
> Hello,
> 
> (Partially drafted with the help of Claude)
> 
> On Tue, Mar 03, 2026 at 04:02:14PM -0800, Ben Greear wrote:
>> Could the logic that detects blocked work-queues instead be instrumented
>> to print out more useful information so that just reproducing the problem
>> and providing dmesg output will be sufficient?  Or does dmesg already provide
>> enough that would give you a clue as to what is going on?
> 
> It may not be exactly the same issue, but Breno just posted a patch that
> might help. The current watchdog only prints backtraces for workers that
> are actively running on CPU, so sleeping culprits are invisible. His
> patch removes that filter so all in-flight workers get printed:
> 
>    http://lkml.kernel.org/r/aag4tTyeiZyw0jID@gmail.com
> 
> Might be worth trying.

Hello Tejun,

I applied the first 4 patches of the v2 of that series to my 6.18.14 kernel, with
my use-kthread-for-regdom patches reverted.

Stock 6.18.16 kernel crashes too often in the wifi driver to reliably reproduce the deadlock
there.

Both the reg_todo and lru_drain are on CPU 5, but I'm not sure if that really
matters.

Does this info below show anything useful to you?

Mar 10 08:59:15 ct523c-2103 kernel: BUG: workqueue lockup - pool cpus=5 node=0 flags=0x0 nice=0 stuck for 57507s!
Mar 10 08:59:15 ct523c-2103 kernel: Showing busy workqueues and worker pools:
Mar 10 08:59:15 ct523c-2103 kernel: workqueue events: flags=0x100
Mar 10 08:59:15 ct523c-2103 kernel:   pwq 2: cpus=0 node=0 flags=0x0 nice=0 active=2 refcnt=3
Mar 10 08:59:15 ct523c-2103 kernel:     in-flight: 264128:disconnect_work [cfg80211] for 57502s disconnect_work [cfg80211]
Mar 10 08:59:15 ct523c-2103 kernel:   pwq 22: cpus=5 node=0 flags=0x0 nice=0 active=9 refcnt=10
Mar 10 08:59:15 ct523c-2103 kernel:     in-flight: 271323:reg_todo [cfg80211] for 57507s
Mar 10 08:59:15 ct523c-2103 kernel:     pending: reg_todo [cfg80211], igb_watchdog_task [igb], output_poll_execute [drm_kms_helper], kernfs_notify_workfn, 
key_garbage_collector, 2*update_super_work, netstamp_clear
Mar 10 08:59:15 ct523c-2103 kernel: workqueue events_unbound: flags=0x2
Mar 10 08:59:15 ct523c-2103 kernel:   pwq 32: cpus=0-7 flags=0x4 nice=0 active=2 refcnt=3
Mar 10 08:59:15 ct523c-2103 kernel:     in-flight: 236337:cfg80211_wiphy_work [cfg80211] for 57507s cfg80211_wiphy_work [cfg80211]
Mar 10 08:59:15 ct523c-2103 kernel: workqueue events_unbound: flags=0x2
Mar 10 08:59:15 ct523c-2103 kernel:   pwq 32: cpus=0-7 flags=0x4 nice=0 active=1 refcnt=2
Mar 10 08:59:15 ct523c-2103 kernel:     in-flight: 142096:linkwatch_event for 57502s linkwatch_event
Mar 10 08:59:15 ct523c-2103 kernel:   pwq 32: cpus=0-7 flags=0x4 nice=0 active=3 refcnt=4
Mar 10 08:59:15 ct523c-2103 kernel:     in-flight: 218388:fsnotify_mark_destroy_workfn for 55995s fsnotify_mark_destroy_workfn BAR(1309) 
,267638:fsnotify_connector_destroy_workfn for 55995s fsnotify_connector_destroy_workfn
Mar 10 08:59:15 ct523c-2103 kernel:   pwq 32: cpus=0-7 flags=0x4 nice=0 active=2 refcnt=4
Mar 10 08:59:15 ct523c-2103 kernel: workqueue events_freezable: flags=0x104
Mar 10 08:59:15 ct523c-2103 kernel:   pwq 22: cpus=5 node=0 flags=0x0 nice=0 active=1 refcnt=2
Mar 10 08:59:15 ct523c-2103 kernel:     pending: pci_pme_list_scan
Mar 10 08:59:15 ct523c-2103 kernel: workqueue events_power_efficient: flags=0x180
Mar 10 08:59:15 ct523c-2103 kernel:   pwq 14: cpus=3 node=0 flags=0x0 nice=0 active=2 refcnt=3
Mar 10 08:59:15 ct523c-2103 kernel:     in-flight: 268226:reg_check_chans_work [cfg80211] for 57441s reg_check_chans_work [cfg80211]
Mar 10 08:59:15 ct523c-2103 kernel:   pwq 22: cpus=5 node=0 flags=0x0 nice=0 active=218 refcnt=219
Mar 10 08:59:15 ct523c-2103 kernel:     pending: gc_worker [nf_conntrack], hub_post_resume, 216*ioc_release_fn
Mar 10 08:59:15 ct523c-2103 kernel: workqueue rcu_gp: flags=0x108
Mar 10 08:59:15 ct523c-2103 kernel:   pwq 22: cpus=5 node=0 flags=0x0 nice=0 active=1 refcnt=2
Mar 10 08:59:15 ct523c-2103 kernel:     pending: process_srcu
Mar 10 08:59:15 ct523c-2103 kernel: workqueue mm_percpu_wq: flags=0x8
Mar 10 08:59:15 ct523c-2103 kernel:   pwq 22: cpus=5 node=0 flags=0x0 nice=0 active=2 refcnt=4
Mar 10 08:59:15 ct523c-2103 kernel:     pending: lru_add_drain_per_cpu BAR(236337), vmstat_update
Mar 10 08:59:15 ct523c-2103 kernel: workqueue cgroup_offline: flags=0x100
Mar 10 08:59:15 ct523c-2103 kernel:   pwq 22: cpus=5 node=0 flags=0x0 nice=0 active=1 refcnt=107
Mar 10 08:59:15 ct523c-2103 kernel:     pending: css_killed_work_fn
Mar 10 08:59:15 ct523c-2103 kernel:     inactive: 105*css_killed_work_fn
Mar 10 08:59:15 ct523c-2103 kernel: workqueue cgroup_release: flags=0x100
Mar 10 08:59:15 ct523c-2103 kernel:   pwq 22: cpus=5 node=0 flags=0x0 nice=0 active=1 refcnt=15
Mar 10 08:59:15 ct523c-2103 kernel:     pending: css_release_work_fn
Mar 10 08:59:15 ct523c-2103 kernel:     inactive: 13*css_release_work_fn
Mar 10 08:59:15 ct523c-2103 kernel: workqueue cgroup_bpf_destroy: flags=0x100
Mar 10 08:59:15 ct523c-2103 kernel:   pwq 22: cpus=5 node=0 flags=0x0 nice=0 active=1 refcnt=54
Mar 10 08:59:15 ct523c-2103 kernel:     pending: cgroup_bpf_release
Mar 10 08:59:15 ct523c-2103 kernel:     inactive: 52*cgroup_bpf_release
Mar 10 08:59:15 ct523c-2103 kernel: workqueue ipv6_addrconf: flags=0x6000a
Mar 10 08:59:15 ct523c-2103 kernel:   pwq 32: cpus=0-7 flags=0x4 nice=0 active=1 refcnt=14
Mar 10 08:59:15 ct523c-2103 kernel:     in-flight: 202972:addrconf_dad_work for 57505s
Mar 10 08:59:15 ct523c-2103 kernel:     inactive: 4*addrconf_verify_work
Mar 10 08:59:15 ct523c-2103 kernel: pool 2: cpus=0 node=0 flags=0x0 nice=0 hung=0s workers=3 idle: 203067 243191
Mar 10 08:59:15 ct523c-2103 kernel: pool 14: cpus=3 node=0 flags=0x0 nice=0 hung=0s workers=3 idle: 709498 699449
Mar 10 08:59:15 ct523c-2103 kernel: pool 22: cpus=5 node=0 flags=0x0 nice=0 hung=57507s workers=3 idle: 166929 260518
Mar 10 08:59:15 ct523c-2103 kernel: pool 32: cpus=0-7 flags=0x4 nice=0 hung=0s workers=9 idle: 631693 712021 717023 671858
Mar 10 08:59:15 ct523c-2103 kernel: Showing backtraces of busy workers in stalled CPU-bound worker pools:
Mar 10 08:59:15 ct523c-2103 kernel: pool 22:
Mar 10 08:59:15 ct523c-2103 kernel: task:kworker/5:2     state:D stack:0     pid:271323 tgid:271323 ppid:2      task_flags:0x4208060 flags:0x00080000
Mar 10 08:59:15 ct523c-2103 kernel: Workqueue: events reg_todo [cfg80211]
Mar 10 08:59:15 ct523c-2103 kernel: Call Trace:
Mar 10 08:59:15 ct523c-2103 kernel:  <TASK>
Mar 10 08:59:15 ct523c-2103 kernel:  __schedule+0x106f/0x4340
Mar 10 08:59:15 ct523c-2103 kernel:  ? lock_acquire+0x155/0x2e0
Mar 10 08:59:15 ct523c-2103 kernel:  ? io_schedule_timeout+0x150/0x150
Mar 10 08:59:15 ct523c-2103 kernel:  ? __schedule+0x1865/0x4340
Mar 10 08:59:15 ct523c-2103 kernel:  preempt_schedule_notrace+0x4c/0x70
Mar 10 08:59:15 ct523c-2103 kernel:  preempt_schedule_notrace_thunk+0x16/0x30
Mar 10 08:59:15 ct523c-2103 kernel:  rcu_is_watching+0x59/0x70
Mar 10 08:59:15 ct523c-2103 kernel:  lock_acquire+0x291/0x2e0
Mar 10 08:59:15 ct523c-2103 kernel:  schedule+0x211/0x3a0
Mar 10 08:59:15 ct523c-2103 kernel:  ? schedule+0x1f2/0x3a0
Mar 10 08:59:15 ct523c-2103 kernel:  schedule_preempt_disabled+0x11/0x20
Mar 10 08:59:15 ct523c-2103 kernel:  __mutex_lock+0xd02/0x1d60
Mar 10 08:59:15 ct523c-2103 kernel:  ? reg_process_self_managed_hints+0x70/0x190 [cfg80211]
Mar 10 08:59:15 ct523c-2103 kernel:  ? ww_mutex_lock+0x160/0x160
Mar 10 08:59:15 ct523c-2103 kernel:  ? __mutex_unlock_slowpath+0x15d/0x770
Mar 10 08:59:15 ct523c-2103 kernel:  ? wait_for_completion_io_timeout+0x20/0x20
Mar 10 08:59:15 ct523c-2103 kernel:  ? reg_process_self_managed_hints+0x70/0x190 [cfg80211]
Mar 10 08:59:15 ct523c-2103 kernel:  reg_process_self_managed_hints+0x70/0x190 [cfg80211]
Mar 10 08:59:15 ct523c-2103 kernel:  reg_todo+0x52e/0x7c0 [cfg80211]
Mar 10 08:59:15 ct523c-2103 kernel:  ? lock_release+0xce/0x290
Mar 10 08:59:15 ct523c-2103 kernel:  process_one_work+0x88b/0x1820
Mar 10 08:59:15 ct523c-2103 kernel:  ? pwq_dec_nr_in_flight+0xe00/0xe00
Mar 10 08:59:15 ct523c-2103 kernel:  ? reg_process_hint+0x1480/0x1480 [cfg80211]
Mar 10 08:59:15 ct523c-2103 kernel:  worker_thread+0x5a1/0xfd0
Mar 10 08:59:15 ct523c-2103 kernel:  ? __kthread_parkme+0xc6/0x1f0
Mar 10 08:59:15 ct523c-2103 kernel:  ? rescuer_thread+0x1350/0x1350
Mar 10 08:59:15 ct523c-2103 kernel:  kthread+0x3b7/0x770
Mar 10 08:59:15 ct523c-2103 kernel:  ? kthread_is_per_cpu+0xb0/0xb0
Mar 10 08:59:15 ct523c-2103 kernel:  ? ret_from_fork+0x17/0x3a0
Mar 10 08:59:15 ct523c-2103 kernel:  ? lock_release+0xce/0x290
Mar 10 08:59:15 ct523c-2103 kernel:  ? kthread_is_per_cpu+0xb0/0xb0
Mar 10 08:59:15 ct523c-2103 kernel:  ret_from_fork+0x28b/0x3a0
Mar 10 08:59:15 ct523c-2103 kernel:  ? kthread_is_per_cpu+0xb0/0xb0
Mar 10 08:59:15 ct523c-2103 kernel:  ret_from_fork_asm+0x11/0x20
Mar 10 08:59:15 ct523c-2103 kernel:  </TASK>
Mar 10 08:59:46 ct523c-2103 kernel: BUG: workqueue lockup - pool cpus=5 node=0 flags=0x0 nice=0 stuck for 57537s!
Mar 10 08:59:46 ct523c-2103 kernel: Showing busy workqueues and worker pools:
Mar 10 08:59:46 ct523c-2103 kernel: workqueue events: flags=0x100
Mar 10 08:59:46 ct523c-2103 kernel:   pwq 2: cpus=0 node=0 flags=0x0 nice=0 active=2 refcnt=3
Mar 10 08:59:46 ct523c-2103 kernel:     in-flight: 264128:disconnect_work [cfg80211] for 57533s disconnect_work [cfg80211]
Mar 10 08:59:46 ct523c-2103 kernel:   pwq 22: cpus=5 node=0 flags=0x0 nice=0 active=9 refcnt=10
Mar 10 08:59:46 ct523c-2103 kernel:     in-flight: 271323:reg_todo [cfg80211] for 57537s
Mar 10 08:59:46 ct523c-2103 kernel:     pending: reg_todo [cfg80211], igb_watchdog_task [igb], output_poll_execute [drm_kms_helper], kernfs_notify_workfn, 
key_garbage_collector, 2*update_super_work, netstamp_clear
Mar 10 08:59:46 ct523c-2103 kernel: workqueue events_unbound: flags=0x2
Mar 10 08:59:46 ct523c-2103 kernel:   pwq 32: cpus=0-7 flags=0x4 nice=0 active=2 refcnt=3
Mar 10 08:59:46 ct523c-2103 kernel:     in-flight: 236337:cfg80211_wiphy_work [cfg80211] for 57537s cfg80211_wiphy_work [cfg80211]
Mar 10 08:59:46 ct523c-2103 kernel: workqueue events_unbound: flags=0x2
Mar 10 08:59:46 ct523c-2103 kernel:   pwq 32: cpus=0-7 flags=0x4 nice=0 active=1 refcnt=2
Mar 10 08:59:46 ct523c-2103 kernel:     in-flight: 142096:linkwatch_event for 57533s linkwatch_event
Mar 10 08:59:46 ct523c-2103 kernel:   pwq 32: cpus=0-7 flags=0x4 nice=0 active=3 refcnt=4
Mar 10 08:59:46 ct523c-2103 kernel:     in-flight: 218388:fsnotify_mark_destroy_workfn for 56026s fsnotify_mark_destroy_workfn BAR(1309) 
,267638:fsnotify_connector_destroy_workfn for 56026s fsnotify_connector_destroy_workfn
Mar 10 08:59:46 ct523c-2103 kernel:   pwq 32: cpus=0-7 flags=0x4 nice=0 active=2 refcnt=4
Mar 10 08:59:46 ct523c-2103 kernel: workqueue events_freezable: flags=0x104
Mar 10 08:59:46 ct523c-2103 kernel:   pwq 22: cpus=5 node=0 flags=0x0 nice=0 active=1 refcnt=2
Mar 10 08:59:46 ct523c-2103 kernel:     pending: pci_pme_list_scan
Mar 10 08:59:46 ct523c-2103 kernel: workqueue events_power_efficient: flags=0x180
Mar 10 08:59:46 ct523c-2103 kernel:   pwq 14: cpus=3 node=0 flags=0x0 nice=0 active=2 refcnt=3
Mar 10 08:59:46 ct523c-2103 kernel:     in-flight: 268226:reg_check_chans_work [cfg80211] for 57472s reg_check_chans_work [cfg80211]
Mar 10 08:59:46 ct523c-2103 kernel:   pwq 22: cpus=5 node=0 flags=0x0 nice=0 active=218 refcnt=219
Mar 10 08:59:46 ct523c-2103 kernel:     pending: gc_worker [nf_conntrack], hub_post_resume, 216*ioc_release_fn
Mar 10 08:59:46 ct523c-2103 kernel: workqueue rcu_gp: flags=0x108
Mar 10 08:59:46 ct523c-2103 kernel:   pwq 22: cpus=5 node=0 flags=0x0 nice=0 active=1 refcnt=2
Mar 10 08:59:46 ct523c-2103 kernel:     pending: process_srcu
Mar 10 08:59:46 ct523c-2103 kernel: workqueue mm_percpu_wq: flags=0x8
Mar 10 08:59:46 ct523c-2103 kernel:   pwq 22: cpus=5 node=0 flags=0x0 nice=0 active=2 refcnt=4
Mar 10 08:59:46 ct523c-2103 kernel:     pending: lru_add_drain_per_cpu BAR(236337), vmstat_update
Mar 10 08:59:46 ct523c-2103 kernel: workqueue cgroup_offline: flags=0x100
Mar 10 08:59:46 ct523c-2103 kernel:   pwq 22: cpus=5 node=0 flags=0x0 nice=0 active=1 refcnt=107
Mar 10 08:59:46 ct523c-2103 kernel:     pending: css_killed_work_fn
Mar 10 08:59:46 ct523c-2103 kernel:     inactive: 105*css_killed_work_fn
Mar 10 08:59:46 ct523c-2103 kernel: workqueue cgroup_release: flags=0x100
Mar 10 08:59:46 ct523c-2103 kernel:   pwq 22: cpus=5 node=0 flags=0x0 nice=0 active=1 refcnt=15
Mar 10 08:59:46 ct523c-2103 kernel:     pending: css_release_work_fn
Mar 10 08:59:46 ct523c-2103 kernel:     inactive: 13*css_release_work_fn
Mar 10 08:59:46 ct523c-2103 kernel: workqueue cgroup_bpf_destroy: flags=0x100
Mar 10 08:59:46 ct523c-2103 kernel:   pwq 22: cpus=5 node=0 flags=0x0 nice=0 active=1 refcnt=54
Mar 10 08:59:46 ct523c-2103 kernel:     pending: cgroup_bpf_release
Mar 10 08:59:46 ct523c-2103 kernel:     inactive: 52*cgroup_bpf_release
Mar 10 08:59:46 ct523c-2103 kernel: workqueue ipv6_addrconf: flags=0x6000a
Mar 10 08:59:46 ct523c-2103 kernel:   pwq 32: cpus=0-7 flags=0x4 nice=0 active=1 refcnt=14
Mar 10 08:59:46 ct523c-2103 kernel:     in-flight: 202972:addrconf_dad_work for 57536s
Mar 10 08:59:46 ct523c-2103 kernel:     inactive: 4*addrconf_verify_work
Mar 10 08:59:46 ct523c-2103 kernel: pool 2: cpus=0 node=0 flags=0x0 nice=0 hung=0s workers=3 idle: 203067 243191
Mar 10 08:59:46 ct523c-2103 kernel: pool 14: cpus=3 node=0 flags=0x0 nice=0 hung=0s workers=3 idle: 709498 699449
Mar 10 08:59:46 ct523c-2103 kernel: pool 22: cpus=5 node=0 flags=0x0 nice=0 hung=57537s workers=3 idle: 166929 260518
Mar 10 08:59:46 ct523c-2103 kernel: pool 32: cpus=0-7 flags=0x4 nice=0 hung=0s workers=9 idle: 631693 712021 717023 671858
Mar 10 08:59:46 ct523c-2103 kernel: Showing backtraces of busy workers in stalled CPU-bound worker pools:
Mar 10 08:59:46 ct523c-2103 kernel: pool 22:
Mar 10 08:59:46 ct523c-2103 kernel: task:kworker/5:2     state:D stack:0     pid:271323 tgid:271323 ppid:2      task_flags:0x4208060 flags:0x00080000
Mar 10 08:59:46 ct523c-2103 kernel: Workqueue: events reg_todo [cfg80211]
Mar 10 08:59:46 ct523c-2103 kernel: Call Trace:
Mar 10 08:59:46 ct523c-2103 kernel:  <TASK>
Mar 10 08:59:46 ct523c-2103 kernel:  __schedule+0x106f/0x4340
Mar 10 08:59:46 ct523c-2103 kernel:  ? lock_acquire+0x155/0x2e0
Mar 10 08:59:46 ct523c-2103 kernel:  ? io_schedule_timeout+0x150/0x150
Mar 10 08:59:46 ct523c-2103 kernel:  ? __schedule+0x1865/0x4340
Mar 10 08:59:46 ct523c-2103 kernel:  preempt_schedule_notrace+0x4c/0x70
Mar 10 08:59:46 ct523c-2103 kernel:  preempt_schedule_notrace_thunk+0x16/0x30
Mar 10 08:59:46 ct523c-2103 kernel:  rcu_is_watching+0x59/0x70
Mar 10 08:59:46 ct523c-2103 kernel:  lock_acquire+0x291/0x2e0
Mar 10 08:59:46 ct523c-2103 kernel:  schedule+0x211/0x3a0
Mar 10 08:59:46 ct523c-2103 kernel:  ? schedule+0x1f2/0x3a0
Mar 10 08:59:46 ct523c-2103 kernel:  schedule_preempt_disabled+0x11/0x20
Mar 10 08:59:46 ct523c-2103 kernel:  __mutex_lock+0xd02/0x1d60
Mar 10 08:59:46 ct523c-2103 kernel:  ? reg_process_self_managed_hints+0x70/0x190 [cfg80211]
Mar 10 08:59:46 ct523c-2103 kernel:  ? ww_mutex_lock+0x160/0x160
Mar 10 08:59:46 ct523c-2103 kernel:  ? __mutex_unlock_slowpath+0x15d/0x770
Mar 10 08:59:46 ct523c-2103 kernel:  ? wait_for_completion_io_timeout+0x20/0x20
Mar 10 08:59:46 ct523c-2103 kernel:  ? reg_process_self_managed_hints+0x70/0x190 [cfg80211]
Mar 10 08:59:46 ct523c-2103 kernel:  reg_process_self_managed_hints+0x70/0x190 [cfg80211]
Mar 10 08:59:46 ct523c-2103 kernel:  reg_todo+0x52e/0x7c0 [cfg80211]
Mar 10 08:59:46 ct523c-2103 kernel:  ? lock_release+0xce/0x290
Mar 10 08:59:46 ct523c-2103 kernel:  process_one_work+0x88b/0x1820
Mar 10 08:59:46 ct523c-2103 kernel:  ? pwq_dec_nr_in_flight+0xe00/0xe00
Mar 10 08:59:46 ct523c-2103 kernel:  ? reg_process_hint+0x1480/0x1480 [cfg80211]
Mar 10 08:59:46 ct523c-2103 kernel:  worker_thread+0x5a1/0xfd0
Mar 10 08:59:46 ct523c-2103 kernel:  ? __kthread_parkme+0xc6/0x1f0
Mar 10 08:59:46 ct523c-2103 kernel:  ? rescuer_thread+0x1350/0x1350
Mar 10 08:59:46 ct523c-2103 kernel:  kthread+0x3b7/0x770
Mar 10 08:59:46 ct523c-2103 kernel:  ? kthread_is_per_cpu+0xb0/0xb0
Mar 10 08:59:46 ct523c-2103 kernel:  ? ret_from_fork+0x17/0x3a0
Mar 10 08:59:46 ct523c-2103 kernel:  ? lock_release+0xce/0x290
Mar 10 08:59:46 ct523c-2103 kernel:  ? kthread_is_per_cpu+0xb0/0xb0
Mar 10 08:59:46 ct523c-2103 kernel:  ret_from_fork+0x28b/0x3a0
Mar 10 08:59:46 ct523c-2103 kernel:  ? kthread_is_per_cpu+0xb0/0xb0
Mar 10 08:59:46 ct523c-2103 kernel:  ret_from_fork_asm+0x11/0x20
Mar 10 08:59:46 ct523c-2103 kernel:  </TASK>


>> If I were to attempt to use AI on the coredump, would echoing 'c' to
>> /proc/sysrq-trigger with kdump enabled (when deadlock is happening) be
>> the appropriate action to grab the core file?
> 
> Yes, that's right, but you need to set up kdump first. The quickest way
> depends on your distro:
> 
>   - Fedora/RHEL: dnf install kexec-tools, then kdumpctl reset-crashkernel,
>     systemctl enable --now kdump
>   - Ubuntu/Debian: apt install kdump-tools (say Yes to enable), reboot
>   - Arch: Install kexec-tools, add crashkernel=512M to your kernel
>     cmdline, create a kdump.service that runs
>     kexec -p /boot/vmlinuz-linux --initrd=/boot/initramfs-linux.img \
>       --append="root=<your-root> irqpoll nr_cpus=1 reset_devices"
> 
> After reboot, verify with: cat /sys/kernel/kexec_crash_size (should be
> non-zero). Then when the deadlock happens:
> 
>    echo c > /proc/sysrq-trigger

I have kdump enabled already, and I could create a vmcore like this.
I have never used drgn, nor claude.  Based on the logs above, do
you still think it would be helpful to try drgn?  If so, can you please
suggest some commands or approaches specific to this particular bug?

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com




  reply	other threads:[~2026-03-10 16:11 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-23 22:36 6.18.13 iwlwifi deadlock allocating cma while work-item is active Ben Greear
2026-02-27 16:31 ` Ben Greear
2026-03-01 15:38   ` Ben Greear
2026-03-02  8:07     ` Johannes Berg
2026-03-02 15:26       ` Ben Greear
2026-03-02 15:38         ` Johannes Berg
2026-03-02 15:50           ` Ben Greear
2026-03-03 11:49             ` Johannes Berg
2026-03-03 20:52               ` Tejun Heo
2026-03-03 21:03                 ` Johannes Berg
2026-03-03 21:12                 ` Johannes Berg
2026-03-03 21:40                   ` Ben Greear
2026-03-03 21:54                     ` Tejun Heo
2026-03-04  0:02                       ` Ben Greear
2026-03-04 17:14                         ` Tejun Heo
2026-03-10 16:10                           ` Ben Greear [this message]
2026-03-10 18:06                             ` Tejun Heo
2026-03-10 19:18                               ` Ben Greear
2026-03-10 19:47                                 ` Tejun Heo
2026-03-10 19:48                                   ` Tejun Heo
2026-03-04  3:08               ` Hillf Danton
2026-03-04  6:57                 ` Johannes Berg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bba74cab-7305-a052-7e1c-7a7736ba4531@candelatech.com \
    --to=greearb@candelatech.com \
    --cc=johannes@sipsolutions.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=miriam.rachel.korenblit@intel.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox