All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ben Greear <greearb@candelatech.com>
To: linux-wireless <linux-wireless@vger.kernel.org>
Cc: "Korenblit, Miriam Rachel" <miriam.rachel.korenblit@intel.com>,
	linux-mm@kvack.org
Subject: 6.18.13 iwlwifi deadlock allocating cma while work-item is active.
Date: Mon, 23 Feb 2026 14:36:59 -0800	[thread overview]
Message-ID: <fa4e82ee-eb14-3930-c76c-f3bd59c5f258@candelatech.com> (raw)

Hello,

I hit a deadlock related to CMA mem allocation attempting to flush all work
while holding some wifi related mutex, and with a work-queue attempting to process a wifi regdomain
work item.  I really don't see any good way to fix this,
it would seem that any code that was holding a mutex that could block a work-queue
cannot safely allocate CMA memory?  Hopefully someone else has a better idea.

For whatever reason, my hacked up kernel will print out the sysrq process stack traces I need
to understand this, and my stable 6.18.13 will not.  But, the locks-held matches in both cases, so almost
certainly this is same problem.  I can reproduce the same problem on both un-modified stable
and my own.  The details below are from my modified 6.18.9+ kernel.

I only hit this (reliably?) with a KASAN enabled kernel, likely because it makes things slow enough to
hit the problem and/or causes CMA allocations in a different manner.

General way to reproduce is to have large amounts of intel be200 radios in a system, and bring them
admin up and down.


## From 6.18.13 (un-modified)

40479 Feb 23 14:13:31 ct523c-de7c kernel: 5 locks held by kworker/u32:11/34989:
40480 Feb 23 14:13:31 ct523c-de7c kernel:  #0: ffff888120161148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0xf7a/0x17b0
40481 Feb 23 14:13:31 ct523c-de7c kernel:  #1: ffff8881a561fd20 ((work_completion)(&rdev->wiphy_work)){+.+.}-{0:0}, at: process_one_work+0x7ca/0x17b0
40482 Feb 23 14:13:31 ct523c-de7c kernel:  #2: ffff88815e618788 (&rdev->wiphy.mtx){+.+.}-{4:4}, at: cfg80211_wiphy_work+0x5c/0x570 [cfg80211]
40483 Feb 23 14:13:31 ct523c-de7c kernel:  #3: ffffffff87232e60 (&cma->alloc_mutex){+.+.}-{4:4}, at: __cma_alloc+0x3c5/0xd20
40484 Feb 23 14:13:31 ct523c-de7c kernel:  #4: ffffffff8534f668 (lock#5){+.+.}-{4:4}, at: __lru_add_drain_all+0x5f/0x530

40488 Feb 23 14:13:31 ct523c-de7c kernel: 4 locks held by kworker/1:0/39480:
40489 Feb 23 14:13:31 ct523c-de7c kernel:  #0: ffff88812006b148 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0xf7a/0x17b0
40490 Feb 23 14:13:31 ct523c-de7c kernel:  #1: ffff88814087fd20 (reg_work){+.+.}-{0:0}, at: process_one_work+0x7ca/0x17b0
40491 Feb 23 14:13:31 ct523c-de7c kernel:  #2: ffffffff85970028 (rtnl_mutex){+.+.}-{4:4}, at: reg_todo+0x18/0x770 [cfg80211]
40492 Feb 23 14:13:31 ct523c-de7c kernel:  #3: ffff88815e618788 (&rdev->wiphy.mtx){+.+.}-{4:4}, at: reg_process_self_managed_hints+0x70/0x190 [cfg80211]


## Rest of this is from my 6.18.9+ hacks kernel.

### thread trying to allocate cma is blocked here, trying to flush work.

Type "apropos word" to search for commands related to "word"...
Reading symbols from vmlinux...
(gdb) l *(alloc_contig_range_noprof+0x1de)
0xffffffff8162453e is in alloc_contig_range_noprof (/home2/greearb/git/linux-6.18.dev.y/mm/page_alloc.c:6798).
6793			.reason = MR_CONTIG_RANGE,
6794		};
6795	
6796		lru_cache_disable();
6797	
6798		while (pfn < end || !list_empty(&cc->migratepages)) {
6799			if (fatal_signal_pending(current)) {
6800				ret = -EINTR;
6801				break;
6802			}
(gdb) l *(__lru_add_drain_all+0x19b)
0xffffffff815ae44b is in __lru_add_drain_all (/home2/greearb/git/linux-6.18.dev.y/mm/swap.c:884).
879				queue_work_on(cpu, mm_percpu_wq, work);
880				__cpumask_set_cpu(cpu, &has_work);
881			}
882		}
883	
884		for_each_cpu(cpu, &has_work)
885			flush_work(&per_cpu(lru_add_drain_work, cpu));
886	
887	done:
888		mutex_unlock(&lock);
(gdb)


#### and other thread is trying to process a regdom request, and trying to use
# rcu and rtnl???

Type "apropos word" to search for commands related to "word"...
Reading symbols from net/wireless/cfg80211.ko...
(gdb) l *(reg_todo+0x18)
0xe238 is in reg_todo (/home2/greearb/git/linux-6.18.dev.y/net/wireless/reg.c:3107).
3102	 */
3103	static void reg_process_pending_hints(void)
3104	{
3105		struct regulatory_request *reg_request, *lr;
3106	
3107		lr = get_last_request();
3108	
3109		/* When last_request->processed becomes true this will be rescheduled */
3110		if (lr && !lr->processed) {
3111			pr_debug("Pending regulatory request, waiting for it to be processed...\n");
(gdb)

static struct regulatory_request *get_last_request(void)
{
	return rcu_dereference_rtnl(last_request);
}


task:kworker/6:0     state:D stack:0     pid:56    tgid:56    ppid:2      task_flags:0x4208060 flags:0x00080000
Workqueue: events reg_todo [cfg80211]
Call Trace:
  <TASK>
  __schedule+0x526/0x1290
  preempt_schedule_notrace+0x35/0x50
  preempt_schedule_notrace_thunk+0x16/0x30
  rcu_is_watching+0x2a/0x30
  lock_acquire+0x26d/0x2c0
  schedule+0xac/0x120
  ? schedule+0x8d/0x120
  schedule_preempt_disabled+0x11/0x20
  __mutex_lock+0x726/0x1070
  ? reg_todo+0x18/0x2b0 [cfg80211]
  ? reg_todo+0x18/0x2b0 [cfg80211]
  reg_todo+0x18/0x2b0 [cfg80211]
  process_one_work+0x221/0x6d0
  worker_thread+0x1e5/0x3b0
  ? rescuer_thread+0x450/0x450
  kthread+0x108/0x220
  ? kthreads_online_cpu+0x110/0x110
  ret_from_fork+0x1c6/0x220
  ? kthreads_online_cpu+0x110/0x110
  ret_from_fork_asm+0x11/0x20
  </TASK>

task:ip              state:D stack:0     pid:72857 tgid:72857 ppid:72843  task_flags:0x400100 flags:0x00080001
Call Trace:
  <TASK>
  __schedule+0x526/0x1290
  ? schedule+0x8d/0x120
  ? schedule+0xe2/0x120
  schedule+0x36/0x120
  schedule_timeout+0xf9/0x110
  ? mark_held_locks+0x40/0x70
  __wait_for_common+0xbe/0x1e0
  ? hrtimer_nanosleep_restart+0x120/0x120
  ? __flush_work+0x20b/0x530
  __flush_work+0x34e/0x530
  ? flush_workqueue_prep_pwqs+0x160/0x160
  ? bpf_prog_test_run_tracing+0x160/0x2d0
  __lru_add_drain_all+0x19b/0x220
  alloc_contig_range_noprof+0x1de/0x8a0
  __cma_alloc+0x1f1/0x6a0
  __dma_direct_alloc_pages.isra.0+0xcb/0x2f0
  dma_direct_alloc+0x7b/0x250
  dma_alloc_attrs+0xa1/0x2a0
  _iwl_pcie_ctxt_info_dma_alloc_coherent+0x31/0xb0 [iwlwifi]
  iwl_pcie_ctxt_info_alloc_dma+0x20/0x50 [iwlwifi]
  iwl_pcie_init_fw_sec+0x2fc/0x380 [iwlwifi]
  iwl_pcie_ctxt_info_v2_alloc+0x19e/0x530 [iwlwifi]
  iwl_trans_pcie_gen2_start_fw+0x2e2/0x820 [iwlwifi]
  ? lock_is_held_type+0x92/0x100
  iwl_trans_start_fw+0x77/0x90 [iwlwifi]
  iwl_mld_load_fw_wait_alive+0x97/0x2c0 [iwlmld]
  ? iwl_mld_mac80211_sta_state+0x780/0x780 [iwlmld]
  ? lock_is_held_type+0x92/0x100
  iwl_mld_load_fw+0x91/0x240 [iwlmld]
  ? ieee80211_open+0x3d/0xe0 [mac80211]
  ? lock_is_held_type+0x92/0x100
  iwl_mld_start_fw+0x44/0x470 [iwlmld]
  iwl_mld_mac80211_start+0x3d/0x1b0 [iwlmld]
  drv_start+0x6f/0x1d0 [mac80211]
  ieee80211_do_open+0x2d6/0x960 [mac80211]
  ieee80211_open+0x62/0xe0 [mac80211]
  __dev_open+0x11a/0x2e0
  __dev_change_flags+0x1f8/0x280
  netif_change_flags+0x22/0x60
  do_setlink.isra.0+0xe57/0x11a0
  ? __mutex_lock+0xb0/0x1070
  ? __mutex_lock+0x99e/0x1070
  ? __nla_validate_parse+0x5e/0xcd0
  ? rtnl_newlink+0x355/0xb50
  ? cap_capable+0x90/0x100
  ? security_capable+0x72/0x80
  rtnl_newlink+0x7e8/0xb50
  ? __lock_acquire+0x436/0x2190
  ? lock_acquire+0xc2/0x2c0
  ? rtnetlink_rcv_msg+0x97/0x660
  ? find_held_lock+0x2b/0x80
  ? do_setlink.isra.0+0x11a0/0x11a0
  ? rtnetlink_rcv_msg+0x3ea/0x660
  ? lock_release+0xcc/0x290
  ? do_setlink.isra.0+0x11a0/0x11a0
  rtnetlink_rcv_msg+0x409/0x660
  ? rtnl_fdb_dump+0x240/0x240
  netlink_rcv_skb+0x56/0x100
  netlink_unicast+0x1e1/0x2d0
  netlink_sendmsg+0x219/0x460
  __sock_sendmsg+0x38/0x70
  ____sys_sendmsg+0x214/0x280
  ? import_iovec+0x2c/0x30
  ? copy_msghdr_from_user+0x6c/0xa0
  ___sys_sendmsg+0x85/0xd0
  ? __lock_acquire+0x436/0x2190
  ? find_held_lock+0x2b/0x80
  ? lock_acquire+0xc2/0x2c0
  ? mntput_no_expire+0x43/0x460
  ? find_held_lock+0x2b/0x80
  ? mntput_no_expire+0x8c/0x460
  __sys_sendmsg+0x6b/0xc0
  do_syscall_64+0x6b/0x11b0
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com



             reply	other threads:[~2026-02-23 22:37 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-23 22:36 Ben Greear [this message]
2026-02-27 16:31 ` 6.18.13 iwlwifi deadlock allocating cma while work-item is active Ben Greear
2026-03-01 15:38   ` Ben Greear
2026-03-02  8:07     ` Johannes Berg
2026-03-02 15:26       ` Ben Greear
2026-03-02 15:38         ` Johannes Berg
2026-03-02 15:50           ` Ben Greear
2026-03-03 11:49             ` Johannes Berg
2026-03-03 20:52               ` Tejun Heo
2026-03-03 21:03                 ` Johannes Berg
2026-03-03 21:12                 ` Johannes Berg
2026-03-03 21:40                   ` Ben Greear
2026-03-03 21:54                     ` Tejun Heo
2026-03-04  0:02                       ` Ben Greear
2026-03-04 17:14                         ` Tejun Heo
2026-03-04 17:14                           ` Tejun Heo
2026-03-10 16:10                           ` Ben Greear
2026-03-10 18:06                             ` Tejun Heo
2026-03-10 19:18                               ` Ben Greear
2026-03-10 19:47                                 ` Tejun Heo
2026-03-10 19:48                                   ` Tejun Heo
2026-03-04  3:08               ` Hillf Danton
2026-03-04  6:57                 ` Johannes Berg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fa4e82ee-eb14-3930-c76c-f3bd59c5f258@candelatech.com \
    --to=greearb@candelatech.com \
    --cc=linux-mm@kvack.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=miriam.rachel.korenblit@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.