From: Hillf Danton <hdanton@sina.com>
To: Johannes Berg <johannes@sipsolutions.net>
Cc: Ben Greear <greearb@candelatech.com>,
linux-wireless <linux-wireless@vger.kernel.org>,
"Korenblit, Miriam Rachel" <miriam.rachel.korenblit@intel.com>,
linux-mm@kvack.org, Tejun Heo <tj@kernel.org>,
linux-kernel@vger.kernel.org
Subject: Re: 6.18.13 iwlwifi deadlock allocating cma while work-item is active.
Date: Wed, 4 Mar 2026 11:08:34 +0800 [thread overview]
Message-ID: <20260304030835.610-1-hdanton@sina.com> (raw)
In-Reply-To: <35779061f94c2a55bb58dcd619ae91c618509cf4.camel@sipsolutions.net>
On Tue, 03 Mar 2026 12:49:24 +0100 Johannes Berg wrote:
>On Mon, 2026-03-02 at 07:50 -0800, Ben Greear wrote:
>> On 3/2/26 07:38, Johannes Berg wrote:
>> > On Mon, 2026-03-02 at 07:26 -0800, Ben Greear wrote:
>> > > >
>> > > > Was this with lockdep? If so, it complain about anything?
>> > > >
>> > > > I'm having a hard time seeing why it would deadlock at all when wifi
>> > > > uses schedule_work() and therefore the system_percpu_wq, and
>> > > > __lru_add_drain_all() flushes lru_add_drain_work on mm_percpu_wq, and
>> > > > lru_add_and_bh_lrus_drain() doesn't really _seem_ to do anything related
>> > > > to RTNL etc.?
>> > > >
>> > > > I think we need a real explanation here rather than "if I randomly
>> > > > change this, it no longer appears".
>> > >
>> > > The path where iwlwifi acquires CMA holds rtnl and/or wiphy locks before
>> > > allocating CMA memory, as expected.
>> > >
>> > > And the CMA allocation path attempts to flush the work queues in
>> > > at least some cases.
>> > >
>> > > If there is a work item queued that is trying to grab rtnl and/or wiphy lock
>> > > when CMA attempts to flush, then the flush work cannot complete, so it deadlocks.
>> > >
>> > > Lockdep doesn't warn about this.
>> >
>> > It really should, in cases where it can actually happen, I wrote the
>> > code myself for that... Though things have changed since, and the checks
>> > were lost at least once (and re-added), so I suppose it's possible that
>> > they were lost _again_, but the flushing system is far more flexible now
>> > and it's not flushing the same workqueue anyway, so it shouldn't happen.
>> >
>> > I stand by what I said before, need to show more precisely what depends
>> > on what, and I'm not going to accept a random kthread into this.
>>
>> My first email on the topic has process stack traces as well as lockdep
>> locks-held printout that points to the deadlock. I'm not sure what else to offer...please let me know
>> what you'd like to see.
>
> Fair. I don't know, I don't think there's anything that even shows that
> there's a dependency between the two workqueues and the
> "((wq_completion)events_unbound)" and "((wq_completion)events)", and
> there would have to be for it to deadlock this way because of that?
>
Given the locks held [1],
kworker/1:0/39480 kworker/u32:11/34989
rtnl_mutex
&rdev->wiphy.mtx
__lru_add_drain_all
flush_work(&per_cpu(lru_add_drain_work, cpu))
&rdev->wiphy.mtx
__if__ there is one work item queued __before__ one of the flush targets on
workqueue and it acquires the rtnl mutex, then no deadlock can rise,
because worker-xyz gets off CPU due to failing to take the rtnl lock then
worker-xyz+1 dequeus the flush target and completes it due to nothing
with rtnl. Same applies to the wiphy lock.
BTW any chance for queuing work that acquires rtnl lock on mm_percpu_wq?
[1] Subject: 6.18.13 iwlwifi deadlock allocating cma while work-item is active.
https://lore.kernel.org/linux-wireless/fa4e82ee-eb14-3930-c76c-f3bd59c5f258@candelatech.com/
next prev parent reply other threads:[~2026-03-04 3:08 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-23 22:36 6.18.13 iwlwifi deadlock allocating cma while work-item is active Ben Greear
2026-02-27 16:31 ` Ben Greear
2026-03-01 15:38 ` Ben Greear
2026-03-02 8:07 ` Johannes Berg
2026-03-02 15:26 ` Ben Greear
2026-03-02 15:38 ` Johannes Berg
2026-03-02 15:50 ` Ben Greear
2026-03-03 11:49 ` Johannes Berg
2026-03-03 20:52 ` Tejun Heo
2026-03-03 21:03 ` Johannes Berg
2026-03-03 21:12 ` Johannes Berg
2026-03-03 21:40 ` Ben Greear
2026-03-03 21:54 ` Tejun Heo
2026-03-04 0:02 ` Ben Greear
2026-03-04 17:14 ` Tejun Heo
2026-03-04 17:14 ` Tejun Heo
2026-03-10 16:10 ` Ben Greear
2026-03-10 18:06 ` Tejun Heo
2026-03-10 19:18 ` Ben Greear
2026-03-10 19:47 ` Tejun Heo
2026-03-10 19:48 ` Tejun Heo
2026-03-04 3:08 ` Hillf Danton [this message]
2026-03-04 6:57 ` Johannes Berg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260304030835.610-1-hdanton@sina.com \
--to=hdanton@sina.com \
--cc=greearb@candelatech.com \
--cc=johannes@sipsolutions.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-wireless@vger.kernel.org \
--cc=miriam.rachel.korenblit@intel.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.