From: Mauricio Faria de Oliveira <mfo@igalia.com>
To: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Daniel Lezcano <daniel.lezcano@kernel.org>,
Zhang Rui <rui.zhang@intel.com>,
Lukasz Luba <lukasz.luba@arm.com>,
linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org,
kernel-dev@igalia.com,
syzbot+3b3852c6031d0f30dfaf@syzkaller.appspotmail.com
Subject: Re: [PATCH] thermal: core: fix use-after-free due to init/cancel delayed_work race
Date: Wed, 25 Mar 2026 11:17:46 -0300 [thread overview]
Message-ID: <772a77c80b6ad216dec4cc10d3fbb133@igalia.com> (raw)
In-Reply-To: <CAJZ5v0iBh6BjtwhS7RK_GE8QmYLAwW71P+YpA92SiGu4wy+syw@mail.gmail.com>
Thanks for looking into this.
On 2026-03-25 09:47, Rafael J. Wysocki wrote:
> On Wed, Mar 25, 2026 at 1:10 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>>
>> On Wed, Mar 25, 2026 at 12:51 AM Mauricio Faria de Oliveira
>> <mfo@igalia.com> wrote:
>> >
>> > If INIT_DELAYED_WORK() is called for a currently running work item,
>> > cancel_delayed_work_sync() is unable to cancel/wait for it anymore,
>> > as the work item's data bits required for that are cleared.
>> >
>> > In the resume path, INIT_DELAYED_WORK() is called twice:
>> > 1) to replace the work function: thermal_zone_device_check/resume()
>> > 2) to restore it.
>> >
>> > Both cases might race with the unregister path and bypass the call to
>> > cancel_delayed_work_sync(),
>>
>> So this is the problem, isn't it?
>>
>> > after which struct thermal_zone_device *tz
>> > is freed, and the non-canceled/non-waited for work hits use-after-free.
>>
>> Which basically means that a TZ_STATE_FLAG_EXIT check is missing in
>> both thermal_zone_pm_complete() and thermal_zone_device_resume().
>
> Actually, thermal_zone_pm_complete() runs under thermal_list_lock and
> thermal_zone_device_unregister() removes the zone from
> thermal_tz_list, also under thermal_list_lock, before calling
> cancel_delayed_work_sync().
>
> So either thermal_zone_device_unregister() removes the zone from the
> list before thermal_zone_pm_complete() can run, in which case the
> latter won't run for the given zone at all because that zone is not
> there in thermal_tz_list, or the cancel_delayed_work_sync() will see
> the work item queued up by thermal_zone_pm_complete().
(I think this reply clarifies your previous points too; or tell me.)
Regarding the latter: yes, cancel_delayed_work_sync() will see the
work item queued up by thermal_zone_pm_complete(), but the issue is
it will not see the _previously queued up (and running)_ work item,
due to the INIT_DELAYED_WORK() in thermal_zone_pm_complete().
That work item, thermal_zone_device_check() might be past a check
for TZ_STATE_FLAG_EXIT by the time it is set. Please read on.
> So where's the race between thermal_zone_pm_complete() and
> thermal_zone_device_unregister()?
I'll append a more detailed view below, but the summary is:
thermal_pm_notify_complete() can acquire thermal_list_lock before
thermal_zone_device_unregister() -> thermal_zone_exit() does,
thus proceed to thermal_zone_pm_complete() which calls
INIT_DELAYED_WORK().
This makes future calls to cancel_delayed_work_sync() to wait only
for the newly initialized work function, not a previous one, which
might be running -- in this case, thermal_zone_device_check().
Then thermal_zone_pm_complete() and thermal_pm_notify_complete()
return, releasing thermal_list_lock.
Now, thermal_zone_device_unregister() -> thermal_zone_exit()
can set finally set TZ_STATE_FLAG_EXIT, but it might be too late.
The previously queued, currently running, work item might be past
any check for it (say, it acquired the lock to read it earlier,
or before thermal_zone_device_unregister() was even called).
So, thermal_zone_device_unregister() reaches cancel_delayed_work_sync(),
which waits for thermal_zone_device_resume() (new work function),
but _not_ for thermal_zone_device_check() (old work function).
That finishes, tz is freed, and now the old work funtion continues
to run and accesses tz -- hitting a use-after-free.
> I can see the one between thermal_zone_device_unregister() and
> thermal_zone_device_resume(), but that can be addressed by adding a
> TZ_STATE_FLAG_EXIT check to the latter AFAICS.
In the example describe above and detailed below, apparently that
is not sufficient, if I'm not missing anything. See, if _resume()
is reached with thermal_list_lock held, thermal_zone_device_exit()
is waiting for thermal_list_lock before setting TZ_STATE_FLAG_EXIT,
thus a check for it in _resume() would find it clear yet.
Hope this helps. Please let me know your thoughts. Thanks!
Detailed view)
Thread A:
...
WORK: tz->poll_queue
thermal_zone_device_check() // started running before
TZ_STATE_FLAG_EXIT could be set
... // wait a bit
Thread B:
thermal_pm_notify()
thermal_pm_notify_complete()
guard(mutex)(&thermal_list_lock) // acquires the lock
...
Thread C:
thermal_zone_device_unregister()
thermal_zone_exit()
guard(mutex)(&thermal_list_lock) // blocks waiting for the lock
...
tz->state |= TZ_STATE_FLAG_EXIT // not run yet
list_del_init(&tz->node) // not run yet
Thread B: // continues with tz in thermal_tz_list and without
TZ_STATE_FLAG_EXIT
...
list_for_each_entry(tz, &thermal_tz_list, node)
thermal_zone_pm_complete(tz)
cancel_delayed_work(&tz->poll_queue) // does not wait for
thermal_zone_device_check() to finish
INIT_DELAYED_WORK(&tz->poll_queue, thermal_zone_device_resume)
...
returns, releases the thermal_list_lock.
From now on, cancel_delayed_work_sync(&tz->poll_queue)
will _not_ wait for thermal_zone_device_check() anymore
(note it _is_ running in Thread A)
it will only wait for thermal_zone_device_resume().
Thread C:
tz->state |= TZ_STATE_FLAG_EXIT
list_del_init(&tz->node)
These are now set, but no longer effective for the already running
thermal_zone_device_check().
cancel_delayed_work_sync(&tz->poll_queue) // as mentioned, waits
for the other function.
kfree(tz)
Thread A: // not waited for
...
thermal_zone_device_update()
// access to tz (freed)
--
Mauricio
next prev parent reply other threads:[~2026-03-25 14:18 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-24 23:50 [PATCH] thermal: core: fix use-after-free due to init/cancel delayed_work race Mauricio Faria de Oliveira
2026-03-25 12:10 ` Rafael J. Wysocki
2026-03-25 12:47 ` Rafael J. Wysocki
2026-03-25 14:17 ` Mauricio Faria de Oliveira [this message]
2026-03-25 14:28 ` Mauricio Faria de Oliveira
2026-03-25 15:13 ` Mauricio Faria de Oliveira
2026-03-25 16:24 ` Rafael J. Wysocki
2026-03-25 19:22 ` Mauricio Faria de Oliveira
2026-03-25 19:29 ` Rafael J. Wysocki
2026-03-26 17:41 ` Mauricio Faria de Oliveira
2026-03-25 20:20 ` Rafael J. Wysocki
2026-03-26 17:45 ` Mauricio Faria de Oliveira
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=772a77c80b6ad216dec4cc10d3fbb133@igalia.com \
--to=mfo@igalia.com \
--cc=daniel.lezcano@kernel.org \
--cc=kernel-dev@igalia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=lukasz.luba@arm.com \
--cc=rafael@kernel.org \
--cc=rui.zhang@intel.com \
--cc=syzbot+3b3852c6031d0f30dfaf@syzkaller.appspotmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox