From: Qiliang Yuan <realwujing@gmail.com>
To: dianders@chromium.org
Cc: akpm@linux-foundation.org, lihuafei1@huawei.com,
linux-kernel@vger.kernel.org, mingo@kernel.org,
mm-commits@vger.kernel.org, realwujing@gmail.com,
song@kernel.org, stable@vger.kernel.org, sunshx@chinatelecom.cn,
thorsten.blum@linux.dev, wangjinchao600@gmail.com,
yangyicong@hisilicon.com, yuanql9@chinatelecom.cn,
zhangjn11@chinatelecom.cn
Subject: Re: [PATCH v3] watchdog/hardlockup: Fix UAF in perf event cleanup due to migration race
Date: Sun, 25 Jan 2026 22:30:12 -0500 [thread overview]
Message-ID: <20260126033012.934143-1-realwujing@gmail.com> (raw)
In-Reply-To: <CAD=FV=Vmk1jA+dAgJNVDMtxrhhrPxgnXkNxiqJXWBvgUcZZUxQ@mail.gmail.com>
Hi Doug,
Thanks for your further questions and for digging into the 4.19 vs ToT
differences.
On Sat, 24 Jan 2026 15:36:01 Doug Anderson <dianders@chromium.org> wrote:
> The part that doesn't make a lot of sense to me, though, is that v4.19
> also doesn't have commit 930d8f8dbab9 ("watchdog/perf: adapt the
> watchdog_perf interface for async model"), which is where we are
> saying the problem was introduced.
>
> ...so in v4.19 I think:
> * hardlockup_detector_perf_init() is only called from watchdog_nmi_probe()
> * watchdog_nmi_probe() is only called from lockup_detector_init()
> * lockup_detector_init() is only called from kernel_init_freeable()
> right before smp_init()
>
> Thus I'm super confused about how you could have seen the problem on
> v4.19. Maybe your v4.19 kernel has some backported patches that makes
> this possible?
You caught it! Here is the context for the differences:
1. Mainline (ToT):
- `lockup_detector_init()` is always called before `smp_init()`
(pre-SMP phase).
- Risk source: The asynchronous retry path (`lockup_detector_delay_init`)
introduced by 930d8f8dbab9, which runs in a workqueue (post-SMP)
context and triggers the UAF.
2. openEuler (4.19/5.10):
- Local `euler inclusion` patches moved `lockup_detector_init()` after
`do_basic_setup()` (post-SMP phase).
- Risk source: The initial probe occurs directly in a post-SMP
environment, exposing the race condition.
For openEuler (4.19/5.10) kernel, the call stack looks like this:
kernel_init()
-> kernel_init_freeable()
-> lockup_detector_init() <-- Called after smp_init()
-> watchdog_nmi_probe()
-> hardlockup_detector_perf_init()
-> hardlockup_detector_event_create()
In mainline (ToT), the initial probe (safe) call stack is:
kernel_init()
-> kernel_init_freeable()
-> lockup_detector_init() <-- Called before smp_init()
-> watchdog_hardlockup_probe()
-> hardlockup_detector_event_create()
However, the asynchronous retry mechanism (commit 930d8f8dbab9) executes the
probe logic in a post-SMP, preemptible context.
For the mainline (ToT) retry path (at risk), the call stack is:
kworker thread
-> process_one_work()
-> lockup_detector_delay_init()
-> watchdog_hardlockup_probe()
-> hardlockup_detector_event_create()
Thus, `930d8f8dbab9` remains the correct "Fixes" target for ToT.
> OK, fair enough. ...but I'm a bit curious why nobody else saw this
> WARN_ON(). I'm also curious if you have tested the hardlockup detector
> on newer kernels, or if all of your work has been done on 4.19. If all
> your work has been done on 4.19, do we need to find someone to test
> your patch on a newer kernel and make sure it works OK? If you've
> tested on a newer kernel, did the hardlockup detector init from the
> kernel's early-init code, or the retry code?
In newer kernels, when the probe fails initially and falls
back to the retry workqueue (or even during early init if preemption is
enabled), the `WARN_ON(!is_percpu_thread())` in
`hardlockup_detector_event_create()` does indeed trigger because
`watchdog_hardlockup_probe()` is called from a non-bound context.
I have verified this patch on the openEuler 4.19 kernel. During our stress
testing, where we start dozens of VMs simultaneously to create high resource
contention, the UAF was consistently reproducible without this fix and is now
confirmed resolved.
The v4 patch addresses this by refactoring the creation logic to be stateless
and adding `cpu_hotplug_disable()` to ensure the probed CPU stays alive.
I'll wait for your further thoughts on v4:
https://lore.kernel.org/all/20260124070814.806828-1-realwujing@gmail.com/
Best regards,
Qiliang
next prev parent reply other threads:[~2026-01-26 3:30 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20260122042717.657231-1-realwujing@gmail.com>
2026-01-22 5:24 ` [PATCH v2] watchdog/hardlockup: Fix UAF in perf event cleanup due to migration race Qiliang Yuan
2026-01-22 21:59 ` Andrew Morton
2026-01-23 2:39 ` Doug Anderson
2026-01-23 6:34 ` [PATCH v3] " Qiliang Yuan
2026-01-24 0:01 ` Doug Anderson
2026-01-24 6:57 ` Qiliang Yuan
2026-01-24 23:36 ` Doug Anderson
2026-01-26 3:30 ` Qiliang Yuan [this message]
2026-01-27 1:14 ` Doug Anderson
2026-01-27 2:16 ` [PATCH v4] " Qiliang Yuan
2026-01-27 21:37 ` Doug Anderson
2026-01-28 2:37 ` Qiliang Yuan
2026-01-24 7:08 ` Qiliang Yuan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260126033012.934143-1-realwujing@gmail.com \
--to=realwujing@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=dianders@chromium.org \
--cc=lihuafei1@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=mm-commits@vger.kernel.org \
--cc=song@kernel.org \
--cc=stable@vger.kernel.org \
--cc=sunshx@chinatelecom.cn \
--cc=thorsten.blum@linux.dev \
--cc=wangjinchao600@gmail.com \
--cc=yangyicong@hisilicon.com \
--cc=yuanql9@chinatelecom.cn \
--cc=zhangjn11@chinatelecom.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox