All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@kernel.org>
To: Jing Wu <realwujing@gmail.com>, Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Frederic Weisbecker <frederic@kernel.org>,
	Neeraj Upadhyay <neeraj.upadhyay@kernel.org>,
	Joel Fernandes <joelagnelf@nvidia.com>,
	Josh Triplett <josh@joshtriplett.org>,
	Boqun Feng <boqun@kernel.org>,
	Uladzislau Rezki <urezki@gmail.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	Zqiang <qiang.zhang@linux.dev>,
	Anna-Maria Behnsen <anna-maria@linutronix.de>,
	Tejun Heo <tj@kernel.org>, Jonathan Corbet <corbet@lwn.net>,
	Shuah Khan <skhan@linuxfoundation.org>,
	Shuah Khan <shuah@kernel.org>
Cc: linux-kernel@vger.kernel.org, rcu@vger.kernel.org,
	cgroups@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kselftest@vger.kernel.org, Jing Wu <realwujing@gmail.com>,
	Qiliang Yuan <yuanql9@chinatelecom.cn>
Subject: Re: [PATCH v3 10/13] sched: Guard sched_tick_start/stop against uninitialized tick_work_cpu
Date: Thu, 18 Jun 2026 22:50:35 +0200	[thread overview]
Message-ID: <87a4srefok.ffs@fw13> (raw)
In-Reply-To: <20260618-wujing-dhm-v3-10-28f1a4d83b68@gmail.com>

On Thu, Jun 18 2026 at 11:11, Jing Wu wrote:
> sched_tick_start() and sched_tick_stop() are called during CPU hotplug
> for CPUs not in the HK_TYPE_KERNEL_NOISE set.  They dereference
> tick_work_cpu, which is allocated by sched_tick_offload_init() and only
> called from housekeeping_init() when nohz_full= is present at boot.
>
> When the DHM subsystem first-enables HK_TYPE_KERNEL_NOISE at runtime via
> housekeeping_update_types(), tick_work_cpu remains NULL because
> sched_tick_offload_init() is __init-only and cannot be re-invoked.  A
> subsequent CPU offline/online cycle for an isolated CPU triggers
> WARN_ON_ONCE(!tick_work_cpu) followed by a NULL-pointer dereference in
> per_cpu_ptr(tick_work_cpu, cpu), crashing the kernel.
>
> Since nohz_full= was not active at boot, tick_nohz_full_running remains
> false and the tick-offload infrastructure is never activated; isolated
> CPUs continue to receive their own ticks.  Guard both helpers with an
> additional !tick_work_cpu check so they become no-ops in this case.

This is the same fake functionality as with the tick itself. Seriously?

> -	if (housekeeping_cpu(cpu, HK_TYPE_KERNEL_NOISE))
> +	if (housekeeping_cpu(cpu, HK_TYPE_KERNEL_NOISE) || !tick_work_cpu)
>  		return;
>  
>  	WARN_ON_ONCE(!tick_work_cpu);
> @@ -5799,7 +5799,7 @@ static void sched_tick_stop(int cpu)
>  	struct tick_work *twork;
>  	int os;
>  
> -	if (housekeeping_cpu(cpu, HK_TYPE_KERNEL_NOISE))
> +	if (housekeeping_cpu(cpu, HK_TYPE_KERNEL_NOISE) || !tick_work_cpu)
>  		return;
>  
>  	WARN_ON_ONCE(!tick_work_cpu);

Brilliant stuff that. Guard against tick_work_cpu == NULL and then keep
the WARN_ON() there, which became completely pointless.

But that's all just mindless tinkering and fixing the symptoms.

If all of this is runtime managed, then all the initialization needs to
be made unconditional. Yes, that wastes a few bytes of memory per CPU if
it's not used, but avoids these completely inconsistent hacks all over
the place and provides a coherent user interface.

Stop trying to duct tape this in. This needs more thoughts than just
sprinkling works a few works for me hacks all over the place.

Thanks,

        tglx

  reply	other threads:[~2026-06-18 20:50 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-18  3:11 [PATCH v3 00/13] Dynamic Housekeeping Management (DHM) via CPUSets Jing Wu
2026-06-18  3:11 ` [PATCH v3 01/13] sched/isolation: Replace notifier chain with explicit callback interface Jing Wu
2026-06-18  3:11 ` [PATCH v3 02/13] sched/isolation: Add housekeeping_update_types() for kernel-noise masks Jing Wu
2026-06-18  3:11 ` [PATCH v3 03/13] sched/isolation: RCU-protect all housekeeping cpumask readers Jing Wu
2026-06-18  3:11 ` [PATCH v3 04/13] sched/isolation: Fix RCU protection for runtime-mutable cpumask callers Jing Wu
2026-06-18  3:11 ` [PATCH v3 05/13] cpu/hotplug: Reserve CPUHP states for nohz_full and managed IRQ down-paths Jing Wu
2026-06-18 16:06   ` Thomas Gleixner
2026-06-18 21:01     ` Thomas Gleixner
2026-06-18  3:11 ` [PATCH v3 06/13] tick/nohz, context_tracking: Prepare for runtime nohz_full updates Jing Wu
2026-06-18 17:27   ` Thomas Gleixner
2026-06-18 19:49     ` Thomas Gleixner
2026-06-18  3:11 ` [PATCH v3 07/13] rcu/nocb: Add explicit housekeeping callback for runtime NOCB toggling Jing Wu
2026-06-18  3:11 ` [PATCH v3 08/13] genirq: Add explicit housekeeping callback for managed IRQ migration Jing Wu
2026-06-18 20:27   ` Thomas Gleixner
2026-06-18 21:11     ` Thomas Gleixner
2026-06-18  3:11 ` [PATCH v3 09/13] watchdog/lockup_detector: Register housekeeping callback for kernel-noise Jing Wu
2026-06-18  3:11 ` [PATCH v3 10/13] sched: Guard sched_tick_start/stop against uninitialized tick_work_cpu Jing Wu
2026-06-18 20:50   ` Thomas Gleixner [this message]
2026-06-18  3:11 ` [PATCH v3 11/13] cgroup/cpuset: Extend isolated partition to trigger kernel-noise isolation Jing Wu
2026-06-18 20:55   ` Thomas Gleixner
2026-06-18  3:11 ` [PATCH v3 12/13] docs: cgroup-v2: Document kernel-noise isolation via isolated partitions Jing Wu
2026-06-18  3:11 ` [PATCH v3 13/13] selftests/cgroup: Add kernel-noise isolation test to cpuset selftest Jing Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a4srefok.ffs@fw13 \
    --to=tglx@kernel.org \
    --cc=anna-maria@linutronix.de \
    --cc=boqun@kernel.org \
    --cc=bsegall@google.com \
    --cc=cgroups@vger.kernel.org \
    --cc=corbet@lwn.net \
    --cc=dietmar.eggemann@arm.com \
    --cc=frederic@kernel.org \
    --cc=jiangshanlai@gmail.com \
    --cc=joelagnelf@nvidia.com \
    --cc=josh@joshtriplett.org \
    --cc=juri.lelli@redhat.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=neeraj.upadhyay@kernel.org \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=qiang.zhang@linux.dev \
    --cc=rcu@vger.kernel.org \
    --cc=realwujing@gmail.com \
    --cc=rostedt@goodmis.org \
    --cc=shuah@kernel.org \
    --cc=skhan@linuxfoundation.org \
    --cc=tj@kernel.org \
    --cc=urezki@gmail.com \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=yuanql9@chinatelecom.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.